This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
tuto_ecj_weka [2014/03/03 15:44] Denis Pallez created |
tuto_ecj_weka [2014/03/03 16:16] Denis Pallez [Class diagram:] |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Tutorial: Build weka classifiers using ecj algorithms ====== | + | ====== How to build weka classifiers using ECJ library ? ====== |
- | A short version is available [[easier|here]]. | + | A short version is available [[tuto_ecj_weka_easier|here]]. |
- | ===== Table of content: ===== | + | If you have any question please do not hesitate to ask. You can contact me at romaric DOT pighetti AT gmail DOT com. |
- | + | ||
- | * [[#prerequisities|Pre requisities]] | + | |
- | * [[#classdiagram|Class diagram]] | + | |
- | * [[#GetTheResultsOfTheEvolutionaryAlgorithm|Getting the results of the evolutionary algorithm]] | + | |
- | * [[#RunningTheAlgorithmAndGetTheResults|Running the algorithm and get the results]] | + | |
- | * [[#GettingDataFromWeka|Getting data from Weka]] | + | |
- | * [[#UsingInstances|Using the ]]**[[#UsingInstances|Instances]]**[[#UsingInstances| and access to the information enclosed within it]] | + | |
- | * [[#BuildWekaAlgorithm|Build a new algorithm in weka]] | + | |
- | * [[#HandlingOptions|Handling options in a weka clusterer]] | + | |
- | * [[#CallingECJ|Calling the ECJ algorithm:]] | + | |
- | * [[#Conclusion|Conclusion]] | + | |
- | + | ||
- | ---- | + | |
- | + | ||
- | This tutorial is under construction. It should be finished within a couple of weeks. If you have any question please do not hesitate to ask. You can contact me at romaric DOT pighetti AT gmail DOT com. | + | |
===== Goal and introduction ===== | ===== Goal and introduction ===== | ||
Line 36: | Line 21: | ||
===== Class diagram: ===== | ===== Class diagram: ===== | ||
- | Here is a short diagram representing what we will add to ECJ in to be able to use it correctly with weka and how it is connected to the core of ECJ (Click to access the image). | + | Here is a short diagram representing what we will add to ECJ in to be able to use it correctly with weka and how it is connected to the core of ECJ. |
- | + | {{:ecj_weka_fichiers:ecj_mods_dgrm.png?direct&300 | UML Class diagramm}} | |
- | [[images:ecj_mods_dgrm|{{images:ecj_mods_dgrm.png|class diagram}}]]===== Getting the results of the evolutionary algorithm ===== | + | ]===== Getting the results of the evolutionary algorithm ===== |
ECJ is based on parameters files which contains the parameters of the evolutionary algorithms, including the name of the calss used to perform the different parts of the algorithm. The only way to have access to the results of the computation in a proper manner is through the statistics process. Thus I cretaed the class **ec.weka.StatisticsForWeka** which role is to keep the best individuals computed for each subpopulation in each jobs. This class is also responsible of saving this information and restore it uppon checkpointing. | ECJ is based on parameters files which contains the parameters of the evolutionary algorithms, including the name of the calss used to perform the different parts of the algorithm. The only way to have access to the results of the computation in a proper manner is through the statistics process. Thus I cretaed the class **ec.weka.StatisticsForWeka** which role is to keep the best individuals computed for each subpopulation in each jobs. This class is also responsible of saving this information and restore it uppon checkpointing. | ||
Line 44: | Line 29: | ||
ec.weka.StatisticsForWeka: | ec.weka.StatisticsForWeka: | ||
- | <code> | + | <code java> |
package ec.weka; | package ec.weka; | ||
Line 161: | Line 146: | ||
ec.weka.EvolutionStateForWeka: | ec.weka.EvolutionStateForWeka: | ||
- | <code> | + | <code java> |
package ec.weka; | package ec.weka; | ||
Line 253: | Line 238: | ||
ECJBasedClusterer.java, first step: | ECJBasedClusterer.java, first step: | ||
- | <code> | + | <code java> |
package weka.clusterers; | package weka.clusterers; | ||
Line 316: | Line 301: | ||
Path to the ECJ parameters file: | Path to the ECJ parameters file: | ||
- | <code> | + | <code java> |
private String parameterFilePath; | private String parameterFilePath; | ||
</code> | </code> | ||
Line 328: | Line 313: | ||
These 3 methods in our case: | These 3 methods in our case: | ||
- | <code> | + | <code java> |
/** | /** | ||
* Returns an enumeration describing the available options. | * Returns an enumeration describing the available options. | ||
Line 383: | Line 368: | ||
setter | setter | ||
- | <code> | + | <code java> |
public void setBlabla(TypeOfTheField t) | public void setBlabla(TypeOfTheField t) | ||
{ | { | ||
Line 392: | Line 377: | ||
getter | getter | ||
- | <code> | + | <code java> |
public TypeOfTheField getBlabla() | public TypeOfTheField getBlabla() | ||
{ | { | ||
Line 412: | Line 397: | ||
setLearningDataSet: | setLearningDataSet: | ||
- | <code> | + | <code java> |
Instance[] instances = new Instance[numberOfInstance]; | Instance[] instances = new Instance[numberOfInstance]; | ||
/* | /* | ||
Line 424: | Line 409: | ||
calling the run method with a parameter file: | calling the run method with a parameter file: | ||
- | <code> | + | <code java> |
String[] parameters = new String[3]; | String[] parameters = new String[3]; | ||
parameters[1] = "-file"; | parameters[1] = "-file"; | ||
Line 437: | Line 422: | ||
If you want to load from a checkpoint, you don't need to set the learning data. I mentioned that the data given by weka to ecj are placed in a manner that allow them to be save when checkpointing. So when resuming, the data previously saved are restored and used to complete the evolution process. If new data are given, they are ignored. Here is a bit of code showing how to launch an algorithm from a checkpoint file: | If you want to load from a checkpoint, you don't need to set the learning data. I mentioned that the data given by weka to ecj are placed in a manner that allow them to be save when checkpointing. So when resuming, the data previously saved are restored and used to complete the evolution process. If new data are given, they are ignored. Here is a bit of code showing how to launch an algorithm from a checkpoint file: | ||
- | <code> | + | <code java> |
String[] parameters = new String[3]; | String[] parameters = new String[3]; | ||
parameters[1] = "-checkpoint"; | parameters[1] = "-checkpoint"; |