public class CheckEstimator extends Object implements OptionHandler, RevisionHandler
java weka.estimators.CheckEstimator -W estimator_name
estimator_options
This class uses code from the CheckEstimatorClass
ATTENTION! Current estimators can only
1. split on a nominal class attribute
2. build estimators for nominal and numeric attributes
3. build estimators independendly of the class type
The functionality to test on other class and attribute types
is left in big parts in the code.
CheckEstimator reports on the following:
weka.estimators.AbstractEstimatorTest
uses this
class to test all the estimators. Any changes here, have to be
checked in that abstract test class, too.
Valid options are:
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 100).
-W Full name of the estimator analysed. eg: weka.estimators.NormalEstimator
Options specific to estimator weka.estimators.NormalEstimator:
-D If set, estimator is run in debug mode and may output additional info to the consoleOptions after -- are passed to the designated estimator.
TestInstances
Modifier and Type | Class and Description |
---|---|
static class |
CheckEstimator.AttrTypes
class that contains info about the attribute types the estimator can estimate
estimator work on one attribute only
|
static class |
CheckEstimator.EstTypes
public class that contains info about the chosen attribute type
estimator work on one attribute only
|
class |
CheckEstimator.PostProcessor
a class for postprocessing the test-data
|
Modifier and Type | Field and Description |
---|---|
protected String |
m_AnalysisResults
The results of the analysis as a string
|
protected boolean |
m_ClasspathProblems
whether classpath problems occurred
|
protected boolean |
m_Debug
Debugging mode, gives extra output if true
|
protected Estimator |
m_Estimator
The estimator to be examined
|
protected String[] |
m_EstimatorOptions
The options to be passed to the base estimator.
|
protected int |
m_NumInstances
The number of instances in the datasets
|
protected CheckEstimator.PostProcessor |
m_PostProcessor
for post-processing the data even further
|
protected boolean |
m_Silent
Silent mode, for no output at all to stdout
|
Constructor and Description |
---|
CheckEstimator() |
Modifier and Type | Method and Description |
---|---|
protected void |
addMissing(Instances data,
int level,
boolean attributeMissing,
boolean classMissing,
int attrIndex)
Add missing values to a dataset.
|
protected boolean[] |
canEstimate(CheckEstimator.AttrTypes attrTypes,
boolean supervised,
int classType)
Checks basic estimation of one attribute of the scheme, for simple non-troublesome
datasets.
|
protected boolean[] |
canHandleClassAsNthAttribute(CheckEstimator.AttrTypes attrTypes,
int numAtts,
int attrIndex,
int classType,
int classIndex)
Checks whether the scheme can handle class attributes as Nth attribute.
|
protected boolean[] |
canHandleMissing(CheckEstimator.AttrTypes attrTypes,
int classType,
boolean attributeMissing,
boolean classMissing,
int missingLevel)
Checks basic missing value handling of the scheme.
|
protected boolean[] |
canHandleNClasses(CheckEstimator.AttrTypes attrTypes,
int numClasses)
Checks whether nominal schemes can handle more than two classes.
|
protected boolean[] |
canHandleZeroTraining(CheckEstimator.AttrTypes attrTypes,
int classType)
Checks whether the scheme can handle zero training instances.
|
protected void |
canSplitUpClass(CheckEstimator.AttrTypes attrTypes,
int classType)
Checks basic estimation of one attribute of the scheme, for simple non-troublesome
datasets.
|
protected boolean[] |
canSplitUpClass(int attrType,
int classType)
Checks basic estimation of one attribute of the scheme, for simple non-troublesome
datasets.
|
protected boolean[] |
canTakeOptions()
Checks whether the scheme can take command line options.
|
protected void |
compareDatasets(Instances data1,
Instances data2)
Compare two datasets to see if they differ.
|
protected boolean[] |
correctBuildInitialisation(CheckEstimator.AttrTypes attrTypes,
int classType)
Checks whether the scheme correctly initialises models when
buildEstimator is called.
|
protected boolean[] |
datasetIntegrity(CheckEstimator.AttrTypes attrTypes,
int classType,
boolean attributeMissing,
boolean classMissing)
Checks whether the scheme alters the training dataset during
training.
|
void |
doTests()
Begin the tests, reporting results to System.out
|
boolean |
getDebug()
Get whether debugging is turned on
|
Estimator |
getEstimator()
Get the estimator used as the estimator
|
protected double[] |
getMinimumMaximum(Instances inst,
int attrIndex)
Gets the minimum and maximum of the values a the first attribute
of the given data set
|
static int |
getMinMax(Instances inst,
int attrIndex,
double[] minMax)
Find the minimum and the maximum of the attribute and return it in
the last parameter..
|
int |
getNumInstances()
Gets the current number of instances to use for the datasets.
|
String[] |
getOptions()
Gets the current settings of the CheckEstimator.
|
CheckEstimator.PostProcessor |
getPostProcessor()
returns the current PostProcessor, can be null
|
String |
getRevision()
Returns the revision string.
|
boolean |
getSilent()
Get whether silent mode is turned on
|
boolean |
hasClasspathProblems()
returns TRUE if the estimator returned a "not in classpath" Exception
|
protected boolean[] |
incrementalEstimator()
Checks whether the scheme can build models incrementally.
|
protected boolean[] |
incrementingEquality(CheckEstimator.AttrTypes attrTypes,
int classType)
Checks whether an incremental scheme produces the same model when
trained incrementally as when batch trained.
|
protected boolean[] |
instanceWeights(CheckEstimator.AttrTypes attrTypes,
int classType)
Checks whether the estimator can handle instance weights.
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] args)
Test method for this class
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numAttr,
CheckEstimator.AttrTypes attrTypes,
int numClasses,
int classType)
Make a simple set of instances, which can later be modified
for use in specific tests.
|
protected Instances |
makeTestDataset(int seed,
int numInstances,
int numAttr,
CheckEstimator.AttrTypes attrTypes,
int numClasses,
int classType,
int classIndex)
Make a simple set of instances with variable position of the class
attribute, which can later be modified for use in specific tests.
|
protected Vector |
makeTestValueList(int seed,
int numValues,
double minValue,
double maxValue,
int attrType)
Make a simple set of values.
|
protected Vector |
makeTestValueList(int seed,
int numValues,
Instances data,
int attrIndex,
int attrType)
Make a simple set of values.
|
protected void |
print(Object msg)
prints the given message to stdout, if not silent mode
|
protected void |
printAttributeSummary(CheckEstimator.AttrTypes attrTypes,
int classType)
Print out a short summary string for the dataset characteristics
|
protected void |
printAttributeSummary(int attrType,
int classType)
Print out a short summary string for the dataset characteristics
|
protected void |
println()
prints a LF to stdout, if not silent mode
|
protected void |
println(Object msg)
prints the given message (+ LF) to stdout, if not silent mode
|
protected Instances |
process(Instances data)
Provides a hook for derived classes to further modify the data.
|
protected boolean[] |
runBasicTest(CheckEstimator.AttrTypes attrTypes,
int numAtts,
int attrIndex,
int classType,
int missingLevel,
boolean attributeMissing,
boolean classMissing,
int numTrain,
int numTest,
int numClasses,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
protected boolean[] |
runBasicTest(CheckEstimator.AttrTypes attrTypes,
int numAtts,
int attrIndex,
int classType,
int classIndex,
int missingLevel,
boolean attributeMissing,
boolean classMissing,
int numTrain,
int numTest,
int numClasses,
FastVector accepts)
Runs a text on the datasets with the given characteristics.
|
void |
setDebug(boolean debug)
Set debugging mode
|
void |
setEstimator(Estimator newEstimator)
Set the estimator for boosting.
|
void |
setNumInstances(int value)
Sets the number of instances to use in the datasets (some estimators
might require more instances).
|
void |
setOptions(String[] options)
Parses a given list of options.
|
void |
setPostProcessor(CheckEstimator.PostProcessor value)
sets the PostProcessor to use
|
void |
setSilent(boolean value)
Set slient mode, i.e., no output at all to stdout
|
protected boolean[] |
supervisedEstimator()
Checks whether the estimator is supervised.
|
protected CheckEstimator.AttrTypes |
testsPerClassType(int classType,
CheckEstimator.EstTypes estTypes)
Run a battery of tests for a given class attribute type
|
protected Vector |
testWithTestValues(Estimator est,
Vector test)
Test with test values.
|
protected boolean[] |
weightedInstancesHandler()
Checks whether the scheme says it can handle instance weights.
|
protected Estimator m_Estimator
protected String[] m_EstimatorOptions
protected String m_AnalysisResults
protected boolean m_Debug
protected boolean m_Silent
protected int m_NumInstances
protected CheckEstimator.PostProcessor m_PostProcessor
protected boolean m_ClasspathProblems
public Enumeration listOptions()
listOptions
in interface OptionHandler
public void setOptions(String[] options) throws Exception
-D Turn on debugging output.
-S Silent mode - prints nothing to stdout.
-N <num> The number of instances in the datasets (default 100).
-W Full name of the estimator analysed. eg: weka.estimators.NormalEstimator
Options specific to estimator weka.estimators.NormalEstimator:
-D If set, estimator is run in debug mode and may output additional info to the console
setOptions
in interface OptionHandler
options
- the list of options as an array of stringsException
- if an option is not supportedpublic String[] getOptions()
getOptions
in interface OptionHandler
public void setPostProcessor(CheckEstimator.PostProcessor value)
value
- the new PostProcessorm_PostProcessor
public CheckEstimator.PostProcessor getPostProcessor()
public boolean hasClasspathProblems()
public void doTests()
public void setDebug(boolean debug)
debug
- true if debug output should be printedpublic boolean getDebug()
public void setSilent(boolean value)
value
- whether silent mode is active or notpublic boolean getSilent()
public void setNumInstances(int value)
value
- the number of instances to usepublic int getNumInstances()
public void setEstimator(Estimator newEstimator)
newEstimator
- the Estimator to use.public Estimator getEstimator()
protected void print(Object msg)
msg
- the text to print to stdoutprotected void println(Object msg)
msg
- the message to println to stdoutprotected void println()
protected CheckEstimator.AttrTypes testsPerClassType(int classType, CheckEstimator.EstTypes estTypes)
classType
- true if the class attribute should be numericestTypes
- types the estimator is, like incremental, weighted, supervised etcprotected boolean[] canTakeOptions()
protected boolean[] incrementalEstimator()
protected boolean[] weightedInstancesHandler()
protected boolean[] supervisedEstimator()
protected boolean[] canEstimate(CheckEstimator.AttrTypes attrTypes, boolean supervised, int classType)
attrTypes
- the types the estimator can work withclassType
- the class type (NOMINAL, NUMERIC, etc.)protected void canSplitUpClass(CheckEstimator.AttrTypes attrTypes, int classType)
attrTypes
- the types the estimator can work withclassType
- the class type (NOMINAL, NUMERIC, etc.)protected boolean[] canSplitUpClass(int attrType, int classType)
attrType
- the type of the estimatorclassType
- the class type (NOMINAL, NUMERIC, etc.)protected boolean[] canHandleNClasses(CheckEstimator.AttrTypes attrTypes, int numClasses)
attrTypes
- attribute types the estimator exceptsnumClasses
- the number of classes to testprotected boolean[] canHandleClassAsNthAttribute(CheckEstimator.AttrTypes attrTypes, int numAtts, int attrIndex, int classType, int classIndex)
attrTypes
- the attribute types the estimator acceptsnumAtts
- of attributesattrIndex
- the index of the attributeclassType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the index of the class attribute (0-based, -1 means last attribute)TestInstances.CLASS_IS_LAST
protected boolean[] canHandleZeroTraining(CheckEstimator.AttrTypes attrTypes, int classType)
attrTypes
- attribute types that can be estimatedclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] correctBuildInitialisation(CheckEstimator.AttrTypes attrTypes, int classType)
attrTypes
- attribute types that can be estimatedclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] canHandleMissing(CheckEstimator.AttrTypes attrTypes, int classType, boolean attributeMissing, boolean classMissing, int missingLevel)
attrTypes
- attribute types that can be estimatedclassType
- the class type (NUMERIC, NOMINAL, etc.)attributeMissing
- true if the missing values may be in
the attributesclassMissing
- true if the missing values may be in the classmissingLevel
- the percentage of missing valuesprotected boolean[] incrementingEquality(CheckEstimator.AttrTypes attrTypes, int classType)
attrTypes
- attribute types that can be estimatedclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] instanceWeights(CheckEstimator.AttrTypes attrTypes, int classType)
attrTypes
- attribute types that can be estimatedclassType
- the class type (NUMERIC, NOMINAL, etc.)protected boolean[] datasetIntegrity(CheckEstimator.AttrTypes attrTypes, int classType, boolean attributeMissing, boolean classMissing)
attrTypes
- attribute types that can be estimatedclassType
- the class type (NUMERIC, NOMINAL, etc.)attributeMissing
- true if we know the estimator can handle
(at least) moderate missing attribute valuesclassMissing
- true if we know the estimator can handle
(at least) moderate missing class valuesprotected boolean[] runBasicTest(CheckEstimator.AttrTypes attrTypes, int numAtts, int attrIndex, int classType, int missingLevel, boolean attributeMissing, boolean classMissing, int numTrain, int numTest, int numClasses, FastVector accepts)
attrTypes
- attribute types that can be estimatednumAtts
- number of attributesattrIndex
- attribute indexclassType
- the class type (NUMERIC, NOMINAL, etc.)missingLevel
- the percentage of missing valuesattributeMissing
- true if the missing values may be in
the attributesclassMissing
- true if the missing values may be in the classnumTrain
- the number of instances in the training setnumTest
- the number of instaces in the test setnumClasses
- the number of classesaccepts
- the acceptable string in an exceptionprotected boolean[] runBasicTest(CheckEstimator.AttrTypes attrTypes, int numAtts, int attrIndex, int classType, int classIndex, int missingLevel, boolean attributeMissing, boolean classMissing, int numTrain, int numTest, int numClasses, FastVector accepts)
attrTypes
- attribute types that can be estimatednumAtts
- number of attributesclassType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the attribute index of the classmissingLevel
- the percentage of missing valuesattributeMissing
- true if the missing values may be in
the attributesclassMissing
- true if the missing values may be in the classnumTrain
- the number of instances in the training setnumTest
- the number of instaces in the test setnumClasses
- the number of classesaccepts
- the acceptable string in an exceptionprotected void compareDatasets(Instances data1, Instances data2) throws Exception
data1
- one set of instancesdata2
- the other set of instancesException
- if the datasets differprotected void addMissing(Instances data, int level, boolean attributeMissing, boolean classMissing, int attrIndex)
data
- the instances to add missing values tolevel
- the level of missing values to add (if positive, this
is the probability that a value will be set to missing, if negative
all but one value will be set to missing (not yet implemented))attributeMissing
- if true, attributes will be modifiedclassMissing
- if true, the class attribute will be modifiedattrIndex
- index of the attributeprotected Instances makeTestDataset(int seed, int numInstances, int numAttr, CheckEstimator.AttrTypes attrTypes, int numClasses, int classType) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumAttr
- the number of attributesattrTypes
- the attribute typesnumClasses
- the number of classes (if nominal class)classType
- the class type (NUMERIC, NOMINAL, etc.)Exception
- if the dataset couldn't be generatedprocess(Instances)
protected Instances makeTestDataset(int seed, int numInstances, int numAttr, CheckEstimator.AttrTypes attrTypes, int numClasses, int classType, int classIndex) throws Exception
seed
- the random number seednumInstances
- the number of instances to generatenumAttr
- the number of attributes to generateattrTypes
- the type of attrbute that is exceptednumClasses
- the number of classes (if nominal class)classType
- the class type (NUMERIC, NOMINAL, etc.)classIndex
- the index of the class (0-based, -1 as last)Exception
- if the dataset couldn't be generatedTestInstances.CLASS_IS_LAST
,
process(Instances)
protected Vector makeTestValueList(int seed, int numValues, Instances data, int attrIndex, int attrType) throws Exception
seed
- the random number seednumValues
- the number of values to generatedata
- the dataset to make test examples forattrIndex
- index of the attributeattrType
- the class type (NUMERIC, NOMINAL, etc.)Exception
- if the dataset couldn't be generatedprocess(Instances)
protected Vector makeTestValueList(int seed, int numValues, double minValue, double maxValue, int attrType) throws Exception
seed
- the random number seednumValues
- the number of values to generateminValue
- the minimal data valuemaxValue
- the maximal data valueattrType
- the class type (NUMERIC, NOMINAL, etc.)Exception
- if the dataset couldn't be generatedprocess(Instances)
protected Vector testWithTestValues(Estimator est, Vector test)
est
- estimator to be testedtest
- vector with test valuesprotected double[] getMinimumMaximum(Instances inst, int attrIndex)
inst
- the instanceattrIndex
- the index of the attribut to find min and maxpublic static int getMinMax(Instances inst, int attrIndex, double[] minMax) throws Exception
inst
- instances used to build the estimatorattrIndex
- index of the attributeminMax
- the array to return minimum and maximum inException
- if parameter minMax wasn't initialized properlyprotected Instances process(Instances data)
data
- the data to processm_PostProcessor
protected void printAttributeSummary(CheckEstimator.AttrTypes attrTypes, int classType)
attrTypes
- the attribute types used (NUMERIC, NOMINAL, etc.)classType
- the class type (NUMERIC, NOMINAL, etc.)protected void printAttributeSummary(int attrType, int classType)
attrType
- the attribute type (NUMERIC, NOMINAL, etc.)classType
- the class type (NUMERIC, NOMINAL, etc.)public String getRevision()
getRevision
in interface RevisionHandler
public static void main(String[] args)
args
- the commandline parametersCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.