public class NumericCleaner extends SimpleStreamFilter
-D Turns on output of debugging information.
-min <double> The minimum threshold. (default -Double.MAX_VALUE)
-min-default <double> The replacement for values smaller than the minimum threshold. (default -Double.MAX_VALUE)
-max <double> The maximum threshold. (default Double.MAX_VALUE)
-max-default <double> The replacement for values larger than the maximum threshold. (default Double.MAX_VALUE)
-closeto <double> The number values are checked for closeness. (default 0)
-closeto-default <double> The replacement for values that are close to '-closeto'. (default 0)
-closeto-tolerance <double> The tolerance below which numbers are considered being close to to each other. (default 1E-6)
-decimals <int> The number of decimals to round to, -1 means no rounding at all. (default -1)
-R <col1,col2,...> The list of columns to cleanse, e.g., first-last or first-3,5-last. (default first-last)
-V Inverts the matching sense.
-include-class Whether to include the class in the cleansing. The class column will always be skipped, if this flag is not present. (default no)
Modifier and Type | Field and Description |
---|---|
protected double |
m_CloseTo
the number the values are checked for closeness to
|
protected double |
m_CloseToDefault
the default replacement value for numbers "close-to"
|
protected double |
m_CloseToTolerance
the tolerance distance, below which numbers are considered being "close-to"
|
protected Range |
m_Cols
Stores which columns to cleanse
|
protected int |
m_Decimals
the number of decimals to round to (-1 means no rounding)
|
protected boolean |
m_IncludeClass
whether to include the class attribute
|
protected double |
m_MaxDefault
the maximum default replacement value
|
protected double |
m_MaxThreshold
the maximum threshold
|
protected double |
m_MinDefault
the minimum default replacement value
|
protected double |
m_MinThreshold
the minimum threshold
|
m_Debug
m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts
Constructor and Description |
---|
NumericCleaner() |
Modifier and Type | Method and Description |
---|---|
String |
attributeIndicesTipText()
Returns the tip text for this property
|
String |
closeToDefaultTipText()
Returns the tip text for this property
|
String |
closeToTipText()
Returns the tip text for this property
|
String |
closeToToleranceTipText()
Returns the tip text for this property
|
String |
decimalsTipText()
Returns the tip text for this property
|
protected Instances |
determineOutputFormat(Instances inputFormat)
Determines the output format based on the input format and returns
this.
|
String |
getAttributeIndices()
Gets the selection of the columns, e.g., first-last or first-3,5-last
|
Capabilities |
getCapabilities()
Returns the Capabilities of this filter.
|
double |
getCloseTo()
Get the "close to" number.
|
double |
getCloseToDefault()
Get the "close to" default.
|
double |
getCloseToTolerance()
Get the "close to" Tolerance.
|
int |
getDecimals()
Get the number of decimals to round to.
|
boolean |
getIncludeClass()
Gets whether the class is included in the cleaning process or always
skipped.
|
boolean |
getInvertSelection()
Gets whether the selection of the columns is inverted
|
double |
getMaxDefault()
Get the maximum default.
|
double |
getMaxThreshold()
Get the maximum threshold.
|
double |
getMinDefault()
Get the minimum default.
|
double |
getMinThreshold()
Get the minimum threshold.
|
String[] |
getOptions()
Gets the current settings of the filter.
|
String |
getRevision()
Returns the revision string.
|
String |
globalInfo()
Returns a string describing this filter.
|
String |
includeClassTipText()
Returns the tip text for this property
|
String |
invertSelectionTipText()
Returns the tip text for this property
|
Enumeration |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(String[] args)
Runs the filter from commandline, use "-h" to see all options.
|
String |
maxDefaultTipText()
Returns the tip text for this property
|
String |
maxThresholdTipText()
Returns the tip text for this property
|
String |
minDefaultTipText()
Returns the tip text for this property
|
String |
minThresholdTipText()
Returns the tip text for this property
|
protected Instance |
process(Instance instance)
processes the given instance (may change the provided instance) and
returns the modified version.
|
void |
setAttributeIndices(String value)
Sets the columns to use, e.g., first-last or first-3,5-last
|
void |
setCloseTo(double value)
Set the "close to" number.
|
void |
setCloseToDefault(double value)
Set the "close to" default.
|
void |
setCloseToTolerance(double value)
Set the "close to" Tolerance.
|
void |
setDecimals(int value)
Set the number of decimals to round to.
|
void |
setIncludeClass(boolean value)
Sets whether the class can be cleaned, too.
|
void |
setInvertSelection(boolean value)
Sets whether the selection of the indices is inverted or not
|
void |
setMaxDefault(double value)
Set the naximum default.
|
void |
setMaxThreshold(double value)
Set the maximum threshold.
|
void |
setMinDefault(double value)
Set the minimum default.
|
void |
setMinThreshold(double value)
Set the minimum threshold.
|
void |
setOptions(String[] options)
Parses a given list of options.
|
batchFinished, hasImmediateOutputFormat, input, preprocess, process
debugTipText, getDebug, reset, setDebug, setInputFormat
batchFilterFile, bufferInput, copyValues, copyValues, filterFile, flushInput, getCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputFormatPeek, outputPeek, push, resetQueue, runFilter, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapper
protected double m_MinThreshold
protected double m_MinDefault
protected double m_MaxThreshold
protected double m_MaxDefault
protected double m_CloseTo
protected double m_CloseToDefault
protected double m_CloseToTolerance
protected Range m_Cols
protected boolean m_IncludeClass
protected int m_Decimals
public String globalInfo()
globalInfo
in class SimpleFilter
public Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class SimpleFilter
public String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class SimpleFilter
public void setOptions(String[] options) throws Exception
-D Turns on output of debugging information.
-min <double> The minimum threshold. (default -Double.MAX_VALUE)
-min-default <double> The replacement for values smaller than the minimum threshold. (default -Double.MAX_VALUE)
-max <double> The maximum threshold. (default Double.MAX_VALUE)
-max-default <double> The replacement for values larger than the maximum threshold. (default Double.MAX_VALUE)
-closeto <double> The number values are checked for closeness. (default 0)
-closeto-default <double> The replacement for values that are close to '-closeto'. (default 0)
-closeto-tolerance <double> The tolerance below which numbers are considered being close to to each other. (default 1E-6)
-decimals <int> The number of decimals to round to, -1 means no rounding at all. (default -1)
-R <col1,col2,...> The list of columns to cleanse, e.g., first-last or first-3,5-last. (default first-last)
-V Inverts the matching sense.
-include-class Whether to include the class in the cleansing. The class column will always be skipped, if this flag is not present. (default no)
setOptions
in interface OptionHandler
setOptions
in class SimpleFilter
options
- the list of options as an array of stringsException
- if an option is not supportedSimpleFilter.reset()
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class Filter
Capabilities
protected Instances determineOutputFormat(Instances inputFormat) throws Exception
determineOutputFormat
in class SimpleStreamFilter
inputFormat
- the input format to base the output format onException
- in case the determination goes wrongSimpleStreamFilter.hasImmediateOutputFormat()
,
SimpleStreamFilter.batchFinished()
protected Instance process(Instance instance) throws Exception
process
in class SimpleStreamFilter
instance
- the instance to processException
- in case the processing goes wrongpublic String minThresholdTipText()
public double getMinThreshold()
public void setMinThreshold(double value)
value
- the minimum threshold to use.public String minDefaultTipText()
public double getMinDefault()
public void setMinDefault(double value)
value
- the minimum default to use.public String maxThresholdTipText()
public double getMaxThreshold()
public void setMaxThreshold(double value)
value
- the maximum threshold to use.public String maxDefaultTipText()
public double getMaxDefault()
public void setMaxDefault(double value)
value
- the maximum default to use.public String closeToTipText()
public double getCloseTo()
public void setCloseTo(double value)
value
- the number to use for checking closeness.public String closeToDefaultTipText()
public double getCloseToDefault()
public void setCloseToDefault(double value)
value
- the "close to" default to use.public String closeToToleranceTipText()
public double getCloseToTolerance()
public void setCloseToTolerance(double value)
value
- the "close to" Tolerance to use.public String attributeIndicesTipText()
public String getAttributeIndices()
public void setAttributeIndices(String value)
value
- the columns to usepublic String invertSelectionTipText()
public boolean getInvertSelection()
public void setInvertSelection(boolean value)
value
- the new invert settingpublic String includeClassTipText()
public boolean getIncludeClass()
public void setIncludeClass(boolean value)
value
- true if the class can be cleansed, toopublic String decimalsTipText()
public int getDecimals()
public void setDecimals(int value)
value
- the number of decimals.public String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Filter
public static void main(String[] args)
args
- the commandline options for the filterCopyright © 2015 University of Waikato, Hamilton, NZ. All rights reserved.