weka.filters.supervised.instance
Class SMOTE

java.lang.Object
  extended by weka.filters.Filter
      extended by weka.filters.supervised.instance.SMOTE
All Implemented Interfaces:
java.io.Serializable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, SupervisedFilter

public class SMOTE
extends Filter
implements SupervisedFilter, OptionHandler, TechnicalInformationHandler

Resamples a dataset by applying the Synthetic Minority Oversampling TEchnique (SMOTE). The original dataset must fit entirely in memory. The amount of SMOTE and number of nearest neighbors may be specified. For more information, see

Nitesh V. Chawla et. al. (2002). Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 16:321-357.

BibTeX:

 @article{al.2002,
    author = {Nitesh V. Chawla et. al.},
    journal = {Journal of Artificial Intelligence Research},
    pages = {321-357},
    title = {Synthetic Minority Over-sampling Technique},
    volume = {16},
    year = {2002}
 }
 

Valid options are:

 -S <num>
  Specifies the random number seed
  (default 1)
 -P <percentage>
  Specifies percentage of SMOTE instances to create.
  (default 100.0)
 
 -K <nearest-neighbors>
  Specifies the number of nearest neighbors to use.
  (default 5)
 
 -C <value-index>
  Specifies the index of the nominal class value to SMOTE
  (default 0: auto-detect non-empty minority class))
 

Version:
$Revision: 4565 $
Author:
Ryan Lichtenwalter (rlichtenwalter@gmail.com)
See Also:
Serialized Form

Constructor Summary
SMOTE()
           
 
Method Summary
 boolean batchFinished()
          Signify that this batch of input to the filter is finished.
 java.lang.String classValueTipText()
          Returns the tip text for this property.
 Capabilities getCapabilities()
          Returns the Capabilities of this filter.
 java.lang.String getClassValue()
          Gets the index of the class value to which SMOTE should be applied.
 int getNearestNeighbors()
          Gets the number of nearest neighbors to use.
 java.lang.String[] getOptions()
          Gets the current settings of the filter.
 double getPercentage()
          Gets the percentage of SMOTE instances to create.
 int getRandomSeed()
          Gets the random number seed.
 java.lang.String getRevision()
          Returns the revision string.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 java.lang.String globalInfo()
          Returns a string describing this classifier.
 boolean input(Instance instance)
          Input an instance for filtering.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for running this filter.
 java.lang.String nearestNeighborsTipText()
          Returns the tip text for this property.
 java.lang.String percentageTipText()
          Returns the tip text for this property.
 java.lang.String randomSeedTipText()
          Returns the tip text for this property.
 void setClassValue(java.lang.String value)
          Sets the index of the class value to which SMOTE should be applied.
 boolean setInputFormat(Instances instanceInfo)
          Sets the format of the input instances.
 void setNearestNeighbors(int value)
          Sets the number of nearest neighbors to use.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setPercentage(double value)
          Sets the percentage of SMOTE instances to create.
 void setRandomSeed(int value)
          Sets the random number seed.
 
Methods inherited from class weka.filters.Filter
batchFilterFile, filterFile, getCapabilities, getOutputFormat, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, numPendingOutput, output, outputPeek, toString, useFilter, wekaStaticWrapper
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SMOTE

public SMOTE()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this classifier.

Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Returns:
the technical information about this class

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Returns:
the revision

getCapabilities

public Capabilities getCapabilities()
Returns the Capabilities of this filter.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Filter
Returns:
the capabilities of this object
See Also:
Capabilities

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options. Valid options are:

 -S <num>
  Specifies the random number seed
  (default 1)
 -P <percentage>
  Specifies percentage of SMOTE instances to create.
  (default 100.0)
 
 -K <nearest-neighbors>
  Specifies the number of nearest neighbors to use.
  (default 5)
 
 -C <value-index>
  Specifies the index of the nominal class value to SMOTE
  (default 0: auto-detect non-empty minority class))
 

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the filter.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

randomSeedTipText

public java.lang.String randomSeedTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getRandomSeed

public int getRandomSeed()
Gets the random number seed.

Returns:
the random number seed.

setRandomSeed

public void setRandomSeed(int value)
Sets the random number seed.

Parameters:
value - the new random number seed.

percentageTipText

public java.lang.String percentageTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setPercentage

public void setPercentage(double value)
Sets the percentage of SMOTE instances to create.

Parameters:
value - the percentage to use

getPercentage

public double getPercentage()
Gets the percentage of SMOTE instances to create.

Returns:
the percentage of SMOTE instances to create

nearestNeighborsTipText

public java.lang.String nearestNeighborsTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setNearestNeighbors

public void setNearestNeighbors(int value)
Sets the number of nearest neighbors to use.

Parameters:
value - the number of nearest neighbors to use

getNearestNeighbors

public int getNearestNeighbors()
Gets the number of nearest neighbors to use.

Returns:
the number of nearest neighbors to use

classValueTipText

public java.lang.String classValueTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setClassValue

public void setClassValue(java.lang.String value)
Sets the index of the class value to which SMOTE should be applied.

Parameters:
value - the class value index

getClassValue

public java.lang.String getClassValue()
Gets the index of the class value to which SMOTE should be applied.

Returns:
the index of the clas value to which SMOTE should be applied

setInputFormat

public boolean setInputFormat(Instances instanceInfo)
                       throws java.lang.Exception
Sets the format of the input instances.

Overrides:
setInputFormat in class Filter
Parameters:
instanceInfo - an Instances object containing the input instance structure (any instances contained in the object are ignored - only the structure is required).
Returns:
true if the outputFormat may be collected immediately
Throws:
java.lang.Exception - if the input format can't be set successfully

input

public boolean input(Instance instance)
Input an instance for filtering. Filter requires all training instances be read before producing output.

Overrides:
input in class Filter
Parameters:
instance - the input instance
Returns:
true if the filtered instance may now be collected with output().
Throws:
java.lang.IllegalStateException - if no input structure has been defined

batchFinished

public boolean batchFinished()
                      throws java.lang.Exception
Signify that this batch of input to the filter is finished. If the filter requires all instances prior to filtering, output() may now be called to retrieve the filtered instances.

Overrides:
batchFinished in class Filter
Returns:
true if there are instances pending output
Throws:
java.lang.IllegalStateException - if no input structure has been defined
java.lang.Exception - if provided options cannot be executed on input instances

main

public static void main(java.lang.String[] args)
Main method for running this filter.

Parameters:
args - should contain arguments to the filter: use -h for help