weka.classifiers.functions
Class PaceRegression

java.lang.Object
  extended by weka.classifiers.Classifier
      extended by weka.classifiers.functions.PaceRegression
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

public class PaceRegression
extends Classifier
implements OptionHandler, WeightedInstancesHandler, TechnicalInformationHandler

Class for building pace regression linear models and using them for prediction.

Under regularity conditions, pace regression is provably optimal when the number of coefficients tends to infinity. It consists of a group of estimators that are either overall optimal or optimal under certain conditions.

The current work of the pace regression theory, and therefore also this implementation, do not handle:

- missing values
- non-binary nominal attributes
- the case that n - k is small where n is the number of instances and k is the number of coefficients (the threshold used in this implmentation is 20)

For more information see:

Wang, Y (2000). A new approach to fitting linear models in high dimensional spaces. Hamilton, New Zealand.

Wang, Y., Witten, I. H.: Modeling for optimal probability prediction. In: Proceedings of the Nineteenth International Conference in Machine Learning, Sydney, Australia, 650-657, 2002.

BibTeX:

 @phdthesis{Wang2000,
    address = {Hamilton, New Zealand},
    author = {Wang, Y},
    school = {Department of Computer Science, University of Waikato},
    title = {A new approach to fitting linear models in high dimensional spaces},
    year = {2000}
 }
 
 @inproceedings{Wang2002,
    address = {Sydney, Australia},
    author = {Wang, Y. and Witten, I. H.},
    booktitle = {Proceedings of the Nineteenth International Conference in Machine Learning},
    pages = {650-657},
    title = {Modeling for optimal probability prediction},
    year = {2002}
 }
 

Valid options are:

 -D
  Produce debugging output.
  (default no debugging output)
 -E <estimator>
  The estimator can be one of the following:
   eb -- Empirical Bayes estimator for noraml mixture (default)
   nested -- Optimal nested model selector for normal mixture
   subset -- Optimal subset selector for normal mixture
   pace2 -- PACE2 for Chi-square mixture
   pace4 -- PACE4 for Chi-square mixture
   pace6 -- PACE6 for Chi-square mixture
 
   ols -- Ordinary least squares estimator
   aic -- AIC estimator
   bic -- BIC estimator
   ric -- RIC estimator
   olsc -- Ordinary least squares subset selector with a threshold
 -S <threshold value>
  Threshold value for the OLSC estimator

Version:
$Revision: 1.9 $
Author:
Yong Wang (yongwang@cs.waikato.ac.nz), Gabi Schmidberger (gabi@cs.waikato.ac.nz)
See Also:
Serialized Form

Field Summary
static Tag[] TAGS_ESTIMATOR
          estimator types
 
Constructor Summary
PaceRegression()
           
 
Method Summary
 void buildClassifier(Instances data)
          Builds a pace regression model for the given data.
 boolean checkForMissing(Instance instance, Instances model)
          Checks if an instance has a missing value.
 double classifyInstance(Instance instance)
          Classifies the given instance using the linear regression function.
 double[] coefficients()
          Returns the coefficients for this linear model.
 java.lang.String debugTipText()
          Returns the tip text for this property
 java.lang.String estimatorTipText()
          Returns the tip text for this property
 Capabilities getCapabilities()
          Returns default capabilities of the classifier.
 boolean getDebug()
          Controls whether debugging output will be printed
 SelectedTag getEstimator()
          Gets the estimator
 java.lang.String[] getOptions()
          Gets the current settings of the classifier.
 java.lang.String getRevision()
          Returns the revision string.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 double getThreshold()
          Gets the threshold for olsc estimator
 java.lang.String globalInfo()
          Returns a string describing this classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Generates a linear regression function predictor.
 int numParameters()
          Get the number of coefficients used in the model
 void setDebug(boolean debug)
          Controls whether debugging output will be printed
 void setEstimator(SelectedTag estimator)
          Sets the estimator.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setThreshold(double newThreshold)
          Set threshold for the olsc estimator
 java.lang.String thresholdTipText()
          Returns the tip text for this property
 java.lang.String toString()
          Outputs the linear regression model as a string.
 
Methods inherited from class weka.classifiers.Classifier
distributionForInstance, forName, makeCopies, makeCopy
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

TAGS_ESTIMATOR

public static final Tag[] TAGS_ESTIMATOR
estimator types

Constructor Detail

PaceRegression

public PaceRegression()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this classifier

Returns:
a description of the classifier suitable for displaying in the explorer/experimenter gui

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Returns:
the technical information about this class

getCapabilities

public Capabilities getCapabilities()
Returns default capabilities of the classifier.

Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class Classifier
Returns:
the capabilities of this classifier
See Also:
Capabilities

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Builds a pace regression model for the given data.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the linear regression function
Throws:
java.lang.Exception - if the classifier could not be built successfully

checkForMissing

public boolean checkForMissing(Instance instance,
                               Instances model)
Checks if an instance has a missing value.

Parameters:
instance - the instance
model - the data
Returns:
true if missing value is present

classifyInstance

public double classifyInstance(Instance instance)
                        throws java.lang.Exception
Classifies the given instance using the linear regression function.

Overrides:
classifyInstance in class Classifier
Parameters:
instance - the test instance
Returns:
the classification
Throws:
java.lang.Exception - if classification can't be done successfully

toString

public java.lang.String toString()
Outputs the linear regression model as a string.

Overrides:
toString in class java.lang.Object
Returns:
the model as string

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class Classifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -D
  Produce debugging output.
  (default no debugging output)
 -E <estimator>
  The estimator can be one of the following:
   eb -- Empirical Bayes estimator for noraml mixture (default)
   nested -- Optimal nested model selector for normal mixture
   subset -- Optimal subset selector for normal mixture
   pace2 -- PACE2 for Chi-square mixture
   pace4 -- PACE4 for Chi-square mixture
   pace6 -- PACE6 for Chi-square mixture
 
   ols -- Ordinary least squares estimator
   aic -- AIC estimator
   bic -- BIC estimator
   ric -- RIC estimator
   olsc -- Ordinary least squares subset selector with a threshold
 -S <threshold value>
  Threshold value for the OLSC estimator

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class Classifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

coefficients

public double[] coefficients()
Returns the coefficients for this linear model.

Returns:
the coefficients for this linear model

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class Classifier
Returns:
an array of strings suitable for passing to setOptions

numParameters

public int numParameters()
Get the number of coefficients used in the model

Returns:
the number of coefficients

debugTipText

public java.lang.String debugTipText()
Returns the tip text for this property

Overrides:
debugTipText in class Classifier
Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setDebug

public void setDebug(boolean debug)
Controls whether debugging output will be printed

Overrides:
setDebug in class Classifier
Parameters:
debug - true if debugging output should be printed

getDebug

public boolean getDebug()
Controls whether debugging output will be printed

Overrides:
getDebug in class Classifier
Returns:
true if debugging output should be printed

estimatorTipText

public java.lang.String estimatorTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getEstimator

public SelectedTag getEstimator()
Gets the estimator

Returns:
the estimator

setEstimator

public void setEstimator(SelectedTag estimator)
Sets the estimator.

Parameters:
estimator - the new estimator

thresholdTipText

public java.lang.String thresholdTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setThreshold

public void setThreshold(double newThreshold)
Set threshold for the olsc estimator

Parameters:
newThreshold - the threshold for the olsc estimator

getThreshold

public double getThreshold()
Gets the threshold for olsc estimator

Returns:
the threshold

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Returns:
the revision

main

public static void main(java.lang.String[] argv)
Generates a linear regression function predictor.

Parameters:
argv - the options