Package mdp :: Package nodes :: Class PCANode
[hide private]
[frames] | no frames]

Class PCANode


Filter the input data through the most significatives of its
principal components.

**Internal variables of interest**

  ``self.avg``
      Mean of the input data (available after training).

  ``self.v``
      Transposed of the projection matrix (available after training).

  ``self.d``
      Variance corresponding to the PCA components (eigenvalues of the
      covariance matrix).

  ``self.explained_variance``
      When output_dim has been specified as a fraction of the total
      variance, this is the fraction of the total variance that is
      actually explained.

More information about Principal Component Analysis, a.k.a. discrete
Karhunen-Loeve transform can be found among others in
I.T. Jolliffe, Principal Component Analysis, Springer-Verlag (1986).

Instance Methods [hide private]
 
__init__(self, input_dim=None, output_dim=None, dtype=None, svd=False, reduce=False, var_rel=1e-12, var_abs=1e-15, var_part=None)
The number of principal components to be kept can be specified as 'output_dim' directly (e.g.
 
_adjust_output_dim(self)
Return the eigenvector range and set the output dim if required.
 
_check_output(self, y)
 
_execute(self, x, n=None)
Project the input on the first 'n' principal components.
 
_inverse(self, y, n=None)
Project 'y' to the input space using the first 'n' components.
 
_set_output_dim(self, n)
 
_stop_training(self, debug=False)
Stop the training phase.
 
_train(self, x)
 
execute(self, x, n=None)
Project the input on the first 'n' principal components.
 
get_explained_variance(self)
Return the fraction of the original variance that can be explained by self._output_dim PCA components.
 
get_projmatrix(self, transposed=1)
Return the projection matrix.
 
get_recmatrix(self, transposed=1)
Return the back-projection matrix (i.e.
 
inverse(self, y, n=None)
Project 'y' to the input space using the first 'n' components.
 
stop_training(self, debug=False)
Stop the training phase.
 
train(self, x)
Update the internal structures according to the input data `x`.

Inherited from unreachable.newobject: __long__, __native__, __nonzero__, __unicode__, next

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __sizeof__, __subclasshook__

    Inherited from Node
 
__add__(self, other)
 
__call__(self, x, *args, **kwargs)
Calling an instance of `Node` is equivalent to calling its `execute` method.
 
__repr__(self)
repr(x)
 
__str__(self)
str(x)
 
_check_input(self, x)
 
_check_train_args(self, x, *args, **kwargs)
 
_get_supported_dtypes(self)
Return the list of dtypes supported by this node.
 
_get_train_seq(self)
 
_if_training_stop_training(self)
 
_pre_execution_checks(self, x)
This method contains all pre-execution checks.
 
_pre_inversion_checks(self, y)
This method contains all pre-inversion checks.
 
_refcast(self, x)
Helper function to cast arrays to the internal dtype.
 
_set_dtype(self, t)
 
_set_input_dim(self, n)
 
copy(self, protocol=None)
Return a deep copy of the node.
 
get_current_train_phase(self)
Return the index of the current training phase.
 
get_dtype(self)
Return dtype.
 
get_input_dim(self)
Return input dimensions.
 
get_output_dim(self)
Return output dimensions.
 
get_remaining_train_phase(self)
Return the number of training phases still to accomplish.
 
get_supported_dtypes(self)
Return dtypes supported by the node as a list of :numpy:`dtype` objects.
 
has_multiple_training_phases(self)
Return True if the node has multiple training phases.
 
is_training(self)
Return True if the node is in the training phase, False otherwise.
 
save(self, filename, protocol=-1)
Save a pickled serialization of the node to `filename`.
 
set_dtype(self, t)
Set internal structures' dtype.
 
set_input_dim(self, n)
Set input dimensions.
 
set_output_dim(self, n)
Set output dimensions.
Static Methods [hide private]
    Inherited from Node
 
is_invertible()
Return True if the node can be inverted, False otherwise.
 
is_trainable()
Return True if the node can be trained, False otherwise.
Properties [hide private]

Inherited from object: __class__

    Inherited from Node
  _train_seq
List of tuples::
  dtype
dtype
  input_dim
Input dimensions
  output_dim
Output dimensions
  supported_dtypes
Supported dtypes
Method Details [hide private]

__init__(self, input_dim=None, output_dim=None, dtype=None, svd=False, reduce=False, var_rel=1e-12, var_abs=1e-15, var_part=None)
(Constructor)

 
The number of principal components to be kept can be specified as
'output_dim' directly (e.g. 'output_dim=10' means 10 components
are kept) or by the fraction of variance to be explained
(e.g. 'output_dim=0.95' means that as many components as necessary
will be kept in order to explain 95% of the input variance).

Other Keyword Arguments:

svd -- if True use Singular Value Decomposition instead of the
       standard eigenvalue problem solver. Use it when PCANode
       complains about singular covariance matrices

reduce -- Keep only those principal components which have a variance
          larger than 'var_abs' and a variance relative to the
          first principal component larger than 'var_rel' and a
          variance relative to total variance larger than 'var_part'
          (set var_part to None or 0 for no filtering).
          Note: when the 'reduce' switch is enabled, the actual number
          of principal components (self.output_dim) may be different
          from that set when creating the instance.

Overrides: object.__init__

_adjust_output_dim(self)

 
Return the eigenvector range and set the output dim if required.

This is used if the output dimensions is smaller than the input
dimension (so only the larger eigenvectors have to be kept).

_check_output(self, y)

 
Overrides: Node._check_output

_execute(self, x, n=None)

 
Project the input on the first 'n' principal components.
If 'n' is not set, use all available components.

Overrides: Node._execute

_inverse(self, y, n=None)

 
Project 'y' to the input space using the first 'n' components.
If 'n' is not set, use all available components.

Overrides: Node._inverse

_set_output_dim(self, n)

 
Overrides: Node._set_output_dim

_stop_training(self, debug=False)

 
Stop the training phase.

Keyword arguments:

debug=True     if stop_training fails because of singular cov
               matrices, the singular matrices itselves are stored in
               self.cov_mtx and self.dcov_mtx to be examined.

Overrides: Node._stop_training

_train(self, x)

 
Overrides: Node._train

execute(self, x, n=None)

 
Project the input on the first 'n' principal components.
If 'n' is not set, use all available components.

Overrides: Node.execute

get_explained_variance(self)

 
Return the fraction of the original variance that can be
explained by self._output_dim PCA components.
If for example output_dim has been set to 0.95, the explained
variance could be something like 0.958...
Note that if output_dim was explicitly set to be a fixed number
of components, there is no way to calculate the explained variance.

get_projmatrix(self, transposed=1)

 
Return the projection matrix.

get_recmatrix(self, transposed=1)

 
Return the back-projection matrix (i.e. the reconstruction matrix).
        

inverse(self, y, n=None)

 
Project 'y' to the input space using the first 'n' components.
If 'n' is not set, use all available components.

Overrides: Node.inverse

stop_training(self, debug=False)

 
Stop the training phase.

Keyword arguments:

debug=True     if stop_training fails because of singular cov
               matrices, the singular matrices itselves are stored in
               self.cov_mtx and self.dcov_mtx to be examined.

Overrides: Node.stop_training

train(self, x)

 
Update the internal structures according to the input data `x`.

`x` is a matrix having different variables on different columns
and observations on the rows.

By default, subclasses should overwrite `_train` to implement their
training phase. The docstring of the `_train` method overwrites this
docstring.

Note: a subclass supporting multiple training phases should implement
the *same* signature for all the training phases and document the
meaning of the arguments in the `_train` method doc-string. Having
consistent signatures is a requirement to use the node in a flow.

Overrides: Node.train