Generic 2-dimensional hierarchical clustering with geometric hashing. More...
#include <OpenMS/COMPARISON/CLUSTERING/HierarchicalClustering.h>
Classes | |
class | BoundingBox |
Bounding box of cluster. More... | |
class | Cluster |
Set of points. Describes a cluster on the grid. A point consists of a PointCoordinate and a PointRef. More... | |
class | TreeDistance |
Wrapper class for two trees and the corresponding distance. More... | |
class | TreeNode |
Tree node used for clustering. More... | |
Public Types | |
typedef DPosition< 2, DoubleReal > | PointCoordinate |
Coordinate of a point to be clustered. More... | |
typedef HashGrid< Cluster > | Grid |
The hash grid data type. More... | |
Public Member Functions | |
HierarchicalClustering (const PointCoordinate &cluster_dimension) | |
Constructor. More... | |
Grid::cell_iterator | insertPoint (const PointCoordinate &d, const PointRef &ref) |
Insert new PointCoordinate into grid. More... | |
void | cluster () |
Perform clustering of all existing points. More... | |
Public Attributes | |
Grid | grid |
The hash grid. More... | |
Protected Types | |
typedef std::map< typename Grid::CellIndex, std::pair < typename Grid::CellContent *, bool > > | ClusterCells |
typedef boost::unordered_set < TreeNode * > | ClusterTrees |
typedef std::priority_queue < TreeDistance, std::vector < TreeDistance >, std::greater < TreeDistance > > | TreeDistanceQueue |
Priority queue queue used to find minimum distances. More... | |
Protected Member Functions | |
template<class P > | |
Grid::cell_iterator | insertCluster_ (const P &p) |
Insert new Cluster into grid. More... | |
void | clusterIndex_ (const typename Grid::CellIndex &p) |
Perform clustering at given cell index. More... | |
void | gridCells5x5_ (typename Grid::CellIndex cur, ClusterCells &cells) |
Collect all cells used to cluster at given cell index. More... | |
void | gridCell_ (const typename Grid::CellIndex &cur, ClusterCells &cells, bool center=false, bool ignore_missing=true) |
Collect one cell. More... | |
void | addTreeDistance_ (TreeNode *tree, ClusterTrees &trees, TreeDistanceQueue &dists) |
Add a new tree to the set of trees and distance queue. More... | |
DoubleReal | treeDistance_ (TreeNode *left, TreeNode *right) |
Returns distance of two tree nodes Returns the euclidean distance of the coordinates of the two trees. It checks the size of the bounding box and returns INFINITY if it gets to large. More... | |
void | tree2Cluster_ (const TreeNode *tree, Cluster &cluster) |
Recursively add the points of a finished cluster into the hash grid. All points are saved in the leafs of the tree. More... | |
void | tree2Points_ (const TreeNode *tree) |
Recursively add the points of an unfinished cluster back to the grid. All points are saved in the leafs of the tree. More... | |
Static Protected Member Functions | |
static PointCoordinate | coordScalarDiv_ (const PointCoordinate &lhs, const DoubleReal &rhs) |
static PointCoordinate | coordElemDiv_ (const PointCoordinate &lhs, const PointCoordinate &rhs) |
static bool | coordElemGreater_ (const PointCoordinate &lhs, const PointCoordinate &rhs) |
static DoubleReal | coordDist_ (const PointCoordinate &lhs, const PointCoordinate &rhs) |
Generic 2-dimensional hierarchical clustering with geometric hashing.
The input data is saved into a hash grid. The dimension of the hash cells is also the maximum cluster dimension.
The clustering is performed on a 5x5 subsets of the hash grid. Only clusters with all points in the inner 3x3 subset are accepted into the output; all others are discarded. This makes sure that all clusters are maximal and can't get larger with points not visible.
This clustering only supports centroid linkage. It uses a priority queue to save minimum distances between two subsets (proto-cluster?). No full distance matrix is required.
PointRef | Reference associated with every point. Must have a default constructor. |
|
protected |
|
protected |
typedef DPosition<2, DoubleReal> PointCoordinate |
Coordinate of a point to be clustered.
|
protected |
Priority queue queue used to find minimum distances.
|
inline |
Constructor.
cluster_dimension | Max dimension of cluster |
|
inlineprotected |
Add a new tree to the set of trees and distance queue.
|
inline |
Perform clustering of all existing points.
|
protected |
Perform clustering at given cell index.
p | Cell index. |
References HierarchicalClustering< PointRef >::TreeNode::bbox, HierarchicalClustering< PointRef >::TreeNode::coord, and HierarchicalClustering< PointRef >::TreeNode::points.
Referenced by HierarchicalClustering< SILACPattern * >::cluster().
|
inlinestaticprotected |
Referenced by HierarchicalClustering< SILACPattern * >::treeDistance_().
|
inlinestaticprotected |
Referenced by HierarchicalClustering< SILACPattern * >::treeDistance_().
|
inlinestaticprotected |
Referenced by HierarchicalClustering< SILACPattern * >::treeDistance_().
|
inlinestaticprotected |
|
inlineprotected |
Collect one cell.
cur | Cell index. |
cells | List of cells. |
center | Is the given cell in the center. |
ignore_missing | Defines if non-existent errors should be ignored. |
|
protected |
Collect all cells used to cluster at given cell index.
This function collects all cells in a 5x5 array.
cur | Cell index. |
cells | List of cells to be used. |
|
inlineprotected |
Insert new Cluster into grid.
p | Point to insert. |
Referenced by HierarchicalClustering< SILACPattern * >::insertPoint().
|
inline |
Insert new PointCoordinate into grid.
d | PointCoordinate to insert. |
ref | Associated caller specified info. |
Referenced by HierarchicalClustering< SILACPattern * >::tree2Points_().
Recursively add the points of a finished cluster into the hash grid. All points are saved in the leafs of the tree.
tree | The tree |
cluster | The cluster |
Referenced by HierarchicalClustering< SILACPattern * >::tree2Cluster_().
|
inlineprotected |
Recursively add the points of an unfinished cluster back to the grid. All points are saved in the leafs of the tree.
tree | The tree |
Referenced by HierarchicalClustering< SILACPattern * >::tree2Points_().
|
inlineprotected |
Returns distance of two tree nodes Returns the euclidean distance of the coordinates of the two trees. It checks the size of the bounding box and returns INFINITY if it gets to large.
Referenced by HierarchicalClustering< SILACPattern * >::addTreeDistance_().
Grid grid |
The hash grid.
It contains clusters.
Referenced by HierarchicalClustering< SILACPattern * >::cluster(), HierarchicalClustering< SILACPattern * >::gridCell_(), HierarchicalClustering< SILACPattern * >::insertCluster_(), and HierarchicalClustering< SILACPattern * >::treeDistance_().
OpenMS / TOPP release 1.11.1 | Documentation generated on Thu Nov 14 2013 11:19:28 using doxygen 1.8.5 |