Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Classes | Public Types | Public Member Functions | Public Attributes | Protected Types | Protected Member Functions | Static Protected Member Functions | List of all members
HierarchicalClustering< PointRef > Class Template Reference

Generic 2-dimensional hierarchical clustering with geometric hashing. More...

#include <OpenMS/COMPARISON/CLUSTERING/HierarchicalClustering.h>

Classes

class  BoundingBox
 Bounding box of cluster. More...
 
class  Cluster
 Set of points. Describes a cluster on the grid. A point consists of a PointCoordinate and a PointRef. More...
 
class  TreeDistance
 Wrapper class for two trees and the corresponding distance. More...
 
class  TreeNode
 Tree node used for clustering. More...
 

Public Types

typedef DPosition< 2, DoubleRealPointCoordinate
 Coordinate of a point to be clustered. More...
 
typedef HashGrid< ClusterGrid
 The hash grid data type. More...
 

Public Member Functions

 HierarchicalClustering (const PointCoordinate &cluster_dimension)
 Constructor. More...
 
Grid::cell_iterator insertPoint (const PointCoordinate &d, const PointRef &ref)
 Insert new PointCoordinate into grid. More...
 
void cluster ()
 Perform clustering of all existing points. More...
 

Public Attributes

Grid grid
 The hash grid. More...
 

Protected Types

typedef std::map< typename
Grid::CellIndex, std::pair
< typename Grid::CellContent
*, bool > > 
ClusterCells
 
typedef boost::unordered_set
< TreeNode * > 
ClusterTrees
 
typedef std::priority_queue
< TreeDistance, std::vector
< TreeDistance >, std::greater
< TreeDistance > > 
TreeDistanceQueue
 Priority queue queue used to find minimum distances. More...
 

Protected Member Functions

template<class P >
Grid::cell_iterator insertCluster_ (const P &p)
 Insert new Cluster into grid. More...
 
void clusterIndex_ (const typename Grid::CellIndex &p)
 Perform clustering at given cell index. More...
 
void gridCells5x5_ (typename Grid::CellIndex cur, ClusterCells &cells)
 Collect all cells used to cluster at given cell index. More...
 
void gridCell_ (const typename Grid::CellIndex &cur, ClusterCells &cells, bool center=false, bool ignore_missing=true)
 Collect one cell. More...
 
void addTreeDistance_ (TreeNode *tree, ClusterTrees &trees, TreeDistanceQueue &dists)
 Add a new tree to the set of trees and distance queue. More...
 
DoubleReal treeDistance_ (TreeNode *left, TreeNode *right)
 Returns distance of two tree nodes Returns the euclidean distance of the coordinates of the two trees. It checks the size of the bounding box and returns INFINITY if it gets to large. More...
 
void tree2Cluster_ (const TreeNode *tree, Cluster &cluster)
 Recursively add the points of a finished cluster into the hash grid. All points are saved in the leafs of the tree. More...
 
void tree2Points_ (const TreeNode *tree)
 Recursively add the points of an unfinished cluster back to the grid. All points are saved in the leafs of the tree. More...
 

Static Protected Member Functions

static PointCoordinate coordScalarDiv_ (const PointCoordinate &lhs, const DoubleReal &rhs)
 
static PointCoordinate coordElemDiv_ (const PointCoordinate &lhs, const PointCoordinate &rhs)
 
static bool coordElemGreater_ (const PointCoordinate &lhs, const PointCoordinate &rhs)
 
static DoubleReal coordDist_ (const PointCoordinate &lhs, const PointCoordinate &rhs)
 

Detailed Description

template<typename PointRef>
class OpenMS::HierarchicalClustering< PointRef >

Generic 2-dimensional hierarchical clustering with geometric hashing.

The input data is saved into a hash grid. The dimension of the hash cells is also the maximum cluster dimension.

The clustering is performed on a 5x5 subsets of the hash grid. Only clusters with all points in the inner 3x3 subset are accepted into the output; all others are discarded. This makes sure that all clusters are maximal and can't get larger with points not visible.

This clustering only supports centroid linkage. It uses a priority queue to save minimum distances between two subsets (proto-cluster?). No full distance matrix is required.

Template Parameters
PointRefReference associated with every point. Must have a default constructor.

Member Typedef Documentation

typedef std::map<typename Grid::CellIndex, std::pair<typename Grid::CellContent *, bool> > ClusterCells
protected
typedef boost::unordered_set<TreeNode *> ClusterTrees
protected
typedef HashGrid<Cluster> Grid

The hash grid data type.

Coordinate of a point to be clustered.

Attention
To be replaced by a OpenMS coordinate type.
typedef std::priority_queue<TreeDistance, std::vector<TreeDistance>, std::greater<TreeDistance> > TreeDistanceQueue
protected

Priority queue queue used to find minimum distances.

Constructor & Destructor Documentation

HierarchicalClustering ( const PointCoordinate cluster_dimension)
inline

Constructor.

Parameters
cluster_dimensionMax dimension of cluster

Member Function Documentation

void addTreeDistance_ ( TreeNode tree,
ClusterTrees trees,
TreeDistanceQueue dists 
)
inlineprotected

Add a new tree to the set of trees and distance queue.

void cluster ( )
inline

Perform clustering of all existing points.

void clusterIndex_ ( const typename Grid::CellIndex &  p)
protected
static DoubleReal coordDist_ ( const PointCoordinate lhs,
const PointCoordinate rhs 
)
inlinestaticprotected
static PointCoordinate coordElemDiv_ ( const PointCoordinate lhs,
const PointCoordinate rhs 
)
inlinestaticprotected
static bool coordElemGreater_ ( const PointCoordinate lhs,
const PointCoordinate rhs 
)
inlinestaticprotected
static PointCoordinate coordScalarDiv_ ( const PointCoordinate lhs,
const DoubleReal rhs 
)
inlinestaticprotected
void gridCell_ ( const typename Grid::CellIndex &  cur,
ClusterCells cells,
bool  center = false,
bool  ignore_missing = true 
)
inlineprotected

Collect one cell.

Parameters
curCell index.
cellsList of cells.
centerIs the given cell in the center.
ignore_missingDefines if non-existent errors should be ignored.
void gridCells5x5_ ( typename Grid::CellIndex  cur,
ClusterCells cells 
)
protected

Collect all cells used to cluster at given cell index.

This function collects all cells in a 5x5 array.

Parameters
curCell index.
cellsList of cells to be used.
Grid::cell_iterator insertCluster_ ( const P &  p)
inlineprotected

Insert new Cluster into grid.

Parameters
pPoint to insert.
Returns
iterator to inserted cluster.

Referenced by HierarchicalClustering< SILACPattern * >::insertPoint().

Grid::cell_iterator insertPoint ( const PointCoordinate d,
const PointRef &  ref 
)
inline

Insert new PointCoordinate into grid.

Parameters
dPointCoordinate to insert.
refAssociated caller specified info.
Returns
iterator to inserted cluster.

Referenced by HierarchicalClustering< SILACPattern * >::tree2Points_().

void tree2Cluster_ ( const TreeNode tree,
Cluster cluster 
)
inlineprotected

Recursively add the points of a finished cluster into the hash grid. All points are saved in the leafs of the tree.

Parameters
treeThe tree
clusterThe cluster

Referenced by HierarchicalClustering< SILACPattern * >::tree2Cluster_().

void tree2Points_ ( const TreeNode tree)
inlineprotected

Recursively add the points of an unfinished cluster back to the grid. All points are saved in the leafs of the tree.

Parameters
treeThe tree

Referenced by HierarchicalClustering< SILACPattern * >::tree2Points_().

DoubleReal treeDistance_ ( TreeNode left,
TreeNode right 
)
inlineprotected

Returns distance of two tree nodes Returns the euclidean distance of the coordinates of the two trees. It checks the size of the bounding box and returns INFINITY if it gets to large.

Referenced by HierarchicalClustering< SILACPattern * >::addTreeDistance_().

Member Data Documentation

Grid grid

OpenMS / TOPP release 1.11.1 Documentation generated on Thu Nov 14 2013 11:19:28 using doxygen 1.8.5