Grid based method & model based clustering method

 INTRODUCTION
 STING
 WAVECLUSTER
 CLIQUE-Clustering in QUEST
 FAST PROCESSING TIME

 The grid based clustering approach uses a multi
resolution grid data structure.
 The object space is quantized into finite number
of cells that form a grid structure.
 The major advantage of this method is fast
processing time.
 It is dependent only on the number of cells in
each dimension in the quantized space.

 Statistical information GRID.
 Spatial area is divided into rectangular cells
 Several levels of cells-at different levels of
resolution
 High level cell is partitioned into several
lower level cells.
 Statistical attributes are stored in cell.
(mean , maximum , minimum)

 Computation is query independent
 Parallel processing-supported.
 Data is processed in a single pass
 Quality depends on granuerily

 A multi-resolution clustering approach which
applies wavelet transform to the feature space
 A wavelet transform is a signal processing
technique that decomposes a signal into different
frequency sub-band
 Both grid-based and density-based
 Input parameters:
 # of cells for each dimension
 The wavelet , and the # of application wavelet
transform.

 Complexity O(N)
 Detect arbitrary shaped clusters at different
scales.
 Not sensitive to noise , not sensitive to input
order.
 Only applicable to low dimensional data.

CLIQUE can be considered as both density-
based and grid-based
1.It partitions each dimension into the same number
of equal length interval.
2.It partitions an m-dimensional data space into
non-overlapping rectangular units.
3.A unit is dense if the fraction of total data points
contained in the unit exceeds the input model
parameter.
4.A cluster is a maximal set of connected dense units
within a subspace.

 Attempt to optimize the fit between the data
and some mathematical model.
 ASSUMPTION:-data are generated by a
mixture of underlying portability distributes.
 TECHNIQUES:
 expectation-maximization
 Conceptual clustering
 Neural networks approach

 ITERATIVE REFINEMENT ALGORITHM-
used to find parameter estimates
EXTENSION OF K-MEANS
 Assigns an object to a cluster according to a
weight representing portability of
membership.
 Initial estimate of parameters
 Iteratively reassigns scores.

 A form of clustering in machine learning
 Produces a classification scheme for a set of
unlabeled objects.
 Finds characteristics description for each concept
 COBWEB
 A popular and simple method of incremental
conceptual learning.
 Creates a hierarchical clustering in the form of a
classification tree.

 Represent each cluster as an exemplar , acting as
a “prototype” of the cluster.
 New objects are distributed to the cluster whose
exemplar is the most similar according to some
distance measure.
SELF ORGANIZING MAP
 Competitive learning
 Involves a hierarchical architecture of several
units
 Organization of units-forms a feature map
 Web document clustering.

FEATURE TRANSFORMATION METHODS
 PCA , SVD-Summarize data by creating linear
combinations of attributes.
 But do not remove any attributes ;
transformed attributes-complex to interpret
FEATURE SELECTION METHODS
 Most relevant of attributes with represent to
class labels
 Entropy analysis .

Grid based method & model based clustering method

More Related Content

What's hot

Similar to Grid based method & model based clustering method

More from rajshreemuthiah

Recently uploaded

In this document

Grid based method & model based clustering method