K-means Clustering Algorithm Testing Cases

Outline:
• What is K-means Clustering Algorithm?
• How K-means Clustering Algorithm Works?
• Example.

What is K-means Clustering Algorithm?
It’s a concept formation (clustering) algorithm that groups
objects based on some attributes/features into K clusters.

How K-means Clustering Algorithm Works?
Having a set of unlabeled data, the aim is to group this data into different number of
clusters, the algorithm starts by iterating through the following steps:
FIRST ITERATION
1. Choose the number of clusters K
2. Select the centroid of each cluster K (first iteration) , for example if we have 3 clusters
then we need to choose three centroids randomly (there are other ways to select
centroids).
3. For each cluster, calculate the distance between objects and its centroid
4. Group based on the minimum distance. For example if we have point A(centroid), B and
C(centroid), the distance between A and B is 0.5, the distance between B and C is 1.5, then
B will be in the A group because the distance is less
5. Check if any object moved from one group to another:
• if yes then go for another iteration (calculate new centroids then go to step 3)
• Else the algorithm stops and return the formed clusters

Remark for SECOND, THIRD, FOURTH, … ITERATIONS
Starting from the second iteration, the algorithm updates the centroids , since
by now we have for example n groups, and each group has some objects, the
centroid will be calculated as = the sum of objects in the group/the number of
objects

EXAMPLE:
STEP 1: Choose number of clusters K
Each medicine represents a point in the graph, in this example our goal is to group objects into two clusters (K=2) based
on the two attributes weight index and pH.
Object (X): weight index (Y):pH
Medicine A 1 1
Medicine B 2 1
Medicine C 4 3
Medicine D 5 4

Sincewewanttohavetwoclusters,thenweneedto
selecttwocentroids.Inthisexample,wechosepointA
andBtobethecentroids
c1=(1,1)cluster1
c2=(2,1)cluster2
STEP 2: Choose centroids

STEP3:Calculatedistancebetweeneachobjectandcentroid(EuclideanDistance)
D1 =
A B C D
STEP 4: Choose the minimum distance
G1 =
A B C D
In this 1st iteration: cluster 1 has only A, cluster 2 has B,C, and D.
Euclidean Distance Formula:
D =

2nd
iteration
Now we need to compute the new centroid of each cluster, since
cluster 1 has only A then the centroid remains c1 = (1,1).
However, cluster 2 now has three members B,C, and D, so we
calculate the centroid by taking the average of the three
members.
C2 = () = (3.67 , 2.67)

Calculatethedistancesfor2nd
iteration
D2 =
A B C D
Choose the minimum distance
G2 =
A B C D
In this 2nd
iteration: cluster 1 has A,B and cluster 2 has C,D

3rd
Iteration
We repeat the same steps, first we compute the new centroid for
each cluster by taking the average.
C1 = () = (1.5 , 1)
C2 = () = (4.5 , 3.5)

Calculatethedistancesfor3rd
iteration
D3 =
A B C D
Choose the minimum distance
G3 =
A B C D
In this 3rd
iteration: cluster 1 has A,B and cluster 2 has C,D

• We found out that G3=G2, this means that no object moved to
another group so we stop the algorithm. Now we have divided our
data into two clusters.
• Cluster 1: A,B
• Cluster 2: C,D

K-means Clustering Algorithm Testing Cases

More Related Content

Similar to K-means Clustering Algorithm Testing Cases

More from Ghazanfar Latif (Gabe)

Recently uploaded

K-means Clustering Algorithm Testing Cases