This project applies unsupervised clustering techniques to a digit dataset in order to group similar data points without using labels.
- Hierarchical Clustering (Ward linkage)
- BIRCH Clustering
- Gaussian Mixture Model (GMM)
- Python
- Pandas, NumPy
- Scikit-learn
- Matplotlib
- Data was normalized before clustering
- Hierarchical and BIRCH clustering produced the best results
- Experimental approaches (proximity matrix, deep learning) were not effective
Open and run the notebook:
jupyter notebook clustering.ipynb