Machine Learning - Unit 6: Clustering with Python

Overview

Last week we learned about clustering. This week, we will apply clustering (K-means only) on real life data using the Python libraries in scikit-learn.

My Reflection

This week focused more on the hands-on implementation of K-Means Clustering in Python's Sickit-Learn, mainly through the unit's seminar. We were also encouraged to view Sickit-Learn's documentation on K-Means Clustering.

Since I had previous experience with clustering, this unit as well was easy to go with. To practice the implementation of clustering, I had already implemented a project of clustering on asian religious texts that can be viewed here. The project uses three different clustering approaches, K-Means, Gaussian Mixture Model (GMM) and Hierarchical Clustering. I evaluated the three approaches and picked the Hierarchical Clustering as the most suitable approach for the task.

Nonetheless, this week also included the deadline for submitting the team project, so we had to finalise the report and submit, which we did successfully. The code base and documentation can be viewied here on a GitHub repository that was prepared by one of the team colleagues.

Artefacts

Team Proejct Submission

The full code base and documentation can be viewied here.