Document Clustering using K Means

Clustering documents is an important task as it groups similar documents together which can be used for a variety of tasks such as recommendations, similarity detection, creating dataset of a topic, generate new data following same pattern and so on. Clustering has always been a central task in Natural Language Processing and in this article, we use ideas from TF IDF and similarity metrics to use K Means clustering algorithm to cluster documents.


This is a companion discussion topic for the original entry at http://iq.opengenus.org/document-clustering-nlp-kmeans/