Document Clustering
Similarity-based clustering documents to find patterns that characterize the data is one of the most important tasks in textual analytics applications. In the case of documents, clustering requires efficient approaches to represent and measure distances/closeness's between documents. From this,...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Buchkapitel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Similarity-based clustering documents to find patterns that characterize the data is one of the most important tasks in textual analytics applications. In the case of documents, clustering requires efficient approaches to represent and measure distances/closeness's between documents. From this, different cluster generation strategies can be used. One of the most popular strategies is the K-means clustering method, which fundamentally creates clusters based on the distance of the input data to the centers of the clusters, which is why groups are characterized by having concentric topologies. On the other hand, extensions of the technique, such as the Self-Organizational Map (SOM) allow not only to create clusters with different types of topology but to learn the best input data assignments to such clusters, considering the relationship with neighboring data points, which makes it attractive as a global optimum technique. |
---|---|
DOI: | 10.1201/9781003280996-7 |