Clustering by previous representative
A method may include identifying documents in a current clustering operation, assigning the identified documents to one or more clusters, selecting a current representative document for each of the one or more clusters, determining whether the current representative document has been re-crawled, det...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Patent |
Sprache: | eng |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method may include identifying documents in a current clustering operation, assigning the identified documents to one or more clusters, selecting a current representative document for each of the one or more clusters, determining whether the current representative document has been re-crawled, determining a previous representative document with which the current representative document was previously associated in a prior clustering operation, if it is determined that the current representative document has not been re-crawled, determining one of the one or more clusters to which the previous representative document has been assigned in the current clustering operation, combining one of the one or more clusters associated with the current representative document that has not been re-crawled with the one of the one or more clusters associated with the previous representative document into a combined cluster, and storing information regarding the combined cluster. |
---|