Patent document clustering with deep embeddings
The analysis of scientific and technical documents is crucial in the process of establishing science and technology strategies. One popular method for such analysis is for field experts to manually classify each scientific or technical document into one of several predefined technical categories. Ho...
Gespeichert in:
Veröffentlicht in: | Scientometrics 2020-05, Vol.123 (2), p.563-577 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The analysis of scientific and technical documents is crucial in the process of establishing science and technology strategies. One popular method for such analysis is for field experts to manually classify each scientific or technical document into one of several predefined technical categories. However, not only is manual classification error-prone and expensive, but it also requires extended efforts to handle frequent data updates. In contrast, machine learning and text mining techniques enable cheaper and faster operations, and can alleviate the burden on human resources. In this paper, we propose a method for extracting embedded feature vectors by applying a neural embedding approach for text features in patent documents and automatically clustering the embedding features by utilizing a deep embedding clustering method. |
---|---|
ISSN: | 0138-9130 1588-2861 |
DOI: | 10.1007/s11192-020-03396-7 |