A text-embedding-based approach to measuring patent-to-patent technological similarity

•We develop a method to create vector representations of patents based on text data.•We describe an efficient process to use these vectors to create patent similarity-to-patent measures for large amounts of patents.•We provide all code and data for reproduction, use, and improvement.•We evaluate and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Technological forecasting & social change 2022-04, Vol.177, p.121559, Article 121559
Hauptverfasser: Hain, Daniel S., Jurowetzki, Roman, Buchmann, Tobias, Wolf, Patrick
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•We develop a method to create vector representations of patents based on text data.•We describe an efficient process to use these vectors to create patent similarity-to-patent measures for large amounts of patents.•We provide all code and data for reproduction, use, and improvement.•We evaluate and illustrate the results empirically.•We illustrate the results of the created measures and metrics at the case of electric vehicle patents. This paper describes an efficiently scaleable approach to measuring technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology, we are able to compute similarities between all existing patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways and, using the case of electric vehicle technologies, demonstrate their usefulness in measuring knowledge flows, mapping technological change, and creating patent quality indicators. This paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentation of our methods, including all code, and indicators at https://github.com/AI-Growth-Lab/patent_p2p_similarity_w2v).
ISSN:0040-1625
1873-5509
DOI:10.1016/j.techfore.2022.121559