A text-embedding-based approach to measuring patent-to-patent technological similarity

Daniel S. Hain*, Roman Jurowetzki, Tobias Buchmann, Patrick Wolf


Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

23 Citationer (Scopus)


This paper describes an efficiently scaleable approach to measuring technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology, we are able to compute similarities between all existing patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways and, using the case of electric vehicle technologies, demonstrate their usefulness in measuring knowledge flows, mapping technological change, and creating patent quality indicators. This paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentation of our methods, including all code, and indicators at https://github.com/AI-Growth-Lab/patent_p2p_similarity_w2v).

TidsskriftTechnological Forecasting and Social Change
StatusUdgivet - apr. 2022

Bibliografisk note

Publisher Copyright:
© 2022


Dyk ned i forskningsemnerne om 'A text-embedding-based approach to measuring patent-to-patent technological similarity'. Sammen danner de et unikt fingeraftryk.