A text-embedding-based approach to measuring patent-to-patent technological similarity

Daniel S. Hain; Roman Jurowetzki; Tobias Buchmann; Patrick Wolf

doi:10.1016/j.techfore.2022.121559

A text-embedding-based approach to measuring patent-to-patent technological similarity

Daniel S. Hain^*, Roman Jurowetzki, Tobias Buchmann, Patrick Wolf

^*Corresponding author for this work

Research output: Contribution to journal › Journal article › Research › peer-review

26 Citations (Scopus)

Abstract

This paper describes an efficiently scaleable approach to measuring technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology, we are able to compute similarities between all existing patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways and, using the case of electric vehicle technologies, demonstrate their usefulness in measuring knowledge flows, mapping technological change, and creating patent quality indicators. This paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentation of our methods, including all code, and indicators at https://github.com/AI-Growth-Lab/patent_p2p_similarity_w2v).

Original language	English
Article number	121559
Journal	Technological Forecasting and Social Change
Volume	177
ISSN	0040-1625
DOIs	https://doi.org/10.1016/j.techfore.2022.121559
Publication status	Published - Apr 2022

Bibliographical note

Funding Information:
All code necessary to recreate our workflow, indicator creation, and analysis is freely available at https://github.com/daniel-hain/patent_embedding_research . All data is also available for download and use in third-party analysis. Financial support for ZSW’s research provided by BMBF Kopernikus ENavi (FKZ:03SFK4W0). – Workflow, Code, and Applications –

Publisher Copyright:
© 2022

Keywords

Natural-language processing
Patent data
Patent landscaping
Patent quality
Technological similarity
Technology network

Access to Document

10.1016/j.techfore.2022.121559

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{e915b3763eab4fbe9efcf6be265122c3,

title = "A text-embedding-based approach to measuring patent-to-patent technological similarity",

abstract = "This paper describes an efficiently scaleable approach to measuring technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology, we are able to compute similarities between all existing patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways and, using the case of electric vehicle technologies, demonstrate their usefulness in measuring knowledge flows, mapping technological change, and creating patent quality indicators. This paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentation of our methods, including all code, and indicators at https://github.com/AI-Growth-Lab/patent_p2p_similarity_w2v).",

keywords = "Natural-language processing, Patent data, Patent landscaping, Patent quality, Technological similarity, Technology network",

author = "Hain, {Daniel S.} and Roman Jurowetzki and Tobias Buchmann and Patrick Wolf",

note = "Funding Information: All code necessary to recreate our workflow, indicator creation, and analysis is freely available at https://github.com/daniel-hain/patent_embedding_research . All data is also available for download and use in third-party analysis. Financial support for ZSW{\textquoteright}s research provided by BMBF Kopernikus ENavi (FKZ:03SFK4W0). – Workflow, Code, and Applications – Publisher Copyright: {\textcopyright} 2022",

year = "2022",

month = apr,

doi = "10.1016/j.techfore.2022.121559",

language = "English",

volume = "177",

journal = "Technological Forecasting and Social Change",

issn = "0040-1625",

publisher = "Elsevier",

}

TY - JOUR

T1 - A text-embedding-based approach to measuring patent-to-patent technological similarity

AU - Hain, Daniel S.

AU - Jurowetzki, Roman

AU - Buchmann, Tobias

AU - Wolf, Patrick

N1 - Funding Information: All code necessary to recreate our workflow, indicator creation, and analysis is freely available at https://github.com/daniel-hain/patent_embedding_research . All data is also available for download and use in third-party analysis. Financial support for ZSW’s research provided by BMBF Kopernikus ENavi (FKZ:03SFK4W0). – Workflow, Code, and Applications – Publisher Copyright: © 2022

PY - 2022/4

Y1 - 2022/4

N2 - This paper describes an efficiently scaleable approach to measuring technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology, we are able to compute similarities between all existing patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways and, using the case of electric vehicle technologies, demonstrate their usefulness in measuring knowledge flows, mapping technological change, and creating patent quality indicators. This paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentation of our methods, including all code, and indicators at https://github.com/AI-Growth-Lab/patent_p2p_similarity_w2v).

AB - This paper describes an efficiently scaleable approach to measuring technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology, we are able to compute similarities between all existing patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways and, using the case of electric vehicle technologies, demonstrate their usefulness in measuring knowledge flows, mapping technological change, and creating patent quality indicators. This paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentation of our methods, including all code, and indicators at https://github.com/AI-Growth-Lab/patent_p2p_similarity_w2v).

KW - Natural-language processing

KW - Patent data

KW - Patent landscaping

KW - Patent quality

KW - Technological similarity

KW - Technology network

UR - http://www.scopus.com/inward/record.url?scp=85124302402&partnerID=8YFLogxK

U2 - 10.1016/j.techfore.2022.121559

DO - 10.1016/j.techfore.2022.121559

M3 - Journal article

AN - SCOPUS:85124302402

SN - 0040-1625

VL - 177

JO - Technological Forecasting and Social Change

JF - Technological Forecasting and Social Change

M1 - 121559

ER -

A text-embedding-based approach to measuring patent-to-patent technological similarity

Abstract

Bibliographical note

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this