A Vector Worth a Thousand Counts: A Temporal Semantic Similarity Approach to Patent Impact Prediction

Daniel Hain, Roman Jurowetzki, Tobias Buchmann, Patrick Wolf

Publikation: Working paper/PreprintWorking paperForskning

462 Downloads (Pure)


Patent data has long been used as a widely accessible measure of the
rate and direction of technological change. However, this long tradition of research has
so far focused on producing and analyzing measures of patent quantity, assuming the
number of patents produced to accurately capture the rate of progress and innovation
output. Existing attempts to measure patent quality are mostly limited to the use of
forward- and backward citation pattern. In contrast, in this paper, we derive a patent
quality indicator by leveraging the rich but up to now under-utilized textual information
in patent abstracts. We employ vector space modeling techniques to create a highdimensional
vector representation of the patents to capture their technological signature.
Using almost near linear-scaling approximate nearest neighbor matching techniques, we
are able to compute dyadic similarity scores across large bodies of patent data. Based on
the temporal distribution of a patents similarity scores, we compute ex-ante indicators
of a patentÂťs technological novelty and ex-post indicators of technological impact and
significance. At the case of circa 132.000 electro-mobility patents, we demonstrate the
proposed indicators‘ to map, analyze, and predict patent quality on individual, firm, and
country level, and its development over time.
StatusUdgivet - 2022


Dyk ned i forskningsemnerne om 'A Vector Worth a Thousand Counts: A Temporal Semantic Similarity Approach to Patent Impact Prediction'. Sammen danner de et unikt fingeraftryk.
  • OECD IPSDM “Big Data Analytics” Challenge

    Hain, Daniel (Modtager), Jurowetzki, Roman (Modtager), Buchmann, Tobias (Modtager), Wolf, Patrick (Modtager) & Simmering, Paul (Modtager), 13 sep. 2018

    Pris: Øvrige priser