A Vector Worth a Thousand Counts: A Temporal Semantic Similarity Approach to Patent Impact Prediction

Research output: Working paperResearch

Abstract

Patent data has long been used as a widely accessible measure of the
rate and direction of technological change. However, this long tradition of research has
so far focused on producing and analyzing measures of patent quantity, assuming the
number of patents produced to accurately capture the rate of progress and innovation
output. Existing attempts to measure patent quality are mostly limited to the use of
forward- and backward citation pattern. In contrast, in this paper, we derive a patent
quality indicator by leveraging the rich but up to now under-utilized textual information
in patent abstracts. We employ vector space modeling techniques to create a highdimensional
vector representation of the patents to capture their technological signature.
Using almost near linear-scaling approximate nearest neighbor matching techniques, we
are able to compute dyadic similarity scores across large bodies of patent data. Based on
the temporal distribution of a patents similarity scores, we compute ex-ante indicators
of a patentÂťs technological novelty and ex-post indicators of technological impact and
significance. At the case of circa 132.000 electro-mobility patents, we demonstrate the
proposed indicators‘ to map, analyze, and predict patent quality on individual, firm, and
country level, and its development over time.
Close

Details

Patent data has long been used as a widely accessible measure of the
rate and direction of technological change. However, this long tradition of research has
so far focused on producing and analyzing measures of patent quantity, assuming the
number of patents produced to accurately capture the rate of progress and innovation
output. Existing attempts to measure patent quality are mostly limited to the use of
forward- and backward citation pattern. In contrast, in this paper, we derive a patent
quality indicator by leveraging the rich but up to now under-utilized textual information
in patent abstracts. We employ vector space modeling techniques to create a highdimensional
vector representation of the patents to capture their technological signature.
Using almost near linear-scaling approximate nearest neighbor matching techniques, we
are able to compute dyadic similarity scores across large bodies of patent data. Based on
the temporal distribution of a patents similarity scores, we compute ex-ante indicators
of a patentÂťs technological novelty and ex-post indicators of technological impact and
significance. At the case of circa 132.000 electro-mobility patents, we demonstrate the
proposed indicators‘ to map, analyze, and predict patent quality on individual, firm, and
country level, and its development over time.
Original languageEnglish
Publication statusIn preparation - 2018
Publication categoryResearch
Peer-reviewedNo

    Research areas

  • Technological change, patent data, natural language processing, vector space modeling, quality indicators

Download statistics

No data available
ID: 282564570