Pivot selection algorithms in metric spaces: a survey and experimental study

Yifan Zhu, Lu Chen, Yunjun Gao, Christian S. Jensen

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

3 Citationer (Scopus)
1 Downloads (Pure)

Abstract

Similarity search in metric spaces is used widely in areas such as multimedia retrieval, data mining, data integration, to name but a few. To accelerate metric similarity search, pivot-based indexing is often employed. Pivot-based indexing first computes the distances between data objects and pivots and then exploits filtering techniques that use the triangle inequality on pre-computed distances to prune search space during search. The performance of pivot-based indexing depends on the quality of the pivots used, and many algorithms have been proposed for selecting high-quality pivots. We present a comprehensive empirical study of pivot selection algorithms. Specifically, we classify all existing algorithms into three categories according to the types of distances they use for selecting pivots. We also propose a new pivot selection algorithm that exploits the power law probabilistic distribution. Next, we report on a comprehensive empirical study of the search performance enabled by different pivot selection approaches, using different datasets and indexes, thus contributing new insight into the strengths and weaknesses of existing selection techniques. Finally, we offer advice on how to select appropriate pivot selection algorithms for different settings.

OriginalsprogEngelsk
TidsskriftVLDB Journal
Vol/bind31
Udgave nummer1
Sider (fra-til)23-47
Antal sider25
ISSN1066-8888
DOI
StatusUdgivet - 2022

Bibliografisk note

Funding Information:
This work was supported in part by the NSFC under Grants No. 62025206 and 61972338. Yunjun Gao is the corresponding author of the work.

Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Pivot selection algorithms in metric spaces: a survey and experimental study'. Sammen danner de et unikt fingeraftryk.

Citationsformater