Learned Probing Cardinality Estimation for High-Dimensional Approximate NN Search

Bolong Zheng*, Ziyang Yue, Qi Hu, Xiaomeng Yi, Xiaofan Luan, Charles Xie, Xiaofang Zhou, Christian S. Jensen

*Kontaktforfatter

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

1 Citationer (Scopus)

Abstract

Approximate nearest neighbor (ANN) search in high-dimensional space plays an essential role in a variety of real-world applications. A well-known solution to ANN search, inverted file product quantization (IVFPQ) adopts inverted files to avoid exhaustive examination and compresses vectors using product quantization to reduce the space overhead. However, existing implementations use the same fixed probing cardinality (i.e., the number of cells to probe) setting for all queries, which leads to too many or too few cell examinations, thus increasing the average query latency or reducing the recall. To achieve a better trade-off between latency and accuracy, we enable probing cardinality estimation for high-dimensional ANN search by using deep learning techniques. We develop HBK-means, a hierarchical balanced clustering algorithm that reduces the data distribution imbalance of cells to enable a better estimation. Next, we develop PCE-Net, an encoder-decoder based neural network for estimating query-dependent minimum probing cardinality. In addition, we introduce two query optimization strategies: lower bound sorting based pruning (LBS-Pruning) and early termination (ET), to further reduce query latency. Extensive experiments with real-world data offer evidence that the proposed solution is capable of achieving better performance than IVFPQ and its variants.

OriginalsprogEngelsk
Titel2023 IEEE 39th International Conference on Data Engineering (ICDE)
Antal sider13
ForlagIEEE
Publikationsdato2023
Sider3209-3221
ISBN (Trykt)979-8-3503-2228-6
ISBN (Elektronisk)979-8-3503-2227-9
DOI
StatusUdgivet - 2023
Begivenhed39th IEEE International Conference on Data Engineering, ICDE 2023 - Anaheim, USA
Varighed: 3 apr. 20237 apr. 2023

Konference

Konference39th IEEE International Conference on Data Engineering, ICDE 2023
Land/OmrådeUSA
ByAnaheim
Periode03/04/202307/04/2023
NavnProceedings of the International Conference on Data Engineering
ISSN1063-6382

Bibliografisk note

Publisher Copyright:
© 2023 IEEE.

Fingeraftryk

Dyk ned i forskningsemnerne om 'Learned Probing Cardinality Estimation for High-Dimensional Approximate NN Search'. Sammen danner de et unikt fingeraftryk.

Citationsformater