Abstract
Approximate nearest neighbor (ANN) search in high-dimensional space plays an essential role in a variety of real-world applications. A well-known solution to ANN search, inverted file product quantization (IVFPQ) adopts inverted files to avoid exhaustive examination and compresses vectors using product quantization to reduce the space overhead. However, existing implementations use the same fixed probing cardinality (i.e., the number of cells to probe) setting for all queries, which leads to too many or too few cell examinations, thus increasing the average query latency or reducing the recall. To achieve a better trade-off between latency and accuracy, we enable probing cardinality estimation for high-dimensional ANN search by using deep learning techniques. We develop HBK-means, a hierarchical balanced clustering algorithm that reduces the data distribution imbalance of cells to enable a better estimation. Next, we develop PCE-Net, an encoder-decoder based neural network for estimating query-dependent minimum probing cardinality. In addition, we introduce two query optimization strategies: lower bound sorting based pruning (LBS-Pruning) and early termination (ET), to further reduce query latency. Extensive experiments with real-world data offer evidence that the proposed solution is capable of achieving better performance than IVFPQ and its variants.
Originalsprog | Engelsk |
---|---|
Titel | 2023 IEEE 39th International Conference on Data Engineering (ICDE) |
Antal sider | 13 |
Forlag | IEEE |
Publikationsdato | 2023 |
Sider | 3209-3221 |
ISBN (Trykt) | 979-8-3503-2228-6 |
ISBN (Elektronisk) | 979-8-3503-2227-9 |
DOI | |
Status | Udgivet - 2023 |
Begivenhed | 39th IEEE International Conference on Data Engineering, ICDE 2023 - Anaheim, USA Varighed: 3 apr. 2023 → 7 apr. 2023 |
Konference
Konference | 39th IEEE International Conference on Data Engineering, ICDE 2023 |
---|---|
Land/Område | USA |
By | Anaheim |
Periode | 03/04/2023 → 07/04/2023 |
Navn | Proceedings of the International Conference on Data Engineering |
---|---|
ISSN | 1063-6382 |
Bibliografisk note
Publisher Copyright:© 2023 IEEE.