TY - ABST
T1 - Answering skyline queries over incomplete data with crowdsourcing (Extended Abstract)
AU - Miao, Xiaoye
AU - Gao, Yunjun
AU - Guo, Su
AU - Chen, Lu
AU - Yin, Jianwei
AU - Li, Qing
N1 - Funding Information:
This work was supported in part by the NSFC Grants No. 61902343, 61972338, 61825205, 61772459, and U1609217, National Key Research and Development Program of China under Grants No. 2018YFB1004003 and 2017YFB1400603, National Science and Technology Major Project of China under Grant No. 50-D36B02-9002-16/19, the ZJU-Hikvision Joint Project, and the Fundamental Research Funds for the Central Universities. Yunjun Gao is the corresponding author of the work.
Publisher Copyright:
© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/4
Y1 - 2020/4
N2 - Due to the pervasiveness of incomplete data, incomplete data queries are vital in a large number of real-life scenarios. Current models and approaches for incomplete data queries mainly rely on the machine power. In this paper, we study the problem of skyline queries over incomplete data with crowdsourcing. We propose a novel query framework, termed as BayesCrowd, on top of Bayesian network and the typical c-table model on incomplete data. Considering budget and latency constraints, we present a suite of effective task selection strategies. In particular, since the probability computation of each object being an answer object is at least as hard as #SAT problem, we propose an adaptive DPLL (i.e., Davis-Putnam-Logemann-Loveland) algorithm to speed up the computation. Extensive experiments using both real and synthetic data sets confirm the superiority of BayesCrowd to the state-of-the-art method.
AB - Due to the pervasiveness of incomplete data, incomplete data queries are vital in a large number of real-life scenarios. Current models and approaches for incomplete data queries mainly rely on the machine power. In this paper, we study the problem of skyline queries over incomplete data with crowdsourcing. We propose a novel query framework, termed as BayesCrowd, on top of Bayesian network and the typical c-table model on incomplete data. Considering budget and latency constraints, we present a suite of effective task selection strategies. In particular, since the probability computation of each object being an answer object is at least as hard as #SAT problem, we propose an adaptive DPLL (i.e., Davis-Putnam-Logemann-Loveland) algorithm to speed up the computation. Extensive experiments using both real and synthetic data sets confirm the superiority of BayesCrowd to the state-of-the-art method.
UR - http://www.scopus.com/inward/record.url?scp=85085866764&partnerID=8YFLogxK
U2 - 10.1109/ICDE48307.2020.00235
DO - 10.1109/ICDE48307.2020.00235
M3 - Conference abstract in proceeding
AN - SCOPUS:85085866764
T3 - Proceedings - International Conference on Data Engineering
SP - 2032
EP - 2033
BT - Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
PB - IEEE (Institute of Electrical and Electronics Engineers)
T2 - 36th IEEE International Conference on Data Engineering, ICDE 2020
Y2 - 20 April 2020 through 24 April 2020
ER -