Multi-Dimensional Top-k Dominating Queries

Man Lung Yiu; Nikos Mamoulis

Multi-Dimensional Top-k Dominating Queries

Man Lung Yiu, Nikos Mamoulis

Research output: Contribution to journal › Journal article › Research › peer-review

Abstract

The top-k dominating query returns k data objects
which dominate the highest number of objects in a
dataset. This query is an important tool for decision support
since it provides data analysts an intuitive way for finding
significant objects. In addition, it combines the advantages
of top-k and skyline queries without sharing their disadvantages:
(i) the output size can be controlled, (ii) no ranking
functions need to be specified by users, and (iii) the result
is independent of the scales at different dimensions. Despite
their importance, top-k dominating queries have not
received adequate attention from the research community.
This paper is an extensive study on the evaluation of topk
dominating queries. First, we propose a set of algorithms
that apply on indexed multi-dimensional data. Second, we
investigate query evaluation on data that are not indexed. Finally,
we study a relaxed variant of the query which considers
dominance in dimensional subspaces. Experiments using
synthetic and real datasets demonstrate that our algorithms
significantly outperform a previous skyline-based approach.
We also illustrate the applicability of this multi-dimensional
analysis query by studying the meaningfulness of its results
on real data.

Original language	English
Journal	VLDB Journal
Volume	18
Issue number	3
Pages (from-to)	695-718
ISSN	1066-8888
Publication status	Published - 2009

Access to Document

http://portal.acm.org/citation.cfm?id=1553321.1553325

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

@article{676373d000f611df9a61000ea68e967b,

title = "Multi-Dimensional Top-k Dominating Queries",

abstract = "The top-k dominating query returns k data objectswhich dominate the highest number of objects in adataset. This query is an important tool for decision supportsince it provides data analysts an intuitive way for findingsignificant objects. In addition, it combines the advantagesof top-k and skyline queries without sharing their disadvantages:(i) the output size can be controlled, (ii) no rankingfunctions need to be specified by users, and (iii) the resultis independent of the scales at different dimensions. Despitetheir importance, top-k dominating queries have notreceived adequate attention from the research community.This paper is an extensive study on the evaluation of topkdominating queries. First, we propose a set of algorithmsthat apply on indexed multi-dimensional data. Second, weinvestigate query evaluation on data that are not indexed. Finally,we study a relaxed variant of the query which considersdominance in dimensional subspaces. Experiments usingsynthetic and real datasets demonstrate that our algorithmssignificantly outperform a previous skyline-based approach.We also illustrate the applicability of this multi-dimensionalanalysis query by studying the meaningfulness of its resultson real data.",

author = "Yiu, {Man Lung} and Nikos Mamoulis",

year = "2009",

language = "English",

volume = "18",

pages = "695--718",

journal = "VLDB Journal",

issn = "1066-8888",

publisher = "Springer",

number = "3",

}

TY - JOUR

T1 - Multi-Dimensional Top-k Dominating Queries

AU - Yiu, Man Lung

AU - Mamoulis, Nikos

PY - 2009

Y1 - 2009

N2 - The top-k dominating query returns k data objectswhich dominate the highest number of objects in adataset. This query is an important tool for decision supportsince it provides data analysts an intuitive way for findingsignificant objects. In addition, it combines the advantagesof top-k and skyline queries without sharing their disadvantages:(i) the output size can be controlled, (ii) no rankingfunctions need to be specified by users, and (iii) the resultis independent of the scales at different dimensions. Despitetheir importance, top-k dominating queries have notreceived adequate attention from the research community.This paper is an extensive study on the evaluation of topkdominating queries. First, we propose a set of algorithmsthat apply on indexed multi-dimensional data. Second, weinvestigate query evaluation on data that are not indexed. Finally,we study a relaxed variant of the query which considersdominance in dimensional subspaces. Experiments usingsynthetic and real datasets demonstrate that our algorithmssignificantly outperform a previous skyline-based approach.We also illustrate the applicability of this multi-dimensionalanalysis query by studying the meaningfulness of its resultson real data.

AB - The top-k dominating query returns k data objectswhich dominate the highest number of objects in adataset. This query is an important tool for decision supportsince it provides data analysts an intuitive way for findingsignificant objects. In addition, it combines the advantagesof top-k and skyline queries without sharing their disadvantages:(i) the output size can be controlled, (ii) no rankingfunctions need to be specified by users, and (iii) the resultis independent of the scales at different dimensions. Despitetheir importance, top-k dominating queries have notreceived adequate attention from the research community.This paper is an extensive study on the evaluation of topkdominating queries. First, we propose a set of algorithmsthat apply on indexed multi-dimensional data. Second, weinvestigate query evaluation on data that are not indexed. Finally,we study a relaxed variant of the query which considersdominance in dimensional subspaces. Experiments usingsynthetic and real datasets demonstrate that our algorithmssignificantly outperform a previous skyline-based approach.We also illustrate the applicability of this multi-dimensionalanalysis query by studying the meaningfulness of its resultson real data.

M3 - Journal article

SN - 1066-8888

VL - 18

SP - 695

EP - 718

JO - VLDB Journal

JF - VLDB Journal

IS - 3

ER -

Multi-Dimensional Top-k Dominating Queries

Abstract

Access to Document

AUB Link

Fingerprint

Cite this