Lightweight graphical models for selectivity estimation without independence assumptions

Kostas Tzoumas*, Amol Deshpande, Christian S. Jensen

*Kontaktforfatter

Publikation: Bidrag til tidsskriftKonferenceartikel i tidsskriftForskningpeer review

49 Citationer (Scopus)

Abstract

As a result of decades of research and industrial development, modern query optimizers are complex software artifacts. However, the quality of the query plan chosen by an optimizer is largely determined by the quality of the underlying statistical summaries. Small selectivity estimation errors, propagated exponentially, can lead to severely sub-optimal plans. Modern optimizers typically maintain one-dimensional statistical summaries and make the attribute value independence and join uniformity assumptions for efficiently estimating selectivities. Therefore, selectivity estimation errors in today's optimizers are frequently caused by missed correlations between attributes. We present a selectivity estimation approach that does not make the independence assumptions. By carefully using concepts from the field of graphical models, we are able to factor the joint probability distribution of all the attributes in the database into small, usually two-dimensional distributions. We describe several optimizations that can make selectivity estimation highly efficient, and we present a complete implementation inside PostgreSQL's query optimizer. Experimental results indicate an order of magnitude better selectivity estimates, while keeping optimization time in the range of tens of milliseconds.

OriginalsprogEngelsk
TidsskriftProceedings of the VLDB Endowment
Vol/bind4
Sider (fra-til)852-863
Antal sider12
ISSN2150-8097
StatusUdgivet - aug. 2011
BegivenhedThe 37th International Conference on Very Large Data Bases - Seattle, Washington, USA
Varighed: 29 aug. 20113 sep. 2011

Konference

KonferenceThe 37th International Conference on Very Large Data Bases
Land/OmrådeUSA
BySeattle, Washington
Periode29/08/201103/09/2011

Fingeraftryk

Dyk ned i forskningsemnerne om 'Lightweight graphical models for selectivity estimation without independence assumptions'. Sammen danner de et unikt fingeraftryk.

Citationsformater