Lightweight graphical models for selectivity estimation without independence assumptions

Kostas Tzoumas*, Amol Deshpande, Christian S. Jensen

*Corresponding author for this work

Research output: Contribution to journalConference article in JournalResearchpeer-review

47 Citations (Scopus)

Abstract

As a result of decades of research and industrial development, modern query optimizers are complex software artifacts. However, the quality of the query plan chosen by an optimizer is largely determined by the quality of the underlying statistical summaries. Small selectivity estimation errors, propagated exponentially, can lead to severely sub-optimal plans. Modern optimizers typically maintain one-dimensional statistical summaries and make the attribute value independence and join uniformity assumptions for efficiently estimating selectivities. Therefore, selectivity estimation errors in today's optimizers are frequently caused by missed correlations between attributes. We present a selectivity estimation approach that does not make the independence assumptions. By carefully using concepts from the field of graphical models, we are able to factor the joint probability distribution of all the attributes in the database into small, usually two-dimensional distributions. We describe several optimizations that can make selectivity estimation highly efficient, and we present a complete implementation inside PostgreSQL's query optimizer. Experimental results indicate an order of magnitude better selectivity estimates, while keeping optimization time in the range of tens of milliseconds.

Original languageEnglish
JournalProceedings of the VLDB Endowment
Volume4
Pages (from-to)852-863
Number of pages12
ISSN2150-8097
Publication statusPublished - Aug 2011
EventThe 37th International Conference on Very Large Data Bases - Seattle, Washington, United States
Duration: 29 Aug 20113 Sept 2011

Conference

ConferenceThe 37th International Conference on Very Large Data Bases
Country/TerritoryUnited States
CitySeattle, Washington
Period29/08/201103/09/2011

Fingerprint

Dive into the research topics of 'Lightweight graphical models for selectivity estimation without independence assumptions'. Together they form a unique fingerprint.

Cite this