SimpleETL: ETL Processing by Simple Specifications

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

2 Citations (Scopus)
367 Downloads (Pure)

Abstract

Massive quantities of data are today collected from many sources. However, it is often labor-intensive to handle and integrate these data sources into a data warehouse. Further, the complexity is increased when specific requirements exist. One such new requirement, is the right to be forgotten where an organization upon request must delete all data about an individual. Another requirement is when facts are updated retrospectively. In this paper, we present the general framework SimpleETL which is currently used for Extract-Transform-Load (ETL) processing in a company with such requirements. SimpleETL automatically handles all database interactions such as creating fact tables, dimensions, and foreign keys. The framework also has features for handling
version management of facts and implements four different methods for handling deleted facts. The framework enables, e.g., data scientists, to program complete and complex ETL solutions very efficiently with only few lines of code, which is demonstrated with a real-world example.
Original languageEnglish
Title of host publicationProceedings of the 20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with 10th EDBT/ICDT Joint Conference
Number of pages6
Volume2062
PublisherCEUR Workshop Proceedings
Publication date1 Jan 2018
Publication statusPublished - 1 Jan 2018
Event20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with 10th EDBT/ICDT Joint Conference - TU Wien's Faculty of Electrical Engineering, Wien, Austria
Duration: 26 Mar 201829 Mar 2018
Conference number: 20
http://www.cs.put.poznan.pl/events/DOLAP2018.html

Conference

Conference20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with 10th EDBT/ICDT Joint Conference
Number20
LocationTU Wien's Faculty of Electrical Engineering
Country/TerritoryAustria
CityWien
Period26/03/201829/03/2018
Internet address
SeriesCEUR Workshop Proceedings
Volume2062
ISSN1613-0073

Cite this