Abstract
Massive quantities of data are today collected from many sources. However, it is often labor-intensive to handle and integrate these data sources into a data warehouse. Further, the complexity is increased when specific requirements exist. One such new requirement, is the right to be forgotten where an organization upon request must delete all data about an individual. Another requirement is when facts are updated retrospectively. In this paper, we present the general framework SimpleETL which is currently used for Extract-Transform-Load (ETL) processing in a company with such requirements. SimpleETL automatically handles all database interactions such as creating fact tables, dimensions, and foreign keys. The framework also has features for handling
version management of facts and implements four different methods for handling deleted facts. The framework enables, e.g., data scientists, to program complete and complex ETL solutions very efficiently with only few lines of code, which is demonstrated with a real-world example.
version management of facts and implements four different methods for handling deleted facts. The framework enables, e.g., data scientists, to program complete and complex ETL solutions very efficiently with only few lines of code, which is demonstrated with a real-world example.
Original language | English |
---|---|
Title of host publication | Proceedings of the 20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with 10th EDBT/ICDT Joint Conference |
Number of pages | 6 |
Volume | 2062 |
Publisher | CEUR Workshop Proceedings |
Publication date | 1 Jan 2018 |
Publication status | Published - 1 Jan 2018 |
Event | 20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with 10th EDBT/ICDT Joint Conference - TU Wien's Faculty of Electrical Engineering, Wien, Austria Duration: 26 Mar 2018 → 29 Mar 2018 Conference number: 20 http://www.cs.put.poznan.pl/events/DOLAP2018.html |
Conference
Conference | 20th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data co-located with 10th EDBT/ICDT Joint Conference |
---|---|
Number | 20 |
Location | TU Wien's Faculty of Electrical Engineering |
Country/Territory | Austria |
City | Wien |
Period | 26/03/2018 → 29/03/2018 |
Internet address |
Series | CEUR Workshop Proceedings |
---|---|
Volume | 2062 |
ISSN | 1613-0073 |