MAIME: A Maintenance Manager for ETL Processes

Darius Butkevicius, Philipp Daniel Freiberger, Frederik Madsen Halberg, Jacob Bach Hansen, Søren Jensen, Michael Tarp, Harry Xuegang Huang, Christian Thomsen

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

3 Citations (Scopus)
145 Downloads (Pure)

Abstract

The proliferation of business intelligence applications moves most organizations into an era where data becomes an essential part of the success factors. More and more business focus has thus been added to the integration and pro
cessing of data in the enterprise environment. Developing and maintaining Extraction-Transform-Load (ETL) processes becomes critical in most data-driven organizations. External Data Sources (EDSs) often change their schema which potentially leaves the ETL processes that extract data
from those EDSs invalid. Repairing these ETL processes is time-consuming and tedious. As a remedy, we propose MAIME as a tool to (semi-)automatically maintain ETL processes. MAIME works with SQL Server Integration Services (SSIS) and uses a graph model as a layer of abstraction
on top of SSIS Data Flow tasks (ETL processes). We introduce a graph alteration algorithm which propagates detected EDS schema changes through the graph. Modifications done to a graph are directly applied to the underlying ETL process. It can be configured how MAIME handles EDS schema changes for different SSIS transformations. For the considered set of transformations, MAIME can maintain SSIS Data Flow tasks (semi-)automatically. Compared to doing
this manually, the amount of user inputs is decreased by a factor of 9.5 and the spent time is reduced by a factor of 9.8 in an evaluation.
Original languageEnglish
Title of host publicationProceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017)
Number of pages10
PublisherCEUR Workshop Proceedings
Publication date15 Mar 2017
Article number8
Publication statusPublished - 15 Mar 2017
EventNineteenth International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data - Venice, Italy
Duration: 21 Mar 201724 Mar 2017
Conference number: 19
http://www.info.univ-tours.fr/~marcel/dolap2017/

Workshop

WorkshopNineteenth International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data
Number19
Country/TerritoryItaly
CityVenice
Period21/03/201724/03/2017
Internet address
SeriesCEUR Workshop Proceedings
Volume1810
ISSN1613-0073

Fingerprint

Dive into the research topics of 'MAIME: A Maintenance Manager for ETL Processes'. Together they form a unique fingerprint.

Cite this