Abstract
The proliferation of business intelligence applications moves most organizations into an era where data becomes an essential part of the success factors. More and more business focus has thus been added to the integration and pro
cessing of data in the enterprise environment. Developing and maintaining Extraction-Transform-Load (ETL) processes becomes critical in most data-driven organizations. External Data Sources (EDSs) often change their schema which potentially leaves the ETL processes that extract data
from those EDSs invalid. Repairing these ETL processes is time-consuming and tedious. As a remedy, we propose MAIME as a tool to (semi-)automatically maintain ETL processes. MAIME works with SQL Server Integration Services (SSIS) and uses a graph model as a layer of abstraction
on top of SSIS Data Flow tasks (ETL processes). We introduce a graph alteration algorithm which propagates detected EDS schema changes through the graph. Modifications done to a graph are directly applied to the underlying ETL process. It can be configured how MAIME handles EDS schema changes for different SSIS transformations. For the considered set of transformations, MAIME can maintain SSIS Data Flow tasks (semi-)automatically. Compared to doing
this manually, the amount of user inputs is decreased by a factor of 9.5 and the spent time is reduced by a factor of 9.8 in an evaluation.
cessing of data in the enterprise environment. Developing and maintaining Extraction-Transform-Load (ETL) processes becomes critical in most data-driven organizations. External Data Sources (EDSs) often change their schema which potentially leaves the ETL processes that extract data
from those EDSs invalid. Repairing these ETL processes is time-consuming and tedious. As a remedy, we propose MAIME as a tool to (semi-)automatically maintain ETL processes. MAIME works with SQL Server Integration Services (SSIS) and uses a graph model as a layer of abstraction
on top of SSIS Data Flow tasks (ETL processes). We introduce a graph alteration algorithm which propagates detected EDS schema changes through the graph. Modifications done to a graph are directly applied to the underlying ETL process. It can be configured how MAIME handles EDS schema changes for different SSIS transformations. For the considered set of transformations, MAIME can maintain SSIS Data Flow tasks (semi-)automatically. Compared to doing
this manually, the amount of user inputs is decreased by a factor of 9.5 and the spent time is reduced by a factor of 9.8 in an evaluation.
Original language | English |
---|---|
Title of host publication | Proceedings of the Workshops of the EDBT/ICDT 2017 Joint Conference (EDBT/ICDT 2017) |
Number of pages | 10 |
Publisher | CEUR Workshop Proceedings |
Publication date | 15 Mar 2017 |
Article number | 8 |
Publication status | Published - 15 Mar 2017 |
Event | Nineteenth International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data - Venice, Italy Duration: 21 Mar 2017 → 24 Mar 2017 Conference number: 19 http://www.info.univ-tours.fr/~marcel/dolap2017/ |
Workshop
Workshop | Nineteenth International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data |
---|---|
Number | 19 |
Country/Territory | Italy |
City | Venice |
Period | 21/03/2017 → 24/03/2017 |
Internet address |
Series | CEUR Workshop Proceedings |
---|---|
Volume | 1810 |
ISSN | 1613-0073 |