Evolution of Extract-Transform-Load (ETL) processes towards data product pipelines

Gorka Zarate, María Jose Lopez Osa, Ana I. Torre-Bastida, Urtza Iturraspe, Jordi Arjona, Benjamín Navarro, Antoni Gimeno

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    The rise of data as a first-class asset has led to creating infrastructures and tools designed to enhance organizations' abilities to monetize them internally. One of the most powerful tools have been ETLs, which govern the internal data operations, assisting in companies in their quest to becoming data-driven. Lately, these horizons have expanded with the apparition of new ecosystems for data exchange, such as Data Spaces or initiatives like Gaia-X or SIMPL, allowing companies to monetize data externally, e.g. sharing or selling them. However, traditional ETLs fall short to serve this purpose. In this article, we try to offer a technological comparison of how current ETL tools are prepared to address the new concept of data pipeline aimed at achieving a data product. Furthermore, this comparison is proposed within the framework of a project like DATAMITE, which allows it to be provided with real scenarios and use cases, in which its benefits and applicability can be accurately appreciated.

    Original languageEnglish
    Title of host publicationProceedings of 4th Eclipse Security, AI, Architecture and Modelling Conference on Data Spaces, eSAAM 2024
    PublisherAssociation for Computing Machinery
    Pages25-32
    Number of pages8
    ISBN (Electronic)9798400709845
    DOIs
    Publication statusPublished - 22 Oct 2024
    Event4th Eclipse Security, AI, Architecture and Modelling Conference on Data Spaces, eSAAM 2024, co-located with Eclipse Open Community Experience 2024 - Mainz, Germany
    Duration: 22 Oct 2024 → …

    Publication series

    NameACM International Conference Proceeding Series

    Conference

    Conference4th Eclipse Security, AI, Architecture and Modelling Conference on Data Spaces, eSAAM 2024, co-located with Eclipse Open Community Experience 2024
    Country/TerritoryGermany
    CityMainz
    Period22/10/24 → …

    Keywords

    • Cloud Continuum
    • Data Discovery
    • Data Product
    • Data Storage
    • Extract-Transform-Load

    Fingerprint

    Dive into the research topics of 'Evolution of Extract-Transform-Load (ETL) processes towards data product pipelines'. Together they form a unique fingerprint.

    Cite this