TY - GEN
T1 - Evolution of Extract-Transform-Load (ETL) processes towards data product pipelines
AU - Zarate, Gorka
AU - Lopez Osa, María Jose
AU - Torre-Bastida, Ana I.
AU - Iturraspe, Urtza
AU - Arjona, Jordi
AU - Navarro, Benjamín
AU - Gimeno, Antoni
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/10/22
Y1 - 2024/10/22
N2 - The rise of data as a first-class asset has led to creating infrastructures and tools designed to enhance organizations' abilities to monetize them internally. One of the most powerful tools have been ETLs, which govern the internal data operations, assisting in companies in their quest to becoming data-driven. Lately, these horizons have expanded with the apparition of new ecosystems for data exchange, such as Data Spaces or initiatives like Gaia-X or SIMPL, allowing companies to monetize data externally, e.g. sharing or selling them. However, traditional ETLs fall short to serve this purpose. In this article, we try to offer a technological comparison of how current ETL tools are prepared to address the new concept of data pipeline aimed at achieving a data product. Furthermore, this comparison is proposed within the framework of a project like DATAMITE, which allows it to be provided with real scenarios and use cases, in which its benefits and applicability can be accurately appreciated.
AB - The rise of data as a first-class asset has led to creating infrastructures and tools designed to enhance organizations' abilities to monetize them internally. One of the most powerful tools have been ETLs, which govern the internal data operations, assisting in companies in their quest to becoming data-driven. Lately, these horizons have expanded with the apparition of new ecosystems for data exchange, such as Data Spaces or initiatives like Gaia-X or SIMPL, allowing companies to monetize data externally, e.g. sharing or selling them. However, traditional ETLs fall short to serve this purpose. In this article, we try to offer a technological comparison of how current ETL tools are prepared to address the new concept of data pipeline aimed at achieving a data product. Furthermore, this comparison is proposed within the framework of a project like DATAMITE, which allows it to be provided with real scenarios and use cases, in which its benefits and applicability can be accurately appreciated.
KW - Cloud Continuum
KW - Data Discovery
KW - Data Product
KW - Data Storage
KW - Extract-Transform-Load
UR - http://www.scopus.com/inward/record.url?scp=85208783521&partnerID=8YFLogxK
U2 - 10.1145/3685651.3686662
DO - 10.1145/3685651.3686662
M3 - Conference contribution
AN - SCOPUS:85208783521
T3 - ACM International Conference Proceeding Series
SP - 25
EP - 32
BT - Proceedings of 4th Eclipse Security, AI, Architecture and Modelling Conference on Data Spaces, eSAAM 2024
PB - Association for Computing Machinery
T2 - 4th Eclipse Security, AI, Architecture and Modelling Conference on Data Spaces, eSAAM 2024, co-located with Eclipse Open Community Experience 2024
Y2 - 22 October 2024
ER -