TY - GEN
T1 - K2E
T2 - 19th IEEE International Conference on Software Architecture Companion, ICSA-C 2022
AU - Zarate, Gorka
AU - Minon, Raul
AU - Diaz-De-Arcaya, Josu
AU - Torre-Bastida, Ana I.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Nowadays, there are a variety of problems associated with the process of extracting value and information from data such as: Data heterogeneity, data distribution, model versioning, and the vast variety of techniques and approaches. Due to all this, the data management process becomes hard to implement in real world scenarios. In this context, the catalogue tools for data and Artificial Intelligence models alleviate the burden of dealing with versioning tasks. Thus, the automation of the data and models' management processes is facilitated, complying with DataOps and MLOps good practices. This work in progress enumerates key challenges to address when creating these types of catalogues: On the one hand, the management of the diversity of data and models' internal nature and their different versions, and on the other hand, the provision of adequate meta-information and Governance tools such as access control and auditing. In this paper, the Knowledge to Environment (K2E) platform is presented, whose architecture aims to define the necessary components for the creation of environments that allow working with data and model catalogues. By environment creation, we mean providing a workspace populated with the datasets and models of an organization, while tracking their distinct versions by using specialised catalogues. In addition, this workspace will incorporate added-value tools for governance and auditing. Finally, an approach for implementing K2E is detailed.
AB - Nowadays, there are a variety of problems associated with the process of extracting value and information from data such as: Data heterogeneity, data distribution, model versioning, and the vast variety of techniques and approaches. Due to all this, the data management process becomes hard to implement in real world scenarios. In this context, the catalogue tools for data and Artificial Intelligence models alleviate the burden of dealing with versioning tasks. Thus, the automation of the data and models' management processes is facilitated, complying with DataOps and MLOps good practices. This work in progress enumerates key challenges to address when creating these types of catalogues: On the one hand, the management of the diversity of data and models' internal nature and their different versions, and on the other hand, the provision of adequate meta-information and Governance tools such as access control and auditing. In this paper, the Knowledge to Environment (K2E) platform is presented, whose architecture aims to define the necessary components for the creation of environments that allow working with data and model catalogues. By environment creation, we mean providing a workspace populated with the datasets and models of an organization, while tracking their distinct versions by using specialised catalogues. In addition, this workspace will incorporate added-value tools for governance and auditing. Finally, an approach for implementing K2E is detailed.
KW - automation
KW - catalogues
KW - data
KW - datalake
KW - DataOps
KW - dataset
KW - management
KW - metadata
KW - MlOps
KW - models
KW - versioning
UR - http://www.scopus.com/inward/record.url?scp=85132192572&partnerID=8YFLogxK
U2 - 10.1109/ICSA-C54293.2022.00047
DO - 10.1109/ICSA-C54293.2022.00047
M3 - Conference contribution
AN - SCOPUS:85132192572
T3 - 2022 IEEE 19th International Conference on Software Architecture Companion, ICSA-C 2022
SP - 206
EP - 209
BT - 2022 IEEE 19th International Conference on Software Architecture Companion, ICSA-C 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 12 March 2022 through 15 March 2022
ER -