Towards the Design, Quality Assessment and Explainability of Synthetic Tabular Data Generation Techniques for Metabolic Syndrome Diagnosis

Diana Manjarrés*, Begoña Ispizua, Iratxe Niño-Adan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In last years decision-making Machine Leaning (ML) approaches have evolved from traditional methods to evidence-based approaches, particularly in healthcare sector. However, sharing data with third parties raises significant security and privacy concerns. To address these issues, researchers have explored data anonymization, distributed privacypreserving data mining, and synthetic data generation (SDG). SDG, in particular, shows promise in enabling secure data sharing while preserving privacy, crucial for developing advanced AI models. This paper focuses on Metabolic Syndrome (MetS) data, a condition affecting a significant portion of the population, and investigates various synthetic tabular data generation (STDG)techniques. It evaluates the performance of an AutoML approach for predicting MetS using different percentages of synthetic data assessed through a specific evaluation framework. Moreover, presents an explainability and feature relevance analysis of the proposed STDG methods.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
EditorsMario Cannataro, Huiru Zheng, Lin Gao, Jianlin Cheng, Joao Luis de Miranda, Ester Zumpano, Xiaohua Hu, Young-Rae Cho, Taesung Park
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5009-5015
Number of pages7
ISBN (Electronic)9798350386226
DOIs
Publication statusPublished - 2024
Event2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 - Lisbon, Portugal
Duration: 3 Dec 20246 Dec 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024

Conference

Conference2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024
Country/TerritoryPortugal
CityLisbon
Period3/12/246/12/24

Keywords

  • classification
  • machine learning
  • metabolic syndrome
  • synthetic data evaluation
  • synthetic data generation

Fingerprint

Dive into the research topics of 'Towards the Design, Quality Assessment and Explainability of Synthetic Tabular Data Generation Techniques for Metabolic Syndrome Diagnosis'. Together they form a unique fingerprint.

Cite this