Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN

Asier Garmendia-Orbegozo*, Jose David Nuñez-Gonzalez, Miguel Angel Anton Gonzalez, Manuel Graña

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In many environments in which detection of minority class instances is critical, the available data intended for training Machine Learning models is poorly distributed. The data imbalance usually produces deterioration of the trained model by generalisation of instances belonging to minority class predicting as majority class instances. To avoid these, different techniques have been adopted in the literature and expand the original database such as Generative Adversarial Networks (GANs) or Bayesian network-based over-sampling method (BOSME). Starting from these two methods, in this work we propose three new variants of data augmentation to address data imbalance issue. We use traffic event data from three different areas of California divided in two subgroups attending their severity. Experiments show that top performance cases where reached after using our variants. The importance of data augmentation techniques as preprocessing tool has been proved as well, as a consequence of performance drop of systems in which original databases with imbalanced data where used.

Original languageEnglish
Title of host publicationHybrid Artificial Intelligent Systems - 18th International Conference, HAIS 2023, Proceedings
EditorsPablo García Bringas, Hilde Pérez García, Francisco Javier Martínez de Pisón, Francisco Martínez Álvarez, Alicia Troncoso Lora, Álvaro Herrero, José Luis Calvo Rolle, Héctor Quintián, Emilio Corchado
PublisherSpringer Science and Business Media Deutschland GmbH
Pages145-155
Number of pages11
ISBN (Print)9783031407246
DOIs
Publication statusPublished - 2023
EventProceedings of the 18th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2023 - Salamanca, Spain
Duration: 5 Sept 20237 Sept 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14001 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceProceedings of the 18th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2023
Country/TerritorySpain
CitySalamanca
Period5/09/237/09/23

Keywords

  • Data augmentation
  • Data imbalance
  • GANs

Fingerprint

Dive into the research topics of 'Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN'. Together they form a unique fingerprint.

Cite this