Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN

Asier Garmendia-Orbegozo*, Jose David Nuñez-Gonzalez, Miguel Angel Anton Gonzalez, Manuel Graña

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    In many environments in which detection of minority class instances is critical, the available data intended for training Machine Learning models is poorly distributed. The data imbalance usually produces deterioration of the trained model by generalisation of instances belonging to minority class predicting as majority class instances. To avoid these, different techniques have been adopted in the literature and expand the original database such as Generative Adversarial Networks (GANs) or Bayesian network-based over-sampling method (BOSME). Starting from these two methods, in this work we propose three new variants of data augmentation to address data imbalance issue. We use traffic event data from three different areas of California divided in two subgroups attending their severity. Experiments show that top performance cases where reached after using our variants. The importance of data augmentation techniques as preprocessing tool has been proved as well, as a consequence of performance drop of systems in which original databases with imbalanced data where used.

    Original languageEnglish
    Title of host publicationHybrid Artificial Intelligent Systems - 18th International Conference, HAIS 2023, Proceedings
    EditorsPablo García Bringas, Hilde Pérez García, Francisco Javier Martínez de Pisón, Francisco Martínez Álvarez, Alicia Troncoso Lora, Álvaro Herrero, José Luis Calvo Rolle, Héctor Quintián, Emilio Corchado
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages145-155
    Number of pages11
    ISBN (Print)9783031407246
    DOIs
    Publication statusPublished - 2023
    EventProceedings of the 18th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2023 - Salamanca, Spain
    Duration: 5 Sept 20237 Sept 2023

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume14001 LNAI
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    ConferenceProceedings of the 18th International Conference on Hybrid Artificial Intelligence Systems, HAIS 2023
    Country/TerritorySpain
    CitySalamanca
    Period5/09/237/09/23

    Keywords

    • Data augmentation
    • Data imbalance
    • GANs

    Fingerprint

    Dive into the research topics of 'Comprehensive Analysis of Different Techniques for Data Augmentation and Proposal of New Variants of BOSME and GAN'. Together they form a unique fingerprint.

    Cite this