TY - JOUR
T1 - DRED
T2 - An evolutionary diversity generation method for concept drift adaptation in online learning environments
AU - L. Lobo, Jesus
AU - Del Ser, Javier
AU - Bilbao, Miren Nekane
AU - Perfecto, Cristina
AU - Salcedo-Sanz, Sancho
N1 - Publisher Copyright:
© 2017 Elsevier B.V.
PY - 2018/7
Y1 - 2018/7
N2 - Nowadays fast-arriving information flows lay the basis of many data mining applications. Such data streams are usually affected by non-stationary events that eventually change their distribution (concept drift), causing that predictive models trained over these data become obsolete and do not adapt suitably to the new distribution. Specially in online learning scenarios, there is a pressing need for new algorithms that adapt to this change as fast as possible, while maintaining good performance scores. Recent studies have revealed that a good strategy is to construct highly diverse ensembles towards utilizing them shortly after the drift (independently from the type of drift) to obtain good performance scores. However, the existence of the so-called trade-off between stability (performance over stable data concepts) and plasticity (recovery and adaptation after drift events) implies that the construction of the ensemble model should account simultaneously for these two conflicting objectives. In this regard, this work presents a new approach to artificially generate an optimal diversity level when building prediction ensembles once shortly after a drift occurs. The approach uses a Kernel Density Estimation (KDE) method to generate synthetic data, which are subsequently labeled by means a multi-objective optimization method that allows training each model of the ensemble with a different subset of synthetic samples. Computational experiments reveal that the proposed approach can be hybridized with other traditional diversity generation approaches, yielding optimized levels of diversity that render an enhanced recovery from drifts.
AB - Nowadays fast-arriving information flows lay the basis of many data mining applications. Such data streams are usually affected by non-stationary events that eventually change their distribution (concept drift), causing that predictive models trained over these data become obsolete and do not adapt suitably to the new distribution. Specially in online learning scenarios, there is a pressing need for new algorithms that adapt to this change as fast as possible, while maintaining good performance scores. Recent studies have revealed that a good strategy is to construct highly diverse ensembles towards utilizing them shortly after the drift (independently from the type of drift) to obtain good performance scores. However, the existence of the so-called trade-off between stability (performance over stable data concepts) and plasticity (recovery and adaptation after drift events) implies that the construction of the ensemble model should account simultaneously for these two conflicting objectives. In this regard, this work presents a new approach to artificially generate an optimal diversity level when building prediction ensembles once shortly after a drift occurs. The approach uses a Kernel Density Estimation (KDE) method to generate synthetic data, which are subsequently labeled by means a multi-objective optimization method that allows training each model of the ensemble with a different subset of synthetic samples. Computational experiments reveal that the proposed approach can be hybridized with other traditional diversity generation approaches, yielding optimized levels of diversity that render an enhanced recovery from drifts.
KW - Concept drift
KW - Diversity
KW - Evolutionary computation
KW - Online learning
UR - http://www.scopus.com/inward/record.url?scp=85032336679&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2017.10.004
DO - 10.1016/j.asoc.2017.10.004
M3 - Article
AN - SCOPUS:85032336679
SN - 1568-4946
VL - 68
SP - 693
EP - 709
JO - Applied Soft Computing Journal
JF - Applied Soft Computing Journal
ER -