Abstract
Artificial Neural Networks (ANNs) are weighted directed graphs of interconnected neurons widely employed to model complex problems. However, the selection of the optimal ANN architecture and its training parameters is not enough to obtain reliable models. The data preprocessing stage is fundamental to improve the model’s performance. Specifically, Feature Normalisation (FN) is commonly utilised to remove the features’ magnitude aiming at equalising the features’ contribution to the model training. Nevertheless, this work demonstrates that the FN method selection affects the model performance. Also, it is well-known that ANNs are commonly considered a “black box” due to their lack of interpretability. In this sense, several works aim to analyse the features’ contribution to the network for estimating the output. However, these methods, specifically those based on network’s weights, like Garson’s or Yoon’s methods, do not consider preprocessing factors, such as dispersion factors , previously employed to transform the input data. This work proposes a new features’ relevance analysis method that includes the dispersion factors into the weight matrix analysis methods to infer each feature’s actual contribution to the network output more precisely. Besides, in this work, the Proportional Dispersion Weights (PWD) are proposed as explanatory factors of similarity between models’ performance results. The conclusions from this work improve the understanding of the features’ contribution to the model that enhances the feature selection strategy, which is fundamental for reliably modelling a given problem.
Original language | English |
---|---|
Pages (from-to) | 125462-125477 |
Number of pages | 16 |
Journal | IEEE Access |
Volume | 9 |
DOIs | |
Publication status | Published - 2021 |
Keywords
- Artificial neural networks
- Explainability
- Feature contribution
- Feature normalization
Project and Funding Information
- Funding Info
- This work was supported in part by DATA Inc. Fellowship under Grant 48-AF-W1-2019-00002, in part by Tecnalia Research and Innovation Ph.D. Scholarship, in part by the Spanish Centro para el Desarrollo Tecnológico Industrial (CDTI, Ministry of Science and Innovation) through the ‘‘Red Cervera’’ Programme (AI4ES Project) under Grant CER-20191029, and in part by the 3KIA Project funded by the ELKARTEK Program of the SPRI-Basque Government under Grant KK-2020/00049.