TY - GEN
T1 - Advancing towards Safe Reinforcement Learning over Sparse Environments with Out-of-Distribution Observations
T2 - 2024 International Joint Conference on Neural Networks, IJCNN 2024
AU - Martinez-Seras, Aitor
AU - Andres, Alain
AU - Del Ser, Javier
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Safety in AI-based systems is among the highest research priorities, particularly when such systems are deployed in real-world scenarios subject to uncertainties and unpredictable inputs. Among them, the presence of long-tailed stimuli (Out-of-Distribution data, OoD) has captured much interest in recent times, giving rise to many proposals over the years to detect unfamiliar inputs to the model and adapt its knowledge accordingly. This work analyzes several OoD detection and adaptation strategies for Reinforcement Learning agents over environments with sparse reward signals. The sparsity of rewards and the impact of OoD objects on the state transition distribution learned by the agent are shown to be crucial when it comes to the design of effective knowledge transfer methods once OoD objects are detected. Furthermore, different approaches to detect OoD elements within the observation of the agent are also outlined, stressing on their benefits and potential downsides. Experiments with procedurally generated environments are performed to assess the performance of the considered OoD detection techniques, and to gauge the impact of the adaptation strategies on the generalization capability of the RL agent. The results pave the way towards further research around the provision of safety guarantees in sparse open-world Reinforcement Learning environments.
AB - Safety in AI-based systems is among the highest research priorities, particularly when such systems are deployed in real-world scenarios subject to uncertainties and unpredictable inputs. Among them, the presence of long-tailed stimuli (Out-of-Distribution data, OoD) has captured much interest in recent times, giving rise to many proposals over the years to detect unfamiliar inputs to the model and adapt its knowledge accordingly. This work analyzes several OoD detection and adaptation strategies for Reinforcement Learning agents over environments with sparse reward signals. The sparsity of rewards and the impact of OoD objects on the state transition distribution learned by the agent are shown to be crucial when it comes to the design of effective knowledge transfer methods once OoD objects are detected. Furthermore, different approaches to detect OoD elements within the observation of the agent are also outlined, stressing on their benefits and potential downsides. Experiments with procedurally generated environments are performed to assess the performance of the considered OoD detection techniques, and to gauge the impact of the adaptation strategies on the generalization capability of the RL agent. The results pave the way towards further research around the provision of safety guarantees in sparse open-world Reinforcement Learning environments.
KW - Open-World Learning
KW - Out-of-Distribution (OoD)
KW - Reinforcement Learning
KW - Sparse Rewards
UR - http://www.scopus.com/inward/record.url?scp=85205005846&partnerID=8YFLogxK
U2 - 10.1109/IJCNN60899.2024.10650670
DO - 10.1109/IJCNN60899.2024.10650670
M3 - Conference contribution
AN - SCOPUS:85205005846
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 30 June 2024 through 5 July 2024
ER -