Abstract
Given the increasing incidence of sophisticated cyber-attacks, particularly Advanced Persistent Threats (APTs), there is a growing need for intelligent and adaptive intrusion response solutions. In this paper, we propose a Reinforcement Learning (RL)-based model for APT intrusion response that can manage dynamic, multi-stage attacks and large observation spaces. The model supports both policy-based and value-based learning approaches, enabling comparative evaluation between different strategies. We introduce a realistic RL training environment based on emulation infrastructure, which accurately reproduces APT scenarios using real systems and executes a wide range of authentic Intrusion Response System (IRS) actions. This setup includes time and variability constraints commonly encountered in operational environments, offering a more practical alternative to traditional simulations. The RL agents, implemented using Proximal Policy Optimization (PPO) and Deep Q-Network (DQN) algorithms, were both trained and evaluated within this industrial-style emulated environment. Empirical results demonstrate that both DRL algorithms successfully learned effective and well-timed defensive actions under realistic constraints, confirming their capability to operate in dynamic, real-world APT scenarios.
| Original language | English |
|---|---|
| Article number | 129168 |
| Journal | Expert Systems with Applications |
| Volume | 296 |
| DOIs | |
| Publication status | Published - 15 Jan 2026 |
Keywords
- APT
- Advanced persistent threat
- Decision making
- IRS
- Intrusion response system
- Multi-stage attack
- Reinforcement learning
Fingerprint
Dive into the research topics of 'Reinforcement Learning in action: Powering intelligent intrusion responses to advanced cyber threats in realistic scenarios'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver