TY - JOUR
T1 - Offline reinforcement learning for job-shop scheduling problems
AU - Echeverria, Imanol
AU - Murua, Maialen
AU - Santana, Roberto
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/12
Y1 - 2025/12
N2 - Recent advances in deep learning have shown significant potential for solving combinatorial optimization problems in real-time. Unlike traditional methods, deep learning can generate high-quality solutions efficiently, which is crucial for applications like routing and scheduling. However, existing approaches like deep reinforcement learning (RL) and behavioral cloning have notable limitations, with deep RL suffering from slow learning and behavioral cloning relying solely on expert actions, which can lead to generalization issues and neglect of the optimization objective. Offline RL addresses these challenges by learning from fixed datasets while leveraging reward signals, making it especially suitable for constrained combinatorial problems where online exploration is impractical. This paper introduces a novel offline RL method designed for combinatorial optimization problems with complex constraints, where the state is represented as a heterogeneous graph and the action space is variable. Our approach encodes actions in edge attributes and balances expected rewards with the imitation of expert solutions. We demonstrate the effectiveness of this method on job-shop scheduling and flexible job-shop scheduling benchmarks, achieving superior performance compared to state-of-the-art techniques.
AB - Recent advances in deep learning have shown significant potential for solving combinatorial optimization problems in real-time. Unlike traditional methods, deep learning can generate high-quality solutions efficiently, which is crucial for applications like routing and scheduling. However, existing approaches like deep reinforcement learning (RL) and behavioral cloning have notable limitations, with deep RL suffering from slow learning and behavioral cloning relying solely on expert actions, which can lead to generalization issues and neglect of the optimization objective. Offline RL addresses these challenges by learning from fixed datasets while leveraging reward signals, making it especially suitable for constrained combinatorial problems where online exploration is impractical. This paper introduces a novel offline RL method designed for combinatorial optimization problems with complex constraints, where the state is represented as a heterogeneous graph and the action space is variable. Our approach encodes actions in edge attributes and balances expected rewards with the imitation of expert solutions. We demonstrate the effectiveness of this method on job-shop scheduling and flexible job-shop scheduling benchmarks, achieving superior performance compared to state-of-the-art techniques.
KW - Deep neural networks
KW - Graph neural networks
KW - Heterogeneous data
KW - Job-shop scheduling problem
KW - Offline reinforcement learning
UR - https://www.scopus.com/pages/publications/105013848926
U2 - 10.1016/j.asoc.2025.113736
DO - 10.1016/j.asoc.2025.113736
M3 - Article
AN - SCOPUS:105013848926
SN - 1568-4946
VL - 184
JO - Applied Soft Computing Journal
JF - Applied Soft Computing Journal
M1 - 113736
ER -