Minimizing the Late Work of the Flow Shop Scheduling Problem with a Deep Reinforcement Learning Based Approach
In the field of industrial manufacturing, assembly line production is the most common production process that can be modeled as a permutation flow shop scheduling problem (PFSP). Minimizing the late work criteria (tasks remaining after due dates arrive) of production planning can effectively reduce...
Gespeichert in:
Veröffentlicht in: | Applied sciences 2022-03, Vol.12 (5), p.2366 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In the field of industrial manufacturing, assembly line production is the most common production process that can be modeled as a permutation flow shop scheduling problem (PFSP). Minimizing the late work criteria (tasks remaining after due dates arrive) of production planning can effectively reduce production costs and allow for faster product delivery. In this article, a novel learning-based approach is proposed to minimize the late work of the PFSP using deep reinforcement learning (DRL) and graph isomorphism network (GIN), which is an innovative combination of the field of combinatorial optimization and deep learning. The PFSPs are the well-known permutation flow shop problem and each job comes with a release date constraint. In this work, the PFSP is defined as a Markov decision process (MDP) that can be solved by reinforcement learning (RL). A complete graph is introduced for describing the PFSP instance. The proposed policy network combines the graph representation of PFSP and the sequence information of jobs to predict the distribution of candidate jobs. The policy network will be invoked multiple times until a complete sequence is obtained. In order to further improve the quality of the solution obtained by reinforcement learning, an improved iterative greedy (IG) algorithm is proposed to search the solution locally. The experimental results show that the proposed RL and the combined method of RL+IG can obtain better solutions than other excellent heuristic and meta-heuristic algorithms in a short time. |
---|---|
ISSN: | 2076-3417 2076-3417 |
DOI: | 10.3390/app12052366 |