Parameterized reinforcement learning for optical system optimization
Engineering a physical system to feature designated characteristics states an inverse design problem, which is often determined by several discrete and continuous parameters. If such a system must feature a particular behavior, the mentioned combination of both, discrete and continuous, parameters r...
Gespeichert in:
Veröffentlicht in: | Journal of physics. D, Applied physics Applied physics, 2021-07, Vol.54 (30), p.305104 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Engineering a physical system to feature designated characteristics states an inverse design problem, which is often determined by several discrete and continuous parameters. If such a system must feature a particular behavior, the mentioned combination of both, discrete and continuous, parameters results in a challenging optimization problem that requires an extensive search for an optimal system design. However, if the corresponding inverse design problem can be reformulated as a parameterized Markov decision process, reinforcement learning (RL) provides a heuristic framework to solve it. In this work, we use multi-layer thin films as an example of the aforementioned optimization problems and consider three design parameters: Each of the thin film layer’s dielectric material (discrete) and thickness (continuous), as well as the total number of layers (discrete). While recent methods merely determine the optimal thicknesses and—less commonly—the layers’ materials, our approach optimizes the total number of stacked layers as well. In summary, we further develop a Q-learning variant to solve inverse design optimization and thereby outperform human experts and current approaches like needle-point optimization or naive RL. For this purpose, we propose an exponentially transformed reward signal that eases policy search and enables constrained optimization. Moreover, the learned Q-values contain information about the optical properties of multi-layer thin films, which allows us a physical interpretation or what-if analysis and thus enables explainability. |
---|---|
ISSN: | 0022-3727 1361-6463 |
DOI: | 10.1088/1361-6463/abfddb |