Training Hybrid Neural Networks with Multimode Optical Nonlinearities Using Digital Twins
The ability to train ever-larger neural networks brings artificial intelligence to the forefront of scientific and technical discoveries. However, their exponentially increasing size creates a proportionally greater demand for energy and computational hardware. Incorporating complex physical events...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The ability to train ever-larger neural networks brings artificial
intelligence to the forefront of scientific and technical discoveries. However,
their exponentially increasing size creates a proportionally greater demand for
energy and computational hardware. Incorporating complex physical events in
networks as fixed, efficient computation modules can address this demand by
decreasing the complexity of trainable layers. Here, we utilize ultrashort
pulse propagation in multimode fibers, which perform large-scale nonlinear
transformations, for this purpose. Training the hybrid architecture is achieved
through a neural model that differentiably approximates the optical system. The
training algorithm updates the neural simulator and backpropagates the error
signal over this proxy to optimize layers preceding the optical one. Our
experimental results achieve state-of-the-art image classification accuracies
and simulation fidelity. Moreover, the framework demonstrates exceptional
resilience to experimental drifts. By integrating low-energy physical systems
into neural networks, this approach enables scalable, energy-efficient AI
models with significantly reduced computational demands. |
---|---|
DOI: | 10.48550/arxiv.2501.07991 |