Neuromorphic spatiotemporal optical flow: Enabling ultrafast visual perception beyond human capabilities
Optical flow, inspired by the mechanisms of biological visual systems, calculates spatial motion vectors within visual scenes that are necessary for enabling robotics to excel in complex and dynamic working environments. However, current optical flow algorithms, despite human-competitive task perfor...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Optical flow, inspired by the mechanisms of biological visual systems,
calculates spatial motion vectors within visual scenes that are necessary for
enabling robotics to excel in complex and dynamic working environments.
However, current optical flow algorithms, despite human-competitive task
performance on benchmark datasets, remain constrained by unacceptable time
delays (~0.6 seconds per inference, 4X human processing speed) in practical
deployment. Here, we introduce a neuromorphic optical flow approach that
addresses delay bottlenecks by encoding temporal information directly in a
synaptic transistor array to assist spatial motion analysis. Compared to
conventional spatial-only optical flow methods, our spatiotemporal neuromorphic
optical flow offers the spatial-temporal consistency of motion information,
rapidly identifying regions of interest in as little as 1-2 ms using the
temporal motion cues derived from the embedded temporal information in the
two-dimensional floating gate synaptic transistors. Thus, the visual input can
be selectively filtered to achieve faster velocity calculations and various
task execution. At the hardware level, due to the atomically sharp interfaces
between distinct functional layers in two-dimensional van der Waals
heterostructures, the synaptic transistor offers high-frequency response (~100
{\mu}s), robust non-volatility (>10000 s), and excellent endurance (>8000
cycles), enabling robust visual processing. In software benchmarks, our system
outperforms state-of-the-art algorithms with a 400% speedup, frequently
surpassing human-level performance while maintaining or enhancing accuracy by
utilizing the temporal priors provided by the embedded temporal information. |
---|---|
DOI: | 10.48550/arxiv.2409.15345 |