MotionDeltaCNN: Sparse CNN Inference of Frame Differences in Moving Camera Videos
Convolutional neural network inference on video input is computationally expensive and requires high memory bandwidth. Recently, DeltaCNN managed to reduce the cost by only processing pixels with significant updates over the previous frame. However, DeltaCNN relies on static camera input. Moving cam...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Convolutional neural network inference on video input is computationally
expensive and requires high memory bandwidth. Recently, DeltaCNN managed to
reduce the cost by only processing pixels with significant updates over the
previous frame. However, DeltaCNN relies on static camera input. Moving cameras
add new challenges in how to fuse newly unveiled image regions with already
processed regions efficiently to minimize the update rate - without increasing
memory overhead and without knowing the camera extrinsics of future frames. In
this work, we propose MotionDeltaCNN, a sparse CNN inference framework that
supports moving cameras. We introduce spherical buffers and padded convolutions
to enable seamless fusion of newly unveiled regions and previously processed
regions -- without increasing memory footprint. Our evaluation shows that we
outperform DeltaCNN by up to 90% for moving camera videos. |
---|---|
DOI: | 10.48550/arxiv.2210.09887 |