Accelerating the Training of Video Super-Resolution Models
Despite that convolution neural networks (CNN) have recently demonstrated high-quality reconstruction for video super-resolution (VSR), efficiently training competitive VSR models remains a challenging problem. It usually takes an order of magnitude more time than training their counterpart image mo...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite that convolution neural networks (CNN) have recently demonstrated
high-quality reconstruction for video super-resolution (VSR), efficiently
training competitive VSR models remains a challenging problem. It usually takes
an order of magnitude more time than training their counterpart image models,
leading to long research cycles. Existing VSR methods typically train models
with fixed spatial and temporal sizes from beginning to end. The fixed sizes
are usually set to large values for good performance, resulting to slow
training. However, is such a rigid training strategy necessary for VSR? In this
work, we show that it is possible to gradually train video models from small to
large spatial/temporal sizes, i.e., in an easy-to-hard manner. In particular,
the whole training is divided into several stages and the earlier stage has
smaller training spatial shape. Inside each stage, the temporal size also
varies from short to long while the spatial size remains unchanged. Training is
accelerated by such a multigrid training strategy, as most of computation is
performed on smaller spatial and shorter temporal shapes. For further
acceleration with GPU parallelization, we also investigate the large minibatch
training without the loss in accuracy. Extensive experiments demonstrate that
our method is capable of largely speeding up training (up to $6.2\times$
speedup in wall-clock training time) without performance drop for various VSR
models. The code is available at
https://github.com/TencentARC/Efficient-VSR-Training. |
---|---|
DOI: | 10.48550/arxiv.2205.05069 |