Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching
Despite the great success of deep learning in stereo matching, recovering accurate disparity maps is still challenging. Currently, L1 and cross-entropy are the two most widely used losses for stereo network training. Compared with the former, the latter usually performs better thanks to its probabil...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Despite the great success of deep learning in stereo matching, recovering
accurate disparity maps is still challenging. Currently, L1 and cross-entropy
are the two most widely used losses for stereo network training. Compared with
the former, the latter usually performs better thanks to its probability
modeling and direct supervision to the cost volume. However, how to accurately
model the stereo ground-truth for cross-entropy loss remains largely
under-explored. Existing works simply assume that the ground-truth
distributions are uni-modal, which ignores the fact that most of the edge
pixels can be multi-modal. In this paper, a novel adaptive multi-modal
cross-entropy loss (ADL) is proposed to guide the networks to learn different
distribution patterns for each pixel. Moreover, we optimize the disparity
estimator to further alleviate the bleeding or misalignment artifacts in
inference. Extensive experimental results show that our method is generic and
can help classic stereo networks regain state-of-the-art performance. In
particular, GANet with our method ranks $1^{st}$ on both the KITTI 2015 and
2012 benchmarks among the published methods. Meanwhile, excellent
synthetic-to-realistic generalization performance can be achieved by simply
replacing the traditional loss with ours. |
---|---|
DOI: | 10.48550/arxiv.2306.15612 |