Looking for Tiny Defects via Forward-Backward Feature Transfer
Motivated by efficiency requirements, most anomaly detection and segmentation (AD&S) methods focus on processing low-resolution images, e.g., $224\times 224$ pixels, obtained by downsampling the original input images. In this setting, downsampling is typically applied also to the provided ground...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Motivated by efficiency requirements, most anomaly detection and segmentation
(AD&S) methods focus on processing low-resolution images, e.g., $224\times 224$
pixels, obtained by downsampling the original input images. In this setting,
downsampling is typically applied also to the provided ground-truth defect
masks. Yet, as numerous industrial applications demand identification of both
large and tiny defects, the above-described protocol may fall short in
providing a realistic picture of the actual performance attainable by current
methods. Hence, in this work, we introduce a novel benchmark that evaluates
methods on the original, high-resolution image and ground-truth masks, focusing
on segmentation performance as a function of the size of anomalies. Our
benchmark includes a metric that captures robustness with respect to defect
size, i.e., the ability of a method to preserve good localization from large
anomalies to tiny ones. Furthermore, we introduce an AD&S approach based on a
novel Teacher-Student paradigm which relies on two shallow MLPs (the Students)
that learn to transfer patch features across the layers of a frozen vision
transformer (the Teacher). By means of our benchmark, we evaluate our proposal
and other recent AD&S methods on high-resolution inputs containing large and
tiny defects. Our proposal features the highest robustness to defect size, runs
at the fastest speed, yields state-of-the-art performance on the MVTec AD
dataset and state-of-the-art segmentation performance on the VisA dataset. |
---|---|
DOI: | 10.48550/arxiv.2407.04092 |