On the limits of cross-domain generalization in automated X-ray prediction
This large scale study focuses on quantifying what X-rays diagnostic prediction tasks generalize well across multiple different datasets. We present evidence that the issue of generalization is not due to a shift in the images but instead a shift in the labels. We study the cross-domain performance,...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This large scale study focuses on quantifying what X-rays diagnostic
prediction tasks generalize well across multiple different datasets. We present
evidence that the issue of generalization is not due to a shift in the images
but instead a shift in the labels. We study the cross-domain performance,
agreement between models, and model representations. We find interesting
discrepancies between performance and agreement where models which both achieve
good performance disagree in their predictions as well as models which agree
yet achieve poor performance. We also test for concept similarity by
regularizing a network to group tasks across multiple datasets together and
observe variation across the tasks. All code is made available online and data
is publicly available: https://github.com/mlmed/torchxrayvision |
---|---|
DOI: | 10.48550/arxiv.2002.02497 |