Artificial intelligence to identify fractures on pediatric and young adult upper extremity radiographs
Background Pediatric fractures are challenging to identify given the different response of the pediatric skeleton to injury compared to adults, and most artificial intelligence (AI) fracture detection work has focused on adults. Objective Develop and transparently share an AI model capable of detect...
Gespeichert in:
Veröffentlicht in: | Pediatric radiology 2023-11, Vol.53 (12), p.2386-2397 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background
Pediatric fractures are challenging to identify given the different response of the pediatric skeleton to injury compared to adults, and most artificial intelligence (AI) fracture detection work has focused on adults.
Objective
Develop and transparently share an AI model capable of detecting a range of pediatric upper extremity fractures.
Materials and methods
In total, 58,846 upper extremity radiographs (finger/hand, wrist/forearm, elbow, humerus, shoulder/clavicle) from 14,873 pediatric and young adult patients were divided into train (
n
= 12,232 patients), tune (
n
= 1,307), internal test (
n
= 819), and external test (
n
= 515) splits. Fracture was determined by manual inspection of all test radiographs and the subset of train/tune radiographs whose reports were classified fracture-positive by a rule-based natural language processing (NLP) algorithm. We trained an object detection model (Faster Region-based Convolutional Neural Network [R-CNN]; “strongly-supervised”) and an image classification model (EfficientNetV2-Small; “weakly-supervised”) to detect fractures using train/tune data and evaluate on test data. AI fracture detection accuracy was compared with accuracy of on-call residents on cases they preliminarily interpreted overnight.
Results
A strongly-supervised fracture detection AI model achieved overall test area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.95–0.97), accuracy 89.7% (95% CI 88.0–91.3%), sensitivity 90.8% (95% CI 88.5–93.1%), and specificity 88.7% (95% CI 86.4–91.0%), and outperformed a weakly-supervised model (AUC 0.93, 95% CI 0.92–0.94,
P
|
---|---|
ISSN: | 1432-1998 0301-0449 1432-1998 |
DOI: | 10.1007/s00247-023-05754-y |