Artificial intelligence-based detection of paediatric appendicular skeletal fractures: performance and limitations for common fracture types and locations
Background Research into artificial intelligence (AI)-based fracture detection in children is scarce and has disregarded the detection of indirect fracture signs and dislocations. Objective To assess the diagnostic accuracy of an existing AI-tool for the detection of fractures, indirect fracture sig...
Gespeichert in:
Veröffentlicht in: | Pediatric radiology 2024-01, Vol.54 (1), p.136-145 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Background
Research into artificial intelligence (AI)-based fracture detection in children is scarce and has disregarded the detection of indirect fracture signs and dislocations.
Objective
To assess the diagnostic accuracy of an existing AI-tool for the detection of fractures, indirect fracture signs, and dislocations.
Materials and methods
An AI software, BoneView (Gleamer, Paris, France), was assessed for diagnostic accuracy of fracture detection using paediatric radiology consensus diagnoses as reference. Radiographs from a single emergency department were enrolled retrospectively going back from December 2021, limited to 1,000 radiographs per body part. Enrolment criteria were as follows: suspected fractures of the forearm, lower leg, or elbow; age 0–18 years; and radiographs in at least two projections.
Results
Lower leg radiographs showed 607 fractures. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were high (87.5%, 87.5%, 98.3%, 98.3%, respectively). Detection rate was low for toddler’s fractures, trampoline fractures, and proximal tibial Salter-Harris-II fractures. Forearm radiographs showed 1,137 fractures. Sensitivity, specificity, PPV, and NPV were high (92.9%, 98.1%, 98.4%, 91.7%, respectively). Radial and ulnar bowing fractures were not reliably detected (one out of 11 radial bowing fractures and zero out of seven ulnar bowing fractures were correctly detected). Detection rate was low for styloid process avulsions, proximal radial buckle, and complete olecranon fractures. Elbow radiographs showed 517 fractures. Sensitivity and NPV were moderate (80.5%, 84.7%, respectively). Specificity and PPV were high (94.9%, 93.3%, respectively). For joint effusion, sensitivity, specificity, PPV, and NPV were moderate (85.1%, 85.7%, 89.5%, 80%, respectively). For elbow dislocations, sensitivity and PPV were low (65.8%, 50%, respectively). Specificity and NPV were high (97.7%, 98.8%, respectively).
Conclusions
The diagnostic performance of BoneView is promising for forearm and lower leg fractures. However, improvement is mandatory before clinicians can rely solely on AI-based paediatric fracture detection using this software.
Graphical Abstract |
---|---|
ISSN: | 1432-1998 0301-0449 1432-1998 |
DOI: | 10.1007/s00247-023-05822-3 |