Statistical validation of image segmentation quality based on a spatial overlap index scientific reports

Rationale and Objectives. To examine a statistical validation method based on the spatial overlap between two sets of segmentations of the same anatomy. Materials and Methods. The Dice similarity coefficient (DSC) was used as a statistical validation metric to evaluate the performance of both the re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Academic radiology 2004-02, Vol.11 (2), p.178-189
Hauptverfasser: Zou, Kelly H, Warfield, Simon K, Bharatha, Aditya, Tempany, Clare M C, Kaus, Michael R, Haker, Steven J, Wells, William M, Jolesz, Ferenc A, Kikinis, Ron
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Rationale and Objectives. To examine a statistical validation method based on the spatial overlap between two sets of segmentations of the same anatomy. Materials and Methods. The Dice similarity coefficient (DSC) was used as a statistical validation metric to evaluate the performance of both the reproducibility of manual segmentations and the spatial overlap accuracy of automated probabilistic fractional segmentation of MR images, illustrated on two clinical examples. Example 1: 10 consecutive cases of prostate brachytherapy patients underwent both preoperative 1.5T and intraoperative 0.5T MR imaging. For each case, 5 repeated manual segmentations of the prostate peripheral zone were performed separately on preoperative and on intraoperative images. Example 2: A semi-automated probabilistic fractional segmentation algorithm was applied to MR imaging of 9 cases with 3 types of brain tumors. DSC values were computed and logit-transformed values were compared in the mean with the analysis of variance (ANOVA). Results. Example 1: The mean DSCs of 0.883 (range, 0.876-0.893) with 1.5T preoperative MRI and 0.838 (range, 0.819-0.852) with 0.5T intraoperative MRI (P < .001) were within and at the margin of the range of good reproducibility, respectively. Example 2: Wide ranges of DSC were observed in brain tumor segmentations: Meningiomas (0.519-0.893), astrocytomas (0.487-0.972), and other mixed gliomas (0.490-0.899). Conclusion. The DSC value is a simple and useful summary measure of spatial overlap, which can be applied to studies of reproducibility and accuracy in image segmentation. We observed generally satisfactory but variable validation results in two clinical applications. This metric may be adapted for similar validation tasks.
ISSN:1076-6332
DOI:10.1016/S1076-6332(03)00671-8