Automated location of orofacial landmarks to characterize airway morphology in anaesthesia via deep convolutional neural networks

•Proposed two ad-hoc deep learning networks to locate orofacial landmarks for anaesthesia from preoperative photos.•Trained by successive transfer learning stages, and with data augmentation techniques.•Compared to the consensus between manual annotations by two independent anaesthesiologists, as gr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer methods and programs in biomedicine 2023-04, Vol.232, p.107428-107428, Article 107428
Hauptverfasser: García-García, Fernando, Lee, Dae-Jin, Mendoza-Garcés, Francisco J., Irigoyen-Miró, Sofía, Legarreta-Olabarrieta, María J., García-Gutiérrez, Susana, Arostegui, Inmaculada
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•Proposed two ad-hoc deep learning networks to locate orofacial landmarks for anaesthesia from preoperative photos.•Trained by successive transfer learning stages, and with data augmentation techniques.•Compared to the consensus between manual annotations by two independent anaesthesiologists, as ground truth.•Achieved satisfactory automatic landmark location, without overfitting.•Losses were comparable to inter-human discrepancy in the frontal view. [Display omitted] Background:A reliable anticipation of a difficult airway may notably enhance safety during anaesthesia. In current practice, clinicians use bedside screenings by manual measurements of patients’ morphology. Objective:To develop and evaluate algorithms for the automated extraction of orofacial landmarks, which characterize airway morphology. Methods:We defined 27 frontal + 13 lateral landmarks. We collected n=317 pairs of pre-surgery photos from patients undergoing general anaesthesia (140 females, 177 males). As ground truth reference for supervised learning, landmarks were independently annotated by two anaesthesiologists. We trained two ad-hoc deep convolutional neural network architectures based on InceptionResNetV2 (IRNet) and MobileNetV2 (MNet), to predict simultaneously: (a) whether each landmark is visible or not (occluded, out of frame), (b) its 2D-coordinates (x,y). We implemented successive stages of transfer learning, combined with data augmentation. We added custom top layers on top of these networks, whose weights were fully tuned for our application. Performance in landmark extraction was evaluated by 10-fold cross-validation (CV) and compared against 5 state-of-the-art deformable models. Results:With annotators’ consensus as the ‘gold standard’, our IRNet-based network performed comparably to humans in the frontal view: median CV loss L=1.277·10−3, inter-quartile range (IQR) [1.001, 1.660]; versus median 1.360, IQR [1.172, 1.651], and median 1.352, IQR [1.172, 1.619], for each annotator against consensus, respectively. MNet yielded slightly worse results: median 1.471, IQR [1.139, 1.982]. In the lateral view, both networks attained performances statistically poorer than humans: median CV loss L=2.141·10−3, IQR [1.676, 2.915], and median 2.611, IQR [1.898, 3.535], respectively; versus median 1.507, IQR [1.188, 1.988], and median 1.442, IQR [1.147, 2.010] for both annotators. However, standardized effect sizes in CV loss were small: 0.0322 and 0.0235 (non-significant) for IRNet,
ISSN:0169-2607
1872-7565
DOI:10.1016/j.cmpb.2023.107428