BowNet: Dilated convolutional neural network for ultrasound tongue contour extraction

One usage of medical ultrasound imaging is to visualize and characterize human tongue shape and motion during a real-time speech to study healthy or impaired speech production. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert use...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of the Acoustical Society of America 2019-10, Vol.146 (4), p.2940-2941
Hauptverfasser: Mozaffari, M. Hamed, Sankoff, David, Lee, Won-Sook
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:One usage of medical ultrasound imaging is to visualize and characterize human tongue shape and motion during a real-time speech to study healthy or impaired speech production. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert users to recognize tongue gestures in applications such as visual training of a second language. Several end-to-end deep learning segmentation methods provide promising alternatives with higher accuracy and robustness results and without any intervention. Employing the power of the graphics processing unit with state-of-the-art deep neural network models makes it feasible to have new fully automatic, accurate, and robust segmentation methods with the capability of real-time performance. This paper presents a new novel deep neural network for tongue contour extraction, BowNet, benefits from exploitation capability of dilated convolution by effectively expanding the receptive field without losing resolution to extract clear tongue contours. Also, efficient abstract context exploration is carried out by down-sampling layers to achieve segmentation results with high resolution and relevancy. Two versions, BowNet and wBowNet, are studied qualitatively and quantitatively over datasets from two different ultrasound machines. Our experiment disclosed the outstanding performances of the proposed models in terms of accuracy and robustness in comparison with similar sized models.
ISSN:0001-4966
1520-8524
DOI:10.1121/1.5137212