Lip Reading Using Committee Networks With Two Different Types of Concatenated Frame Images
This paper proposes a lip-reading method based on convolutional neural networks (CNNs) applied to two different types of concatenated frame images (CFIs), consisting of (a) full-lip images and (b) patches around lip landmarks. In addition, we introduce committee networks with the predictions obtaine...
Gespeichert in:
Veröffentlicht in: | IEEE access 2019, Vol.7, p.90125-90131 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | This paper proposes a lip-reading method based on convolutional neural networks (CNNs) applied to two different types of concatenated frame images (CFIs), consisting of (a) full-lip images and (b) patches around lip landmarks. In addition, we introduce committee networks with the predictions obtained from the two different types of the CFIs, which provide better performance than single or committee networks using either type of the CFIs. For efficient training using a limited dataset, such as OuluVS2, we propose time-based label-preserving transform and use a quarter VGG-m in which the number of parameters is reduced compared to the VGG-m. The experimental results with the OuluVS2 dataset show that the proposed method using different types of the CFIs in committee networks outperformed the state-of-the-art methods without pre-training using a large-scale dataset. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2927166 |