Speaker tracking method and system based on multi-modal information
The invention discloses a spokesman tracking method and system based on multi-modal information, and relates to the field of spokesman tracking. The method can be applied to online spokesman tracking tasks of offline conferences or online conferences, spokesmen can be quickly and accurately position...
Gespeichert in:
Hauptverfasser: | , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The invention discloses a spokesman tracking method and system based on multi-modal information, and relates to the field of spokesman tracking. The method can be applied to online spokesman tracking tasks of offline conferences or online conferences, spokesmen can be quickly and accurately positioned, and spokesman close-up can be given; and the method can also be used for marking off-line tasks of spokesmen in each part of the video in the provided video. And under the condition that a plurality of faces appear in the same picture and each person alternately speaks, calculating a speaking lip movement score, a sound and appearance matching score and a lip shape synchronization score of each face in the image by using the input image and the corresponding audio information, and positioning a specific spokesman according to the score of each face in the image. And meanwhile, the voice and face pairs which are registered and paired are supported to be input in advance, and the voice and face pairs with high pa |
---|