Training for long form speech recognition
A method (700) includes obtaining training samples (400), each training sample including a corresponding sequence of speech segments (405) corresponding to a training utterance and a corresponding sequence of real transcriptions (415) of the sequence of speech segments, and each real transcriptions...
Gespeichert in:
Hauptverfasser: | , , , , , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A method (700) includes obtaining training samples (400), each training sample including a corresponding sequence of speech segments (405) corresponding to a training utterance and a corresponding sequence of real transcriptions (415) of the sequence of speech segments, and each real transcriptions including a start time (414) and an end time (416) of the corresponding speech segment. For each of the training samples, the method comprises: processing a corresponding sequence of speech segments using a speech recognition model (200) to obtain one or more speech recognition hypotheses (522) for a training utterance; and, for each speech recognition hypothesis obtained for the training utterance, identifying a respective number of word errors relative to the corresponding real transcription sequence. The method trains the speech recognition model to minimize the word error rate based on a respective number of word errors identified for each speech recognition hypothesis obtained for the training utterance.
一种方法( |
---|