Method and device for aligning audio and text, electronic equipment and storage medium
The embodiment of the invention provides a method and device for aligning audio and text, electronic equipment and a storage medium. The method comprises the following steps: acquiring a target text and a corresponding target audio; determining a first phoneme corresponding to the word in the target...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Patent |
Sprache: | chi ; eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The embodiment of the invention provides a method and device for aligning audio and text, electronic equipment and a storage medium. The method comprises the following steps: acquiring a target text and a corresponding target audio; determining a first phoneme corresponding to the word in the target text according to a preset corresponding relationship between the word and the phoneme; according to a first phoneme sequence between the first phonemes, adding a preset phoneme behind each to-be-processed phoneme to obtain a second phoneme; obtaining a target probability of each target audio frame corresponding to each second phoneme based on the spectrum feature of each target audio frame in the target audio and a pre-trained probability prediction model; based on the target probability and a second phoneme sequence between the second phonemes, determining a target phoneme corresponding to each target audio frame from the second phonemes; and according to the embodiment of the invention, determining the text to |
---|