AI-ASSISTED SOUND EFFECT GENERATION FOR SILENT VIDEO
Sound effect recommendations for visual input are generated by training machine learning models that learn coarse-grained and fine-grained audio-visual correlations from a reference visual, a positive audio signal, and a negative audio signal. A trained Sound Recommendation Network is configured to...
Gespeichert in:
1. Verfasser: | |
---|---|
Format: | Patent |
Sprache: | eng ; fre ; ger |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Sound effect recommendations for visual input are generated by training machine learning models that learn coarse-grained and fine-grained audio-visual correlations from a reference visual, a positive audio signal, and a negative audio signal. A trained Sound Recommendation Network is configured to output an audio embedding and a visual embedding and use the audio embedding and visual embedding to compute a correlation distance between an image frame or video segment and one or more audio segments retrieved from a database. The correlation distances for the one or more audio segments in the database are sorted and one or more audio segments with the closest correlation distance from the sorted audio correlation distances are determined. The audio segment with the closest audio correlation distance is applied to the input image frame or video segment. |
---|