AI-ASSISTED SOUND EFFECT GENERATION FOR SILENT VIDEO

Sound effect recommendations for visual input are generated by training machine learning models that learn coarse-grained and fine-grained audio-visual correlations from a reference visual, a positive audio signal, and a negative audio signal. A trained Sound Recommendation Network is configured to...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	KRISHNAMURTHY, Sudha
Format:	Patent
Sprache:	eng ; fre ; ger
Schlagworte:	ACOUSTICS CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC COMMUNICATION TECHNIQUE ELECTRIC DIGITAL DATA PROCESSING ELECTRICITY INFORMATION STORAGE INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORDCARRIER AND TRANSDUCER MUSICAL INSTRUMENTS PHYSICS PICTORIAL COMMUNICATION, e.g. TELEVISION SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Sound effect recommendations for visual input are generated by training machine learning models that learn coarse-grained and fine-grained audio-visual correlations from a reference visual, a positive audio signal, and a negative audio signal. A trained Sound Recommendation Network is configured to output an audio embedding and a visual embedding and use the audio embedding and visual embedding to compute a correlation distance between an image frame or video segment and one or more audio segments retrieved from a database. The correlation distances for the one or more audio segments in the database are sorted and one or more audio segments with the closest correlation distance from the sorted audio correlation distances are determined. The audio segment with the closest audio correlation distance is applied to the input image frame or video segment.