ACTIVE SPEECH SEPARATION SYSTEM BY FACE RECOGNITION

The present invention relates to an active voice separation system through face recognition which can separate a voice from noise even in a single microphone as a result of a model trained through an active feature extraction unit of a video. The active voice separation system through face recogniti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	CHOI, HYUN BAE
Format:	Patent
Sprache:	eng ; kor
Schlagworte:	ACOUSTICS CALCULATING COMPUTING COUNTING HANDLING RECORD CARRIERS MUSICAL INSTRUMENTS PHYSICS PRESENTATION OF DATA RECOGNITION OF DATA RECORD CARRIERS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The present invention relates to an active voice separation system through face recognition which can separate a voice from noise even in a single microphone as a result of a model trained through an active feature extraction unit of a video. The active voice separation system through face recognition comprises: a device (101) capable of using a single microphone and a camera to photograph, and transmitting photographed data to a main server through a communication network; a face recognition unit (201) recognizing a face in an original image transmitted to the main server; a voice feature extraction unit (203) extracting features of a face associated with a preset voice from face image data with extracted values having specific features in accordance with the values of wavelength data of the voice, and extracting characteristic values; a voice wavelength processing unit (206) separating wavelengths of an audio; an active noise separation unit (205) matching the features of the video and the wavelengths of the audio to separate noise; a terminal transmission unit (204) selectively transferring the matched audio; a voice feature extraction unit (208) receiving the face image data to extract voice features; an active noise learning unit (210) receiving pure noise data and pure voice data to make synthesized voice noise data, and learning with a model using the pure voice data as output values to make a noise separation model; and a noise separation model extraction unit (209) extracting the model resulting from the active noise learning unit. 본 발명은 얼굴인식을 통한 능동형 음성 분리 시스템에 있어서, 한 개 이상의 마이크와 카메라를 이용하여 촬영하고 촬영한 데이터를 통신망을 통해 메인서버로 전송 가능한 기기(101); 상기 메인서버로 전송된 원본 영상에서 얼굴일 인식하는 얼굴인식부(201)와, 얼굴영상데이터에서 미리 설정된 음성과 관련된 얼굴의 특징들을 추출하며, 그 추출된 값은 음성의 파장데이터의 값에 따라 특정한 특징을 지니고 있으며, 그 특징적인 값을 추출하는 음성특징추출부(203), 오디오의 파장을 분리하는 음성파장처리부(206), 비디오의 특징과 오디오의 파장을 매칭하여 소음을 분리하는 능동소음분리부(205), 매칭된 오디오를 선택적으로 전달하는 단말전송부(204); 얼굴영상데이터를 입력받아 음성특징을 추출하는 음성특징추출부(208); 순수음성데이터와 순수소음데이터를 입력을 받아 합성된 음성소음데이터를 만들며, 순수음성데이터를 출력값으로 하는 모델로 학습하여 소음분리모델을 만드는 능동소음학습부(210); 능동소음학습부를 통해 나온 모델을 추출하는 소음분리모델추출부(209); 를 포함하는 얼굴인식을 통한 능동형 음성 분리 시스템에 관한 것이다.