Method and apparatus for generating speech video

In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, gen...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	JO SO YEON, SHIN YOON HO, HWANG SUN HEE, LEE SEUNG HYUN, JEON BYOUNG KI, PARK SANG HOON
Format:	Patent
Sprache:	eng ; kor
Schlagworte:	ACOUSTICS CALCULATING COMPUTING COUNTING IMAGE DATA PROCESSING OR GENERATION, IN GENERAL MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	JO SO YEON SHIN YOON HO HWANG SUN HEE LEE SEUNG HYUN JEON BYOUNG KI PARK SANG HOON
description	In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_KR20230123184A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>KR20230123184A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_KR20230123184A3</originalsourceid><addsrcrecordid>eNrjZDDwTS3JyE9RSMwD4oKCxKLEktJihbT8IoX01LxUIC8zL12huCA1NTlDoSwzJTWfh4E1LTGnOJUXSnMzKLu5hjh76KYW5MenFhckJgP1lcR7BxkZGBkbGBoZG1qYOBoTpwoA_uEsBg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Method and apparatus for generating speech video</title><source>esp@cenet</source><creator>JO SO YEON ; SHIN YOON HO ; HWANG SUN HEE ; LEE SEUNG HYUN ; JEON BYOUNG KI ; PARK SANG HOON</creator><creatorcontrib>JO SO YEON ; SHIN YOON HO ; HWANG SUN HEE ; LEE SEUNG HYUN ; JEON BYOUNG KI ; PARK SANG HOON</creatorcontrib><description>In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.</description><language>eng ; kor</language><subject>ACOUSTICS ; CALCULATING ; COMPUTING ; COUNTING ; IMAGE DATA PROCESSING OR GENERATION, IN GENERAL ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230823&DB=EPODOC&CC=KR&NR=20230123184A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76418</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20230823&DB=EPODOC&CC=KR&NR=20230123184A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>JO SO YEON</creatorcontrib><creatorcontrib>SHIN YOON HO</creatorcontrib><creatorcontrib>HWANG SUN HEE</creatorcontrib><creatorcontrib>LEE SEUNG HYUN</creatorcontrib><creatorcontrib>JEON BYOUNG KI</creatorcontrib><creatorcontrib>PARK SANG HOON</creatorcontrib><title>Method and apparatus for generating speech video</title><description>In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.</description><subject>ACOUSTICS</subject><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>IMAGE DATA PROCESSING OR GENERATION, IN GENERAL</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDDwTS3JyE9RSMwD4oKCxKLEktJihbT8IoX01LxUIC8zL12huCA1NTlDoSwzJTWfh4E1LTGnOJUXSnMzKLu5hjh76KYW5MenFhckJgP1lcR7BxkZGBkbGBoZG1qYOBoTpwoA_uEsBg</recordid><startdate>20230823</startdate><enddate>20230823</enddate><creator>JO SO YEON</creator><creator>SHIN YOON HO</creator><creator>HWANG SUN HEE</creator><creator>LEE SEUNG HYUN</creator><creator>JEON BYOUNG KI</creator><creator>PARK SANG HOON</creator><scope>EVB</scope></search><sort><creationdate>20230823</creationdate><title>Method and apparatus for generating speech video</title><author>JO SO YEON ; SHIN YOON HO ; HWANG SUN HEE ; LEE SEUNG HYUN ; JEON BYOUNG KI ; PARK SANG HOON</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_KR20230123184A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; kor</language><creationdate>2023</creationdate><topic>ACOUSTICS</topic><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>IMAGE DATA PROCESSING OR GENERATION, IN GENERAL</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>JO SO YEON</creatorcontrib><creatorcontrib>SHIN YOON HO</creatorcontrib><creatorcontrib>HWANG SUN HEE</creatorcontrib><creatorcontrib>LEE SEUNG HYUN</creatorcontrib><creatorcontrib>JEON BYOUNG KI</creatorcontrib><creatorcontrib>PARK SANG HOON</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>JO SO YEON</au><au>SHIN YOON HO</au><au>HWANG SUN HEE</au><au>LEE SEUNG HYUN</au><au>JEON BYOUNG KI</au><au>PARK SANG HOON</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Method and apparatus for generating speech video</title><date>2023-08-23</date><risdate>2023</risdate><abstract>In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng ; kor
recordid	cdi_epo_espacenet_KR20230123184A
source	esp@cenet
subjects	ACOUSTICS CALCULATING COMPUTING COUNTING IMAGE DATA PROCESSING OR GENERATION, IN GENERAL MUSICAL INSTRUMENTS PHYSICS SPEECH ANALYSIS OR SYNTHESIS SPEECH OR AUDIO CODING OR DECODING SPEECH OR VOICE PROCESSING SPEECH RECOGNITION
title	Method and apparatus for generating speech video
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T17%3A11%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=JO%20SO%20YEON&rft.date=2023-08-23&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EKR20230123184A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true