Method and apparatus for generating speech video

In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, gen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: JO SO YEON, SHIN YOON HO, HWANG SUN HEE, LEE SEUNG HYUN, JEON BYOUNG KI, PARK SANG HOON
Format: Patent
Sprache:eng ; kor
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator JO SO YEON
SHIN YOON HO
HWANG SUN HEE
LEE SEUNG HYUN
JEON BYOUNG KI
PARK SANG HOON
description In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.
format Patent
fullrecord <record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_KR20230123184A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>KR20230123184A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_KR20230123184A3</originalsourceid><addsrcrecordid>eNrjZDDwTS3JyE9RSMwD4oKCxKLEktJihbT8IoX01LxUIC8zL12huCA1NTlDoSwzJTWfh4E1LTGnOJUXSnMzKLu5hjh76KYW5MenFhckJgP1lcR7BxkZGBkbGBoZG1qYOBoTpwoA_uEsBg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>Method and apparatus for generating speech video</title><source>esp@cenet</source><creator>JO SO YEON ; SHIN YOON HO ; HWANG SUN HEE ; LEE SEUNG HYUN ; JEON BYOUNG KI ; PARK SANG HOON</creator><creatorcontrib>JO SO YEON ; SHIN YOON HO ; HWANG SUN HEE ; LEE SEUNG HYUN ; JEON BYOUNG KI ; PARK SANG HOON</creatorcontrib><description>In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.</description><language>eng ; kor</language><subject>ACOUSTICS ; CALCULATING ; COMPUTING ; COUNTING ; IMAGE DATA PROCESSING OR GENERATION, IN GENERAL ; MUSICAL INSTRUMENTS ; PHYSICS ; SPEECH ANALYSIS OR SYNTHESIS ; SPEECH OR AUDIO CODING OR DECODING ; SPEECH OR VOICE PROCESSING ; SPEECH RECOGNITION</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20230823&amp;DB=EPODOC&amp;CC=KR&amp;NR=20230123184A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,780,885,25563,76418</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&amp;date=20230823&amp;DB=EPODOC&amp;CC=KR&amp;NR=20230123184A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>JO SO YEON</creatorcontrib><creatorcontrib>SHIN YOON HO</creatorcontrib><creatorcontrib>HWANG SUN HEE</creatorcontrib><creatorcontrib>LEE SEUNG HYUN</creatorcontrib><creatorcontrib>JEON BYOUNG KI</creatorcontrib><creatorcontrib>PARK SANG HOON</creatorcontrib><title>Method and apparatus for generating speech video</title><description>In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.</description><subject>ACOUSTICS</subject><subject>CALCULATING</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>IMAGE DATA PROCESSING OR GENERATION, IN GENERAL</subject><subject>MUSICAL INSTRUMENTS</subject><subject>PHYSICS</subject><subject>SPEECH ANALYSIS OR SYNTHESIS</subject><subject>SPEECH OR AUDIO CODING OR DECODING</subject><subject>SPEECH OR VOICE PROCESSING</subject><subject>SPEECH RECOGNITION</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZDDwTS3JyE9RSMwD4oKCxKLEktJihbT8IoX01LxUIC8zL12huCA1NTlDoSwzJTWfh4E1LTGnOJUXSnMzKLu5hjh76KYW5MenFhckJgP1lcR7BxkZGBkbGBoZG1qYOBoTpwoA_uEsBg</recordid><startdate>20230823</startdate><enddate>20230823</enddate><creator>JO SO YEON</creator><creator>SHIN YOON HO</creator><creator>HWANG SUN HEE</creator><creator>LEE SEUNG HYUN</creator><creator>JEON BYOUNG KI</creator><creator>PARK SANG HOON</creator><scope>EVB</scope></search><sort><creationdate>20230823</creationdate><title>Method and apparatus for generating speech video</title><author>JO SO YEON ; SHIN YOON HO ; HWANG SUN HEE ; LEE SEUNG HYUN ; JEON BYOUNG KI ; PARK SANG HOON</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_KR20230123184A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>eng ; kor</language><creationdate>2023</creationdate><topic>ACOUSTICS</topic><topic>CALCULATING</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>IMAGE DATA PROCESSING OR GENERATION, IN GENERAL</topic><topic>MUSICAL INSTRUMENTS</topic><topic>PHYSICS</topic><topic>SPEECH ANALYSIS OR SYNTHESIS</topic><topic>SPEECH OR AUDIO CODING OR DECODING</topic><topic>SPEECH OR VOICE PROCESSING</topic><topic>SPEECH RECOGNITION</topic><toplevel>online_resources</toplevel><creatorcontrib>JO SO YEON</creatorcontrib><creatorcontrib>SHIN YOON HO</creatorcontrib><creatorcontrib>HWANG SUN HEE</creatorcontrib><creatorcontrib>LEE SEUNG HYUN</creatorcontrib><creatorcontrib>JEON BYOUNG KI</creatorcontrib><creatorcontrib>PARK SANG HOON</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>JO SO YEON</au><au>SHIN YOON HO</au><au>HWANG SUN HEE</au><au>LEE SEUNG HYUN</au><au>JEON BYOUNG KI</au><au>PARK SANG HOON</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>Method and apparatus for generating speech video</title><date>2023-08-23</date><risdate>2023</risdate><abstract>In one embodiment, a method generates a nose average value point, a chin key point, and a lip key point corresponding to a speech input, generates a first person background image by masking a lower end part of a face in an original image using the chin key point and the nose average value point, generates a second person background image by combining the lip key point with the first person background image, and generates a final utterance image from the second person background image. Therefore, the present invention is capable of solving a problem of difficulty in generating the utterance image. 일 실시예는, 음성입력에 해당하는 코 평균값 포인트, 턱 키포인트 및 입술 키포인트를 생성; 상기 턱 키포인트와 상기 코 평균값 포인트를 사용하여, 원본 영상에서 얼굴 하단부를 마스킹하여 제1 인물 배경 영상을 생성; 상기 제1 인물 배경 영상에 상기 입술 키포인트를 합성하여 제2 인물 배경 영상을 생성; 상기 제2 인물 배경 영상으로부터 최종 발화 영상을 생성하는, 방법이다.</abstract><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier
ispartof
issn
language eng ; kor
recordid cdi_epo_espacenet_KR20230123184A
source esp@cenet
subjects ACOUSTICS
CALCULATING
COMPUTING
COUNTING
IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
MUSICAL INSTRUMENTS
PHYSICS
SPEECH ANALYSIS OR SYNTHESIS
SPEECH OR AUDIO CODING OR DECODING
SPEECH OR VOICE PROCESSING
SPEECH RECOGNITION
title Method and apparatus for generating speech video
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T17%3A11%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=JO%20SO%20YEON&rft.date=2023-08-23&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EKR20230123184A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true