고정 언어 모델을 사용한 다중-모달 퓨-샷 학습

언어 모델을 사용하여 다중-모달 입력을 프로세싱하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함한 방법, 시스템 및 장치가 제공된다. 특히, 입력에는 이미지가 포함되며, 이미지는 이미지를 나타내는 이미지 임베딩 시퀀스를 생성하기 위해 이미지 인코더 신경망에 의해 인코딩된다. 이미지 임베딩 시퀀스는 언어 모델 신경망에 의해 프로세싱되는 입력 시퀀스의 적어도 일부로 제공된다. Methods, systems, and apparatus, including computer programs encoded on computer st...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	TSIMPOUKELLI MARIA RAFAILIA, CABI SERKAN, HILL FELIX GEORGE, VINYALS ORIOL, ESLAMI SEYED MOHAMMADALI, MENICK JACOB LEE
Format:	Patent
Sprache:	kor
Schlagworte:	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	TSIMPOUKELLI MARIA RAFAILIA CABI SERKAN HILL FELIX GEORGE VINYALS ORIOL ESLAMI SEYED MOHAMMADALI MENICK JACOB LEE
description	언어 모델을 사용하여 다중-모달 입력을 프로세싱하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함한 방법, 시스템 및 장치가 제공된다. 특히, 입력에는 이미지가 포함되며, 이미지는 이미지를 나타내는 이미지 임베딩 시퀀스를 생성하기 위해 이미지 인코더 신경망에 의해 인코딩된다. 이미지 임베딩 시퀀스는 언어 모델 신경망에 의해 프로세싱되는 입력 시퀀스의 적어도 일부로 제공된다. Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using language models. In particular, the inputs include an image, and the image is encoded by an image encoder neural network to generate a sequence of image embeddings representing the image. The sequence of image embeddings is provided as at least part of an input sequence to that is processed by a language model neural network.
format	Patent
fullrecord	<record><control><sourceid>epo_EVB</sourceid><recordid>TN_cdi_epo_espacenet_KR20230152741A</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>KR20230152741A</sourcerecordid><originalsourceid>FETCH-epo_espacenet_KR20230152741A3</originalsourceid><addsrcrecordid>eNrjZLB7tXnBmwVTFd5M2_Fm2haF16tWvO7d8WZui8KbpjVvZq18O3WOwuvuJW-WTNQFSXWvUXg7eYXum-btCm-nznzTtZWHgTUtMac4lRdKczMou7mGOHvophbkx6cWFyQmp-allsR7BxkZGBkbGJoamZsYOhoTpwoAL4FB7g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>patent</recordtype></control><display><type>patent</type><title>고정 언어 모델을 사용한 다중-모달 퓨-샷 학습</title><source>esp@cenet</source><creator>TSIMPOUKELLI MARIA RAFAILIA ; CABI SERKAN ; HILL FELIX GEORGE ; VINYALS ORIOL ; ESLAMI SEYED MOHAMMADALI ; MENICK JACOB LEE</creator><creatorcontrib>TSIMPOUKELLI MARIA RAFAILIA ; CABI SERKAN ; HILL FELIX GEORGE ; VINYALS ORIOL ; ESLAMI SEYED MOHAMMADALI ; MENICK JACOB LEE</creatorcontrib><description>언어 모델을 사용하여 다중-모달 입력을 프로세싱하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함한 방법, 시스템 및 장치가 제공된다. 특히, 입력에는 이미지가 포함되며, 이미지는 이미지를 나타내는 이미지 임베딩 시퀀스를 생성하기 위해 이미지 인코더 신경망에 의해 인코딩된다. 이미지 임베딩 시퀀스는 언어 모델 신경망에 의해 프로세싱되는 입력 시퀀스의 적어도 일부로 제공된다. Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using language models. In particular, the inputs include an image, and the image is encoded by an image encoder neural network to generate a sequence of image embeddings representing the image. The sequence of image embeddings is provided as at least part of an input sequence to that is processed by a language model neural network.</description><language>kor</language><subject>CALCULATING ; COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS ; COMPUTING ; COUNTING ; ELECTRIC DIGITAL DATA PROCESSING ; PHYSICS</subject><creationdate>2023</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231103&DB=EPODOC&CC=KR&NR=20230152741A$$EHTML$$P50$$Gepo$$Hfree_for_read</linktohtml><link.rule.ids>230,308,778,883,25547,76298</link.rule.ids><linktorsrc>$$Uhttps://worldwide.espacenet.com/publicationDetails/biblio?FT=D&date=20231103&DB=EPODOC&CC=KR&NR=20230152741A$$EView_record_in_European_Patent_Office$$FView_record_in_$$GEuropean_Patent_Office$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>TSIMPOUKELLI MARIA RAFAILIA</creatorcontrib><creatorcontrib>CABI SERKAN</creatorcontrib><creatorcontrib>HILL FELIX GEORGE</creatorcontrib><creatorcontrib>VINYALS ORIOL</creatorcontrib><creatorcontrib>ESLAMI SEYED MOHAMMADALI</creatorcontrib><creatorcontrib>MENICK JACOB LEE</creatorcontrib><title>고정 언어 모델을 사용한 다중-모달 퓨-샷 학습</title><description>언어 모델을 사용하여 다중-모달 입력을 프로세싱하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함한 방법, 시스템 및 장치가 제공된다. 특히, 입력에는 이미지가 포함되며, 이미지는 이미지를 나타내는 이미지 임베딩 시퀀스를 생성하기 위해 이미지 인코더 신경망에 의해 인코딩된다. 이미지 임베딩 시퀀스는 언어 모델 신경망에 의해 프로세싱되는 입력 시퀀스의 적어도 일부로 제공된다. Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using language models. In particular, the inputs include an image, and the image is encoded by an image encoder neural network to generate a sequence of image embeddings representing the image. The sequence of image embeddings is provided as at least part of an input sequence to that is processed by a language model neural network.</description><subject>CALCULATING</subject><subject>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</subject><subject>COMPUTING</subject><subject>COUNTING</subject><subject>ELECTRIC DIGITAL DATA PROCESSING</subject><subject>PHYSICS</subject><fulltext>true</fulltext><rsrctype>patent</rsrctype><creationdate>2023</creationdate><recordtype>patent</recordtype><sourceid>EVB</sourceid><recordid>eNrjZLB7tXnBmwVTFd5M2_Fm2haF16tWvO7d8WZui8KbpjVvZq18O3WOwuvuJW-WTNQFSXWvUXg7eYXum-btCm-nznzTtZWHgTUtMac4lRdKczMou7mGOHvophbkx6cWFyQmp-allsR7BxkZGBkbGJoamZsYOhoTpwoAL4FB7g</recordid><startdate>20231103</startdate><enddate>20231103</enddate><creator>TSIMPOUKELLI MARIA RAFAILIA</creator><creator>CABI SERKAN</creator><creator>HILL FELIX GEORGE</creator><creator>VINYALS ORIOL</creator><creator>ESLAMI SEYED MOHAMMADALI</creator><creator>MENICK JACOB LEE</creator><scope>EVB</scope></search><sort><creationdate>20231103</creationdate><title>고정 언어 모델을 사용한 다중-모달 퓨-샷 학습</title><author>TSIMPOUKELLI MARIA RAFAILIA ; CABI SERKAN ; HILL FELIX GEORGE ; VINYALS ORIOL ; ESLAMI SEYED MOHAMMADALI ; MENICK JACOB LEE</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-epo_espacenet_KR20230152741A3</frbrgroupid><rsrctype>patents</rsrctype><prefilter>patents</prefilter><language>kor</language><creationdate>2023</creationdate><topic>CALCULATING</topic><topic>COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS</topic><topic>COMPUTING</topic><topic>COUNTING</topic><topic>ELECTRIC DIGITAL DATA PROCESSING</topic><topic>PHYSICS</topic><toplevel>online_resources</toplevel><creatorcontrib>TSIMPOUKELLI MARIA RAFAILIA</creatorcontrib><creatorcontrib>CABI SERKAN</creatorcontrib><creatorcontrib>HILL FELIX GEORGE</creatorcontrib><creatorcontrib>VINYALS ORIOL</creatorcontrib><creatorcontrib>ESLAMI SEYED MOHAMMADALI</creatorcontrib><creatorcontrib>MENICK JACOB LEE</creatorcontrib><collection>esp@cenet</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>TSIMPOUKELLI MARIA RAFAILIA</au><au>CABI SERKAN</au><au>HILL FELIX GEORGE</au><au>VINYALS ORIOL</au><au>ESLAMI SEYED MOHAMMADALI</au><au>MENICK JACOB LEE</au><format>patent</format><genre>patent</genre><ristype>GEN</ristype><title>고정 언어 모델을 사용한 다중-모달 퓨-샷 학습</title><date>2023-11-03</date><risdate>2023</risdate><abstract>언어 모델을 사용하여 다중-모달 입력을 프로세싱하기 위한 컴퓨터 저장 매체에 인코딩된 컴퓨터 프로그램을 포함한 방법, 시스템 및 장치가 제공된다. 특히, 입력에는 이미지가 포함되며, 이미지는 이미지를 나타내는 이미지 임베딩 시퀀스를 생성하기 위해 이미지 인코더 신경망에 의해 인코딩된다. 이미지 임베딩 시퀀스는 언어 모델 신경망에 의해 프로세싱되는 입력 시퀀스의 적어도 일부로 제공된다. Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing multi-modal inputs using language models. In particular, the inputs include an image, and the image is encoded by an image encoder neural network to generate a sequence of image embeddings representing the image. The sequence of image embeddings is provided as at least part of an input sequence to that is processed by a language model neural network.</abstract><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	kor
recordid	cdi_epo_espacenet_KR20230152741A
source	esp@cenet
subjects	CALCULATING COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
title	고정 언어 모델을 사용한 다중-모달 퓨-샷 학습
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T12%3A07%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-epo_EVB&rft_val_fmt=info:ofi/fmt:kev:mtx:patent&rft.genre=patent&rft.au=TSIMPOUKELLI%20MARIA%20RAFAILIA&rft.date=2023-11-03&rft_id=info:doi/&rft_dat=%3Cepo_EVB%3EKR20230152741A%3C/epo_EVB%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true