Building Efficient Universal Classifiers with Natural Language Inference
Generative Large Language Models (LLMs) have become the mainstream choice for fewshot and zeroshot learning thanks to the universality of text generation. Many users, however, do not need the broad capabilities of generative LLMs when they only want to automate a classification task. Smaller BERT-li...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Laurer, Moritz van Atteveldt, Wouter Casas, Andreu Welbers, Kasper |
description | Generative Large Language Models (LLMs) have become the mainstream choice for
fewshot and zeroshot learning thanks to the universality of text generation.
Many users, however, do not need the broad capabilities of generative LLMs when
they only want to automate a classification task. Smaller BERT-like models can
also learn universal tasks, which allow them to do any text classification task
without requiring fine-tuning (zeroshot classification) or to learn new tasks
with only a few examples (fewshot), while being significantly more efficient
than generative LLMs. This paper (1) explains how Natural Language Inference
(NLI) can be used as a universal classification task that follows similar
principles as instruction fine-tuning of generative LLMs, (2) provides a
step-by-step guide with reusable Jupyter notebooks for building a universal
classifier, and (3) shares the resulting universal classifier that is trained
on 33 datasets with 389 diverse classes. Parts of the code we share has been
used to train our older zeroshot classifiers that have been downloaded more
than 55 million times via the Hugging Face Hub as of December 2023. Our new
classifier improves zeroshot performance by 9.4%. |
doi_str_mv | 10.48550/arxiv.2312.17543 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2312_17543</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2312_17543</sourcerecordid><originalsourceid>FETCH-LOGICAL-a673-9def1c8c7f0a9529ffd5ce5fd705ae0713e687c48a85db7a74fecfcf4135f5d13</originalsourceid><addsrcrecordid>eNotz8FOwzAQBFBfOKDCB3DCP5AQx97aOUJUaKWoXMo5WuzdYClYyEkK_D2lcBqNRhrpCXGjqtI4gOoO81c8lrVWdaksGH0ptg9LHENMg9wwRx8pzfIlxSPlCUfZjjhNkeOpyc84v8k9zks-DR2mYcGB5C4xZUqersQF4zjR9X-uxOFxc2i3Rff8tGvvuwLXVhdNIFbeecsVNlA3zAE8AQdbAVJllaa1s944dBBeLVrD5NmzURoYgtIrcft3e6b0Hzm-Y_7uf0n9maR_ACCGSBc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Building Efficient Universal Classifiers with Natural Language Inference</title><source>arXiv.org</source><creator>Laurer, Moritz ; van Atteveldt, Wouter ; Casas, Andreu ; Welbers, Kasper</creator><creatorcontrib>Laurer, Moritz ; van Atteveldt, Wouter ; Casas, Andreu ; Welbers, Kasper</creatorcontrib><description>Generative Large Language Models (LLMs) have become the mainstream choice for
fewshot and zeroshot learning thanks to the universality of text generation.
Many users, however, do not need the broad capabilities of generative LLMs when
they only want to automate a classification task. Smaller BERT-like models can
also learn universal tasks, which allow them to do any text classification task
without requiring fine-tuning (zeroshot classification) or to learn new tasks
with only a few examples (fewshot), while being significantly more efficient
than generative LLMs. This paper (1) explains how Natural Language Inference
(NLI) can be used as a universal classification task that follows similar
principles as instruction fine-tuning of generative LLMs, (2) provides a
step-by-step guide with reusable Jupyter notebooks for building a universal
classifier, and (3) shares the resulting universal classifier that is trained
on 33 datasets with 389 diverse classes. Parts of the code we share has been
used to train our older zeroshot classifiers that have been downloaded more
than 55 million times via the Hugging Face Hub as of December 2023. Our new
classifier improves zeroshot performance by 9.4%.</description><identifier>DOI: 10.48550/arxiv.2312.17543</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computation and Language</subject><creationdate>2023-12</creationdate><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2312.17543$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2312.17543$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Laurer, Moritz</creatorcontrib><creatorcontrib>van Atteveldt, Wouter</creatorcontrib><creatorcontrib>Casas, Andreu</creatorcontrib><creatorcontrib>Welbers, Kasper</creatorcontrib><title>Building Efficient Universal Classifiers with Natural Language Inference</title><description>Generative Large Language Models (LLMs) have become the mainstream choice for
fewshot and zeroshot learning thanks to the universality of text generation.
Many users, however, do not need the broad capabilities of generative LLMs when
they only want to automate a classification task. Smaller BERT-like models can
also learn universal tasks, which allow them to do any text classification task
without requiring fine-tuning (zeroshot classification) or to learn new tasks
with only a few examples (fewshot), while being significantly more efficient
than generative LLMs. This paper (1) explains how Natural Language Inference
(NLI) can be used as a universal classification task that follows similar
principles as instruction fine-tuning of generative LLMs, (2) provides a
step-by-step guide with reusable Jupyter notebooks for building a universal
classifier, and (3) shares the resulting universal classifier that is trained
on 33 datasets with 389 diverse classes. Parts of the code we share has been
used to train our older zeroshot classifiers that have been downloaded more
than 55 million times via the Hugging Face Hub as of December 2023. Our new
classifier improves zeroshot performance by 9.4%.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computation and Language</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz8FOwzAQBFBfOKDCB3DCP5AQx97aOUJUaKWoXMo5WuzdYClYyEkK_D2lcBqNRhrpCXGjqtI4gOoO81c8lrVWdaksGH0ptg9LHENMg9wwRx8pzfIlxSPlCUfZjjhNkeOpyc84v8k9zks-DR2mYcGB5C4xZUqersQF4zjR9X-uxOFxc2i3Rff8tGvvuwLXVhdNIFbeecsVNlA3zAE8AQdbAVJllaa1s944dBBeLVrD5NmzURoYgtIrcft3e6b0Hzm-Y_7uf0n9maR_ACCGSBc</recordid><startdate>20231229</startdate><enddate>20231229</enddate><creator>Laurer, Moritz</creator><creator>van Atteveldt, Wouter</creator><creator>Casas, Andreu</creator><creator>Welbers, Kasper</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231229</creationdate><title>Building Efficient Universal Classifiers with Natural Language Inference</title><author>Laurer, Moritz ; van Atteveldt, Wouter ; Casas, Andreu ; Welbers, Kasper</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a673-9def1c8c7f0a9529ffd5ce5fd705ae0713e687c48a85db7a74fecfcf4135f5d13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computation and Language</topic><toplevel>online_resources</toplevel><creatorcontrib>Laurer, Moritz</creatorcontrib><creatorcontrib>van Atteveldt, Wouter</creatorcontrib><creatorcontrib>Casas, Andreu</creatorcontrib><creatorcontrib>Welbers, Kasper</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Laurer, Moritz</au><au>van Atteveldt, Wouter</au><au>Casas, Andreu</au><au>Welbers, Kasper</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Building Efficient Universal Classifiers with Natural Language Inference</atitle><date>2023-12-29</date><risdate>2023</risdate><abstract>Generative Large Language Models (LLMs) have become the mainstream choice for
fewshot and zeroshot learning thanks to the universality of text generation.
Many users, however, do not need the broad capabilities of generative LLMs when
they only want to automate a classification task. Smaller BERT-like models can
also learn universal tasks, which allow them to do any text classification task
without requiring fine-tuning (zeroshot classification) or to learn new tasks
with only a few examples (fewshot), while being significantly more efficient
than generative LLMs. This paper (1) explains how Natural Language Inference
(NLI) can be used as a universal classification task that follows similar
principles as instruction fine-tuning of generative LLMs, (2) provides a
step-by-step guide with reusable Jupyter notebooks for building a universal
classifier, and (3) shares the resulting universal classifier that is trained
on 33 datasets with 389 diverse classes. Parts of the code we share has been
used to train our older zeroshot classifiers that have been downloaded more
than 55 million times via the Hugging Face Hub as of December 2023. Our new
classifier improves zeroshot performance by 9.4%.</abstract><doi>10.48550/arxiv.2312.17543</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2312.17543 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2312_17543 |
source | arXiv.org |
subjects | Computer Science - Artificial Intelligence Computer Science - Computation and Language |
title | Building Efficient Universal Classifiers with Natural Language Inference |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T21%3A32%3A00IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Building%20Efficient%20Universal%20Classifiers%20with%20Natural%20Language%20Inference&rft.au=Laurer,%20Moritz&rft.date=2023-12-29&rft_id=info:doi/10.48550/arxiv.2312.17543&rft_dat=%3Carxiv_GOX%3E2312_17543%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |