Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions

Recent years have seen an exponential growth (+98% in 2022 w.r.t. the previous year) of the number of research articles in the few-shot learning field, which aims at training machine learning models with extremely limited available data. The research interest toward few-shot learning systems for Nam...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ACM transactions on intelligent systems and technology 2023-10, Vol.14 (5), p.1-46, Article 94
Hauptverfasser: Moscato, Vincenzo, Postiglione, Marco, Sperlí, Giancarlo
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 46
container_issue 5
container_start_page 1
container_title ACM transactions on intelligent systems and technology
container_volume 14
creator Moscato, Vincenzo
Postiglione, Marco
Sperlí, Giancarlo
description Recent years have seen an exponential growth (+98% in 2022 w.r.t. the previous year) of the number of research articles in the few-shot learning field, which aims at training machine learning models with extremely limited available data. The research interest toward few-shot learning systems for Named Entity Recognition (NER) is thus at the same time increasing. NER consists in identifying mentions of pre-defined entities from unstructured text, and serves as a fundamental step in many downstream tasks, such as the construction of Knowledge Graphs, or Question Answering. The need for a NER system able to be trained with few-annotated examples comes in all its urgency in domains where the annotation process requires time, knowledge and expertise (e.g., healthcare, finance, legal), and in low-resource languages. In this survey, starting from a clear definition and description of the few-shot NER (FS-NER) problem, we take stock of the current state-of-the-art and propose a taxonomy which divides algorithms in two macro-categories according to the underlying mechanisms: model-centric and data-centric. For each category, we line-up works as a story to show how the field is moving toward new research directions. Eventually, techniques, limitations, and key aspects are deeply analyzed to facilitate future studies.
doi_str_mv 10.1145/3609483
format Article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3609483</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3609483</sourcerecordid><originalsourceid>FETCH-LOGICAL-a277t-e024e6581af885d2aa7e8d361efc5a8c15b6284ed62c46f66d47b14b249f46cc3</originalsourceid><addsrcrecordid>eNo90M9LwzAUB_AgCo45vHvKzYvVJE3S1JvshwrDgcxzSZMXF7GNJAXtf7-Ozr3L-8L78A5fhK4puaeUi4dckpKr_AxNGBVFJkvKzk-Z8Es0S-mLDMNLVlI1QZsV_GZpFzr8phuweNl2vuvxO5jw2frOh_YRL8D5Md_hrf4LbWh6rFs7qAQ6mh1e-AjmANIVunD6O8HsuKfoY7Xczl-y9eb5df60zjQrii4DwjhIoah2SgnLtC5A2VxScEZoZaioJVMcrGSGSyel5UVNec146bg0Jp-i2_GviSGlCK76ib7Rsa8oqQ5VVMcqBnkzSm2aE_o_7gFiu1hx</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions</title><source>ACM Digital Library Complete</source><creator>Moscato, Vincenzo ; Postiglione, Marco ; Sperlí, Giancarlo</creator><creatorcontrib>Moscato, Vincenzo ; Postiglione, Marco ; Sperlí, Giancarlo</creatorcontrib><description>Recent years have seen an exponential growth (+98% in 2022 w.r.t. the previous year) of the number of research articles in the few-shot learning field, which aims at training machine learning models with extremely limited available data. The research interest toward few-shot learning systems for Named Entity Recognition (NER) is thus at the same time increasing. NER consists in identifying mentions of pre-defined entities from unstructured text, and serves as a fundamental step in many downstream tasks, such as the construction of Knowledge Graphs, or Question Answering. The need for a NER system able to be trained with few-annotated examples comes in all its urgency in domains where the annotation process requires time, knowledge and expertise (e.g., healthcare, finance, legal), and in low-resource languages. In this survey, starting from a clear definition and description of the few-shot NER (FS-NER) problem, we take stock of the current state-of-the-art and propose a taxonomy which divides algorithms in two macro-categories according to the underlying mechanisms: model-centric and data-centric. For each category, we line-up works as a story to show how the field is moving toward new research directions. Eventually, techniques, limitations, and key aspects are deeply analyzed to facilitate future studies.</description><identifier>ISSN: 2157-6904</identifier><identifier>EISSN: 2157-6912</identifier><identifier>DOI: 10.1145/3609483</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Computing methodologies ; Information extraction ; Learning paradigms ; Natural language processing</subject><ispartof>ACM transactions on intelligent systems and technology, 2023-10, Vol.14 (5), p.1-46, Article 94</ispartof><rights>Copyright held by the owner/author(s).</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-a277t-e024e6581af885d2aa7e8d361efc5a8c15b6284ed62c46f66d47b14b249f46cc3</citedby><cites>FETCH-LOGICAL-a277t-e024e6581af885d2aa7e8d361efc5a8c15b6284ed62c46f66d47b14b249f46cc3</cites><orcidid>0000-0003-1470-8053 ; 0000-0002-0754-7696 ; 0000-0003-4033-3777</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://dl.acm.org/doi/pdf/10.1145/3609483$$EPDF$$P50$$Gacm$$Hfree_for_read</linktopdf><link.rule.ids>314,776,780,2276,27901,27902,40172,75971</link.rule.ids></links><search><creatorcontrib>Moscato, Vincenzo</creatorcontrib><creatorcontrib>Postiglione, Marco</creatorcontrib><creatorcontrib>Sperlí, Giancarlo</creatorcontrib><title>Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions</title><title>ACM transactions on intelligent systems and technology</title><addtitle>ACM TIST</addtitle><description>Recent years have seen an exponential growth (+98% in 2022 w.r.t. the previous year) of the number of research articles in the few-shot learning field, which aims at training machine learning models with extremely limited available data. The research interest toward few-shot learning systems for Named Entity Recognition (NER) is thus at the same time increasing. NER consists in identifying mentions of pre-defined entities from unstructured text, and serves as a fundamental step in many downstream tasks, such as the construction of Knowledge Graphs, or Question Answering. The need for a NER system able to be trained with few-annotated examples comes in all its urgency in domains where the annotation process requires time, knowledge and expertise (e.g., healthcare, finance, legal), and in low-resource languages. In this survey, starting from a clear definition and description of the few-shot NER (FS-NER) problem, we take stock of the current state-of-the-art and propose a taxonomy which divides algorithms in two macro-categories according to the underlying mechanisms: model-centric and data-centric. For each category, we line-up works as a story to show how the field is moving toward new research directions. Eventually, techniques, limitations, and key aspects are deeply analyzed to facilitate future studies.</description><subject>Computing methodologies</subject><subject>Information extraction</subject><subject>Learning paradigms</subject><subject>Natural language processing</subject><issn>2157-6904</issn><issn>2157-6912</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNo90M9LwzAUB_AgCo45vHvKzYvVJE3S1JvshwrDgcxzSZMXF7GNJAXtf7-Ozr3L-8L78A5fhK4puaeUi4dckpKr_AxNGBVFJkvKzk-Z8Es0S-mLDMNLVlI1QZsV_GZpFzr8phuweNl2vuvxO5jw2frOh_YRL8D5Md_hrf4LbWh6rFs7qAQ6mh1e-AjmANIVunD6O8HsuKfoY7Xczl-y9eb5df60zjQrii4DwjhIoah2SgnLtC5A2VxScEZoZaioJVMcrGSGSyel5UVNec146bg0Jp-i2_GviSGlCK76ib7Rsa8oqQ5VVMcqBnkzSm2aE_o_7gFiu1hx</recordid><startdate>20231009</startdate><enddate>20231009</enddate><creator>Moscato, Vincenzo</creator><creator>Postiglione, Marco</creator><creator>Sperlí, Giancarlo</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0003-1470-8053</orcidid><orcidid>https://orcid.org/0000-0002-0754-7696</orcidid><orcidid>https://orcid.org/0000-0003-4033-3777</orcidid></search><sort><creationdate>20231009</creationdate><title>Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions</title><author>Moscato, Vincenzo ; Postiglione, Marco ; Sperlí, Giancarlo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a277t-e024e6581af885d2aa7e8d361efc5a8c15b6284ed62c46f66d47b14b249f46cc3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computing methodologies</topic><topic>Information extraction</topic><topic>Learning paradigms</topic><topic>Natural language processing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Moscato, Vincenzo</creatorcontrib><creatorcontrib>Postiglione, Marco</creatorcontrib><creatorcontrib>Sperlí, Giancarlo</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on intelligent systems and technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Moscato, Vincenzo</au><au>Postiglione, Marco</au><au>Sperlí, Giancarlo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions</atitle><jtitle>ACM transactions on intelligent systems and technology</jtitle><stitle>ACM TIST</stitle><date>2023-10-09</date><risdate>2023</risdate><volume>14</volume><issue>5</issue><spage>1</spage><epage>46</epage><pages>1-46</pages><artnum>94</artnum><issn>2157-6904</issn><eissn>2157-6912</eissn><abstract>Recent years have seen an exponential growth (+98% in 2022 w.r.t. the previous year) of the number of research articles in the few-shot learning field, which aims at training machine learning models with extremely limited available data. The research interest toward few-shot learning systems for Named Entity Recognition (NER) is thus at the same time increasing. NER consists in identifying mentions of pre-defined entities from unstructured text, and serves as a fundamental step in many downstream tasks, such as the construction of Knowledge Graphs, or Question Answering. The need for a NER system able to be trained with few-annotated examples comes in all its urgency in domains where the annotation process requires time, knowledge and expertise (e.g., healthcare, finance, legal), and in low-resource languages. In this survey, starting from a clear definition and description of the few-shot NER (FS-NER) problem, we take stock of the current state-of-the-art and propose a taxonomy which divides algorithms in two macro-categories according to the underlying mechanisms: model-centric and data-centric. For each category, we line-up works as a story to show how the field is moving toward new research directions. Eventually, techniques, limitations, and key aspects are deeply analyzed to facilitate future studies.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3609483</doi><tpages>46</tpages><orcidid>https://orcid.org/0000-0003-1470-8053</orcidid><orcidid>https://orcid.org/0000-0002-0754-7696</orcidid><orcidid>https://orcid.org/0000-0003-4033-3777</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2157-6904
ispartof ACM transactions on intelligent systems and technology, 2023-10, Vol.14 (5), p.1-46, Article 94
issn 2157-6904
2157-6912
language eng
recordid cdi_crossref_primary_10_1145_3609483
source ACM Digital Library Complete
subjects Computing methodologies
Information extraction
Learning paradigms
Natural language processing
title Few-shot Named Entity Recognition: Definition, Taxonomy and Research Directions
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-28T15%3A31%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Few-shot%20Named%20Entity%20Recognition:%20Definition,%20Taxonomy%20and%20Research%20Directions&rft.jtitle=ACM%20transactions%20on%20intelligent%20systems%20and%20technology&rft.au=Moscato,%20Vincenzo&rft.date=2023-10-09&rft.volume=14&rft.issue=5&rft.spage=1&rft.epage=46&rft.pages=1-46&rft.artnum=94&rft.issn=2157-6904&rft.eissn=2157-6912&rft_id=info:doi/10.1145/3609483&rft_dat=%3Cacm_cross%3E3609483%3C/acm_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true