Long Range Named Entity Recognition for Marathi Documents
The demand for sophisticated natural language processing (NLP) methods, particularly Named Entity Recognition (NER), has increased due to the exponential growth of Marathi-language digital content. In particular, NER is essential for recognizing distant entities and for arranging and understanding u...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Deshmukh, Pranita Kulkarni, Nikita Kulkarni, Sanhita Manghani, Kareena Kale, Geetanjali Joshi, Raviraj |
description | The demand for sophisticated natural language processing (NLP) methods,
particularly Named Entity Recognition (NER), has increased due to the
exponential growth of Marathi-language digital content. In particular, NER is
essential for recognizing distant entities and for arranging and understanding
unstructured Marathi text data. With an emphasis on managing long-range
entities, this paper offers a comprehensive analysis of current NER techniques
designed for Marathi documents. It dives into current practices and
investigates the BERT transformer model's potential for long-range Marathi NER.
Along with analyzing the effectiveness of earlier methods, the report draws
comparisons between NER in English literature and suggests adaptation
strategies for Marathi literature. The paper discusses the difficulties caused
by Marathi's particular linguistic traits and contextual subtleties while
acknowledging NER's critical role in NLP. To conclude, this project is a major
step forward in improving Marathi NER techniques, with potential wider
applications across a range of NLP tasks and domains. |
doi_str_mv | 10.48550/arxiv.2410.09192 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2410_09192</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2410_09192</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2410_091923</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGFgaWhpxMlj65OelKwQl5qWnKvgl5qamKLjmlWSWVCoEpSbnp-dllmTm5ymk5Rcp-CYWJZZkZCq45CeX5qbmlRTzMLCmJeYUp_JCaW4GeTfXEGcPXbAl8QVFmbmJRZXxIMviwZYZE1YBABjxM8A</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Long Range Named Entity Recognition for Marathi Documents</title><source>arXiv.org</source><creator>Deshmukh, Pranita ; Kulkarni, Nikita ; Kulkarni, Sanhita ; Manghani, Kareena ; Kale, Geetanjali ; Joshi, Raviraj</creator><creatorcontrib>Deshmukh, Pranita ; Kulkarni, Nikita ; Kulkarni, Sanhita ; Manghani, Kareena ; Kale, Geetanjali ; Joshi, Raviraj</creatorcontrib><description>The demand for sophisticated natural language processing (NLP) methods,
particularly Named Entity Recognition (NER), has increased due to the
exponential growth of Marathi-language digital content. In particular, NER is
essential for recognizing distant entities and for arranging and understanding
unstructured Marathi text data. With an emphasis on managing long-range
entities, this paper offers a comprehensive analysis of current NER techniques
designed for Marathi documents. It dives into current practices and
investigates the BERT transformer model's potential for long-range Marathi NER.
Along with analyzing the effectiveness of earlier methods, the report draws
comparisons between NER in English literature and suggests adaptation
strategies for Marathi literature. The paper discusses the difficulties caused
by Marathi's particular linguistic traits and contextual subtleties while
acknowledging NER's critical role in NLP. To conclude, this project is a major
step forward in improving Marathi NER techniques, with potential wider
applications across a range of NLP tasks and domains.</description><identifier>DOI: 10.48550/arxiv.2410.09192</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Learning</subject><creationdate>2024-10</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,781,886</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2410.09192$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2410.09192$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Deshmukh, Pranita</creatorcontrib><creatorcontrib>Kulkarni, Nikita</creatorcontrib><creatorcontrib>Kulkarni, Sanhita</creatorcontrib><creatorcontrib>Manghani, Kareena</creatorcontrib><creatorcontrib>Kale, Geetanjali</creatorcontrib><creatorcontrib>Joshi, Raviraj</creatorcontrib><title>Long Range Named Entity Recognition for Marathi Documents</title><description>The demand for sophisticated natural language processing (NLP) methods,
particularly Named Entity Recognition (NER), has increased due to the
exponential growth of Marathi-language digital content. In particular, NER is
essential for recognizing distant entities and for arranging and understanding
unstructured Marathi text data. With an emphasis on managing long-range
entities, this paper offers a comprehensive analysis of current NER techniques
designed for Marathi documents. It dives into current practices and
investigates the BERT transformer model's potential for long-range Marathi NER.
Along with analyzing the effectiveness of earlier methods, the report draws
comparisons between NER in English literature and suggests adaptation
strategies for Marathi literature. The paper discusses the difficulties caused
by Marathi's particular linguistic traits and contextual subtleties while
acknowledging NER's critical role in NLP. To conclude, this project is a major
step forward in improving Marathi NER techniques, with potential wider
applications across a range of NLP tasks and domains.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMgEKGFgaWhpxMlj65OelKwQl5qWnKvgl5qamKLjmlWSWVCoEpSbnp-dllmTm5ymk5Rcp-CYWJZZkZCq45CeX5qbmlRTzMLCmJeYUp_JCaW4GeTfXEGcPXbAl8QVFmbmJRZXxIMviwZYZE1YBABjxM8A</recordid><startdate>20241011</startdate><enddate>20241011</enddate><creator>Deshmukh, Pranita</creator><creator>Kulkarni, Nikita</creator><creator>Kulkarni, Sanhita</creator><creator>Manghani, Kareena</creator><creator>Kale, Geetanjali</creator><creator>Joshi, Raviraj</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20241011</creationdate><title>Long Range Named Entity Recognition for Marathi Documents</title><author>Deshmukh, Pranita ; Kulkarni, Nikita ; Kulkarni, Sanhita ; Manghani, Kareena ; Kale, Geetanjali ; Joshi, Raviraj</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2410_091923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Deshmukh, Pranita</creatorcontrib><creatorcontrib>Kulkarni, Nikita</creatorcontrib><creatorcontrib>Kulkarni, Sanhita</creatorcontrib><creatorcontrib>Manghani, Kareena</creatorcontrib><creatorcontrib>Kale, Geetanjali</creatorcontrib><creatorcontrib>Joshi, Raviraj</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Deshmukh, Pranita</au><au>Kulkarni, Nikita</au><au>Kulkarni, Sanhita</au><au>Manghani, Kareena</au><au>Kale, Geetanjali</au><au>Joshi, Raviraj</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Long Range Named Entity Recognition for Marathi Documents</atitle><date>2024-10-11</date><risdate>2024</risdate><abstract>The demand for sophisticated natural language processing (NLP) methods,
particularly Named Entity Recognition (NER), has increased due to the
exponential growth of Marathi-language digital content. In particular, NER is
essential for recognizing distant entities and for arranging and understanding
unstructured Marathi text data. With an emphasis on managing long-range
entities, this paper offers a comprehensive analysis of current NER techniques
designed for Marathi documents. It dives into current practices and
investigates the BERT transformer model's potential for long-range Marathi NER.
Along with analyzing the effectiveness of earlier methods, the report draws
comparisons between NER in English literature and suggests adaptation
strategies for Marathi literature. The paper discusses the difficulties caused
by Marathi's particular linguistic traits and contextual subtleties while
acknowledging NER's critical role in NLP. To conclude, this project is a major
step forward in improving Marathi NER techniques, with potential wider
applications across a range of NLP tasks and domains.</abstract><doi>10.48550/arxiv.2410.09192</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2410.09192 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2410_09192 |
source | arXiv.org |
subjects | Computer Science - Computation and Language Computer Science - Learning |
title | Long Range Named Entity Recognition for Marathi Documents |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T22%3A02%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Long%20Range%20Named%20Entity%20Recognition%20for%20Marathi%20Documents&rft.au=Deshmukh,%20Pranita&rft.date=2024-10-11&rft_id=info:doi/10.48550/arxiv.2410.09192&rft_dat=%3Carxiv_GOX%3E2410_09192%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |