A large-scale dataset for korean document-level relation extraction from encyclopedia texts

Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the stu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2024-09, Vol.54 (17-18), p.8681-8701
Hauptverfasser: Son, Suhyune, Lim, Jungwoo, Koo, Seonmin, Kim, Jinsung, Kim, Younghoon, Lim, Youngsik, Hyun, Dongseok, Lim, Heuiseok
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 8701
container_issue 17-18
container_start_page 8681
container_title Applied intelligence (Dordrecht, Netherlands)
container_volume 54
creator Son, Suhyune
Lim, Jungwoo
Koo, Seonmin
Kim, Jinsung
Kim, Younghoon
Lim, Youngsik
Hyun, Dongseok
Lim, Heuiseok
description Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK ( T oward Document-Level R elation E xtraction in K orean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language’s properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.
doi_str_mv 10.1007/s10489-024-05605-9
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3090094981</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3090094981</sourcerecordid><originalsourceid>FETCH-LOGICAL-c244t-87a83a982b9aef48d5183afa02ceecd9b6be162b05ef537209eb9f721889db683</originalsourceid><addsrcrecordid>eNp9kEtLAzEQx4MoWKtfwFPAc3SS7CM5luILCl4UBA8hm52U1u2mJqnYb-_aFbx5mhn-j4EfIZccrjlAfZM4FEozEAWDsoKS6SMy4WUtWV3o-phMQA9SVenXU3KW0hoApAQ-IW8z2tm4RJac7ZC2NtuEmfoQ6XuIaHvaBrfbYJ9Zh5_Y0YidzavQU_zK0brD6mPYUOzd3nVhi-3K0jyo6ZyceNslvPidU_Jyd_s8f2CLp_vH-WzBnCiKzFRtlbRaiUZb9IVqSz7c3oJwiK7VTdUgr0QDJfpS1gI0NtrXgiul26ZSckquxt5tDB87TNmswy72w0sjQQPoQis-uMTocjGkFNGbbVxtbNwbDuYHohkhmgGiOUA0egjJMZQGc7_E-Ff9T-obEeh2Qw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3090094981</pqid></control><display><type>article</type><title>A large-scale dataset for korean document-level relation extraction from encyclopedia texts</title><source>SpringerLink Journals</source><creator>Son, Suhyune ; Lim, Jungwoo ; Koo, Seonmin ; Kim, Jinsung ; Kim, Younghoon ; Lim, Youngsik ; Hyun, Dongseok ; Lim, Heuiseok</creator><creatorcontrib>Son, Suhyune ; Lim, Jungwoo ; Koo, Seonmin ; Kim, Jinsung ; Kim, Younghoon ; Lim, Youngsik ; Hyun, Dongseok ; Lim, Heuiseok</creatorcontrib><description>Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK ( T oward Document-Level R elation E xtraction in K orean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language’s properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-024-05605-9</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Annotations ; Artificial Intelligence ; Computer Science ; Cultural heritage ; Datasets ; Documents ; Encyclopedias ; Korean language ; Language ; Machines ; Manufacturing ; Mechanical Engineering ; Processes ; Qualitative analysis ; Statistical analysis ; Subject specialists</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2024-09, Vol.54 (17-18), p.8681-8701</ispartof><rights>The Author(s) 2024</rights><rights>The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c244t-87a83a982b9aef48d5183afa02ceecd9b6be162b05ef537209eb9f721889db683</cites><orcidid>0000-0002-9269-1157</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10489-024-05605-9$$EPDF$$P50$$Gspringer$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10489-024-05605-9$$EHTML$$P50$$Gspringer$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Son, Suhyune</creatorcontrib><creatorcontrib>Lim, Jungwoo</creatorcontrib><creatorcontrib>Koo, Seonmin</creatorcontrib><creatorcontrib>Kim, Jinsung</creatorcontrib><creatorcontrib>Kim, Younghoon</creatorcontrib><creatorcontrib>Lim, Youngsik</creatorcontrib><creatorcontrib>Hyun, Dongseok</creatorcontrib><creatorcontrib>Lim, Heuiseok</creatorcontrib><title>A large-scale dataset for korean document-level relation extraction from encyclopedia texts</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK ( T oward Document-Level R elation E xtraction in K orean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language’s properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.</description><subject>Annotations</subject><subject>Artificial Intelligence</subject><subject>Computer Science</subject><subject>Cultural heritage</subject><subject>Datasets</subject><subject>Documents</subject><subject>Encyclopedias</subject><subject>Korean language</subject><subject>Language</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Processes</subject><subject>Qualitative analysis</subject><subject>Statistical analysis</subject><subject>Subject specialists</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>C6C</sourceid><recordid>eNp9kEtLAzEQx4MoWKtfwFPAc3SS7CM5luILCl4UBA8hm52U1u2mJqnYb-_aFbx5mhn-j4EfIZccrjlAfZM4FEozEAWDsoKS6SMy4WUtWV3o-phMQA9SVenXU3KW0hoApAQ-IW8z2tm4RJac7ZC2NtuEmfoQ6XuIaHvaBrfbYJ9Zh5_Y0YidzavQU_zK0brD6mPYUOzd3nVhi-3K0jyo6ZyceNslvPidU_Jyd_s8f2CLp_vH-WzBnCiKzFRtlbRaiUZb9IVqSz7c3oJwiK7VTdUgr0QDJfpS1gI0NtrXgiul26ZSckquxt5tDB87TNmswy72w0sjQQPoQis-uMTocjGkFNGbbVxtbNwbDuYHohkhmgGiOUA0egjJMZQGc7_E-Ff9T-obEeh2Qw</recordid><startdate>20240901</startdate><enddate>20240901</enddate><creator>Son, Suhyune</creator><creator>Lim, Jungwoo</creator><creator>Koo, Seonmin</creator><creator>Kim, Jinsung</creator><creator>Kim, Younghoon</creator><creator>Lim, Youngsik</creator><creator>Hyun, Dongseok</creator><creator>Lim, Heuiseok</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>C6C</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-9269-1157</orcidid></search><sort><creationdate>20240901</creationdate><title>A large-scale dataset for korean document-level relation extraction from encyclopedia texts</title><author>Son, Suhyune ; Lim, Jungwoo ; Koo, Seonmin ; Kim, Jinsung ; Kim, Younghoon ; Lim, Youngsik ; Hyun, Dongseok ; Lim, Heuiseok</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c244t-87a83a982b9aef48d5183afa02ceecd9b6be162b05ef537209eb9f721889db683</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Annotations</topic><topic>Artificial Intelligence</topic><topic>Computer Science</topic><topic>Cultural heritage</topic><topic>Datasets</topic><topic>Documents</topic><topic>Encyclopedias</topic><topic>Korean language</topic><topic>Language</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Processes</topic><topic>Qualitative analysis</topic><topic>Statistical analysis</topic><topic>Subject specialists</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Son, Suhyune</creatorcontrib><creatorcontrib>Lim, Jungwoo</creatorcontrib><creatorcontrib>Koo, Seonmin</creatorcontrib><creatorcontrib>Kim, Jinsung</creatorcontrib><creatorcontrib>Kim, Younghoon</creatorcontrib><creatorcontrib>Lim, Youngsik</creatorcontrib><creatorcontrib>Hyun, Dongseok</creatorcontrib><creatorcontrib>Lim, Heuiseok</creatorcontrib><collection>Springer Nature OA Free Journals</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Son, Suhyune</au><au>Lim, Jungwoo</au><au>Koo, Seonmin</au><au>Kim, Jinsung</au><au>Kim, Younghoon</au><au>Lim, Youngsik</au><au>Hyun, Dongseok</au><au>Lim, Heuiseok</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A large-scale dataset for korean document-level relation extraction from encyclopedia texts</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2024-09-01</date><risdate>2024</risdate><volume>54</volume><issue>17-18</issue><spage>8681</spage><epage>8701</epage><pages>8681-8701</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>Document-level relation extraction (RE) aims to predict the relational facts between two given entities from a document. Unlike widespread research on document-level RE in English, Korean document-level RE research is still at the very beginning due to the absence of a dataset. To accelerate the studies, we present TREK ( T oward Document-Level R elation E xtraction in K orean) dataset constructed from Korean encyclopedia documents written by the domain experts. We provide detailed statistical analyses for our large-scale dataset and human evaluation results suggest the assured quality of TREK . Also, we introduce the document-level RE model that considers the named entity-type while considering the Korean language’s properties. In the experiments, we demonstrate that our proposed model outperforms the baselines and conduct qualitative analysis.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-024-05605-9</doi><tpages>21</tpages><orcidid>https://orcid.org/0000-0002-9269-1157</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0924-669X
ispartof Applied intelligence (Dordrecht, Netherlands), 2024-09, Vol.54 (17-18), p.8681-8701
issn 0924-669X
1573-7497
language eng
recordid cdi_proquest_journals_3090094981
source SpringerLink Journals
subjects Annotations
Artificial Intelligence
Computer Science
Cultural heritage
Datasets
Documents
Encyclopedias
Korean language
Language
Machines
Manufacturing
Mechanical Engineering
Processes
Qualitative analysis
Statistical analysis
Subject specialists
title A large-scale dataset for korean document-level relation extraction from encyclopedia texts
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T10%3A56%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20large-scale%20dataset%20for%20korean%20document-level%20relation%20extraction%20from%20encyclopedia%20texts&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Son,%20Suhyune&rft.date=2024-09-01&rft.volume=54&rft.issue=17-18&rft.spage=8681&rft.epage=8701&rft.pages=8681-8701&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-024-05605-9&rft_dat=%3Cproquest_cross%3E3090094981%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3090094981&rft_id=info:pmid/&rfr_iscdi=true