Using Natural Language Processing to Predict Costume Core Vocabulary of Historical Artifacts

Historic dress artifacts are a valuable source for human studies. In particular, they can provide important insights into the social aspects of their corresponding era. These insights are commonly drawn from garment pictures as well as the accompanying descriptions and are usually stored in a standa...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Muralikrishnan, Madhuvanti, Hilal, Amr, Miller, Chreston, Smith-Glaviana, Dina
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Computation and Language Computer Science - Digital Libraries Computer Science - Information Retrieval
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Muralikrishnan, Madhuvanti Hilal, Amr Miller, Chreston Smith-Glaviana, Dina
description	Historic dress artifacts are a valuable source for human studies. In particular, they can provide important insights into the social aspects of their corresponding era. These insights are commonly drawn from garment pictures as well as the accompanying descriptions and are usually stored in a standardized and controlled vocabulary that accurately describes garments and costume items, called the Costume Core Vocabulary. Building an accurate Costume Core from garment descriptions can be challenging because the historic garment items are often donated, and the accompanying descriptions can be based on untrained individuals and use a language common to the period of the items. In this paper, we present an approach to use Natural Language Processing (NLP) to map the free-form text descriptions of the historic items to that of the controlled vocabulary provided by the Costume Core. Despite the limited dataset, we were able to train an NLP model based on the Universal Sentence Encoder to perform this mapping with more than 90% test accuracy for a subset of the Costume Core vocabulary. We describe our methodology, design choices, and development of our approach, and show the feasibility of predicting the Costume Core for unseen descriptions. With more garment descriptions still being curated to be used for training, we expect to have higher accuracy for better generalizability.
doi_str_mv	10.48550/arxiv.2212.07931
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2212_07931</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2212_07931</sourcerecordid><originalsourceid>FETCH-LOGICAL-a671-c3a2ee4538b7ef68b5e2b8454d6f8beea5d22c833de9b5af52c6c2cc5517917f3</originalsourceid><addsrcrecordid>eNotj8FKxDAURbNxITN-gCvzA61N0rTpcijqCMVxMboaKC_pSwl0JpKkon9vra4ulwuHewi5ZUVeKimLewhf7jPnnPG8qBvBrsnpLbrLSF8gzQEm2sFlnGFE-hq8wbhuyS8NB2cSbX1M8xmXDEjfvQE9TxC-qbd072LywZkFsgvJWTApbsmVhSnizX9uyPHx4djus-7w9NzuugyqmmVGAEcspVC6RlspLZFrVcpyqKzSiCAHzo0SYsBGS7CSm8pwY6RkdcNqKzbk7g-76vUfwZ2XU_2vZr9qih-7xk7e</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Using Natural Language Processing to Predict Costume Core Vocabulary of Historical Artifacts</title><source>arXiv.org</source><creator>Muralikrishnan, Madhuvanti ; Hilal, Amr ; Miller, Chreston ; Smith-Glaviana, Dina</creator><creatorcontrib>Muralikrishnan, Madhuvanti ; Hilal, Amr ; Miller, Chreston ; Smith-Glaviana, Dina</creatorcontrib><description>Historic dress artifacts are a valuable source for human studies. In particular, they can provide important insights into the social aspects of their corresponding era. These insights are commonly drawn from garment pictures as well as the accompanying descriptions and are usually stored in a standardized and controlled vocabulary that accurately describes garments and costume items, called the Costume Core Vocabulary. Building an accurate Costume Core from garment descriptions can be challenging because the historic garment items are often donated, and the accompanying descriptions can be based on untrained individuals and use a language common to the period of the items. In this paper, we present an approach to use Natural Language Processing (NLP) to map the free-form text descriptions of the historic items to that of the controlled vocabulary provided by the Costume Core. Despite the limited dataset, we were able to train an NLP model based on the Universal Sentence Encoder to perform this mapping with more than 90% test accuracy for a subset of the Costume Core vocabulary. We describe our methodology, design choices, and development of our approach, and show the feasibility of predicting the Costume Core for unseen descriptions. With more garment descriptions still being curated to be used for training, we expect to have higher accuracy for better generalizability.</description><identifier>DOI: 10.48550/arxiv.2212.07931</identifier><language>eng</language><subject>Computer Science - Computation and Language ; Computer Science - Digital Libraries ; Computer Science - Information Retrieval</subject><creationdate>2022-11</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2212.07931$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2212.07931$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Muralikrishnan, Madhuvanti</creatorcontrib><creatorcontrib>Hilal, Amr</creatorcontrib><creatorcontrib>Miller, Chreston</creatorcontrib><creatorcontrib>Smith-Glaviana, Dina</creatorcontrib><title>Using Natural Language Processing to Predict Costume Core Vocabulary of Historical Artifacts</title><description>Historic dress artifacts are a valuable source for human studies. In particular, they can provide important insights into the social aspects of their corresponding era. These insights are commonly drawn from garment pictures as well as the accompanying descriptions and are usually stored in a standardized and controlled vocabulary that accurately describes garments and costume items, called the Costume Core Vocabulary. Building an accurate Costume Core from garment descriptions can be challenging because the historic garment items are often donated, and the accompanying descriptions can be based on untrained individuals and use a language common to the period of the items. In this paper, we present an approach to use Natural Language Processing (NLP) to map the free-form text descriptions of the historic items to that of the controlled vocabulary provided by the Costume Core. Despite the limited dataset, we were able to train an NLP model based on the Universal Sentence Encoder to perform this mapping with more than 90% test accuracy for a subset of the Costume Core vocabulary. We describe our methodology, design choices, and development of our approach, and show the feasibility of predicting the Costume Core for unseen descriptions. With more garment descriptions still being curated to be used for training, we expect to have higher accuracy for better generalizability.</description><subject>Computer Science - Computation and Language</subject><subject>Computer Science - Digital Libraries</subject><subject>Computer Science - Information Retrieval</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8FKxDAURbNxITN-gCvzA61N0rTpcijqCMVxMboaKC_pSwl0JpKkon9vra4ulwuHewi5ZUVeKimLewhf7jPnnPG8qBvBrsnpLbrLSF8gzQEm2sFlnGFE-hq8wbhuyS8NB2cSbX1M8xmXDEjfvQE9TxC-qbd072LywZkFsgvJWTApbsmVhSnizX9uyPHx4djus-7w9NzuugyqmmVGAEcspVC6RlspLZFrVcpyqKzSiCAHzo0SYsBGS7CSm8pwY6RkdcNqKzbk7g-76vUfwZ2XU_2vZr9qih-7xk7e</recordid><startdate>20221123</startdate><enddate>20221123</enddate><creator>Muralikrishnan, Madhuvanti</creator><creator>Hilal, Amr</creator><creator>Miller, Chreston</creator><creator>Smith-Glaviana, Dina</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20221123</creationdate><title>Using Natural Language Processing to Predict Costume Core Vocabulary of Historical Artifacts</title><author>Muralikrishnan, Madhuvanti ; Hilal, Amr ; Miller, Chreston ; Smith-Glaviana, Dina</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a671-c3a2ee4538b7ef68b5e2b8454d6f8beea5d22c833de9b5af52c6c2cc5517917f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Computer Science - Computation and Language</topic><topic>Computer Science - Digital Libraries</topic><topic>Computer Science - Information Retrieval</topic><toplevel>online_resources</toplevel><creatorcontrib>Muralikrishnan, Madhuvanti</creatorcontrib><creatorcontrib>Hilal, Amr</creatorcontrib><creatorcontrib>Miller, Chreston</creatorcontrib><creatorcontrib>Smith-Glaviana, Dina</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Muralikrishnan, Madhuvanti</au><au>Hilal, Amr</au><au>Miller, Chreston</au><au>Smith-Glaviana, Dina</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Using Natural Language Processing to Predict Costume Core Vocabulary of Historical Artifacts</atitle><date>2022-11-23</date><risdate>2022</risdate><abstract>Historic dress artifacts are a valuable source for human studies. In particular, they can provide important insights into the social aspects of their corresponding era. These insights are commonly drawn from garment pictures as well as the accompanying descriptions and are usually stored in a standardized and controlled vocabulary that accurately describes garments and costume items, called the Costume Core Vocabulary. Building an accurate Costume Core from garment descriptions can be challenging because the historic garment items are often donated, and the accompanying descriptions can be based on untrained individuals and use a language common to the period of the items. In this paper, we present an approach to use Natural Language Processing (NLP) to map the free-form text descriptions of the historic items to that of the controlled vocabulary provided by the Costume Core. Despite the limited dataset, we were able to train an NLP model based on the Universal Sentence Encoder to perform this mapping with more than 90% test accuracy for a subset of the Costume Core vocabulary. We describe our methodology, design choices, and development of our approach, and show the feasibility of predicting the Costume Core for unseen descriptions. With more garment descriptions still being curated to be used for training, we expect to have higher accuracy for better generalizability.</abstract><doi>10.48550/arxiv.2212.07931</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2212.07931
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2212_07931
source	arXiv.org
subjects	Computer Science - Computation and Language Computer Science - Digital Libraries Computer Science - Information Retrieval
title	Using Natural Language Processing to Predict Costume Core Vocabulary of Historical Artifacts
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T14%3A40%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Using%20Natural%20Language%20Processing%20to%20Predict%20Costume%20Core%20Vocabulary%20of%20Historical%20Artifacts&rft.au=Muralikrishnan,%20Madhuvanti&rft.date=2022-11-23&rft_id=info:doi/10.48550/arxiv.2212.07931&rft_dat=%3Carxiv_GOX%3E2212_07931%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true