The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL

This corpus is a part-of-speech tagged version of Wallman, Jeff, Rowinski, Zach, Ngawang Trinley, Tomlinson, Chris, & Keutzer, Kurt. (2017). Collection of Tibetan etexts compiled by the Buddhist Digital Resource Center [Data set]. Zenodo. http://doi.org/10.5281/zenodo.821218 using the training d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Meelen, Marieke, Hill, Nathan W., Handy, Christopher
Format: Dataset
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Meelen, Marieke
Hill, Nathan W.
Handy, Christopher
description This corpus is a part-of-speech tagged version of Wallman, Jeff, Rowinski, Zach, Ngawang Trinley, Tomlinson, Chris, & Keutzer, Kurt. (2017). Collection of Tibetan etexts compiled by the Buddhist Digital Resource Center [Data set]. Zenodo. http://doi.org/10.5281/zenodo.821218 using the training data of Hill, Nathan W., & Garrett, Edward. (2017). A part-of-speech (POS) tagged corpus of Classical Tibetan [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574878 using the memory based tagger of https://languagemachines.github.io/mbt/ Please note that the files are not post-processed or manually corrected and that a small number of files in the KarmaDelek directory were still annotated, although the original xml-input was corrupted already.
doi_str_mv 10.5281/zenodo.823706
format Dataset
fullrecord <record><control><sourceid>datacite_PQ8</sourceid><recordid>TN_cdi_datacite_primary_10_5281_zenodo_823706</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_5281_zenodo_823706</sourcerecordid><originalsourceid>FETCH-LOGICAL-d776-f34cfd8444c81c328321d66b28897ab5e08ff7dcf6b181ee4df6a2ca30e6184d3</originalsourceid><addsrcrecordid>eNotkMtOwzAQRbNhgQpL9rMEqSlxnDpm2YZXpVQgyD5y7HFqKbErxzzKJ_GVJC2rmblz52p0ouiKJItlysntD1qn3IKnNE_YefRb7RBW1rogAioonN9_DOA0FJ0YBiNFB5VpMAgL16tibG_m8Cp8gA3E8I5tj3a6-0Q_GGfn0IhhHJ2FMOau798KUKY1wUxqwO8A0nUdynA0B9G2o_5lwu7o32Lv_CFeHzOqaelBe9ePL2zX5UV0pkU34OV_nUXV40NVPMfly9OmWJWxynMWa5pJrXiWZZITSVNOU6IYa1LO73LRLDHhWudKatYQThAzpZlIpaAJMsIzRWdRfIpVIghpAtZ7b3rhDzVJ6glhfUJYnxDSP9PcbDY</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>dataset</recordtype></control><display><type>dataset</type><title>The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL</title><source>DataCite</source><creator>Meelen, Marieke ; Hill, Nathan W. ; Handy, Christopher</creator><creatorcontrib>Meelen, Marieke ; Hill, Nathan W. ; Handy, Christopher</creatorcontrib><description>This corpus is a part-of-speech tagged version of Wallman, Jeff, Rowinski, Zach, Ngawang Trinley, Tomlinson, Chris, &amp; Keutzer, Kurt. (2017). Collection of Tibetan etexts compiled by the Buddhist Digital Resource Center [Data set]. Zenodo. http://doi.org/10.5281/zenodo.821218 using the training data of Hill, Nathan W., &amp; Garrett, Edward. (2017). A part-of-speech (POS) tagged corpus of Classical Tibetan [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574878 using the memory based tagger of https://languagemachines.github.io/mbt/ Please note that the files are not post-processed or manually corrected and that a small number of files in the KarmaDelek directory were still annotated, although the original xml-input was corrupted already.</description><identifier>DOI: 10.5281/zenodo.823706</identifier><language>eng</language><publisher>Zenodo</publisher><subject>corpus linguistics ; memory based tagging ; natural language processing ; Tibetan language ; Tibetan linguistics ; Trans-Himalayan Linguistics</subject><creationdate>2017</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0001-6423-017X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,1894</link.rule.ids><linktorsrc>$$Uhttps://commons.datacite.org/doi.org/10.5281/zenodo.823706$$EView_record_in_DataCite.org$$FView_record_in_$$GDataCite.org$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Meelen, Marieke</creatorcontrib><creatorcontrib>Hill, Nathan W.</creatorcontrib><creatorcontrib>Handy, Christopher</creatorcontrib><title>The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL</title><description>This corpus is a part-of-speech tagged version of Wallman, Jeff, Rowinski, Zach, Ngawang Trinley, Tomlinson, Chris, &amp; Keutzer, Kurt. (2017). Collection of Tibetan etexts compiled by the Buddhist Digital Resource Center [Data set]. Zenodo. http://doi.org/10.5281/zenodo.821218 using the training data of Hill, Nathan W., &amp; Garrett, Edward. (2017). A part-of-speech (POS) tagged corpus of Classical Tibetan [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574878 using the memory based tagger of https://languagemachines.github.io/mbt/ Please note that the files are not post-processed or manually corrected and that a small number of files in the KarmaDelek directory were still annotated, although the original xml-input was corrupted already.</description><subject>corpus linguistics</subject><subject>memory based tagging</subject><subject>natural language processing</subject><subject>Tibetan language</subject><subject>Tibetan linguistics</subject><subject>Trans-Himalayan Linguistics</subject><fulltext>true</fulltext><rsrctype>dataset</rsrctype><creationdate>2017</creationdate><recordtype>dataset</recordtype><sourceid>PQ8</sourceid><recordid>eNotkMtOwzAQRbNhgQpL9rMEqSlxnDpm2YZXpVQgyD5y7HFqKbErxzzKJ_GVJC2rmblz52p0ouiKJItlysntD1qn3IKnNE_YefRb7RBW1rogAioonN9_DOA0FJ0YBiNFB5VpMAgL16tibG_m8Cp8gA3E8I5tj3a6-0Q_GGfn0IhhHJ2FMOau798KUKY1wUxqwO8A0nUdynA0B9G2o_5lwu7o32Lv_CFeHzOqaelBe9ePL2zX5UV0pkU34OV_nUXV40NVPMfly9OmWJWxynMWa5pJrXiWZZITSVNOU6IYa1LO73LRLDHhWudKatYQThAzpZlIpaAJMsIzRWdRfIpVIghpAtZ7b3rhDzVJ6glhfUJYnxDSP9PcbDY</recordid><startdate>20170706</startdate><enddate>20170706</enddate><creator>Meelen, Marieke</creator><creator>Hill, Nathan W.</creator><creator>Handy, Christopher</creator><general>Zenodo</general><scope>DYCCY</scope><scope>PQ8</scope><orcidid>https://orcid.org/0000-0001-6423-017X</orcidid></search><sort><creationdate>20170706</creationdate><title>The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL</title><author>Meelen, Marieke ; Hill, Nathan W. ; Handy, Christopher</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-d776-f34cfd8444c81c328321d66b28897ab5e08ff7dcf6b181ee4df6a2ca30e6184d3</frbrgroupid><rsrctype>datasets</rsrctype><prefilter>datasets</prefilter><language>eng</language><creationdate>2017</creationdate><topic>corpus linguistics</topic><topic>memory based tagging</topic><topic>natural language processing</topic><topic>Tibetan language</topic><topic>Tibetan linguistics</topic><topic>Trans-Himalayan Linguistics</topic><toplevel>online_resources</toplevel><creatorcontrib>Meelen, Marieke</creatorcontrib><creatorcontrib>Hill, Nathan W.</creatorcontrib><creatorcontrib>Handy, Christopher</creatorcontrib><collection>DataCite (Open Access)</collection><collection>DataCite</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Meelen, Marieke</au><au>Hill, Nathan W.</au><au>Handy, Christopher</au><format>book</format><genre>unknown</genre><ristype>DATA</ristype><title>The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL</title><date>2017-07-06</date><risdate>2017</risdate><abstract>This corpus is a part-of-speech tagged version of Wallman, Jeff, Rowinski, Zach, Ngawang Trinley, Tomlinson, Chris, &amp; Keutzer, Kurt. (2017). Collection of Tibetan etexts compiled by the Buddhist Digital Resource Center [Data set]. Zenodo. http://doi.org/10.5281/zenodo.821218 using the training data of Hill, Nathan W., &amp; Garrett, Edward. (2017). A part-of-speech (POS) tagged corpus of Classical Tibetan [Data set]. Zenodo. http://doi.org/10.5281/zenodo.574878 using the memory based tagger of https://languagemachines.github.io/mbt/ Please note that the files are not post-processed or manually corrected and that a small number of files in the KarmaDelek directory were still annotated, although the original xml-input was corrupted already.</abstract><pub>Zenodo</pub><doi>10.5281/zenodo.823706</doi><orcidid>https://orcid.org/0000-0001-6423-017X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.5281/zenodo.823706
ispartof
issn
language eng
recordid cdi_datacite_primary_10_5281_zenodo_823706
source DataCite
subjects corpus linguistics
memory based tagging
natural language processing
Tibetan language
Tibetan linguistics
Trans-Himalayan Linguistics
title The Annotated Corpus of Classical Tibetan (ACTib), Part I - Segmented version, based on the BDRC digitised text collection, tagged with the Memory-Based Tagger from TiMBL
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T00%3A37%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-datacite_PQ8&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.au=Meelen,%20Marieke&rft.date=2017-07-06&rft_id=info:doi/10.5281/zenodo.823706&rft_dat=%3Cdatacite_PQ8%3E10_5281_zenodo_823706%3C/datacite_PQ8%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true