Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022

Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of PubMed Records with Medical Subject Headings. Pp. 8-...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bański, Piotr, Barbaresi, Adrien, Clematide, Simon, Kupietz, Marc, Lüngen, Harald
Format:	Web Resource
Sprache:	eng ; ger
Schlagworte:	Daten Datenanalyse Datenmanagement Datenqualität Datensammlung Datensatz Korpus Sprache
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Bański, Piotr Barbaresi, Adrien Clematide, Simon Kupietz, Marc Lüngen, Harald
description	Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of PubMed Records with Medical Subject Headings. Pp. 8-15 3. Luca Brigada Villa: UDeasy: a Tool for Querying Treebanks in CoNLL-U Format. Pp. 16-19 4. Nils Diewald: Matrix and Double-Array Representations for Efficient Finite State Tokenization. Pp. 20-26 5. Peter Fankhauser and Marc Kupietz: Count-Based and Predictive Language Models for Exploring DeReKo. Pp. 27-31 6. Hanno Biber: “The word expired when that world awoke.” New Challenges for Research with Large Text Corpora and Corpus-Based Discourse Studies in Totalitarian Times. Pp. 32-35
format	Web Resource
fullrecord	<record><control><sourceid>europeana_1GC</sourceid><recordid>TN_cdi_europeana_collections_2048427_item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR6</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2048427_item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR6</sourcerecordid><originalsourceid>FETCH-europeana_collections_2048427_item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR63</originalsourceid><addsrcrecordid>eNqtjc2OgjAUhdm4MDrvcJdjogYLiPsGdLBmHDMZjBvS4BUasZe0uPXZrcRHmNVJzs93ht5jb6hEPCtdWaALdDWCOCQcmM8Y5GSutqYWSAOvZdOgrtCC0n1vJ7Ws8Ia6ey2FNBUCJ9OSkfDJd4LPFn7Pmcxd11hUDjB1DmR3jX0y9gYX2Vj8eOvIy9Lkl29meDfUojsoSnKrslOkbcH8cBWyuFAd3opQ_GzX2bc4pUkY59lXGkTxZhv9ify4j9aHZfCvsCcpi1y3</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>web_resource</recordtype></control><display><type>web_resource</type><title>Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022</title><source>Europeana Collections</source><creator>Bański, Piotr ; Barbaresi, Adrien ; Clematide, Simon ; Kupietz, Marc ; Lüngen, Harald</creator><creatorcontrib>Bański, Piotr ; Barbaresi, Adrien ; Clematide, Simon ; Kupietz, Marc ; Lüngen, Harald</creatorcontrib><description>Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of PubMed Records with Medical Subject Headings. Pp. 8-15 3. Luca Brigada Villa: UDeasy: a Tool for Querying Treebanks in CoNLL-U Format. Pp. 16-19 4. Nils Diewald: Matrix and Double-Array Representations for Efficient Finite State Tokenization. Pp. 20-26 5. Peter Fankhauser and Marc Kupietz: Count-Based and Predictive Language Models for Exploring DeReKo. Pp. 27-31 6. Hanno Biber: “The word expired when that world awoke.” New Challenges for Research with Large Text Corpora and Corpus-Based Discourse Studies in Totalitarian Times. Pp. 32-35</description><language>eng ; ger</language><publisher>Paris : European Language Resources Association (ELRA)</publisher><subject>Daten ; Datenanalyse ; Datenmanagement ; Datenqualität ; Datensammlung ; Datensatz ; Korpus ; Sprache</subject><creationdate>2022</creationdate><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://data.europeana.eu/item/2048427/item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR6$$EHTML$$P50$$Geuropeana$$Hfree_for_read</linktohtml><link.rule.ids>777,38498,75925</link.rule.ids><linktorsrc>$$Uhttps://data.europeana.eu/item/2048427/item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR6$$EView_record_in_Europeana$$FView_record_in_$$GEuropeana$$Hfree_for_read</linktorsrc></links><search><creatorcontrib>Bański, Piotr</creatorcontrib><creatorcontrib>Barbaresi, Adrien</creatorcontrib><creatorcontrib>Clematide, Simon</creatorcontrib><creatorcontrib>Kupietz, Marc</creatorcontrib><creatorcontrib>Lüngen, Harald</creatorcontrib><title>Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022</title><description>Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of PubMed Records with Medical Subject Headings. Pp. 8-15 3. Luca Brigada Villa: UDeasy: a Tool for Querying Treebanks in CoNLL-U Format. Pp. 16-19 4. Nils Diewald: Matrix and Double-Array Representations for Efficient Finite State Tokenization. Pp. 20-26 5. Peter Fankhauser and Marc Kupietz: Count-Based and Predictive Language Models for Exploring DeReKo. Pp. 27-31 6. Hanno Biber: “The word expired when that world awoke.” New Challenges for Research with Large Text Corpora and Corpus-Based Discourse Studies in Totalitarian Times. Pp. 32-35</description><subject>Daten</subject><subject>Datenanalyse</subject><subject>Datenmanagement</subject><subject>Datenqualität</subject><subject>Datensammlung</subject><subject>Datensatz</subject><subject>Korpus</subject><subject>Sprache</subject><fulltext>true</fulltext><rsrctype>web_resource</rsrctype><creationdate>2022</creationdate><recordtype>web_resource</recordtype><sourceid>1GC</sourceid><recordid>eNqtjc2OgjAUhdm4MDrvcJdjogYLiPsGdLBmHDMZjBvS4BUasZe0uPXZrcRHmNVJzs93ht5jb6hEPCtdWaALdDWCOCQcmM8Y5GSutqYWSAOvZdOgrtCC0n1vJ7Ws8Ia6ey2FNBUCJ9OSkfDJd4LPFn7Pmcxd11hUDjB1DmR3jX0y9gYX2Vj8eOvIy9Lkl29meDfUojsoSnKrslOkbcH8cBWyuFAd3opQ_GzX2bc4pUkY59lXGkTxZhv9ify4j9aHZfCvsCcpi1y3</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Bański, Piotr</creator><creator>Barbaresi, Adrien</creator><creator>Clematide, Simon</creator><creator>Kupietz, Marc</creator><creator>Lüngen, Harald</creator><general>Paris : European Language Resources Association (ELRA)</general><general>Mannheim : Leibniz-Institut für Deutsche Sprache (IDS)</general><scope>1GC</scope></search><sort><creationdate>20220701</creationdate><title>Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022</title><author>Bański, Piotr ; Barbaresi, Adrien ; Clematide, Simon ; Kupietz, Marc ; Lüngen, Harald</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-europeana_collections_2048427_item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR63</frbrgroupid><rsrctype>web_resources</rsrctype><prefilter>web_resources</prefilter><language>eng ; ger</language><creationdate>2022</creationdate><topic>Daten</topic><topic>Datenanalyse</topic><topic>Datenmanagement</topic><topic>Datenqualität</topic><topic>Datensammlung</topic><topic>Datensatz</topic><topic>Korpus</topic><topic>Sprache</topic><toplevel>online_resources</toplevel><creatorcontrib>Bański, Piotr</creatorcontrib><creatorcontrib>Barbaresi, Adrien</creatorcontrib><creatorcontrib>Clematide, Simon</creatorcontrib><creatorcontrib>Kupietz, Marc</creatorcontrib><creatorcontrib>Lüngen, Harald</creatorcontrib><collection>Europeana Collections</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Bański, Piotr</au><au>Barbaresi, Adrien</au><au>Clematide, Simon</au><au>Kupietz, Marc</au><au>Lüngen, Harald</au><format>book</format><genre>unknown</genre><ristype>GEN</ristype><btitle>Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022</btitle><date>2022-07-01</date><risdate>2022</risdate><abstract>Contents: 1. Vasile Pais, Maria Mitrofan, Verginica Barbu Mititelu, Elena Irimia, Roxana Micu and Carol Luca Gasan: Challenges in Creating a Representative Corpus of Romanian Micro-Blogging Text. Pp. 1-7 2. Modest von Korff: Exhaustive Indexing of PubMed Records with Medical Subject Headings. Pp. 8-15 3. Luca Brigada Villa: UDeasy: a Tool for Querying Treebanks in CoNLL-U Format. Pp. 16-19 4. Nils Diewald: Matrix and Double-Array Representations for Efficient Finite State Tokenization. Pp. 20-26 5. Peter Fankhauser and Marc Kupietz: Count-Based and Predictive Language Models for Exploring DeReKo. Pp. 27-31 6. Hanno Biber: “The word expired when that world awoke.” New Challenges for Research with Large Text Corpora and Corpus-Based Discourse Studies in Totalitarian Times. Pp. 32-35</abstract><pub>Paris : European Language Resources Association (ELRA)</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier
ispartof
issn
language	eng ; ger
recordid	cdi_europeana_collections_2048427_item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR6
source	Europeana Collections
subjects	Daten Datenanalyse Datenmanagement Datenqualität Datensammlung Datensatz Korpus Sprache
title	Proceedings of the LREC 2022 Workshop on Challenges in the Management of Large Corpora (CMLC-10 2022). Marseille, 20 June 2022
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T08%3A08%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-europeana_1GC&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=unknown&rft.btitle=Proceedings%20of%20the%20LREC%202022%20Workshop%20on%20Challenges%20in%20the%20Management%20of%20Large%20Corpora%20(CMLC-10%202022).%20Marseille,%2020%20June%202022&rft.au=Ba%C5%84ski,%20Piotr&rft.date=2022-07-01&rft_id=info:doi/&rft_dat=%3Ceuropeana_1GC%3E2048427_item_4LQKGJOLZFE47WJIF357HK5VLWXP5GR6%3C/europeana_1GC%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true