Lietuvių kalbos morfologiškai ir sintaksiškai anotuoti tekstynai

Annotated corpora are fundamental resources, which are very useful to develop language technology. The size, quality, and structure of such annotated corpora has a direct influence on the development of other tools. This article describes two annotated corpora prepared by the Centre of Computational...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bendrinė kalba (iki 2014 metų – Kalbos kultūra) 2017 (90), p.1-30
Hauptverfasser: Rimkutė, Erika, Boizou, Loïc, Bielinskienė, Agnė
Format: Artikel
Sprache:lit
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Annotated corpora are fundamental resources, which are very useful to develop language technology. The size, quality, and structure of such annotated corpora has a direct influence on the development of other tools. This article describes two annotated corpora prepared by the Centre of Computational Linguistics at Vytautas Magnus University: MATAS, a morphologically annotated corpus, and ALKSNIS, a tree bank. It mainly discusses the structure and the tag set of both corpora,as well as the annotation procedure and tools. Both corpora are available online through ANNIS interface, therefore the syntax of ANNIS simple and complex requests is summarized for the Lithuanian potential users.
ISSN:0130-2795
2351-7204