Morphosyntactic Annotation in the Technical Corpus of Galician

The Corpus Tecnico do Galego ([CTG] Technical Corpus of Galician), developed at the U of Vigo in collaboration with the U of Santiago de Compostela & available on the Internet (www.sli.uvigo.es/CTG) for free consultation since 2006, is presented, specifying its content & size & reporting...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Linguamática (Braga, Portugal) Portugal), 2009-01, Vol.1, p.61-70
Hauptverfasser: Gomez Guinovart, Xavier, Lopez Fernandez, Susana
Format: Artikel
Sprache:orm
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The Corpus Tecnico do Galego ([CTG] Technical Corpus of Galician), developed at the U of Vigo in collaboration with the U of Santiago de Compostela & available on the Internet (www.sli.uvigo.es/CTG) for free consultation since 2006, is presented, specifying its content & size & reporting tagging & lemmatization work; the Corpus Tecnico Anotado do Galego ([CTAG] Annotated Technical Corpus of Galician) is discussed as a categorized & lemmatized version of CTG. Geoffrey Leech & Andrew Wilson's (1996) guidelines for morphosyntactic annotation were applied, incorporating proposals by Montserrat Civit (2003) for Spanish. Form-lemma-tag samples are produced for nouns, verbs, adjectives, & other parts of speech. The treatment of standard & nonstandard word forms, lexemes nonconforming to normative orthography, loanwords, abbreviations, & symbols is discussed. Samples of tagged text fragments are also included. Adapted from the source document
ISSN:1647-0818
1647-0818