Zachycení výstavby textu v Pražském závislostním korpusu
Language corpora annotation schemes cover various layers of sentence description nowadays – from morphology to semantics. Annotation projects concerning phenomena beyond the sentence boundaries, however, started to attract the attention of corpus linguists only recently. In the present contribution,...
Gespeichert in:
Veröffentlicht in: | Slovo a slovesnost 2015, Vol.76 (3), p.163-197 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Language corpora annotation schemes cover various layers of sentence description nowadays – from morphology to semantics. Annotation projects concerning phenomena beyond the sentence boundaries, however, started to attract the attention of corpus linguists only recently. In the present contribution, we describe a unified approach to analysis of discourse phenomena, aimed and developed for a large-scale annotation of Czech empirical data of the Prague Dependency Treebank. This approach is based on two fundamental pillars: (i) it exploits the results of one of the first complex schemes for discourse annotation proposed and realized in the Penn Discourse Treebank for English; (ii) it follows the Praguian Functional Generative Description and treebanking tradition, taking advantage of the tectogrammatical (underlying) layer of sentence analysis and extending it to a full discourse-level description. Our analysis concentrates on two major aspects of discourse coherence: (i) on discourse relations (semantic relations between discourse segments) and discourse connectives as their lexical anchors; and (ii) on coreference and the so-called bridging anaphora. We present a detailed description of the annotation scheme and procedure, address individual problematic issues and offer basic corpus statistics and annotation evaluation. |
---|---|
ISSN: | 0037-7031 2571-0885 |