Measuring Lexical Style and Competence: The Type-Token Vocabulary Curve
A personal computer is used to analyze samples from literary texts by thirteen different authors. The total number of words (tokens) and the number of distinct vocabulary words (types) are computed for each sample. The number of types is then plotted against the number of tokens for eight of the tex...
Gespeichert in:
Veröffentlicht in: | Style (University Park, PA) PA), 1990-12, Vol.24 (4), p.584-599 |
---|---|
1. Verfasser: | |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A personal computer is used to analyze samples from literary texts by thirteen different authors. The total number of words (tokens) and the number of distinct vocabulary words (types) are computed for each sample. The number of types is then plotted against the number of tokens for eight of the texts. From these type-token curves, inferences are drawn about both lexical style (vocabulary use) and lexical competence (vocabulary size). The curve for Joyce's Ulysses, for example, rises much more rapidly than that for a late passage from A Portrait of the Artist as a Young Man; however, after 800 tokens, the two curves begin to converge. This suggests that the difference between Ulysses and Portrait is largely one of lexical style rather than competence. The highest type-token curve for the samples tested was that for Finnegans Wake, the lowest curve was for Genesis. Comparison with typetoken statistics gathered by Henry Kucera and W. Nelson Francis suggests that the curves for the Wake and Genesis are near the maxima and minima for English literature. |
---|---|
ISSN: | 0039-4238 2374-6629 |