Determining the Importance of Frequency and Contextual Diversity in the Lexical Organization of Multiword Expressions

Corpus-based models of lexical strength have called into question the role of word frequency as an organizing principle of the lexicon, revealing that contextual and semantic diversity measures provide a closer fit to lexical behavior data (Adelman et al., 2006; Jones et al., 2012). Contextual diver...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Canadian journal of experimental psychology 2022-06, Vol.76 (2), p.87-98
Hauptverfasser: Senaldi, Marco S. G., Titone, Debra A., Johns, Brendan T.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Corpus-based models of lexical strength have called into question the role of word frequency as an organizing principle of the lexicon, revealing that contextual and semantic diversity measures provide a closer fit to lexical behavior data (Adelman et al., 2006; Jones et al., 2012). Contextual diversity measures modify word frequency by ignoring word repetition in context, while semantic diversity measures consider the semantic consistency of contextual word occurrence. Recent research has shown that a better account of lexical organization data is provided by socially based measures of semantic diversity, which encode the communication patterns of individuals across discourses (Johns, 2021b). While most research on contextual diversity has focused on single words, recent corpus-based and experimental evidence suggests that an integral part of language use involves recurrent and more structurally complex units, such as multiword phrases and idioms. The aim of the present work was to determine if contextual and semantic diversity drive lexical organization at the level of multiword units (here, operationalized as idiomatic expressions), in addition to single words. To this end, we analyzed normative ratings of familiarity for 210 English idioms (Libben & Titone, 2008) using a set of contextual, semantic, and socially based diversity measures that were computed from a 55-billion word corpus of Reddit comments. The results confirm the superiority of diversity measures over frequency for multiword expressions, suggesting that multiword units, such as idiomatic phrases, show similar lexical organization dynamics as single words. Les modèles de force lexicale fondés sur le corpus remettent en question le rôle de la fréquence des mots comme principe d'organisation du lexique. Ce qui transparait de ceci, c'est que les mesures de la diversité contextuelle et sémantique correspondent davantage aux données sur le comportement lexical (Adelman et al., 2006; Jones et al., 2012). Les mesures de la diversité contextuelle modifient la fréquence des mots en ignorant la répétition des mots dans un contexte particulier, tandis que les mesures de la diversité sémantique tiennent compte de la cohérence sémantique de l'occurrence des mots selon le contexte. Des recherches récentes ont démontré que les mesures sociales de la diversité sémantique, qui permettent de coder les modèles de communication des individus à travers les discours, fournissent un meilleur compte rendu des do
ISSN:1196-1961
1878-7290
DOI:10.1037/cep0000271