Lexical ambiguity detection in professional discourse
Professional discourse is the language used by specialists, such as lawyers, doctors and academics, to communicate the knowledge and assumptions associated with their respective fields. Professional discourse can be especially difficult for non-specialists to understand due to the lexical ambiguity...
Gespeichert in:
Veröffentlicht in: | Information processing & management 2022-09, Vol.59 (5), p.103000, Article 103000 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Professional discourse is the language used by specialists, such as lawyers, doctors and academics, to communicate the knowledge and assumptions associated with their respective fields. Professional discourse can be especially difficult for non-specialists to understand due to the lexical ambiguity of commonplace words that have a different or more specific meaning within a specialist domain. This phenomena also makes it harder for specialists to communicate with the general public because they are similarly unaware of the potential for misunderstandings.
In this article, we present an approach for detecting domain terms with lexical ambiguity versus everyday English. We demonstrate the efficacy of our approach with three case studies in statistics, law and biomedicine. In all case studies, we identify domain terms with a precision@100 greater than 0.9, outperforming the best performing baseline by 18.1–91.7%. Most importantly, we show this ranking is broadly consistent with semantic differences. Our results highlight the difficulties that existing semantic difference methods have in the cross-domain setting, which rank non-domain terms highly due to noise or biases in the data. We additionally show that our approach generalizes to short phrases and investigate its data efficiency by varying the number of labeled examples.
•Lexically ambiguous terms in professional discourse can confuse non-specialists.•Semantic shift methods perform poorly due to noisy terms and data set biases.•We present three case studies in law, statistics and medicine.•Our method has high precision and ranks terms consistently with semantic shift.•The ranking of short phrases reflects their increased specificity. |
---|---|
ISSN: | 0306-4573 1873-5371 |
DOI: | 10.1016/j.ipm.2022.103000 |