Unsupervised concept extraction from clinical text through semantic composition

[Display omitted] •Unsupervised concept extraction from clinical text.•Uses semantic correspondence between UMLS and free text for concept extraction.•Does not use any relation information.•Achieves a.32 and a.45 F-score on the I2b2-2010 challenge. Concept extraction is an important step in clinical...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics 2019-03, Vol.91, p.103120-103120, Article 103120
Hauptverfasser: Tulkens, Stéphan, Šuster, Simon, Daelemans, Walter
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •Unsupervised concept extraction from clinical text.•Uses semantic correspondence between UMLS and free text for concept extraction.•Does not use any relation information.•Achieves a.32 and a.45 F-score on the I2b2-2010 challenge. Concept extraction is an important step in clinical natural language processing. Once extracted, the use of concepts can improve the accuracy and generalization of downstream systems. We present a new unsupervised system for the extraction of concepts from clinical text. The system creates representations of concepts from the Unified Medical Language System (UMLS®) by combining natural language descriptions of concepts with word representations, and composing these into higher-order concept vectors. These concept vectors are then used to assign labels to candidate phrases which are extracted using a syntactic chunker. Our approach scores an exact F-score of.32 and an inexact F-score of.45 on the well-known I2b2-2010 challenge corpus, outperforming the only other unsupervised concept extraction method. As our approach relies only on word representations and a chunker, it is completely unsupervised. As such, it can be applied to languages and corpora for which we do not have prior annotations. All our code is open-source and can be found at www.github.com/clips/conch.
ISSN:1532-0464
1532-0480
DOI:10.1016/j.jbi.2019.103120