Neural negated entity recognition in Spanish electronic health records
[Display omitted] •The goal is to detect negated Clinical Named Entities.•Character embeddings are able to cope with lexical variability in EHRs.•The Bi-LSTM was proven robust to capture contextual information of the negation cue.•The system using lemma-embeddings outperformed its counterpart us-ing...
Gespeichert in:
Veröffentlicht in: | Journal of biomedical informatics 2020-05, Vol.105, p.103419-103419, Article 103419 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | [Display omitted]
•The goal is to detect negated Clinical Named Entities.•Character embeddings are able to cope with lexical variability in EHRs.•The Bi-LSTM was proven robust to capture contextual information of the negation cue.•The system using lemma-embeddings outperformed its counterpart us-ing word-embeddings.•F-measure of 65.1 for exact match and of 82.4 for partial match in Spanish EHRs.
This work deals with negation detection in the context of clinical texts. Negation detection is a key for decision support systems since negated events (detection of absence of some events) help ascertain current medical conditions. For artificial intelligence, negation detection is a valuable point as it can revert the meaning of a part of a text and, accordingly, influence other tasks such as medical dosage adjustment, the detection of adverse drug reactions or hospital acquired diseases.
We focus on negated medical events such as disorders, findings and allergies. From Natural Language Processing (NLP) background, we refer to them as negated medical entities. A novelty of this work is that we approached this task as Named Entity Recognition (NER) with the restriction that just negated medical entities must be recognized (in an attempt to help distinguish them from non-negated ones).
Our study is driven with Electronic Health Records (EHRs) written in Spanish. A challenge to cope with is the lexical variability (alternative medical forms, abbreviations, etc.). To this end, we employed an approach based on deep learning. Specifically, the system combines character embeddings to cope with out-of-vocabulary (OOV) words, Long Short-Term Memory (LSTM) networks to model contextual representations and it makes use of Conditional Random Fields (CRF) to classify each medical entity as either negated or not given the contextual dense representation. Moreover, we explored both embeddings created from words and embeddings created from lemmas.
The best results were obtained with the lemmatized embeddings. Apparently, this approach reinforced the capability of the LSTMs to cope with the high lexical variability. The f-measure for exact-match was 65.1 and 82.4 for the partial-match. |
---|---|
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2020.103419 |