Extracting drug-drug interactions from texts with BioBERT and multiple entity-aware attentions

[Display omitted] •We propose multiple entity-aware attentions with various entity information to strengthen the representations of drug entities in sentences.•We integrate drug descriptions from Wikipedia and DrugBank to our model to enhance the semantic information of drug entities.•We modified th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics 2020-06, Vol.106, p.103451, Article 103451
Hauptverfasser: Zhu, Yu, Li, Lishuang, Lu, Hongbin, Zhou, Anqiao, Qin, Xueyang
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:[Display omitted] •We propose multiple entity-aware attentions with various entity information to strengthen the representations of drug entities in sentences.•We integrate drug descriptions from Wikipedia and DrugBank to our model to enhance the semantic information of drug entities.•We modified the output of the BioBERT model and the results show that it is better than using the BioBERT model directly.•We achieve state-of-the-art results for the DDIs extraction with a F-score of 80.9 Drug-drug interactions (DDIs) extraction is one of the important tasks in the field of biomedical relation extraction, which plays an important role in the field of pharmacovigilance. Previous neural network based models have achieved good performance in DDIs extraction. However, most of the previous models did not make good use of the information of drug entity names, which can help to judge the relation between drugs. This is mainly because drug names are often very complex, leading to the fact that neural network models cannot understand their semantics directly. To address this issue, we propose a DDIs extraction model using multiple entity-aware attentions with various entity information. We use an output-modified bidirectional transformer (BioBERT) and a bidirectional gated recurrent unit layer (BiGRU) to obtain the vector representation of sentences. The vectors of drug description documents encoded by Doc2Vec are used as drug description information, which is an external knowledge to our model. Then we construct three different kinds of entity-aware attentions to get the sentence representations with entity information weighted, including attentions using the drug description information. The outputs of attention layers are concatenated and fed into a multi-layer perception layer. Finally, we get the result by a softmax classifier. The F-score is used to evaluate our model, which is also adopted by most previous DDIs extraction models. We evaluate our proposed model on the DDIExtraction 2013 corpus, which is the benchmark corpus of this domain, and achieves the state-of-the-art result (80.9% in F-score).
ISSN:1532-0464
1532-0480
DOI:10.1016/j.jbi.2020.103451