Accelerating discoveries in medicine using distributed vector representations of words
Over the years, several neural network architectures have been proposed to process and represent texts using dense vectors (known as word embeddings): mathematical representations that encode the meaning of words or phrases. Word embeddings can be computed by many different algorithms, usually train...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2024-09, Vol.250, p.123566, Article 123566 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Over the years, several neural network architectures have been proposed to process and represent texts using dense vectors (known as word embeddings): mathematical representations that encode the meaning of words or phrases. Word embeddings can be computed by many different algorithms, usually trained on large amounts of textual data aiming to capture semantic relationships between words. These embeddings revolutionized many Natural Language Processing applications, enabling more accurate and nuanced language understanding. Recently, it was demonstrated that it is possible to employ word embeddings to uncover latent knowledge, i.e., information that may be implicit in a set of texts and that would hardly be perceptible to humans. In this context, this study extends such strategy by combining different unsupervised models to accelerate discoveries in medicine. Our word embeddings were trained on a large corpus of medical papers related to Acute Myeloid Leukemia, a highly malignant form of cancer. Our study shows that established therapies could have been developed before their first proposal due to treatment testing notifications issued by our system up to 11 years in advance. The results show the potential of uncovering latent knowledge from the biomedical field to empower faster and more efficient drug testing for medical discoveries.
•A new embedding-based system is proposed to uncover latent knowledge from literature.•The system coherently encodes biomedical knowledge about Acute Myeloid Leukemia.•The system is capable of anticipating discoveries years before they are published.•Medical discoveries can be sped up through a data-driven drug-testing strategy. |
---|---|
ISSN: | 0957-4174 |
DOI: | 10.1016/j.eswa.2024.123566 |