Towards Using Scientific Publications to Automatically Extract Information on Rare Diseases

A small percentage of the population is afflicted by what is called an orphan or a rare disease. All over the world, there are about several thousand of these diseases. When adding up together all the individuals who are affected, it amounts for up to 10% of the US population. Scientific works on th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mobile networks and applications 2020-06, Vol.25 (3), p.953-960
Hauptverfasser: Cousyn, Charles, Bouchard, Kévin, Gaboury, Sébastien, Bouchard, Bruno
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A small percentage of the population is afflicted by what is called an orphan or a rare disease. All over the world, there are about several thousand of these diseases. When adding up together all the individuals who are affected, it amounts for up to 10% of the US population. Scientific works on these diseases are often poorly financed due to the lack of potential markets for a treatment, which means for patients and clinicians a very limited and scattered access to vital information. To contribute addressing this issue, we present in this paper a new software tool for automating the extraction of information related to rare diseases from scientific publications. More precisely, our contribution consists in a new method of extracting automatically symptoms of these diseases from research papers exploiting a Named Entity Recognition (NER) algorithm based on the numerical statistic Term Frequency - Inverse Document Frequency (TF-IDF). The proposed tool has been tested using PubMed Central (PMC) database.
ISSN:1383-469X
1572-8153
DOI:10.1007/s11036-019-01237-3