A collaborative filtering-based approach to biomedical knowledge discovery

Abstract Motivation The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future as...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2018-02, Vol.34 (4), p.652-659
Hauptverfasser: Lever, Jake, Gakkhar, Sitanshu, Gottlieb, Michael, Rashnavadi, Tahereh, Lin, Santina, Siu, Celia, Smith, Maia, Jones, Martin R, Krzywinski, Martin, Jones, Steven J M
Format: Artikel
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Abstract Motivation The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers. Results We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations. Availability and implementation All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery. Supplementary information Supplementary data are available at Bioinformatics online.
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/btx613