Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles

Understanding the complex connectivity structure of the brain is a major challenge in neuroscience. Vast and ever-expanding literature about neuronal connectivity between brain regions already exists in published research articles and databases. However, with the ever-expanding increase in published...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Interdisciplinary sciences : computational life sciences 2021-12, Vol.13 (4), p.731-750
Hauptverfasser: Sharma, Ashika, Jayakumar, Jaikishan, Mitra, Partha P., Chakraborti, Sutanu, Kumar, P. Sreenivasa
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Understanding the complex connectivity structure of the brain is a major challenge in neuroscience. Vast and ever-expanding literature about neuronal connectivity between brain regions already exists in published research articles and databases. However, with the ever-expanding increase in published articles and repositories, it becomes difficult for a neuroscientist to engage with the breadth and depth of any given field within neuroscience. Natural Language Processing (NLP) techniques can be used to mine ‘Brain Region Connectivity’ information from published articles to build a centralized connectivity resource helping neuroscience researchers to gain quick access to research findings. Manually curating and continuously updating such a resource involves significant time and effort. This paper presents an application of supervised machine learning algorithms that perform shallow and deep linguistic analysis of text to automatically extract connectivity between brain region mentions. Our proposed algorithms are evaluated using benchmark datasets collated from PubMed and our own dataset of full text articles annotated by a domain expert. We also present a comparison with state-of-the-art methods including BioBERT. Proposed methods achieve best recall and F 2 scores negating the need for any domain-specific predefined linguistic patterns. Our paper presents a novel effort towards automatically generating interpretable patterns of connectivity for extracting connected brain region mentions from text and can be expanded to include any other domain-specific information. Graphic Abstract
ISSN:1913-2751
1867-1462
DOI:10.1007/s12539-021-00443-6