Knowledge synthesis from 100 million biomedical documents augments the deep expression profiling of coronavirus receptors
The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a platform for the real-time synthesis of the expon...
Gespeichert in:
Hauptverfasser: | , , , , , , , , , , , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The COVID-19 pandemic demands assimilation of all available biomedical
knowledge to decode its mechanisms of pathogenicity and transmission. Despite
the recent renaissance in unsupervised neural networks for decoding
unstructured natural languages, a platform for the real-time synthesis of the
exponentially growing biomedical literature and its comprehensive triangulation
with deep omic insights is not available. Here, we present the nferX platform
for dynamic inference from over 45 quadrillion possible conceptual associations
extracted from unstructured biomedical text, and their triangulation with
Single Cell RNA-sequencing based insights from over 25 tissues. Using this
platform, we identify intersections between the pathologic manifestations of
COVID-19 and the comprehensive expression profile of the SARS-CoV-2 receptor
ACE2. We find that tongue keratinocytes and olfactory epithelial cells are
likely under-appreciated targets of SARS-CoV-2 infection, correlating with
reported loss of sense of taste and smell as early indicators of COVID-19
infection, including in otherwise asymptomatic patients. Airway club cells,
ciliated cells and type II pneumocytes in the lung, and enterocytes of the gut
also express ACE2. This study demonstrates how a holistic data science platform
can leverage unprecedented quantities of structured and unstructured publicly
available data to accelerate the generation of impactful biological insights
and hypotheses. |
---|---|
DOI: | 10.48550/arxiv.2003.12773 |