Representation Learning for Person or Entity-centric Knowledge Graphs: An Application in Healthcare
Knowledge graphs (KGs) are a popular way to organise information based on ontologies or schemas and have been used across a variety of scenarios from search to recommendation. Despite advances in KGs, representing knowledge remains a non-trivial task across industries and it is especially challengin...
Gespeichert in:
Hauptverfasser: | , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Knowledge graphs (KGs) are a popular way to organise information based on
ontologies or schemas and have been used across a variety of scenarios from
search to recommendation. Despite advances in KGs, representing knowledge
remains a non-trivial task across industries and it is especially challenging
in the biomedical and healthcare domains due to complex interdependent
relations between entities, heterogeneity, lack of standardization, and
sparseness of data. KGs are used to discover diagnoses or prioritize genes
relevant to disease, but they often rely on schemas that are not centred around
a node or entity of interest, such as a person. Entity-centric KGs are
relatively unexplored but hold promise in representing important facets
connected to a central node and unlocking downstream tasks beyond graph
traversal and reasoning, such as generating graph embeddings and training graph
neural networks for a wide range of predictive tasks. This paper presents an
end-to-end representation learning framework to extract entity-centric KGs from
structured and unstructured data. We introduce a star-shaped ontology to
represent the multiple facets of a person and use it to guide KG creation.
Compact representations of the graphs are created leveraging graph neural
networks and experiments are conducted using different levels of heterogeneity
or explicitness. A readmission prediction task is used to evaluate the results
of the proposed framework, showing a stable system, robust to missing data,
that outperforms a range of baseline machine learning classifiers. We highlight
that this approach has several potential applications across domains and is
open-sourced. Lastly, we discuss lessons learned, challenges, and next steps
for the adoption of the framework in practice. |
---|---|
DOI: | 10.48550/arxiv.2305.05640 |