An Open-Source Knowledge Graph Ecosystem for the Life Sciences

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenome...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Callahan, Tiffany J, Tripodi, Ignacio J, Stefanski, Adrianne L, Cappelletti, Luca, Taneja, Sanya B, Wyrwa, Jordan M, Casiraghi, Elena, Matentzoglu, Nicolas A, Reese, Justin, Silverstein, Jonathan C, Hoyt, Charles Tapley, Boyce, Richard D, Malec, Scott A, Unni, Deepak R, Joachimiak, Marcin P, Robinson, Peter N, Mungall, Christopher J, Cavalleri, Emanuele, Fontana, Tommaso, Valentini, Giorgio, Mesiti, Marco, Gillenwater, Lucas A, Santangelo, Brook, Vasilevsky, Nicole A, Hoehndorf, Robert, Bennett, Tellen D, Ryan, Patrick B, Hripcsak, George, Kahn, Michael G, Bada, Michael, BaumgartnerJr, William A, Hunter, Lawrence E
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title
container_volume
creator Callahan, Tiffany J
Tripodi, Ignacio J
Stefanski, Adrianne L
Cappelletti, Luca
Taneja, Sanya B
Wyrwa, Jordan M
Casiraghi, Elena
Matentzoglu, Nicolas A
Reese, Justin
Silverstein, Jonathan C
Hoyt, Charles Tapley
Boyce, Richard D
Malec, Scott A
Unni, Deepak R
Joachimiak, Marcin P
Robinson, Peter N
Mungall, Christopher J
Cavalleri, Emanuele
Fontana, Tommaso
Valentini, Giorgio
Mesiti, Marco
Gillenwater, Lucas A
Santangelo, Brook
Vasilevsky, Nicole A
Hoehndorf, Robert
Bennett, Tellen D
Ryan, Patrick B
Hripcsak, George
Kahn, Michael G
Bada, Michael
BaumgartnerJr, William A
Hunter, Lawrence E
description Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoints and abstraction algorithms), and benchmarks (e.g., prebuilt KGs and embeddings). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
doi_str_mv 10.48550/arxiv.2307.05727
format Article
fullrecord <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2307_05727</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2307_05727</sourcerecordid><originalsourceid>FETCH-LOGICAL-a677-5b83101e37b6d3bc2546c834c53d7710b7ce4f797f6283a1d0abef171ab455b23</originalsourceid><addsrcrecordid>eNotz71uwjAUQGEvDAh4AKb6BRJsXzs3LEgIAa0aiQH2yHauSyRIIofft0elnc52pI-xqRSpzo0RMxsf9S1VIDAVBhUO2WLZ8F1HTbJvr9ET_27a-4mqH-LbaLsjX_u2f_YXOvPQRn45Ei_qQHzva2o89WM2CPbU0-S_I3bYrA-rz6TYbb9WyyKxGWJiXA5SSAJ0WQXOK6Mzn4P2BipEKRx60gHnGDKVg5WVsI6CRGmdNsYpGLGPv-0bUHaxPtv4LH8h5RsCL8R5Qe8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>An Open-Source Knowledge Graph Ecosystem for the Life Sciences</title><source>arXiv.org</source><creator>Callahan, Tiffany J ; Tripodi, Ignacio J ; Stefanski, Adrianne L ; Cappelletti, Luca ; Taneja, Sanya B ; Wyrwa, Jordan M ; Casiraghi, Elena ; Matentzoglu, Nicolas A ; Reese, Justin ; Silverstein, Jonathan C ; Hoyt, Charles Tapley ; Boyce, Richard D ; Malec, Scott A ; Unni, Deepak R ; Joachimiak, Marcin P ; Robinson, Peter N ; Mungall, Christopher J ; Cavalleri, Emanuele ; Fontana, Tommaso ; Valentini, Giorgio ; Mesiti, Marco ; Gillenwater, Lucas A ; Santangelo, Brook ; Vasilevsky, Nicole A ; Hoehndorf, Robert ; Bennett, Tellen D ; Ryan, Patrick B ; Hripcsak, George ; Kahn, Michael G ; Bada, Michael ; BaumgartnerJr, William A ; Hunter, Lawrence E</creator><creatorcontrib>Callahan, Tiffany J ; Tripodi, Ignacio J ; Stefanski, Adrianne L ; Cappelletti, Luca ; Taneja, Sanya B ; Wyrwa, Jordan M ; Casiraghi, Elena ; Matentzoglu, Nicolas A ; Reese, Justin ; Silverstein, Jonathan C ; Hoyt, Charles Tapley ; Boyce, Richard D ; Malec, Scott A ; Unni, Deepak R ; Joachimiak, Marcin P ; Robinson, Peter N ; Mungall, Christopher J ; Cavalleri, Emanuele ; Fontana, Tommaso ; Valentini, Giorgio ; Mesiti, Marco ; Gillenwater, Lucas A ; Santangelo, Brook ; Vasilevsky, Nicole A ; Hoehndorf, Robert ; Bennett, Tellen D ; Ryan, Patrick B ; Hripcsak, George ; Kahn, Michael G ; Bada, Michael ; BaumgartnerJr, William A ; Hunter, Lawrence E</creatorcontrib><description>Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoints and abstraction algorithms), and benchmarks (e.g., prebuilt KGs and embeddings). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.</description><identifier>DOI: 10.48550/arxiv.2307.05727</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computational Engineering, Finance, and Science</subject><creationdate>2023-07</creationdate><rights>http://creativecommons.org/licenses/by/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2307.05727$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2307.05727$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Callahan, Tiffany J</creatorcontrib><creatorcontrib>Tripodi, Ignacio J</creatorcontrib><creatorcontrib>Stefanski, Adrianne L</creatorcontrib><creatorcontrib>Cappelletti, Luca</creatorcontrib><creatorcontrib>Taneja, Sanya B</creatorcontrib><creatorcontrib>Wyrwa, Jordan M</creatorcontrib><creatorcontrib>Casiraghi, Elena</creatorcontrib><creatorcontrib>Matentzoglu, Nicolas A</creatorcontrib><creatorcontrib>Reese, Justin</creatorcontrib><creatorcontrib>Silverstein, Jonathan C</creatorcontrib><creatorcontrib>Hoyt, Charles Tapley</creatorcontrib><creatorcontrib>Boyce, Richard D</creatorcontrib><creatorcontrib>Malec, Scott A</creatorcontrib><creatorcontrib>Unni, Deepak R</creatorcontrib><creatorcontrib>Joachimiak, Marcin P</creatorcontrib><creatorcontrib>Robinson, Peter N</creatorcontrib><creatorcontrib>Mungall, Christopher J</creatorcontrib><creatorcontrib>Cavalleri, Emanuele</creatorcontrib><creatorcontrib>Fontana, Tommaso</creatorcontrib><creatorcontrib>Valentini, Giorgio</creatorcontrib><creatorcontrib>Mesiti, Marco</creatorcontrib><creatorcontrib>Gillenwater, Lucas A</creatorcontrib><creatorcontrib>Santangelo, Brook</creatorcontrib><creatorcontrib>Vasilevsky, Nicole A</creatorcontrib><creatorcontrib>Hoehndorf, Robert</creatorcontrib><creatorcontrib>Bennett, Tellen D</creatorcontrib><creatorcontrib>Ryan, Patrick B</creatorcontrib><creatorcontrib>Hripcsak, George</creatorcontrib><creatorcontrib>Kahn, Michael G</creatorcontrib><creatorcontrib>Bada, Michael</creatorcontrib><creatorcontrib>BaumgartnerJr, William A</creatorcontrib><creatorcontrib>Hunter, Lawrence E</creatorcontrib><title>An Open-Source Knowledge Graph Ecosystem for the Life Sciences</title><description>Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoints and abstraction algorithms), and benchmarks (e.g., prebuilt KGs and embeddings). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computational Engineering, Finance, and Science</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotz71uwjAUQGEvDAh4AKb6BRJsXzs3LEgIAa0aiQH2yHauSyRIIofft0elnc52pI-xqRSpzo0RMxsf9S1VIDAVBhUO2WLZ8F1HTbJvr9ET_27a-4mqH-LbaLsjX_u2f_YXOvPQRn45Ei_qQHzva2o89WM2CPbU0-S_I3bYrA-rz6TYbb9WyyKxGWJiXA5SSAJ0WQXOK6Mzn4P2BipEKRx60gHnGDKVg5WVsI6CRGmdNsYpGLGPv-0bUHaxPtv4LH8h5RsCL8R5Qe8</recordid><startdate>20230711</startdate><enddate>20230711</enddate><creator>Callahan, Tiffany J</creator><creator>Tripodi, Ignacio J</creator><creator>Stefanski, Adrianne L</creator><creator>Cappelletti, Luca</creator><creator>Taneja, Sanya B</creator><creator>Wyrwa, Jordan M</creator><creator>Casiraghi, Elena</creator><creator>Matentzoglu, Nicolas A</creator><creator>Reese, Justin</creator><creator>Silverstein, Jonathan C</creator><creator>Hoyt, Charles Tapley</creator><creator>Boyce, Richard D</creator><creator>Malec, Scott A</creator><creator>Unni, Deepak R</creator><creator>Joachimiak, Marcin P</creator><creator>Robinson, Peter N</creator><creator>Mungall, Christopher J</creator><creator>Cavalleri, Emanuele</creator><creator>Fontana, Tommaso</creator><creator>Valentini, Giorgio</creator><creator>Mesiti, Marco</creator><creator>Gillenwater, Lucas A</creator><creator>Santangelo, Brook</creator><creator>Vasilevsky, Nicole A</creator><creator>Hoehndorf, Robert</creator><creator>Bennett, Tellen D</creator><creator>Ryan, Patrick B</creator><creator>Hripcsak, George</creator><creator>Kahn, Michael G</creator><creator>Bada, Michael</creator><creator>BaumgartnerJr, William A</creator><creator>Hunter, Lawrence E</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20230711</creationdate><title>An Open-Source Knowledge Graph Ecosystem for the Life Sciences</title><author>Callahan, Tiffany J ; Tripodi, Ignacio J ; Stefanski, Adrianne L ; Cappelletti, Luca ; Taneja, Sanya B ; Wyrwa, Jordan M ; Casiraghi, Elena ; Matentzoglu, Nicolas A ; Reese, Justin ; Silverstein, Jonathan C ; Hoyt, Charles Tapley ; Boyce, Richard D ; Malec, Scott A ; Unni, Deepak R ; Joachimiak, Marcin P ; Robinson, Peter N ; Mungall, Christopher J ; Cavalleri, Emanuele ; Fontana, Tommaso ; Valentini, Giorgio ; Mesiti, Marco ; Gillenwater, Lucas A ; Santangelo, Brook ; Vasilevsky, Nicole A ; Hoehndorf, Robert ; Bennett, Tellen D ; Ryan, Patrick B ; Hripcsak, George ; Kahn, Michael G ; Bada, Michael ; BaumgartnerJr, William A ; Hunter, Lawrence E</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a677-5b83101e37b6d3bc2546c834c53d7710b7ce4f797f6283a1d0abef171ab455b23</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computational Engineering, Finance, and Science</topic><toplevel>online_resources</toplevel><creatorcontrib>Callahan, Tiffany J</creatorcontrib><creatorcontrib>Tripodi, Ignacio J</creatorcontrib><creatorcontrib>Stefanski, Adrianne L</creatorcontrib><creatorcontrib>Cappelletti, Luca</creatorcontrib><creatorcontrib>Taneja, Sanya B</creatorcontrib><creatorcontrib>Wyrwa, Jordan M</creatorcontrib><creatorcontrib>Casiraghi, Elena</creatorcontrib><creatorcontrib>Matentzoglu, Nicolas A</creatorcontrib><creatorcontrib>Reese, Justin</creatorcontrib><creatorcontrib>Silverstein, Jonathan C</creatorcontrib><creatorcontrib>Hoyt, Charles Tapley</creatorcontrib><creatorcontrib>Boyce, Richard D</creatorcontrib><creatorcontrib>Malec, Scott A</creatorcontrib><creatorcontrib>Unni, Deepak R</creatorcontrib><creatorcontrib>Joachimiak, Marcin P</creatorcontrib><creatorcontrib>Robinson, Peter N</creatorcontrib><creatorcontrib>Mungall, Christopher J</creatorcontrib><creatorcontrib>Cavalleri, Emanuele</creatorcontrib><creatorcontrib>Fontana, Tommaso</creatorcontrib><creatorcontrib>Valentini, Giorgio</creatorcontrib><creatorcontrib>Mesiti, Marco</creatorcontrib><creatorcontrib>Gillenwater, Lucas A</creatorcontrib><creatorcontrib>Santangelo, Brook</creatorcontrib><creatorcontrib>Vasilevsky, Nicole A</creatorcontrib><creatorcontrib>Hoehndorf, Robert</creatorcontrib><creatorcontrib>Bennett, Tellen D</creatorcontrib><creatorcontrib>Ryan, Patrick B</creatorcontrib><creatorcontrib>Hripcsak, George</creatorcontrib><creatorcontrib>Kahn, Michael G</creatorcontrib><creatorcontrib>Bada, Michael</creatorcontrib><creatorcontrib>BaumgartnerJr, William A</creatorcontrib><creatorcontrib>Hunter, Lawrence E</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Callahan, Tiffany J</au><au>Tripodi, Ignacio J</au><au>Stefanski, Adrianne L</au><au>Cappelletti, Luca</au><au>Taneja, Sanya B</au><au>Wyrwa, Jordan M</au><au>Casiraghi, Elena</au><au>Matentzoglu, Nicolas A</au><au>Reese, Justin</au><au>Silverstein, Jonathan C</au><au>Hoyt, Charles Tapley</au><au>Boyce, Richard D</au><au>Malec, Scott A</au><au>Unni, Deepak R</au><au>Joachimiak, Marcin P</au><au>Robinson, Peter N</au><au>Mungall, Christopher J</au><au>Cavalleri, Emanuele</au><au>Fontana, Tommaso</au><au>Valentini, Giorgio</au><au>Mesiti, Marco</au><au>Gillenwater, Lucas A</au><au>Santangelo, Brook</au><au>Vasilevsky, Nicole A</au><au>Hoehndorf, Robert</au><au>Bennett, Tellen D</au><au>Ryan, Patrick B</au><au>Hripcsak, George</au><au>Kahn, Michael G</au><au>Bada, Michael</au><au>BaumgartnerJr, William A</au><au>Hunter, Lawrence E</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An Open-Source Knowledge Graph Ecosystem for the Life Sciences</atitle><date>2023-07-11</date><risdate>2023</risdate><abstract>Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoints and abstraction algorithms), and benchmarks (e.g., prebuilt KGs and embeddings). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.</abstract><doi>10.48550/arxiv.2307.05727</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier DOI: 10.48550/arxiv.2307.05727
ispartof
issn
language eng
recordid cdi_arxiv_primary_2307_05727
source arXiv.org
subjects Computer Science - Artificial Intelligence
Computer Science - Computational Engineering, Finance, and Science
title An Open-Source Knowledge Graph Ecosystem for the Life Sciences
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-31T06%3A55%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20Open-Source%20Knowledge%20Graph%20Ecosystem%20for%20the%20Life%20Sciences&rft.au=Callahan,%20Tiffany%20J&rft.date=2023-07-11&rft_id=info:doi/10.48550/arxiv.2307.05727&rft_dat=%3Carxiv_GOX%3E2307_05727%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true