A knowledge base of clinical trial eligibility criteria
[Display omitted] •Eligibility criteria extracted from 352,110 clinical trials in ClinicalTrials.gov.•A unique large-scale knowledge base with computable eligibility criteria entities and attributes.•User-friendly web interface to enable queries and visualization.•Enable clinical trial cohort defini...
Gespeichert in:
Veröffentlicht in: | Journal of biomedical informatics 2021-05, Vol.117, p.103771-103771, Article 103771 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | [Display omitted]
•Eligibility criteria extracted from 352,110 clinical trials in ClinicalTrials.gov.•A unique large-scale knowledge base with computable eligibility criteria entities and attributes.•User-friendly web interface to enable queries and visualization.•Enable clinical trial cohort definition, population representativeness assessment, electronical phenotyping.
We present the Clinical Trial Knowledge Base, a regularly updated knowledge base of discrete clinical trial eligibility criteria equipped with a web-based user interface for querying and aggregate analysis of common eligibility criteria.
We used a natural language processing (NLP) tool named Criteria2Query (Yuan et al., 2019) to transform free text clinical trial eligibility criteria from ClinicalTrials.gov into discrete criteria concepts and attributes encoded using the widely adopted Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) and stored in a relational SQL database. A web application accessible via RESTful APIs was implemented to enable queries and visual aggregate analyses. We demonstrate CTKB’s potential role in EHR phenotype knowledge engineering using ten validated phenotyping algorithms.
At the time of writing, CTKB contained 87,504 distinctive OMOP CDM standard concepts, including Condition (47.82%), Drug (23.01%), Procedure (13.73%), Measurement (24.70%) and Observation (5.28%), with 34.78% for inclusion criteria and 65.22% for exclusion criteria, extracted from 352,110 clinical trials. The average hit rate of criteria concepts in eMERGE phenotype algorithms is 77.56%.
CTKB is a novel comprehensive knowledge base of discrete eligibility criteria concepts with the potential to enable knowledge engineering for clinical trial cohort definition, clinical trial population representativeness assessment, electronical phenotyping, and data gap analyses for using electronic health records to support clinical trial recruitment. |
---|---|
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2021.103771 |