Chia, a large annotated corpus of clinical trial eligibility criteria

We present Chia, a novel, large annotated corpus of patient eligibility criteria extracted from 1,000 interventional, Phase IV clinical trials registered in ClinicalTrials.gov. This dataset includes 12,409 annotated eligibility criteria, represented by 41,487 distinctive entities of 15 entity types...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Scientific data 2020-08, Vol.7 (1), p.281-281, Article 281
Hauptverfasser: Kury, Fabrício, Butler, Alex, Yuan, Chi, Fu, Li-heng, Sun, Yingcheng, Liu, Hao, Sim, Ida, Carini, Simona, Weng, Chunhua
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We present Chia, a novel, large annotated corpus of patient eligibility criteria extracted from 1,000 interventional, Phase IV clinical trials registered in ClinicalTrials.gov. This dataset includes 12,409 annotated eligibility criteria, represented by 41,487 distinctive entities of 15 entity types and 25,017 relationships of 12 relationship types. Each criterion is represented as a directed acyclic graph, which can be easily transformed into Boolean logic to form a database query. Chia can serve as a shared benchmark to develop and test future machine learning, rule-based, or hybrid methods for information extraction from free-text clinical trial eligibility criteria. Measurement(s) Clinical Trial Eligibility Criteria • Analytical Procedure Accuracy Technology Type(s) digital curation • computational modeling technique Sample Characteristic - Organism Homo sapiens Machine-accessible metadata file describing the reported data: https://doi.org/10.6084/m9.figshare.12765602
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-020-00620-0