First Steps towards a Risk of Bias Corpus of Randomized Controlled Trials
Abstract Risk of bias (RoB) assessment of randomized clinical trials (RCTs) is vital to conducting systematic reviews. Manual RoB assessment for hundreds of RCTs is a cognitively demanding, lengthy process and is prone to subjective judgment. Supervised machine learning (ML) can help to accelerate t...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Dataset |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Abstract Risk of bias (RoB) assessment of randomized clinical trials (RCTs) is vital to conducting systematic reviews. Manual RoB assessment for hundreds of RCTs is a cognitively demanding, lengthy process and is prone to subjective judgment. Supervised machine learning (ML) can help to accelerate this process but requires a hand-labelled corpus. There are currently no RoB annotation guidelines for randomized clinical trials or annotated corpora. In this pilot project, we test the practicality of directly using the revised Cochrane RoB 2.0 guidelines for developing an RoB annotated corpus using a novel multi-level annotation scheme. We report inter-annotator agreement among four annotators who used Cochrane RoB 2.0 guidelines. The agreement ranges between 0% for some bias classes and 76% for others. Finally, we discuss the shortcomings of this direct translation of annotation guidelines and scheme and suggest approaches to improve them to obtain an RoB annotated corpus suitable for ML. Methods The upload contains two zip files and a .json file. plain.html.zip Original corpus (n = 10) in .html format. The corpus was generated using the methodology described in the paper. Each .html file could be opened in any default text editor in any operating system or browser. A .html contains full text divided into several annotatable text parts. ann.json.zip The .zip contains RoB annotations conducted by the authors (R.H., M.S., K.G., R.C.). The annotation files are in .json format. Each .json is divided into two JSON objects and three JSON arrays. annotatable (object): Parts from the full-text document corresponding to the text parts from the plain .html files. metas (object): full-text document label entities (array): contains labelled entities. Each entity is linked to which part of the full-text it is linked to. relations (array) sources (array) annotations-legend.json This .json file contains entity and entity labels encoded to text legends. For example, entity class label "1_2_Yes_Good" is encoded as "e_113". Resources The code to parse annotations can be found on GitHub. Funding HES-SO Valais-Wallis, Sierre, Switzerland |
---|---|
DOI: | 10.5281/zenodo.7698940 |