Web-based and machine learning approaches for identification of patient-reported outcomes in inflammatory bowel disease

Messages from an Internet forum are raw material that emerges in a natural setting (i.e., non-induced by a research situation). The FLARE-IBD project aimed at using an innovative approach consisting of collecting messages posted by patients in an Internet forum and conducting a machine-learning stud...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Digestive and liver disease 2022-04, Vol.54 (4), p.483-489
Hauptverfasser: Ricci, Laetitia, Toussaint, Yannick, Becker, Justine, Najjar, Hiba, Renier, Alix, Choukour, Myriam, Buisson, Anne, Devos, Corinne, Epstein, Jonathan, Peyrin Biroulet, Laurent, Guillemin, Francis
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Messages from an Internet forum are raw material that emerges in a natural setting (i.e., non-induced by a research situation). The FLARE-IBD project aimed at using an innovative approach consisting of collecting messages posted by patients in an Internet forum and conducting a machine-learning study (data analysis/language processing) for developing a patient-reported outcome measuring flare in inflammatory bowel disease meeting international requirements. We used web-based and machine learning approaches, in the following steps. 1) Web-scraping to collect all available posts in an Internet forum (23 656 messages) and extracting metadata from the forum. 2) Twenty patients were randomly assigned 50 extracted messages; participants indicated whether the message corresponded or not to the flare phenomenon (labeling). If yes, participants were asked to identify excerpts from the text they considered significant flare markers (annotation). 3) The set of annotated messages underwent a vocabulary analysis. The phenomenon of flare was circumscribed with the identification of 20 surrogate flare markers classified into five dimensions with their frequency within extracted labeled data: impact on life, symptoms, extra-intestinal manifestations, drugs and environmental factors. Web-based and machine-learning approaches met international recommendations to inform the content and structure for the development of patient-reported outcomes.
ISSN:1590-8658
1878-3562
DOI:10.1016/j.dld.2021.09.005