BNPA: An R package to learn path analysis input models from a data set semi-automatically using Bayesian networks

Epidemiologists constantly search for methodologies that help them better understand how diseases work. Populations urge these improvements to combat these diseases more effectively. The literature presents several authors defending the idea that epidemiologists should be able to develop causal mode...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Knowledge-based systems 2021-07, Vol.223, p.107042, Article 107042
Hauptverfasser: Carvalho, Elias Cesar Araujo de, Vissoci, Joao Ricardo Nickenig, Andrade, Luciano de, Machado, Wagner de Lara, Paraiso, Emerson Cabrera, Nievola, Julio Cesar
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Epidemiologists constantly search for methodologies that help them better understand how diseases work. Populations urge these improvements to combat these diseases more effectively. The literature presents several authors defending the idea that epidemiologists should be able to develop causal models. In this area, the technique of structural equation models (SEM) has stood out in scientific research. Although SEM has been widely used in several research areas, it has been little explored by epidemiologists. Despite its evolution and efficiency, SEM has a gap in terms of discovering causalities. To fill this gap, this study developed an R package called BNPA, whose methodology joins the best of Bayesian network structural learning algorithms (BNSL) from data and path analysis (PA) a SEM subarea. The BNPA was built with pre-processing functions. Its main algorithm allows creating an input model to start the PA from a data set semi-automatically generating information to analyze the PA performance. An analysis of cardiovascular disease’s main predictors was performed using the BNPA with data from the Canadian Community Health Survey (CCHS). Multiple linear regression (MR) was used as a gold standard methodology; the results of BNPA matched 85% of MR results. In conclusion, BNPA is efficient and can benefit researchers, mainly novices, by enabling them to build PA models from data. Furthermore, statisticians and PA experts will have more time to support these researchers instead of creating an initial model.
ISSN:0950-7051
1872-7409
DOI:10.1016/j.knosys.2021.107042