Estimating substance use disparities across intersectional social positions using machine learning: An application of group-lasso interaction network
An aim of quantitative intersectional research is to model the joint impact of multiple social positions on health risk behaviors. Although moderated multiple regression is frequently used to pursue intersectional research hypotheses, such parametric approaches may produce unreliable effect estimate...
Gespeichert in:
Veröffentlicht in: | Psychology of addictive behaviors 2024-06 |
---|---|
Hauptverfasser: | , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | An aim of quantitative intersectional research is to model the joint impact of multiple social positions on health risk behaviors. Although moderated multiple regression is frequently used to pursue intersectional research hypotheses, such parametric approaches may produce unreliable effect estimates due to data sparsity and high dimensionality. Machine learning provides viable alternatives, offering greater flexibility in evaluating many candidate interactions amid sparse data conditions, yet remains rarely employed. This study introduces group-lasso interaction network (glinternet), a novel machine learning approach involving hierarchical regularization, to assess intersectional differences in substance use prevalence.
Utilizing variable selection and parameter stabilization functionality for main and interaction effects, glinternet was employed to examine two-way interactions between three primary social positions (gender, sexual orientation, and race) predicting heavy episodic drinking, cannabis use, and cigarette use prevalence. Analyses were conducted using the All of Us Research Program (
= 283,403), a national sample with high representation from populations historically underrepresented in biomedical research. Results were replicated using holdout cross-validation and compared against logistic regression estimates.
Glinternet prevalence estimates were more stable across discovery and replication samples relative to logistic regression, particularly among sparsely represented groups. Prevalence estimates for cigarette and cannabis use were elevated among sexual minority and White cisgender women compared to heterosexual and non-White women, respectively.
Glinternet may improve upon traditional moderated multiple regression methods for pursuing intersectional hypotheses by improving model parsimony and parameter stability, providing novel means for quantifying health disparities among intersectional social positions. (PsycInfo Database Record (c) 2024 APA, all rights reserved). |
---|---|
ISSN: | 0893-164X 1939-1501 1939-1501 |
DOI: | 10.1037/adb0001020 |