Identifying HIV sequences that escape antibody neutralization using random forests and collaborative targeted learning

Recent studies have indicated that it is possible to protect individuals from HIV infection using passive infusion of monoclonal antibodies. However, in order for monoclonal antibodies to confer robust protection, the antibodies must be capable of neutralizing many possible strains of the virus. Thi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of causal inference 2022-10, Vol.10 (1), p.280-295
Hauptverfasser: Jin, Yutong, Benkeser, David
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recent studies have indicated that it is possible to protect individuals from HIV infection using passive infusion of monoclonal antibodies. However, in order for monoclonal antibodies to confer robust protection, the antibodies must be capable of neutralizing many possible strains of the virus. This is particularly challenging in the context of a highly diverse pathogen like HIV. It is therefore of great interest to leverage existing observational data sources to discover antibodies that are able to neutralize HIV viruses residues where existing antibodies show modest protection. Such information feeds directly into the clinical trial pipeline for monoclonal antibody therapies by providing information on (i) whether and to what extent combinations of antibodies can generate superior protection and (ii) strategies for analyzing past clinical trials to identify evidence of antibody resistance. These observational data include genetic features of many diverse HIV genetic sequences, as well as measures of antibody resistance. The statistical learning problem we are interested in is developing statistical methodology that can be used to analyze these data to identify important genetic features that are significantly associated with antibody resistance. This is a challenging problem owing to the high-dimensional and strongly correlated nature of the genetic sequence data. To overcome these challenges, we propose an outcome-adaptive, collaborative targeted minimum loss-based estimation approach using random forests. We demonstrate simulation that the approach enjoys important statistical benefits over existing approaches in terms of bias, mean squared error, and type I error. We apply the approach to the Compile, Analyze, and Tally Nab Panels database to identify AA positions that are potentially causally related to resistance to neutralization by several different antibodies.
ISSN:2193-3685
2193-3685
DOI:10.1515/jci-2021-0053