A Novel Biosignature for Potential Stratification and Elucidation of Newly Diagnosed Multiple Myeloma Patients at-Risk
Multiple myeloma (MM) patients have highly variable overallsurvival (OS) ranging from few weeks to more than ten years. Discovering an early biosignature to stratify short-term from long-term survivors offers the prospect of treating at-risk patients. Machine learning (ML) algorithms are currently b...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Multiple myeloma (MM) patients have highly variable overallsurvival (OS) ranging from few weeks to more than ten years. Discovering an early biosignature to stratify short-term from long-term survivors offers the prospect of treating at-risk patients. Machine learning (ML) algorithms are currently being tested to discover biosignatures, but they consist of several features that make their implementation in healthcare an arduous task.
Here, we have developed an algorithm called AlgoOS to stratify newly diagnosed MM (NDMM) patients by integrating a NetRank algorithm, a variation of the Google PageRank algorithm, and ML algorithms-the first of its kind in MM. Also, a dataset of NDMM patients ( n=31) was built consisting of transcriptomic (features=28256), clinical (features=13), biochemical (features=12), and fluorescent in situ hybridization (FISH) (features=3) data. A cut-off OS of 46 months was used to group short and long-term survivors based on domain knowledge. Finally, AlgoOS was implemented on this dataset, and a biosignature predictive of NDMM patient stratification was extracted. The prediction model's performance was evaluated by accuracy, precision, and F1-score, and 5-fold cross-validations were performed. R was used to build a transcription-factor-gene-regulatory network, while all other analyses, including ML, were performed using Python.
During 1 st step of AlgoOS, all transcriptomic features were ranked by NetRank score, and the top 20 were selected for further processing. This ranking was similar to web page ranking done by the Google PageRank algorithm, except that the NetRank algorithm also takes into account the correlation of features to OS. In detail, we calculated NetRank scores of transcriptomic features by building a transcription-factor-gene-regulatory network using the JASPAR-v2022 database. Each transcription factor motif in JASPAR was matched to the putative promoter region (upstream 1000 base pairs) of genes in the hg38 human reference genome. Furthermore, correlations of transcriptomic features to OS were computed, and the NetRank score was calculated by iteratively optimizing a damping factor parameter d, see Figure 1. We trained the support vector machine (SVM) on top NetRanked 10-20000 transcriptomic features, trained it separately on the same number of randomly selected features and calculated their performance scores. The models were chosen by the criteria 1) precision and accuracy ≥ 80%, 2) kernel = non-linear, and 3) C |
---|---|
ISSN: | 0006-4971 1528-0020 |
DOI: | 10.1182/blood-2023-179999 |