A systematic assessment of deep learning methods for drug response prediction: From in vitro to clinical applications

## GDSC dataset **GDSC_EXP.csv** GDSC gene expression profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol. **GDSC_MUT.csv** GDSC gene mutation profiles for 966 c...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Shen, Bihan, Fangyoumin Feng, Kunshi Li, Lin, Ping, Liangxiao Ma, Li, Hong
Format: Dataset
Sprache:eng
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:## GDSC dataset **GDSC_EXP.csv** GDSC gene expression profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol. **GDSC_MUT.csv** GDSC gene mutation profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol. The wild type is coded as 1 and the wild type as 0. **GDSC_CNV.csv** GDSC copy number variation profiles for 966 cancer cell lines, where each column represents a cell line in the form of its name and tissue collection site, and each row represents a gene in the form of the HGNC symbol. The copy-neutral is coded as 0 and the deletion or amplification as 1. **GDSC_DR.csv** GDSC drug response data for 966 cancer cell lines and 282 drugs in the form of the natural logarithm of the IC50 readout. The first column shows the cell line name and tissue collection site, the second column shows the drug name, and the third column shows the drug response readout. **GDSC_DrugAnnotation.csv** GDSC annotations for 282 drugs include drug name, PubChem CID, PubChem canonical SMILES, Rdkit canonical SMILES, Target Pathway, standard deviation, bimodality coefficient and density coverage. ## TCGA dataset **TCGA_EXP.csv** TCGA gene expression profiles, where each column represents a patient in the form of TCGA patient ID, and each row represents a gene in the form of the HGNC symbol. **TCGA_MUT.csv** TCGA gene mutation profiles, where each column represents a patient in the form of TCGA patient ID, and each row represents a gene in the form of the HGNC symbol. The wild type is coded as 1 and the wild type as 0. **TCGA_CNV.csv** TCGA copy number variation profiles, where each column represents a patient in the form of TCGA patient ID, and each row represents a gene in the form of the HGNC symbol. The copy-neutral is coded as 0 and the deletion or amplification as 1. **TCGA_DR.csv** TCGA clinical response data. The first column shows the TCGA patient ID, the second column shows the drug name, the third column shows the clinical response category, the fourth column shows the cancer type, and the last column shows the clinical label as responder or non-responder. ## PMID17185464 (Bortezomib) dataset **PMID17185464_EXP.csv** Bortezomib clinical trial gene expression profiles, where each column represents a patient in
DOI:10.5281/zenodo.7264572