Derivation of minimum best sample size from microarray data sets: A Monte Carlo approach

NCBI has been accumulating a large repository of microarray data sets, namely Gene Expression Omnibus (GEO). GEO is a great resource enabling one to pursue various biological and pathological questions. The question we ask here is: given a set of gene signatures and a classifier, what is the best mi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Chengpeng Bi, Becker, M., Leeder, S.
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:NCBI has been accumulating a large repository of microarray data sets, namely Gene Expression Omnibus (GEO). GEO is a great resource enabling one to pursue various biological and pathological questions. The question we ask here is: given a set of gene signatures and a classifier, what is the best minimum sample size in a clinical microarray research that can effectively distinguish different types of patient responses to a therapeutic drug. It is difficult to answer the question since the sample size for most microarray experiments stored in GEO is very limited. This paper presents a Monte Carlo approach to simulating the best minimum microarray sample size based on the available data sets. Support Vector Machine (SVM) is used as a classifier to compute prediction accuracy for different sample size. Then, a logistic function is applied to fit the relationship between sample size and accuracy whereby a theoretic minimum sample size can be derived.
DOI:10.1109/CIBCB.2011.5948461