Detecting differentially expressed genes in heterogeneous diseases using half Student’s t-test

Background Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods The a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of epidemiology 2010-12, Vol.39 (6), p.1597-1604
Hauptverfasser: Hsu, Chun-Lun, Lee, Wen-Chung
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1604
container_issue 6
container_start_page 1597
container_title International journal of epidemiology
container_volume 39
creator Hsu, Chun-Lun
Lee, Wen-Chung
description Background Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods The authors propose a new test, the ‘half Student’s t-test’, specifically for detecting differentially expressed genes in heterogeneous diseases. Monte–Carlo simulation shows that the test maintains the nominal α level quite well for both normal and non-normal distributions. Power of the half Student’s t is higher than that of the conventional ‘pooled’ Student’s t when there is heterogeneity in the disease under study. The power gain by using the half Student’s t can reach ∼10% when the standard deviation of the case group is 50% larger than that of the control group. Results Application to a colon cancer data reveals that when the false discovery rate (FDR) is controlled at 0.05, the half Student’s t can detect 344 differentially expressed genes, whereas the pooled Student’s t can detect only 65 genes. Or alternatively, if only 50 genes are to be selected, the FDR for the pooled Student’s t has to be set at 0.0320 (false positive rate of ∼3%), but for the half Student’s t, it can be at as low as 0.0001 (false positive rate of about one per ten thousands). Conclusions The half Student’s t-test is to be recommended for the detection of differentially expressed genes in heterogeneous diseases.
doi_str_mv 10.1093/ije/dyq093
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_860378528</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>815547112</sourcerecordid><originalsourceid>FETCH-LOGICAL-c454t-4170ae3f8a2ffbe6b759b28e4dfaa4b63f580b0d5a205b0b0dc3420101fd6e823</originalsourceid><addsrcrecordid>eNqF0cFO3DAQAFCrKipb6KUfUOVSVaoU1o7txDlSoFBpJVYCJMTFdZIxeMlmF48jsbf-Br_Hl-B0FzjuyWPN88zIQ8hXRg8YLfnYzWDcrB5i-IGMmMhFynMlP5IR5ZSmsijYLvmMOKOUCSHKT2Q3o5KVnMsR-XsMAergutukcdaChy4407arBB6XHhChSW6hA0xcl9xF6xfDddFj9AgGY6bH4fmdaW1yEfomVnj-94RJSANg2Cc71rQIXzbnHrn6fXJ5dJZOzk__HB1O0lpIEVLBCmqAW2UyayvIq0KWVaZANNYYUeXcSkUr2kgTZ6-GqOYio4wy2-SgMr5HfqzrLv3ioY-N9dxhDW1r_k-rVU55oWSmtkupRJYxIbdLJqUoGBu6_1zL2i8QPVi99G5u_Eozqocl6bgkvV5SxN82ZftqDs0bfd1KBN83wGAdv9Wbrnb47njOpZBldOnaOQzw-JY3_l7nBS-kPru-0ZOpmnJ-80sr_gJtb6wI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>815547112</pqid></control><display><type>article</type><title>Detecting differentially expressed genes in heterogeneous diseases using half Student’s t-test</title><source>MEDLINE</source><source>Oxford University Press Journals All Titles (1996-Current)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Alma/SFX Local Collection</source><creator>Hsu, Chun-Lun ; Lee, Wen-Chung</creator><creatorcontrib>Hsu, Chun-Lun ; Lee, Wen-Chung</creatorcontrib><description>Background Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods The authors propose a new test, the ‘half Student’s t-test’, specifically for detecting differentially expressed genes in heterogeneous diseases. Monte–Carlo simulation shows that the test maintains the nominal α level quite well for both normal and non-normal distributions. Power of the half Student’s t is higher than that of the conventional ‘pooled’ Student’s t when there is heterogeneity in the disease under study. The power gain by using the half Student’s t can reach ∼10% when the standard deviation of the case group is 50% larger than that of the control group. Results Application to a colon cancer data reveals that when the false discovery rate (FDR) is controlled at 0.05, the half Student’s t can detect 344 differentially expressed genes, whereas the pooled Student’s t can detect only 65 genes. Or alternatively, if only 50 genes are to be selected, the FDR for the pooled Student’s t has to be set at 0.0320 (false positive rate of ∼3%), but for the half Student’s t, it can be at as low as 0.0001 (false positive rate of about one per ten thousands). Conclusions The half Student’s t-test is to be recommended for the detection of differentially expressed genes in heterogeneous diseases.</description><identifier>ISSN: 0300-5771</identifier><identifier>EISSN: 1464-3685</identifier><identifier>DOI: 10.1093/ije/dyq093</identifier><identifier>PMID: 20519335</identifier><identifier>CODEN: IJEPBF</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Analysis. Health state ; Biological and medical sciences ; Colonic Neoplasms - epidemiology ; Colonic Neoplasms - genetics ; Computer Simulation ; epidemiological methods ; Epidemiology ; Gene Expression ; General aspects ; heterogeneous disease ; Humans ; Medical sciences ; Models, Statistical ; Monte Carlo Method ; Public health. Hygiene ; Public health. Hygiene-occupational medicine ; Statistics, Nonparametric ; Student’s t-test</subject><ispartof>International journal of epidemiology, 2010-12, Vol.39 (6), p.1597-1604</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c454t-4170ae3f8a2ffbe6b759b28e4dfaa4b63f580b0d5a205b0b0dc3420101fd6e823</citedby><cites>FETCH-LOGICAL-c454t-4170ae3f8a2ffbe6b759b28e4dfaa4b63f580b0d5a205b0b0dc3420101fd6e823</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=23635459$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20519335$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hsu, Chun-Lun</creatorcontrib><creatorcontrib>Lee, Wen-Chung</creatorcontrib><title>Detecting differentially expressed genes in heterogeneous diseases using half Student’s t-test</title><title>International journal of epidemiology</title><addtitle>Int J Epidemiol</addtitle><description>Background Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods The authors propose a new test, the ‘half Student’s t-test’, specifically for detecting differentially expressed genes in heterogeneous diseases. Monte–Carlo simulation shows that the test maintains the nominal α level quite well for both normal and non-normal distributions. Power of the half Student’s t is higher than that of the conventional ‘pooled’ Student’s t when there is heterogeneity in the disease under study. The power gain by using the half Student’s t can reach ∼10% when the standard deviation of the case group is 50% larger than that of the control group. Results Application to a colon cancer data reveals that when the false discovery rate (FDR) is controlled at 0.05, the half Student’s t can detect 344 differentially expressed genes, whereas the pooled Student’s t can detect only 65 genes. Or alternatively, if only 50 genes are to be selected, the FDR for the pooled Student’s t has to be set at 0.0320 (false positive rate of ∼3%), but for the half Student’s t, it can be at as low as 0.0001 (false positive rate of about one per ten thousands). Conclusions The half Student’s t-test is to be recommended for the detection of differentially expressed genes in heterogeneous diseases.</description><subject>Analysis. Health state</subject><subject>Biological and medical sciences</subject><subject>Colonic Neoplasms - epidemiology</subject><subject>Colonic Neoplasms - genetics</subject><subject>Computer Simulation</subject><subject>epidemiological methods</subject><subject>Epidemiology</subject><subject>Gene Expression</subject><subject>General aspects</subject><subject>heterogeneous disease</subject><subject>Humans</subject><subject>Medical sciences</subject><subject>Models, Statistical</subject><subject>Monte Carlo Method</subject><subject>Public health. Hygiene</subject><subject>Public health. Hygiene-occupational medicine</subject><subject>Statistics, Nonparametric</subject><subject>Student’s t-test</subject><issn>0300-5771</issn><issn>1464-3685</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqF0cFO3DAQAFCrKipb6KUfUOVSVaoU1o7txDlSoFBpJVYCJMTFdZIxeMlmF48jsbf-Br_Hl-B0FzjuyWPN88zIQ8hXRg8YLfnYzWDcrB5i-IGMmMhFynMlP5IR5ZSmsijYLvmMOKOUCSHKT2Q3o5KVnMsR-XsMAergutukcdaChy4407arBB6XHhChSW6hA0xcl9xF6xfDddFj9AgGY6bH4fmdaW1yEfomVnj-94RJSANg2Cc71rQIXzbnHrn6fXJ5dJZOzk__HB1O0lpIEVLBCmqAW2UyayvIq0KWVaZANNYYUeXcSkUr2kgTZ6-GqOYio4wy2-SgMr5HfqzrLv3ioY-N9dxhDW1r_k-rVU55oWSmtkupRJYxIbdLJqUoGBu6_1zL2i8QPVi99G5u_Eozqocl6bgkvV5SxN82ZftqDs0bfd1KBN83wGAdv9Wbrnb47njOpZBldOnaOQzw-JY3_l7nBS-kPru-0ZOpmnJ-80sr_gJtb6wI</recordid><startdate>20101201</startdate><enddate>20101201</enddate><creator>Hsu, Chun-Lun</creator><creator>Lee, Wen-Chung</creator><general>Oxford University Press</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope></search><sort><creationdate>20101201</creationdate><title>Detecting differentially expressed genes in heterogeneous diseases using half Student’s t-test</title><author>Hsu, Chun-Lun ; Lee, Wen-Chung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c454t-4170ae3f8a2ffbe6b759b28e4dfaa4b63f580b0d5a205b0b0dc3420101fd6e823</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Analysis. Health state</topic><topic>Biological and medical sciences</topic><topic>Colonic Neoplasms - epidemiology</topic><topic>Colonic Neoplasms - genetics</topic><topic>Computer Simulation</topic><topic>epidemiological methods</topic><topic>Epidemiology</topic><topic>Gene Expression</topic><topic>General aspects</topic><topic>heterogeneous disease</topic><topic>Humans</topic><topic>Medical sciences</topic><topic>Models, Statistical</topic><topic>Monte Carlo Method</topic><topic>Public health. Hygiene</topic><topic>Public health. Hygiene-occupational medicine</topic><topic>Statistics, Nonparametric</topic><topic>Student’s t-test</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hsu, Chun-Lun</creatorcontrib><creatorcontrib>Lee, Wen-Chung</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><jtitle>International journal of epidemiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hsu, Chun-Lun</au><au>Lee, Wen-Chung</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Detecting differentially expressed genes in heterogeneous diseases using half Student’s t-test</atitle><jtitle>International journal of epidemiology</jtitle><addtitle>Int J Epidemiol</addtitle><date>2010-12-01</date><risdate>2010</risdate><volume>39</volume><issue>6</issue><spage>1597</spage><epage>1604</epage><pages>1597-1604</pages><issn>0300-5771</issn><eissn>1464-3685</eissn><coden>IJEPBF</coden><abstract>Background Microarray technology provides information about hundreds and thousands of gene-expression data in a single experiment. To search for disease-related genes, researchers test for those genes that are differentially expressed between the case subjects and the control subjects. Methods The authors propose a new test, the ‘half Student’s t-test’, specifically for detecting differentially expressed genes in heterogeneous diseases. Monte–Carlo simulation shows that the test maintains the nominal α level quite well for both normal and non-normal distributions. Power of the half Student’s t is higher than that of the conventional ‘pooled’ Student’s t when there is heterogeneity in the disease under study. The power gain by using the half Student’s t can reach ∼10% when the standard deviation of the case group is 50% larger than that of the control group. Results Application to a colon cancer data reveals that when the false discovery rate (FDR) is controlled at 0.05, the half Student’s t can detect 344 differentially expressed genes, whereas the pooled Student’s t can detect only 65 genes. Or alternatively, if only 50 genes are to be selected, the FDR for the pooled Student’s t has to be set at 0.0320 (false positive rate of ∼3%), but for the half Student’s t, it can be at as low as 0.0001 (false positive rate of about one per ten thousands). Conclusions The half Student’s t-test is to be recommended for the detection of differentially expressed genes in heterogeneous diseases.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>20519335</pmid><doi>10.1093/ije/dyq093</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0300-5771
ispartof International journal of epidemiology, 2010-12, Vol.39 (6), p.1597-1604
issn 0300-5771
1464-3685
language eng
recordid cdi_proquest_miscellaneous_860378528
source MEDLINE; Oxford University Press Journals All Titles (1996-Current); EZB-FREE-00999 freely available EZB journals; Alma/SFX Local Collection
subjects Analysis. Health state
Biological and medical sciences
Colonic Neoplasms - epidemiology
Colonic Neoplasms - genetics
Computer Simulation
epidemiological methods
Epidemiology
Gene Expression
General aspects
heterogeneous disease
Humans
Medical sciences
Models, Statistical
Monte Carlo Method
Public health. Hygiene
Public health. Hygiene-occupational medicine
Statistics, Nonparametric
Student’s t-test
title Detecting differentially expressed genes in heterogeneous diseases using half Student’s t-test
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-22T14%3A53%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Detecting%20differentially%20expressed%20genes%20in%20heterogeneous%20diseases%20using%20half%20Student%E2%80%99s%20t-test&rft.jtitle=International%20journal%20of%20epidemiology&rft.au=Hsu,%20Chun-Lun&rft.date=2010-12-01&rft.volume=39&rft.issue=6&rft.spage=1597&rft.epage=1604&rft.pages=1597-1604&rft.issn=0300-5771&rft.eissn=1464-3685&rft.coden=IJEPBF&rft_id=info:doi/10.1093/ije/dyq093&rft_dat=%3Cproquest_cross%3E815547112%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=815547112&rft_id=info:pmid/20519335&rfr_iscdi=true