Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors

Gene expression microarray experiments generate data sets with multiple missing expression values. In some cases, analysis of gene expression requires a complete matrix as input. Either genes with missing values can be removed, or the missing values can be replaced using prediction. We propose six i...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Statistical applications in genetics and molecular biology 2005-05, Vol.4 (1), p.1120-1120
Hauptverfasser:	Feten, Guri, Almoy, Trygve, Aastveit, Are H
Format:	Artikel
Sprache:	eng
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1120
container_issue	1
container_start_page	1120
container_title	Statistical applications in genetics and molecular biology
container_volume	4
creator	Feten, Guri Almoy, Trygve Aastveit, Are H
description	Gene expression microarray experiments generate data sets with multiple missing expression values. In some cases, analysis of gene expression requires a complete matrix as input. Either genes with missing values can be removed, or the missing values can be replaced using prediction. We propose six imputation methods. A comparative study of the methods was performed on data from mice and data from the bacterium Enterococcus faecalis, and a linear mixed model was used to test for differences between the methods. The study showed that different methods' capability to predict is dependent on the data, hence the ideal choice of method and number of components are different for each data set. For data with correlation structure methods based on K-nearest neighbours seemed to be best, while for data without correlation structure using the average of the gene was to be preferred.
format	Article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_miscellaneous_29002745</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>29002745</sourcerecordid><originalsourceid>FETCH-proquest_miscellaneous_290027453</originalsourceid><addsrcrecordid>eNqNjssKwjAQRbNQsD7-YVbuCn0qrqXipuBC3UkJzVQjMaOZVPTvDegHuLpwOfcxEFFaFkW8SNNyJMbM1yTJ0ixPInHaOVS69ZosUAe1Ztb2DEdpemTQNjitI-mcfIO0Cg6MX-6FCmpSaBg8QfUMAekR_AXhV0mOp2LYScM4--lEzDfVfr2N744eYcA3N80tGiMtUs9NtgrHlkWZ_w1-AO7aRbM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>29002745</pqid></control><display><type>article</type><title>Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors</title><source>De Gruyter journals</source><creator>Feten, Guri ; Almoy, Trygve ; Aastveit, Are H</creator><creatorcontrib>Feten, Guri ; Almoy, Trygve ; Aastveit, Are H</creatorcontrib><description>Gene expression microarray experiments generate data sets with multiple missing expression values. In some cases, analysis of gene expression requires a complete matrix as input. Either genes with missing values can be removed, or the missing values can be replaced using prediction. We propose six imputation methods. A comparative study of the methods was performed on data from mice and data from the bacterium Enterococcus faecalis, and a linear mixed model was used to test for differences between the methods. The study showed that different methods' capability to predict is dependent on the data, hence the ideal choice of method and number of components are different for each data set. For data with correlation structure methods based on K-nearest neighbours seemed to be best, while for data without correlation structure using the average of the gene was to be preferred.</description><identifier>ISSN: 1544-6115</identifier><language>eng</language><ispartof>Statistical applications in genetics and molecular biology, 2005-05, Vol.4 (1), p.1120-1120</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780</link.rule.ids></links><search><creatorcontrib>Feten, Guri</creatorcontrib><creatorcontrib>Almoy, Trygve</creatorcontrib><creatorcontrib>Aastveit, Are H</creatorcontrib><title>Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors</title><title>Statistical applications in genetics and molecular biology</title><description>Gene expression microarray experiments generate data sets with multiple missing expression values. In some cases, analysis of gene expression requires a complete matrix as input. Either genes with missing values can be removed, or the missing values can be replaced using prediction. We propose six imputation methods. A comparative study of the methods was performed on data from mice and data from the bacterium Enterococcus faecalis, and a linear mixed model was used to test for differences between the methods. The study showed that different methods' capability to predict is dependent on the data, hence the ideal choice of method and number of components are different for each data set. For data with correlation structure methods based on K-nearest neighbours seemed to be best, while for data without correlation structure using the average of the gene was to be preferred.</description><issn>1544-6115</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNqNjssKwjAQRbNQsD7-YVbuCn0qrqXipuBC3UkJzVQjMaOZVPTvDegHuLpwOfcxEFFaFkW8SNNyJMbM1yTJ0ixPInHaOVS69ZosUAe1Ztb2DEdpemTQNjitI-mcfIO0Cg6MX-6FCmpSaBg8QfUMAekR_AXhV0mOp2LYScM4--lEzDfVfr2N744eYcA3N80tGiMtUs9NtgrHlkWZ_w1-AO7aRbM</recordid><startdate>20050505</startdate><enddate>20050505</enddate><creator>Feten, Guri</creator><creator>Almoy, Trygve</creator><creator>Aastveit, Are H</creator><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20050505</creationdate><title>Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors</title><author>Feten, Guri ; Almoy, Trygve ; Aastveit, Are H</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_miscellaneous_290027453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Feten, Guri</creatorcontrib><creatorcontrib>Almoy, Trygve</creatorcontrib><creatorcontrib>Aastveit, Are H</creatorcontrib><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Statistical applications in genetics and molecular biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Feten, Guri</au><au>Almoy, Trygve</au><au>Aastveit, Are H</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors</atitle><jtitle>Statistical applications in genetics and molecular biology</jtitle><date>2005-05-05</date><risdate>2005</risdate><volume>4</volume><issue>1</issue><spage>1120</spage><epage>1120</epage><pages>1120-1120</pages><issn>1544-6115</issn><abstract>Gene expression microarray experiments generate data sets with multiple missing expression values. In some cases, analysis of gene expression requires a complete matrix as input. Either genes with missing values can be removed, or the missing values can be replaced using prediction. We propose six imputation methods. A comparative study of the methods was performed on data from mice and data from the bacterium Enterococcus faecalis, and a linear mixed model was used to test for differences between the methods. The study showed that different methods' capability to predict is dependent on the data, hence the ideal choice of method and number of components are different for each data set. For data with correlation structure methods based on K-nearest neighbours seemed to be best, while for data without correlation structure using the average of the gene was to be preferred.</abstract></addata></record>
fulltext	fulltext
identifier	ISSN: 1544-6115
ispartof	Statistical applications in genetics and molecular biology, 2005-05, Vol.4 (1), p.1120-1120
issn	1544-6115
language	eng
recordid	cdi_proquest_miscellaneous_29002745
source	De Gruyter journals
title	Prediction of Missing Values in Microarray and Use of Mixed Models to Evaluate the Predictors
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-21T15%3A13%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Prediction%20of%20Missing%20Values%20in%20Microarray%20and%20Use%20of%20Mixed%20Models%20to%20Evaluate%20the%20Predictors&rft.jtitle=Statistical%20applications%20in%20genetics%20and%20molecular%20biology&rft.au=Feten,%20Guri&rft.date=2005-05-05&rft.volume=4&rft.issue=1&rft.spage=1120&rft.epage=1120&rft.pages=1120-1120&rft.issn=1544-6115&rft_id=info:doi/&rft_dat=%3Cproquest%3E29002745%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=29002745&rft_id=info:pmid/&rfr_iscdi=true