Genome‐enabled prediction through machine learning methods considering different levels of trait complexity

Genomic‐wide selection (GWS) consists of the use of a large number of molecular markers for the prediction of genetic values and has been shown to be highly relevant for genetic improvement. The objective of this work was to evaluate and compare the predictive performance of statistical (ridge regre...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Crop science 2021-05, Vol.61 (3), p.1890-1902
Hauptverfasser: Barbosa, Ivan de Paiva, da Silva, Michele Jorge, da Costa, Weverton Gomes, Castro Sant'Anna, Isabela, Nascimento, Moysés, Cruz, Cosme Damião
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1902
container_issue 3
container_start_page 1890
container_title Crop science
container_volume 61
creator Barbosa, Ivan de Paiva
da Silva, Michele Jorge
da Costa, Weverton Gomes
Castro Sant'Anna, Isabela
Nascimento, Moysés
Cruz, Cosme Damião
description Genomic‐wide selection (GWS) consists of the use of a large number of molecular markers for the prediction of genetic values and has been shown to be highly relevant for genetic improvement. The objective of this work was to evaluate and compare the predictive performance of statistical (ridge regression‐best linear unbiased predictor [RR‐BLUP] and BayesB) and machine learning methods through GWS in simulated populations with traits presenting different levels of heritability and quantitative trait loci (QTL) numbers in the presence of dominant and epistatic effects. The simulated genome of population F2 was formed by 1,000 individuals and genotyped with 2,010 single nucleotide polymorphism (SNP) markers. Twenty‐six traits were simulated considering QTL numbers ranging from two to 88 and heritabilities of .3 and .6. The selective and predictive performances were evaluated using the multilayer perceptron (MLP), radial basis function (RBF), decision trees (DT), bagging (BA), random forest (RF), and boosting (BO) machine learning models and the classical RR‐BLUP and BayesB methods. A high effect of heritability was observed for the results of selective accuracy when compared to the increased QTL number. In addition, the selective accuracy based on the number of QTL demonstrates that the application of alternative machine learning models, such as RBF, BA, BO, and RF, can be suitable for the analysis according to QTL number. Machine learning methods are powerful tools for predicting genetic values with epistatic gene control in traits with different degrees of heritability and different numbers of controlling genes. Core Ideas Currently, there are many forecasting techniques whose comparative efficiency is still the subject of study. Adequate knowledge of the techniques on complex traits is useful for the researcher to concentrate efforts. The machine learning model can capture nonlinear relationships and does not require a priori distributions.
doi_str_mv 10.1002/csc2.20488
format Article
fullrecord <record><control><sourceid>wiley_cross</sourceid><recordid>TN_cdi_wiley_primary_10_1002_csc2_20488_CSC220488</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>CSC220488</sourcerecordid><originalsourceid>FETCH-LOGICAL-c2738-218419ab5e5537965b6db3bfccab8e3e5141b893c6693f7762c62a434525c1bf3</originalsourceid><addsrcrecordid>eNqNkMtKxDAUhoMoOI5ufIKulWouTZsupegoCC5UcFeS9MSJtMmQxMvsfASf0SexdcSluDqHw_cffj6EDgk-IRjTUx01PaG4EGILzUjBeI5LzrbRDGNCciLYwy7ai_EJY1zVFZ-hYQHOD_D5_gFOqh66bBWgszpZ77K0DP75cZkNUi-tg6wHGZx1j9kAaem7mGnvou0gTLfOGgMBXBqxF-hj5k2WgrRppIZVD282rffRjpF9hIOfOUf3F-d3zWV-fbO4as6uc00rJnJKREFqqThwzqq65KrsFFNGa6kEMOCkIErUTJdlzUxVlVSXVBas4JRrogybo6PNXx18jAFMuwp2kGHdEtxOotpJVPstaoSPN_ArKG-ituA0_AZGUyWrBaVs3DAZafF_urFJTiYb_-zSGCU_UdvD-o9KbXPb0E25L5iDjrc</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Genome‐enabled prediction through machine learning methods considering different levels of trait complexity</title><source>Access via Wiley Online Library</source><source>Web of Science - Science Citation Index Expanded - 2021&lt;img src="https://exlibris-pub.s3.amazonaws.com/fromwos-v2.jpg" /&gt;</source><source>Alma/SFX Local Collection</source><creator>Barbosa, Ivan de Paiva ; da Silva, Michele Jorge ; da Costa, Weverton Gomes ; Castro Sant'Anna, Isabela ; Nascimento, Moysés ; Cruz, Cosme Damião</creator><creatorcontrib>Barbosa, Ivan de Paiva ; da Silva, Michele Jorge ; da Costa, Weverton Gomes ; Castro Sant'Anna, Isabela ; Nascimento, Moysés ; Cruz, Cosme Damião</creatorcontrib><description>Genomic‐wide selection (GWS) consists of the use of a large number of molecular markers for the prediction of genetic values and has been shown to be highly relevant for genetic improvement. The objective of this work was to evaluate and compare the predictive performance of statistical (ridge regression‐best linear unbiased predictor [RR‐BLUP] and BayesB) and machine learning methods through GWS in simulated populations with traits presenting different levels of heritability and quantitative trait loci (QTL) numbers in the presence of dominant and epistatic effects. The simulated genome of population F2 was formed by 1,000 individuals and genotyped with 2,010 single nucleotide polymorphism (SNP) markers. Twenty‐six traits were simulated considering QTL numbers ranging from two to 88 and heritabilities of .3 and .6. The selective and predictive performances were evaluated using the multilayer perceptron (MLP), radial basis function (RBF), decision trees (DT), bagging (BA), random forest (RF), and boosting (BO) machine learning models and the classical RR‐BLUP and BayesB methods. A high effect of heritability was observed for the results of selective accuracy when compared to the increased QTL number. In addition, the selective accuracy based on the number of QTL demonstrates that the application of alternative machine learning models, such as RBF, BA, BO, and RF, can be suitable for the analysis according to QTL number. Machine learning methods are powerful tools for predicting genetic values with epistatic gene control in traits with different degrees of heritability and different numbers of controlling genes. Core Ideas Currently, there are many forecasting techniques whose comparative efficiency is still the subject of study. Adequate knowledge of the techniques on complex traits is useful for the researcher to concentrate efforts. The machine learning model can capture nonlinear relationships and does not require a priori distributions.</description><identifier>ISSN: 0011-183X</identifier><identifier>EISSN: 1435-0653</identifier><identifier>DOI: 10.1002/csc2.20488</identifier><language>eng</language><publisher>HOBOKEN: Wiley</publisher><subject>Agriculture ; Agronomy ; Life Sciences &amp; Biomedicine ; Science &amp; Technology</subject><ispartof>Crop science, 2021-05, Vol.61 (3), p.1890-1902</ispartof><rights>2021 The Authors. Crop Science © 2021 Crop Science Society of America</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>true</woscitedreferencessubscribed><woscitedreferencescount>8</woscitedreferencescount><woscitedreferencesoriginalsourcerecordid>wos000639822300001</woscitedreferencesoriginalsourcerecordid><citedby>FETCH-LOGICAL-c2738-218419ab5e5537965b6db3bfccab8e3e5141b893c6693f7762c62a434525c1bf3</citedby><cites>FETCH-LOGICAL-c2738-218419ab5e5537965b6db3bfccab8e3e5141b893c6693f7762c62a434525c1bf3</cites><orcidid>0000-0001-8266-4414 ; 0000-0003-0742-5936 ; 0000-0001-8648-8825 ; 0000-0001-5886-9540</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fcsc2.20488$$EPDF$$P50$$Gwiley$$H</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fcsc2.20488$$EHTML$$P50$$Gwiley$$H</linktohtml><link.rule.ids>315,781,785,1418,27929,27930,39263,45579,45580</link.rule.ids></links><search><creatorcontrib>Barbosa, Ivan de Paiva</creatorcontrib><creatorcontrib>da Silva, Michele Jorge</creatorcontrib><creatorcontrib>da Costa, Weverton Gomes</creatorcontrib><creatorcontrib>Castro Sant'Anna, Isabela</creatorcontrib><creatorcontrib>Nascimento, Moysés</creatorcontrib><creatorcontrib>Cruz, Cosme Damião</creatorcontrib><title>Genome‐enabled prediction through machine learning methods considering different levels of trait complexity</title><title>Crop science</title><addtitle>CROP SCI</addtitle><description>Genomic‐wide selection (GWS) consists of the use of a large number of molecular markers for the prediction of genetic values and has been shown to be highly relevant for genetic improvement. The objective of this work was to evaluate and compare the predictive performance of statistical (ridge regression‐best linear unbiased predictor [RR‐BLUP] and BayesB) and machine learning methods through GWS in simulated populations with traits presenting different levels of heritability and quantitative trait loci (QTL) numbers in the presence of dominant and epistatic effects. The simulated genome of population F2 was formed by 1,000 individuals and genotyped with 2,010 single nucleotide polymorphism (SNP) markers. Twenty‐six traits were simulated considering QTL numbers ranging from two to 88 and heritabilities of .3 and .6. The selective and predictive performances were evaluated using the multilayer perceptron (MLP), radial basis function (RBF), decision trees (DT), bagging (BA), random forest (RF), and boosting (BO) machine learning models and the classical RR‐BLUP and BayesB methods. A high effect of heritability was observed for the results of selective accuracy when compared to the increased QTL number. In addition, the selective accuracy based on the number of QTL demonstrates that the application of alternative machine learning models, such as RBF, BA, BO, and RF, can be suitable for the analysis according to QTL number. Machine learning methods are powerful tools for predicting genetic values with epistatic gene control in traits with different degrees of heritability and different numbers of controlling genes. Core Ideas Currently, there are many forecasting techniques whose comparative efficiency is still the subject of study. Adequate knowledge of the techniques on complex traits is useful for the researcher to concentrate efforts. The machine learning model can capture nonlinear relationships and does not require a priori distributions.</description><subject>Agriculture</subject><subject>Agronomy</subject><subject>Life Sciences &amp; Biomedicine</subject><subject>Science &amp; Technology</subject><issn>0011-183X</issn><issn>1435-0653</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>HGBXW</sourceid><recordid>eNqNkMtKxDAUhoMoOI5ufIKulWouTZsupegoCC5UcFeS9MSJtMmQxMvsfASf0SexdcSluDqHw_cffj6EDgk-IRjTUx01PaG4EGILzUjBeI5LzrbRDGNCciLYwy7ai_EJY1zVFZ-hYQHOD_D5_gFOqh66bBWgszpZ77K0DP75cZkNUi-tg6wHGZx1j9kAaem7mGnvou0gTLfOGgMBXBqxF-hj5k2WgrRppIZVD282rffRjpF9hIOfOUf3F-d3zWV-fbO4as6uc00rJnJKREFqqThwzqq65KrsFFNGa6kEMOCkIErUTJdlzUxVlVSXVBas4JRrogybo6PNXx18jAFMuwp2kGHdEtxOotpJVPstaoSPN_ArKG-ituA0_AZGUyWrBaVs3DAZafF_urFJTiYb_-zSGCU_UdvD-o9KbXPb0E25L5iDjrc</recordid><startdate>202105</startdate><enddate>202105</enddate><creator>Barbosa, Ivan de Paiva</creator><creator>da Silva, Michele Jorge</creator><creator>da Costa, Weverton Gomes</creator><creator>Castro Sant'Anna, Isabela</creator><creator>Nascimento, Moysés</creator><creator>Cruz, Cosme Damião</creator><general>Wiley</general><scope>BLEPL</scope><scope>DTL</scope><scope>HGBXW</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-8266-4414</orcidid><orcidid>https://orcid.org/0000-0003-0742-5936</orcidid><orcidid>https://orcid.org/0000-0001-8648-8825</orcidid><orcidid>https://orcid.org/0000-0001-5886-9540</orcidid></search><sort><creationdate>202105</creationdate><title>Genome‐enabled prediction through machine learning methods considering different levels of trait complexity</title><author>Barbosa, Ivan de Paiva ; da Silva, Michele Jorge ; da Costa, Weverton Gomes ; Castro Sant'Anna, Isabela ; Nascimento, Moysés ; Cruz, Cosme Damião</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c2738-218419ab5e5537965b6db3bfccab8e3e5141b893c6693f7762c62a434525c1bf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Agriculture</topic><topic>Agronomy</topic><topic>Life Sciences &amp; Biomedicine</topic><topic>Science &amp; Technology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Barbosa, Ivan de Paiva</creatorcontrib><creatorcontrib>da Silva, Michele Jorge</creatorcontrib><creatorcontrib>da Costa, Weverton Gomes</creatorcontrib><creatorcontrib>Castro Sant'Anna, Isabela</creatorcontrib><creatorcontrib>Nascimento, Moysés</creatorcontrib><creatorcontrib>Cruz, Cosme Damião</creatorcontrib><collection>Web of Science Core Collection</collection><collection>Science Citation Index Expanded</collection><collection>Web of Science - Science Citation Index Expanded - 2021</collection><collection>CrossRef</collection><jtitle>Crop science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Barbosa, Ivan de Paiva</au><au>da Silva, Michele Jorge</au><au>da Costa, Weverton Gomes</au><au>Castro Sant'Anna, Isabela</au><au>Nascimento, Moysés</au><au>Cruz, Cosme Damião</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genome‐enabled prediction through machine learning methods considering different levels of trait complexity</atitle><jtitle>Crop science</jtitle><stitle>CROP SCI</stitle><date>2021-05</date><risdate>2021</risdate><volume>61</volume><issue>3</issue><spage>1890</spage><epage>1902</epage><pages>1890-1902</pages><issn>0011-183X</issn><eissn>1435-0653</eissn><abstract>Genomic‐wide selection (GWS) consists of the use of a large number of molecular markers for the prediction of genetic values and has been shown to be highly relevant for genetic improvement. The objective of this work was to evaluate and compare the predictive performance of statistical (ridge regression‐best linear unbiased predictor [RR‐BLUP] and BayesB) and machine learning methods through GWS in simulated populations with traits presenting different levels of heritability and quantitative trait loci (QTL) numbers in the presence of dominant and epistatic effects. The simulated genome of population F2 was formed by 1,000 individuals and genotyped with 2,010 single nucleotide polymorphism (SNP) markers. Twenty‐six traits were simulated considering QTL numbers ranging from two to 88 and heritabilities of .3 and .6. The selective and predictive performances were evaluated using the multilayer perceptron (MLP), radial basis function (RBF), decision trees (DT), bagging (BA), random forest (RF), and boosting (BO) machine learning models and the classical RR‐BLUP and BayesB methods. A high effect of heritability was observed for the results of selective accuracy when compared to the increased QTL number. In addition, the selective accuracy based on the number of QTL demonstrates that the application of alternative machine learning models, such as RBF, BA, BO, and RF, can be suitable for the analysis according to QTL number. Machine learning methods are powerful tools for predicting genetic values with epistatic gene control in traits with different degrees of heritability and different numbers of controlling genes. Core Ideas Currently, there are many forecasting techniques whose comparative efficiency is still the subject of study. Adequate knowledge of the techniques on complex traits is useful for the researcher to concentrate efforts. The machine learning model can capture nonlinear relationships and does not require a priori distributions.</abstract><cop>HOBOKEN</cop><pub>Wiley</pub><doi>10.1002/csc2.20488</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0001-8266-4414</orcidid><orcidid>https://orcid.org/0000-0003-0742-5936</orcidid><orcidid>https://orcid.org/0000-0001-8648-8825</orcidid><orcidid>https://orcid.org/0000-0001-5886-9540</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0011-183X
ispartof Crop science, 2021-05, Vol.61 (3), p.1890-1902
issn 0011-183X
1435-0653
language eng
recordid cdi_wiley_primary_10_1002_csc2_20488_CSC220488
source Access via Wiley Online Library; Web of Science - Science Citation Index Expanded - 2021<img src="https://exlibris-pub.s3.amazonaws.com/fromwos-v2.jpg" />; Alma/SFX Local Collection
subjects Agriculture
Agronomy
Life Sciences & Biomedicine
Science & Technology
title Genome‐enabled prediction through machine learning methods considering different levels of trait complexity
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-13T23%3A21%3A01IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wiley_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genome%E2%80%90enabled%20prediction%20through%20machine%20learning%20methods%20considering%20different%20levels%20of%20trait%20complexity&rft.jtitle=Crop%20science&rft.au=Barbosa,%20Ivan%20de%20Paiva&rft.date=2021-05&rft.volume=61&rft.issue=3&rft.spage=1890&rft.epage=1902&rft.pages=1890-1902&rft.issn=0011-183X&rft.eissn=1435-0653&rft_id=info:doi/10.1002/csc2.20488&rft_dat=%3Cwiley_cross%3ECSC220488%3C/wiley_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true