Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers

Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 meth...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Plant genetic resources: characterization and utilization 2018-06, Vol.16 (3), p.228-236
Hauptverfasser:	Acuña-Matamoros, Carlos L., Reyes-Valdés, M. Humberto
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Alleles Bioinformatics Computer simulation Corn Criteria Data collection Datasets Genetic distance Genetic diversity Genetic markers Heuristic Hypothesis testing Markers Maximization Methods Optimization Polymorphism Seed banks Seeds Single-nucleotide polymorphism Software Wheat
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	236
container_issue	3
container_start_page	228
container_title	Plant genetic resources: characterization and utilization
container_volume	16
creator	Acuña-Matamoros, Carlos L. Reyes-Valdés, M. Humberto
description	Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.
doi_str_mv	10.1017/S1479262117000247
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2031052207</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cupid>10_1017_S1479262117000247</cupid><sourcerecordid>2031052207</sourcerecordid><originalsourceid>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</originalsourceid><addsrcrecordid>eNp1kMtOwzAQRSMEEqXwAewssQ74kdjJElW8pPKQChK7yHHGrUsSBzsVtN_AR-PSAgvEau6Mzr0zmig6JviUYCLOJiQROeWUEIExponYiQbrUUw5e9790ZTsRwfezwOSCpEOoo-RbTrpjLctshrZrjeNWcnehL6BfmYrj7R1SFkHyC9KDz3yUIP6IrSzDZKolm4KAam_5yHpFt6Nki16m4HsA9FWTirwSM1kED04s4IKlUs0uXtAjXQv4PxhtKdl7eFoW4fR0-XF4-g6Ht9f3YzOx7FiRPQxwbykGSkZFlUKCdMy4ylVmmY003mal7lIqMI8oVleYSY5z2WV6wQ4rSSInA2jk01u5-zrAnxfzO3CtWFlQTEjOKUUi0CRDaWc9d6BLjpnwqXLguBi_fTiz9ODh209simdqabwG_2_6xOmI4Vo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2031052207</pqid></control><display><type>article</type><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><source>Cambridge University Press Journals Complete</source><creator>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</creator><creatorcontrib>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</creatorcontrib><description>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</description><identifier>ISSN: 1479-2621</identifier><identifier>EISSN: 1479-263X</identifier><identifier>DOI: 10.1017/S1479262117000247</identifier><language>eng</language><publisher>Cambridge, UK: Cambridge University Press</publisher><subject>Algorithms ; Alleles ; Bioinformatics ; Computer simulation ; Corn ; Criteria ; Data collection ; Datasets ; Genetic distance ; Genetic diversity ; Genetic markers ; Heuristic ; Hypothesis testing ; Markers ; Maximization ; Methods ; Optimization ; Polymorphism ; Seed banks ; Seeds ; Single-nucleotide polymorphism ; Software ; Wheat</subject><ispartof>Plant genetic resources: characterization and utilization, 2018-06, Vol.16 (3), p.228-236</ispartof><rights>Copyright © NIAB 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</citedby><cites>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.cambridge.org/core/product/identifier/S1479262117000247/type/journal_article$$EHTML$$P50$$Gcambridge$$H</linktohtml><link.rule.ids>164,314,780,784,27924,27925,55628</link.rule.ids></links><search><creatorcontrib>Acuña-Matamoros, Carlos L.</creatorcontrib><creatorcontrib>Reyes-Valdés, M. Humberto</creatorcontrib><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><title>Plant genetic resources: characterization and utilization</title><addtitle>Plant Genet. Resour</addtitle><description>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</description><subject>Algorithms</subject><subject>Alleles</subject><subject>Bioinformatics</subject><subject>Computer simulation</subject><subject>Corn</subject><subject>Criteria</subject><subject>Data collection</subject><subject>Datasets</subject><subject>Genetic distance</subject><subject>Genetic diversity</subject><subject>Genetic markers</subject><subject>Heuristic</subject><subject>Hypothesis testing</subject><subject>Markers</subject><subject>Maximization</subject><subject>Methods</subject><subject>Optimization</subject><subject>Polymorphism</subject><subject>Seed banks</subject><subject>Seeds</subject><subject>Single-nucleotide polymorphism</subject><subject>Software</subject><subject>Wheat</subject><issn>1479-2621</issn><issn>1479-263X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp1kMtOwzAQRSMEEqXwAewssQ74kdjJElW8pPKQChK7yHHGrUsSBzsVtN_AR-PSAgvEau6Mzr0zmig6JviUYCLOJiQROeWUEIExponYiQbrUUw5e9790ZTsRwfezwOSCpEOoo-RbTrpjLctshrZrjeNWcnehL6BfmYrj7R1SFkHyC9KDz3yUIP6IrSzDZKolm4KAam_5yHpFt6Nki16m4HsA9FWTirwSM1kED04s4IKlUs0uXtAjXQv4PxhtKdl7eFoW4fR0-XF4-g6Ht9f3YzOx7FiRPQxwbykGSkZFlUKCdMy4ylVmmY003mal7lIqMI8oVleYSY5z2WV6wQ4rSSInA2jk01u5-zrAnxfzO3CtWFlQTEjOKUUi0CRDaWc9d6BLjpnwqXLguBi_fTiz9ODh209simdqabwG_2_6xOmI4Vo</recordid><startdate>201806</startdate><enddate>201806</enddate><creator>Acuña-Matamoros, Carlos L.</creator><creator>Reyes-Valdés, M. Humberto</creator><general>Cambridge University Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X2</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FK</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>LK8</scope><scope>M0K</scope><scope>M7P</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>RC3</scope></search><sort><creationdate>201806</creationdate><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><author>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Alleles</topic><topic>Bioinformatics</topic><topic>Computer simulation</topic><topic>Corn</topic><topic>Criteria</topic><topic>Data collection</topic><topic>Datasets</topic><topic>Genetic distance</topic><topic>Genetic diversity</topic><topic>Genetic markers</topic><topic>Heuristic</topic><topic>Hypothesis testing</topic><topic>Markers</topic><topic>Maximization</topic><topic>Methods</topic><topic>Optimization</topic><topic>Polymorphism</topic><topic>Seed banks</topic><topic>Seeds</topic><topic>Single-nucleotide polymorphism</topic><topic>Software</topic><topic>Wheat</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Acuña-Matamoros, Carlos L.</creatorcontrib><creatorcontrib>Reyes-Valdés, M. Humberto</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Agricultural Science Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agricultural Science Database</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Genetics Abstracts</collection><jtitle>Plant genetic resources: characterization and utilization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Acuña-Matamoros, Carlos L.</au><au>Reyes-Valdés, M. Humberto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</atitle><jtitle>Plant genetic resources: characterization and utilization</jtitle><addtitle>Plant Genet. Resour</addtitle><date>2018-06</date><risdate>2018</risdate><volume>16</volume><issue>3</issue><spage>228</spage><epage>236</epage><pages>228-236</pages><issn>1479-2621</issn><eissn>1479-263X</eissn><abstract>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</abstract><cop>Cambridge, UK</cop><pub>Cambridge University Press</pub><doi>10.1017/S1479262117000247</doi><tpages>9</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1479-2621
ispartof	Plant genetic resources: characterization and utilization, 2018-06, Vol.16 (3), p.228-236
issn	1479-2621 1479-263X
language	eng
recordid	cdi_proquest_journals_2031052207
source	Cambridge University Press Journals Complete
subjects	Algorithms Alleles Bioinformatics Computer simulation Corn Criteria Data collection Datasets Genetic distance Genetic diversity Genetic markers Heuristic Hypothesis testing Markers Maximization Methods Optimization Polymorphism Seed banks Seeds Single-nucleotide polymorphism Software Wheat
title	Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T21%3A47%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20optimization%20methods%20for%20core%20subset%20selection%20from%20a%20large%20collection%20of%20Mexican%20wheat%20landraces%20characterized%20by%20SNP%20markers&rft.jtitle=Plant%20genetic%20resources:%20characterization%20and%20utilization&rft.au=Acu%C3%B1a-Matamoros,%20Carlos%20L.&rft.date=2018-06&rft.volume=16&rft.issue=3&rft.spage=228&rft.epage=236&rft.pages=228-236&rft.issn=1479-2621&rft.eissn=1479-263X&rft_id=info:doi/10.1017/S1479262117000247&rft_dat=%3Cproquest_cross%3E2031052207%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2031052207&rft_id=info:pmid/&rft_cupid=10_1017_S1479262117000247&rfr_iscdi=true