Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers

Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 meth...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Plant genetic resources: characterization and utilization 2018-06, Vol.16 (3), p.228-236
Hauptverfasser: Acuña-Matamoros, Carlos L., Reyes-Valdés, M. Humberto
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 236
container_issue 3
container_start_page 228
container_title Plant genetic resources: characterization and utilization
container_volume 16
creator Acuña-Matamoros, Carlos L.
Reyes-Valdés, M. Humberto
description Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.
doi_str_mv 10.1017/S1479262117000247
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2031052207</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cupid>10_1017_S1479262117000247</cupid><sourcerecordid>2031052207</sourcerecordid><originalsourceid>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</originalsourceid><addsrcrecordid>eNp1kMtOwzAQRSMEEqXwAewssQ74kdjJElW8pPKQChK7yHHGrUsSBzsVtN_AR-PSAgvEau6Mzr0zmig6JviUYCLOJiQROeWUEIExponYiQbrUUw5e9790ZTsRwfezwOSCpEOoo-RbTrpjLctshrZrjeNWcnehL6BfmYrj7R1SFkHyC9KDz3yUIP6IrSzDZKolm4KAam_5yHpFt6Nki16m4HsA9FWTirwSM1kED04s4IKlUs0uXtAjXQv4PxhtKdl7eFoW4fR0-XF4-g6Ht9f3YzOx7FiRPQxwbykGSkZFlUKCdMy4ylVmmY003mal7lIqMI8oVleYSY5z2WV6wQ4rSSInA2jk01u5-zrAnxfzO3CtWFlQTEjOKUUi0CRDaWc9d6BLjpnwqXLguBi_fTiz9ODh209simdqabwG_2_6xOmI4Vo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2031052207</pqid></control><display><type>article</type><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><source>Cambridge University Press Journals Complete</source><creator>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</creator><creatorcontrib>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</creatorcontrib><description>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</description><identifier>ISSN: 1479-2621</identifier><identifier>EISSN: 1479-263X</identifier><identifier>DOI: 10.1017/S1479262117000247</identifier><language>eng</language><publisher>Cambridge, UK: Cambridge University Press</publisher><subject>Algorithms ; Alleles ; Bioinformatics ; Computer simulation ; Corn ; Criteria ; Data collection ; Datasets ; Genetic distance ; Genetic diversity ; Genetic markers ; Heuristic ; Hypothesis testing ; Markers ; Maximization ; Methods ; Optimization ; Polymorphism ; Seed banks ; Seeds ; Single-nucleotide polymorphism ; Software ; Wheat</subject><ispartof>Plant genetic resources: characterization and utilization, 2018-06, Vol.16 (3), p.228-236</ispartof><rights>Copyright © NIAB 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</citedby><cites>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.cambridge.org/core/product/identifier/S1479262117000247/type/journal_article$$EHTML$$P50$$Gcambridge$$H</linktohtml><link.rule.ids>164,314,780,784,27924,27925,55628</link.rule.ids></links><search><creatorcontrib>Acuña-Matamoros, Carlos L.</creatorcontrib><creatorcontrib>Reyes-Valdés, M. Humberto</creatorcontrib><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><title>Plant genetic resources: characterization and utilization</title><addtitle>Plant Genet. Resour</addtitle><description>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</description><subject>Algorithms</subject><subject>Alleles</subject><subject>Bioinformatics</subject><subject>Computer simulation</subject><subject>Corn</subject><subject>Criteria</subject><subject>Data collection</subject><subject>Datasets</subject><subject>Genetic distance</subject><subject>Genetic diversity</subject><subject>Genetic markers</subject><subject>Heuristic</subject><subject>Hypothesis testing</subject><subject>Markers</subject><subject>Maximization</subject><subject>Methods</subject><subject>Optimization</subject><subject>Polymorphism</subject><subject>Seed banks</subject><subject>Seeds</subject><subject>Single-nucleotide polymorphism</subject><subject>Software</subject><subject>Wheat</subject><issn>1479-2621</issn><issn>1479-263X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp1kMtOwzAQRSMEEqXwAewssQ74kdjJElW8pPKQChK7yHHGrUsSBzsVtN_AR-PSAgvEau6Mzr0zmig6JviUYCLOJiQROeWUEIExponYiQbrUUw5e9790ZTsRwfezwOSCpEOoo-RbTrpjLctshrZrjeNWcnehL6BfmYrj7R1SFkHyC9KDz3yUIP6IrSzDZKolm4KAam_5yHpFt6Nki16m4HsA9FWTirwSM1kED04s4IKlUs0uXtAjXQv4PxhtKdl7eFoW4fR0-XF4-g6Ht9f3YzOx7FiRPQxwbykGSkZFlUKCdMy4ylVmmY003mal7lIqMI8oVleYSY5z2WV6wQ4rSSInA2jk01u5-zrAnxfzO3CtWFlQTEjOKUUi0CRDaWc9d6BLjpnwqXLguBi_fTiz9ODh209simdqabwG_2_6xOmI4Vo</recordid><startdate>201806</startdate><enddate>201806</enddate><creator>Acuña-Matamoros, Carlos L.</creator><creator>Reyes-Valdés, M. Humberto</creator><general>Cambridge University Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X2</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FK</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>LK8</scope><scope>M0K</scope><scope>M7P</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>RC3</scope></search><sort><creationdate>201806</creationdate><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><author>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Alleles</topic><topic>Bioinformatics</topic><topic>Computer simulation</topic><topic>Corn</topic><topic>Criteria</topic><topic>Data collection</topic><topic>Datasets</topic><topic>Genetic distance</topic><topic>Genetic diversity</topic><topic>Genetic markers</topic><topic>Heuristic</topic><topic>Hypothesis testing</topic><topic>Markers</topic><topic>Maximization</topic><topic>Methods</topic><topic>Optimization</topic><topic>Polymorphism</topic><topic>Seed banks</topic><topic>Seeds</topic><topic>Single-nucleotide polymorphism</topic><topic>Software</topic><topic>Wheat</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Acuña-Matamoros, Carlos L.</creatorcontrib><creatorcontrib>Reyes-Valdés, M. Humberto</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Agricultural Science Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agricultural Science Database</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Genetics Abstracts</collection><jtitle>Plant genetic resources: characterization and utilization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Acuña-Matamoros, Carlos L.</au><au>Reyes-Valdés, M. Humberto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</atitle><jtitle>Plant genetic resources: characterization and utilization</jtitle><addtitle>Plant Genet. Resour</addtitle><date>2018-06</date><risdate>2018</risdate><volume>16</volume><issue>3</issue><spage>228</spage><epage>236</epage><pages>228-236</pages><issn>1479-2621</issn><eissn>1479-263X</eissn><abstract>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</abstract><cop>Cambridge, UK</cop><pub>Cambridge University Press</pub><doi>10.1017/S1479262117000247</doi><tpages>9</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1479-2621
ispartof Plant genetic resources: characterization and utilization, 2018-06, Vol.16 (3), p.228-236
issn 1479-2621
1479-263X
language eng
recordid cdi_proquest_journals_2031052207
source Cambridge University Press Journals Complete
subjects Algorithms
Alleles
Bioinformatics
Computer simulation
Corn
Criteria
Data collection
Datasets
Genetic distance
Genetic diversity
Genetic markers
Heuristic
Hypothesis testing
Markers
Maximization
Methods
Optimization
Polymorphism
Seed banks
Seeds
Single-nucleotide polymorphism
Software
Wheat
title Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T21%3A47%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20optimization%20methods%20for%20core%20subset%20selection%20from%20a%20large%20collection%20of%20Mexican%20wheat%20landraces%20characterized%20by%20SNP%20markers&rft.jtitle=Plant%20genetic%20resources:%20characterization%20and%20utilization&rft.au=Acu%C3%B1a-Matamoros,%20Carlos%20L.&rft.date=2018-06&rft.volume=16&rft.issue=3&rft.spage=228&rft.epage=236&rft.pages=228-236&rft.issn=1479-2621&rft.eissn=1479-263X&rft_id=info:doi/10.1017/S1479262117000247&rft_dat=%3Cproquest_cross%3E2031052207%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2031052207&rft_id=info:pmid/&rft_cupid=10_1017_S1479262117000247&rfr_iscdi=true