Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers
Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 meth...
Gespeichert in:
Veröffentlicht in: | Plant genetic resources: characterization and utilization 2018-06, Vol.16 (3), p.228-236 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 236 |
---|---|
container_issue | 3 |
container_start_page | 228 |
container_title | Plant genetic resources: characterization and utilization |
container_volume | 16 |
creator | Acuña-Matamoros, Carlos L. Reyes-Valdés, M. Humberto |
description | Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers. |
doi_str_mv | 10.1017/S1479262117000247 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2031052207</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cupid>10_1017_S1479262117000247</cupid><sourcerecordid>2031052207</sourcerecordid><originalsourceid>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</originalsourceid><addsrcrecordid>eNp1kMtOwzAQRSMEEqXwAewssQ74kdjJElW8pPKQChK7yHHGrUsSBzsVtN_AR-PSAgvEau6Mzr0zmig6JviUYCLOJiQROeWUEIExponYiQbrUUw5e9790ZTsRwfezwOSCpEOoo-RbTrpjLctshrZrjeNWcnehL6BfmYrj7R1SFkHyC9KDz3yUIP6IrSzDZKolm4KAam_5yHpFt6Nki16m4HsA9FWTirwSM1kED04s4IKlUs0uXtAjXQv4PxhtKdl7eFoW4fR0-XF4-g6Ht9f3YzOx7FiRPQxwbykGSkZFlUKCdMy4ylVmmY003mal7lIqMI8oVleYSY5z2WV6wQ4rSSInA2jk01u5-zrAnxfzO3CtWFlQTEjOKUUi0CRDaWc9d6BLjpnwqXLguBi_fTiz9ODh209simdqabwG_2_6xOmI4Vo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2031052207</pqid></control><display><type>article</type><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><source>Cambridge University Press Journals Complete</source><creator>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</creator><creatorcontrib>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</creatorcontrib><description>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</description><identifier>ISSN: 1479-2621</identifier><identifier>EISSN: 1479-263X</identifier><identifier>DOI: 10.1017/S1479262117000247</identifier><language>eng</language><publisher>Cambridge, UK: Cambridge University Press</publisher><subject>Algorithms ; Alleles ; Bioinformatics ; Computer simulation ; Corn ; Criteria ; Data collection ; Datasets ; Genetic distance ; Genetic diversity ; Genetic markers ; Heuristic ; Hypothesis testing ; Markers ; Maximization ; Methods ; Optimization ; Polymorphism ; Seed banks ; Seeds ; Single-nucleotide polymorphism ; Software ; Wheat</subject><ispartof>Plant genetic resources: characterization and utilization, 2018-06, Vol.16 (3), p.228-236</ispartof><rights>Copyright © NIAB 2017</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</citedby><cites>FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.cambridge.org/core/product/identifier/S1479262117000247/type/journal_article$$EHTML$$P50$$Gcambridge$$H</linktohtml><link.rule.ids>164,314,780,784,27924,27925,55628</link.rule.ids></links><search><creatorcontrib>Acuña-Matamoros, Carlos L.</creatorcontrib><creatorcontrib>Reyes-Valdés, M. Humberto</creatorcontrib><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><title>Plant genetic resources: characterization and utilization</title><addtitle>Plant Genet. Resour</addtitle><description>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</description><subject>Algorithms</subject><subject>Alleles</subject><subject>Bioinformatics</subject><subject>Computer simulation</subject><subject>Corn</subject><subject>Criteria</subject><subject>Data collection</subject><subject>Datasets</subject><subject>Genetic distance</subject><subject>Genetic diversity</subject><subject>Genetic markers</subject><subject>Heuristic</subject><subject>Hypothesis testing</subject><subject>Markers</subject><subject>Maximization</subject><subject>Methods</subject><subject>Optimization</subject><subject>Polymorphism</subject><subject>Seed banks</subject><subject>Seeds</subject><subject>Single-nucleotide polymorphism</subject><subject>Software</subject><subject>Wheat</subject><issn>1479-2621</issn><issn>1479-263X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNp1kMtOwzAQRSMEEqXwAewssQ74kdjJElW8pPKQChK7yHHGrUsSBzsVtN_AR-PSAgvEau6Mzr0zmig6JviUYCLOJiQROeWUEIExponYiQbrUUw5e9790ZTsRwfezwOSCpEOoo-RbTrpjLctshrZrjeNWcnehL6BfmYrj7R1SFkHyC9KDz3yUIP6IrSzDZKolm4KAam_5yHpFt6Nki16m4HsA9FWTirwSM1kED04s4IKlUs0uXtAjXQv4PxhtKdl7eFoW4fR0-XF4-g6Ht9f3YzOx7FiRPQxwbykGSkZFlUKCdMy4ylVmmY003mal7lIqMI8oVleYSY5z2WV6wQ4rSSInA2jk01u5-zrAnxfzO3CtWFlQTEjOKUUi0CRDaWc9d6BLjpnwqXLguBi_fTiz9ODh209simdqabwG_2_6xOmI4Vo</recordid><startdate>201806</startdate><enddate>201806</enddate><creator>Acuña-Matamoros, Carlos L.</creator><creator>Reyes-Valdés, M. Humberto</creator><general>Cambridge University Press</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X2</scope><scope>8FD</scope><scope>8FE</scope><scope>8FH</scope><scope>8FK</scope><scope>AFKRA</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>LK8</scope><scope>M0K</scope><scope>M7P</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>RC3</scope></search><sort><creationdate>201806</creationdate><title>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</title><author>Acuña-Matamoros, Carlos L. ; Reyes-Valdés, M. Humberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c317t-106b281b307d5e43fa8652cf2828f959b9742c064289d03a669ad9f4e62dae793</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Alleles</topic><topic>Bioinformatics</topic><topic>Computer simulation</topic><topic>Corn</topic><topic>Criteria</topic><topic>Data collection</topic><topic>Datasets</topic><topic>Genetic distance</topic><topic>Genetic diversity</topic><topic>Genetic markers</topic><topic>Heuristic</topic><topic>Hypothesis testing</topic><topic>Markers</topic><topic>Maximization</topic><topic>Methods</topic><topic>Optimization</topic><topic>Polymorphism</topic><topic>Seed banks</topic><topic>Seeds</topic><topic>Single-nucleotide polymorphism</topic><topic>Software</topic><topic>Wheat</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Acuña-Matamoros, Carlos L.</creatorcontrib><creatorcontrib>Reyes-Valdés, M. Humberto</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Agricultural Science Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agricultural Science Database</collection><collection>Biological Science Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Genetics Abstracts</collection><jtitle>Plant genetic resources: characterization and utilization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Acuña-Matamoros, Carlos L.</au><au>Reyes-Valdés, M. Humberto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers</atitle><jtitle>Plant genetic resources: characterization and utilization</jtitle><addtitle>Plant Genet. Resour</addtitle><date>2018-06</date><risdate>2018</risdate><volume>16</volume><issue>3</issue><spage>228</spage><epage>236</epage><pages>228-236</pages><issn>1479-2621</issn><eissn>1479-263X</eissn><abstract>Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.</abstract><cop>Cambridge, UK</cop><pub>Cambridge University Press</pub><doi>10.1017/S1479262117000247</doi><tpages>9</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1479-2621 |
ispartof | Plant genetic resources: characterization and utilization, 2018-06, Vol.16 (3), p.228-236 |
issn | 1479-2621 1479-263X |
language | eng |
recordid | cdi_proquest_journals_2031052207 |
source | Cambridge University Press Journals Complete |
subjects | Algorithms Alleles Bioinformatics Computer simulation Corn Criteria Data collection Datasets Genetic distance Genetic diversity Genetic markers Heuristic Hypothesis testing Markers Maximization Methods Optimization Polymorphism Seed banks Seeds Single-nucleotide polymorphism Software Wheat |
title | Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T21%3A47%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Comparison%20of%20optimization%20methods%20for%20core%20subset%20selection%20from%20a%20large%20collection%20of%20Mexican%20wheat%20landraces%20characterized%20by%20SNP%20markers&rft.jtitle=Plant%20genetic%20resources:%20characterization%20and%20utilization&rft.au=Acu%C3%B1a-Matamoros,%20Carlos%20L.&rft.date=2018-06&rft.volume=16&rft.issue=3&rft.spage=228&rft.epage=236&rft.pages=228-236&rft.issn=1479-2621&rft.eissn=1479-263X&rft_id=info:doi/10.1017/S1479262117000247&rft_dat=%3Cproquest_cross%3E2031052207%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2031052207&rft_id=info:pmid/&rft_cupid=10_1017_S1479262117000247&rfr_iscdi=true |