Improved metaheuristics for the quartet method of hierarchical clustering

The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of global optimization 2020-10, Vol.78 (2), p.241-270
Hauptverfasser: Consoli, Sergio, Korst, Jan, Pauws, Steffen, Geleijnse, Gijs
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 270
container_issue 2
container_start_page 241
container_title Journal of global optimization
container_volume 78
creator Consoli, Sergio
Korst, Jan
Pauws, Steffen
Geleijnse, Gijs
description The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.
doi_str_mv 10.1007/s10898-019-00871-1
format Article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2441386867</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A718396623</galeid><sourcerecordid>A718396623</sourcerecordid><originalsourceid>FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</originalsourceid><addsrcrecordid>eNp9kE1PAyEQhonRxFr9A5428bx1BpYFjqbxo0kTL70TyoKLabsVWBP_vdQ18WY4kDDzzLw8hNwiLBBA3CcEqWQNqGoAKbDGMzJDLlhNFbbnZAaK8poD4CW5SukdAJTkdEZWq_0xDp-uq_Yum96NMaQcbKr8EKvcu-pjNDG7fCr3Q1cNvuqDiybaPlizq-xuTNnFcHi7Jhfe7JK7-b3nZPP0uFm-1OvX59XyYV3bBtpce78VLe-2kiHQppHYed4JqoxXIDzddtyqVvCGsvKvjkrJPSK1W8Ea1TSczcndNLbE_hhdyvp9GOOhbNRlHDLZylaUrsXU9WZ2ToeDH3I0tpzO7YMdDs6H8v4gUDLVtpQVgE6AjUNK0Xl9jGFv4pdG0CfFelKsi2L9o1hjgdgEpePJgIt_Wf6hvgGD9X3G</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2441386867</pqid></control><display><type>article</type><title>Improved metaheuristics for the quartet method of hierarchical clustering</title><source>SpringerLink (Online service)</source><creator>Consoli, Sergio ; Korst, Jan ; Pauws, Steffen ; Geleijnse, Gijs</creator><creatorcontrib>Consoli, Sergio ; Korst, Jan ; Pauws, Steffen ; Geleijnse, Gijs</creatorcontrib><description>The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.</description><identifier>ISSN: 0925-5001</identifier><identifier>EISSN: 1573-2916</identifier><identifier>DOI: 10.1007/s10898-019-00871-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Cluster analysis ; Clustering ; Combinatorial analysis ; Computer Science ; Computer simulation ; Cost function ; Enumeration ; Heuristic methods ; Mathematical analysis ; Mathematics ; Mathematics and Statistics ; Matrix reduction ; Operations Research/Decision Theory ; Optimization ; Permutations ; Real Functions ; Simulated annealing ; Topology</subject><ispartof>Journal of global optimization, 2020-10, Vol.78 (2), p.241-270</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><rights>COPYRIGHT 2020 Springer</rights><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</citedby><cites>FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</cites><orcidid>0000-0001-7357-5858</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10898-019-00871-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10898-019-00871-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Consoli, Sergio</creatorcontrib><creatorcontrib>Korst, Jan</creatorcontrib><creatorcontrib>Pauws, Steffen</creatorcontrib><creatorcontrib>Geleijnse, Gijs</creatorcontrib><title>Improved metaheuristics for the quartet method of hierarchical clustering</title><title>Journal of global optimization</title><addtitle>J Glob Optim</addtitle><description>The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.</description><subject>Algorithms</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Combinatorial analysis</subject><subject>Computer Science</subject><subject>Computer simulation</subject><subject>Cost function</subject><subject>Enumeration</subject><subject>Heuristic methods</subject><subject>Mathematical analysis</subject><subject>Mathematics</subject><subject>Mathematics and Statistics</subject><subject>Matrix reduction</subject><subject>Operations Research/Decision Theory</subject><subject>Optimization</subject><subject>Permutations</subject><subject>Real Functions</subject><subject>Simulated annealing</subject><subject>Topology</subject><issn>0925-5001</issn><issn>1573-2916</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kE1PAyEQhonRxFr9A5428bx1BpYFjqbxo0kTL70TyoKLabsVWBP_vdQ18WY4kDDzzLw8hNwiLBBA3CcEqWQNqGoAKbDGMzJDLlhNFbbnZAaK8poD4CW5SukdAJTkdEZWq_0xDp-uq_Yum96NMaQcbKr8EKvcu-pjNDG7fCr3Q1cNvuqDiybaPlizq-xuTNnFcHi7Jhfe7JK7-b3nZPP0uFm-1OvX59XyYV3bBtpce78VLe-2kiHQppHYed4JqoxXIDzddtyqVvCGsvKvjkrJPSK1W8Ea1TSczcndNLbE_hhdyvp9GOOhbNRlHDLZylaUrsXU9WZ2ToeDH3I0tpzO7YMdDs6H8v4gUDLVtpQVgE6AjUNK0Xl9jGFv4pdG0CfFelKsi2L9o1hjgdgEpePJgIt_Wf6hvgGD9X3G</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Consoli, Sergio</creator><creator>Korst, Jan</creator><creator>Pauws, Steffen</creator><creator>Geleijnse, Gijs</creator><general>Springer US</general><general>Springer</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>M2P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-7357-5858</orcidid></search><sort><creationdate>20201001</creationdate><title>Improved metaheuristics for the quartet method of hierarchical clustering</title><author>Consoli, Sergio ; Korst, Jan ; Pauws, Steffen ; Geleijnse, Gijs</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Combinatorial analysis</topic><topic>Computer Science</topic><topic>Computer simulation</topic><topic>Cost function</topic><topic>Enumeration</topic><topic>Heuristic methods</topic><topic>Mathematical analysis</topic><topic>Mathematics</topic><topic>Mathematics and Statistics</topic><topic>Matrix reduction</topic><topic>Operations Research/Decision Theory</topic><topic>Optimization</topic><topic>Permutations</topic><topic>Real Functions</topic><topic>Simulated annealing</topic><topic>Topology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Consoli, Sergio</creatorcontrib><creatorcontrib>Korst, Jan</creatorcontrib><creatorcontrib>Pauws, Steffen</creatorcontrib><creatorcontrib>Geleijnse, Gijs</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer science database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>ProQuest research library</collection><collection>ProQuest Science Journals</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>ProQuest Central Basic</collection><jtitle>Journal of global optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Consoli, Sergio</au><au>Korst, Jan</au><au>Pauws, Steffen</au><au>Geleijnse, Gijs</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improved metaheuristics for the quartet method of hierarchical clustering</atitle><jtitle>Journal of global optimization</jtitle><stitle>J Glob Optim</stitle><date>2020-10-01</date><risdate>2020</risdate><volume>78</volume><issue>2</issue><spage>241</spage><epage>270</epage><pages>241-270</pages><issn>0925-5001</issn><eissn>1573-2916</eissn><abstract>The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10898-019-00871-1</doi><tpages>30</tpages><orcidid>https://orcid.org/0000-0001-7357-5858</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0925-5001
ispartof Journal of global optimization, 2020-10, Vol.78 (2), p.241-270
issn 0925-5001
1573-2916
language eng
recordid cdi_proquest_journals_2441386867
source SpringerLink (Online service)
subjects Algorithms
Cluster analysis
Clustering
Combinatorial analysis
Computer Science
Computer simulation
Cost function
Enumeration
Heuristic methods
Mathematical analysis
Mathematics
Mathematics and Statistics
Matrix reduction
Operations Research/Decision Theory
Optimization
Permutations
Real Functions
Simulated annealing
Topology
title Improved metaheuristics for the quartet method of hierarchical clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T06%3A05%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improved%20metaheuristics%20for%20the%20quartet%20method%20of%20hierarchical%20clustering&rft.jtitle=Journal%20of%20global%20optimization&rft.au=Consoli,%20Sergio&rft.date=2020-10-01&rft.volume=78&rft.issue=2&rft.spage=241&rft.epage=270&rft.pages=241-270&rft.issn=0925-5001&rft.eissn=1573-2916&rft_id=info:doi/10.1007/s10898-019-00871-1&rft_dat=%3Cgale_proqu%3EA718396623%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2441386867&rft_id=info:pmid/&rft_galeid=A718396623&rfr_iscdi=true