Improved metaheuristics for the quartet method of hierarchical clustering
The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissi...
Gespeichert in:
Veröffentlicht in: | Journal of global optimization 2020-10, Vol.78 (2), p.241-270 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 270 |
---|---|
container_issue | 2 |
container_start_page | 241 |
container_title | Journal of global optimization |
container_volume | 78 |
creator | Consoli, Sergio Korst, Jan Pauws, Steffen Geleijnse, Gijs |
description | The quartet method is a novel hierarchical clustering approach where, given a set of
n
data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on
n
, where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem. |
doi_str_mv | 10.1007/s10898-019-00871-1 |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2441386867</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A718396623</galeid><sourcerecordid>A718396623</sourcerecordid><originalsourceid>FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</originalsourceid><addsrcrecordid>eNp9kE1PAyEQhonRxFr9A5428bx1BpYFjqbxo0kTL70TyoKLabsVWBP_vdQ18WY4kDDzzLw8hNwiLBBA3CcEqWQNqGoAKbDGMzJDLlhNFbbnZAaK8poD4CW5SukdAJTkdEZWq_0xDp-uq_Yum96NMaQcbKr8EKvcu-pjNDG7fCr3Q1cNvuqDiybaPlizq-xuTNnFcHi7Jhfe7JK7-b3nZPP0uFm-1OvX59XyYV3bBtpce78VLe-2kiHQppHYed4JqoxXIDzddtyqVvCGsvKvjkrJPSK1W8Ea1TSczcndNLbE_hhdyvp9GOOhbNRlHDLZylaUrsXU9WZ2ToeDH3I0tpzO7YMdDs6H8v4gUDLVtpQVgE6AjUNK0Xl9jGFv4pdG0CfFelKsi2L9o1hjgdgEpePJgIt_Wf6hvgGD9X3G</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2441386867</pqid></control><display><type>article</type><title>Improved metaheuristics for the quartet method of hierarchical clustering</title><source>SpringerLink (Online service)</source><creator>Consoli, Sergio ; Korst, Jan ; Pauws, Steffen ; Geleijnse, Gijs</creator><creatorcontrib>Consoli, Sergio ; Korst, Jan ; Pauws, Steffen ; Geleijnse, Gijs</creatorcontrib><description>The quartet method is a novel hierarchical clustering approach where, given a set of
n
data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on
n
, where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.</description><identifier>ISSN: 0925-5001</identifier><identifier>EISSN: 1573-2916</identifier><identifier>DOI: 10.1007/s10898-019-00871-1</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Cluster analysis ; Clustering ; Combinatorial analysis ; Computer Science ; Computer simulation ; Cost function ; Enumeration ; Heuristic methods ; Mathematical analysis ; Mathematics ; Mathematics and Statistics ; Matrix reduction ; Operations Research/Decision Theory ; Optimization ; Permutations ; Real Functions ; Simulated annealing ; Topology</subject><ispartof>Journal of global optimization, 2020-10, Vol.78 (2), p.241-270</ispartof><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020</rights><rights>COPYRIGHT 2020 Springer</rights><rights>Springer Science+Business Media, LLC, part of Springer Nature 2020.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</citedby><cites>FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</cites><orcidid>0000-0001-7357-5858</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s10898-019-00871-1$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s10898-019-00871-1$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,41488,42557,51319</link.rule.ids></links><search><creatorcontrib>Consoli, Sergio</creatorcontrib><creatorcontrib>Korst, Jan</creatorcontrib><creatorcontrib>Pauws, Steffen</creatorcontrib><creatorcontrib>Geleijnse, Gijs</creatorcontrib><title>Improved metaheuristics for the quartet method of hierarchical clustering</title><title>Journal of global optimization</title><addtitle>J Glob Optim</addtitle><description>The quartet method is a novel hierarchical clustering approach where, given a set of
n
data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on
n
, where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.</description><subject>Algorithms</subject><subject>Cluster analysis</subject><subject>Clustering</subject><subject>Combinatorial analysis</subject><subject>Computer Science</subject><subject>Computer simulation</subject><subject>Cost function</subject><subject>Enumeration</subject><subject>Heuristic methods</subject><subject>Mathematical analysis</subject><subject>Mathematics</subject><subject>Mathematics and Statistics</subject><subject>Matrix reduction</subject><subject>Operations Research/Decision Theory</subject><subject>Optimization</subject><subject>Permutations</subject><subject>Real Functions</subject><subject>Simulated annealing</subject><subject>Topology</subject><issn>0925-5001</issn><issn>1573-2916</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNp9kE1PAyEQhonRxFr9A5428bx1BpYFjqbxo0kTL70TyoKLabsVWBP_vdQ18WY4kDDzzLw8hNwiLBBA3CcEqWQNqGoAKbDGMzJDLlhNFbbnZAaK8poD4CW5SukdAJTkdEZWq_0xDp-uq_Yum96NMaQcbKr8EKvcu-pjNDG7fCr3Q1cNvuqDiybaPlizq-xuTNnFcHi7Jhfe7JK7-b3nZPP0uFm-1OvX59XyYV3bBtpce78VLe-2kiHQppHYed4JqoxXIDzddtyqVvCGsvKvjkrJPSK1W8Ea1TSczcndNLbE_hhdyvp9GOOhbNRlHDLZylaUrsXU9WZ2ToeDH3I0tpzO7YMdDs6H8v4gUDLVtpQVgE6AjUNK0Xl9jGFv4pdG0CfFelKsi2L9o1hjgdgEpePJgIt_Wf6hvgGD9X3G</recordid><startdate>20201001</startdate><enddate>20201001</enddate><creator>Consoli, Sergio</creator><creator>Korst, Jan</creator><creator>Pauws, Steffen</creator><creator>Geleijnse, Gijs</creator><general>Springer US</general><general>Springer</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>88I</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M2O</scope><scope>M2P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0001-7357-5858</orcidid></search><sort><creationdate>20201001</creationdate><title>Improved metaheuristics for the quartet method of hierarchical clustering</title><author>Consoli, Sergio ; Korst, Jan ; Pauws, Steffen ; Geleijnse, Gijs</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c406t-ffb765db831024481df5d729af907f2bd5c9675423007d2885f112cb73494453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Cluster analysis</topic><topic>Clustering</topic><topic>Combinatorial analysis</topic><topic>Computer Science</topic><topic>Computer simulation</topic><topic>Cost function</topic><topic>Enumeration</topic><topic>Heuristic methods</topic><topic>Mathematical analysis</topic><topic>Mathematics</topic><topic>Mathematics and Statistics</topic><topic>Matrix reduction</topic><topic>Operations Research/Decision Theory</topic><topic>Optimization</topic><topic>Permutations</topic><topic>Real Functions</topic><topic>Simulated annealing</topic><topic>Topology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Consoli, Sergio</creatorcontrib><creatorcontrib>Korst, Jan</creatorcontrib><creatorcontrib>Pauws, Steffen</creatorcontrib><creatorcontrib>Geleijnse, Gijs</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Science Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>ProQuest Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer science database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>ProQuest research library</collection><collection>ProQuest Science Journals</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>ProQuest Central Basic</collection><jtitle>Journal of global optimization</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Consoli, Sergio</au><au>Korst, Jan</au><au>Pauws, Steffen</au><au>Geleijnse, Gijs</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improved metaheuristics for the quartet method of hierarchical clustering</atitle><jtitle>Journal of global optimization</jtitle><stitle>J Glob Optim</stitle><date>2020-10-01</date><risdate>2020</risdate><volume>78</volume><issue>2</issue><spage>241</spage><epage>270</epage><pages>241-270</pages><issn>0925-5001</issn><eissn>1573-2916</eissn><abstract>The quartet method is a novel hierarchical clustering approach where, given a set of
n
data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on
n
, where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10898-019-00871-1</doi><tpages>30</tpages><orcidid>https://orcid.org/0000-0001-7357-5858</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0925-5001 |
ispartof | Journal of global optimization, 2020-10, Vol.78 (2), p.241-270 |
issn | 0925-5001 1573-2916 |
language | eng |
recordid | cdi_proquest_journals_2441386867 |
source | SpringerLink (Online service) |
subjects | Algorithms Cluster analysis Clustering Combinatorial analysis Computer Science Computer simulation Cost function Enumeration Heuristic methods Mathematical analysis Mathematics Mathematics and Statistics Matrix reduction Operations Research/Decision Theory Optimization Permutations Real Functions Simulated annealing Topology |
title | Improved metaheuristics for the quartet method of hierarchical clustering |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T06%3A05%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improved%20metaheuristics%20for%20the%20quartet%20method%20of%20hierarchical%20clustering&rft.jtitle=Journal%20of%20global%20optimization&rft.au=Consoli,%20Sergio&rft.date=2020-10-01&rft.volume=78&rft.issue=2&rft.spage=241&rft.epage=270&rft.pages=241-270&rft.issn=0925-5001&rft.eissn=1573-2916&rft_id=info:doi/10.1007/s10898-019-00871-1&rft_dat=%3Cgale_proqu%3EA718396623%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2441386867&rft_id=info:pmid/&rft_galeid=A718396623&rfr_iscdi=true |