Improved metaheuristics for the quartet method of hierarchical clustering

The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of global optimization 2020-10, Vol.78 (2), p.241-270
Hauptverfasser: Consoli, Sergio, Korst, Jan, Pauws, Steffen, Geleijnse, Gijs
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The quartet method is a novel hierarchical clustering approach where, given a set of n data objects and their pairwise dissimilarities, the aim is to construct an optimal tree from the total number of possible combinations of quartet topologies on n , where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by some appealing insights and findings for speeding up and simplifying the processes of solution generation and evaluation, such as the use of adjacency-like matrices to represent the topology structures of candidate solutions; fast calculation of coefficients and weights of the solution matrices; shortcuts in the enumeration of all solution permutations for a given configuration; and an iterative distance matrix reduction procedure, which greedily merges together highly connected objects which may bring lower values of the quartet cost function in a given partial solution. It will be shown that this basic greedy heuristic is able to improve consistently the performance of popular quartet clustering algorithms in the literature, namely a reduced variable neighbourhood search and a simulated annealing metaheuristic, producing novel efficient solution approaches to the MQTC problem.
ISSN:0925-5001
1573-2916
DOI:10.1007/s10898-019-00871-1