A matheuristic for large-scale capacitated clustering

Clustering addresses the problem of assigning similar objects to groups. Since the size of the clusters is often constrained in practical clustering applications, various capacitated clustering problems have received increasing attention. We consider here the capacitated p-median problem (CPMP) in w...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & operations research 2021-08, Vol.132, p.105304, Article 105304
Hauptverfasser: Gnägi, Mario, Baumann, Philipp
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Clustering addresses the problem of assigning similar objects to groups. Since the size of the clusters is often constrained in practical clustering applications, various capacitated clustering problems have received increasing attention. We consider here the capacitated p-median problem (CPMP) in which p objects are selected as cluster centers (medians) such that the total distance from these medians to their assigned objects is minimized. Each object is associated with a weight, and the total weight in each cluster must not exceed a given capacity. Numerous exact and heuristic solution approaches have been proposed for the CPMP. The state-of-the-art approach performs well for instances with up to 5,000 objects but becomes computationally expensive for instances with a much larger number of objects. We propose a matheuristic with new problem decomposition strategies that can deal with instances comprising up to 500,000 objects. In a computational experiment, the proposed matheuristic consistently outperformed the state-of-the-art approach on medium- and large-scale instances while having similar performance for small-scale instances. As an extension, we show that our matheuristic can be applied to related capacitated clustering problems, such as the capacitated centered clustering problem (CCCP). For several test instances of the CCCP, our matheuristic found new best-known solutions. •We introduce a scalable matheuristic for the capacitated p-median problem.•We achieve scalability with novel problem decomposition strategies.•We introduce new large-scale test instances with up to around 500,000 objects.•Our matheuristic represents the new state-of-the-art approach for large instances.•Our matheuristic shows leading performance for other capacitated clustering problems.
ISSN:0305-0548
1873-765X
0305-0548
DOI:10.1016/j.cor.2021.105304