Fast and Memory-Efficient Approximate Minimum Spanning Tree Generation for Large Datasets
Conventional minimum spanning tree (MST) algorithms typically start by creating a distance matrix of the n ( n - 1 ) / 2 pairs of data points, leading to a time complexity of O ( n 2 ) . This initial step poses a computational bottleneck. To overcome this limitation, we present a novel method that c...
Gespeichert in:
Veröffentlicht in: | Arabian journal for science and engineering (2011) 2025, Vol.50 (2), p.1233-1246 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Conventional minimum spanning tree (MST) algorithms typically start by creating a distance matrix of the
n
(
n
-
1
)
/
2
pairs of data points, leading to a time complexity of
O
(
n
2
)
. This initial step poses a computational bottleneck. To overcome this limitation, we present a novel method that constructs an initial random
k
-neighbor graph and optimizes this graph by employing a crawling technique to efficiently approximate the
k
Nearest Neighbors (
k
NN) graph. This crawling approach allows us to approximate the closest neighbors of each node. Subsequently, the approximate
k
NN graph is utilized to build an initial approximate MST and iteratively refine it by the same crawling process. Using this approach, an approximate MST can be obtained for a data set of size
n
with empirical cost around
O
(
n
1.07
)
and a minimal
O
(
n
) memory consumption. In contrast to spatial tree-based approaches, the presented method also scales well to high dimensional data. We have shown that the proposed approach achieves such a level of performance with only a marginal accuracy reduction between 0.5% and 6%. These qualities make it an excellent candidate for approximate MST calculation for high-dimensional, large data sets. |
---|---|
ISSN: | 2193-567X 1319-8025 2191-4281 |
DOI: | 10.1007/s13369-024-08974-y |