Computing the Expected Value and Variance of Geometric Measures

Let P be a point set in ℝ d , and let M be a function that maps any subset of P to a positive real. We examine the problem of computing the mean and variance of M when a subset in P is selected according to a random distribution. We consider two distributions; in the first distribution (the Bernoull...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The ACM journal of experimental algorithmics 2018-11, Vol.23, p.1-32
Hauptverfasser: Tsirogiannis, Constantinos, Staals, Frank, Pellissier, Vincent
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Let P be a point set in ℝ d , and let M be a function that maps any subset of P to a positive real. We examine the problem of computing the mean and variance of M when a subset in P is selected according to a random distribution. We consider two distributions; in the first distribution (the Bernoulli distribution), each point p in P is included in the random subset independently, with probability π( p ). In the second distribution (the fixed-size distribution), exactly s points are selected uniformly at random among all possible subsets of s points in P . We present efficient algorithms for computing the mean and variance of several geometric measures when point sets are selected under one of the described random distributions. We also implemented four of those algorithms: an algorithm that computes the mean 2D bounding box volume in the Bernoulli distribution, an algorithm for the mean 2D convex hull area in the fixed-size distribution, an algorithm that computes the exact mean and variance of the mean pairwise distance (MPD) for d -dimensional point sets in the fixed-size distribution, and an (1 − ε)-approximation algorithm for the same measure. We conducted experiments where we compared the performance of our implementations with a standard heuristic approach, and we show that our implementations are very efficient. We also compared the implementation of our exact MPD algorithm and the (1 − ε)-approximation algorithm; the approximation method performs faster on real-world datasets for point sets of up to 13 dimensions, and provides high-precision approximations.
ISSN:1084-6654
1084-6654
DOI:10.1145/3228331