Sampling and sparsification for approximating the packedness of trajectories and detecting gatherings

Packedness is a measure defined for curves as the ratio of maximum curve length inside any disk divided by its radius. Sparsification allows us to reduce the number of candidate disks for maximum packedness to a polynomial amount in terms of the number of vertices of the polygonal curve. This gives...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International journal of data science and analytics 2023-03, Vol.15 (2), p.201-216
Hauptverfasser:	Aghamolaei, Sepideh, Keikha, Vahideh, Ghodsi, Mohammad, Mohades, Ali
Format:	Artikel
Sprache:	eng
Schlagworte:	Artificial Intelligence Business Information Systems Computational Biology/Bioinformatics Computer Science Data Mining and Knowledge Discovery Database Management Regular Paper
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Packedness is a measure defined for curves as the ratio of maximum curve length inside any disk divided by its radius. Sparsification allows us to reduce the number of candidate disks for maximum packedness to a polynomial amount in terms of the number of vertices of the polygonal curve. This gives an exact algorithm for computing packedness. We prove that using a fat shape, such as a square, instead of a disk gives a constant factor approximation for packedness. Further sparsification using well-separated pair decomposition improves the time complexity at the cost of losing some accuracy. By adjusting the ratio of the separation factor and the size of the query, we improve the approximation factor of the existing algorithm for packedness using square queries. Our experiments show that uniform sampling works well for finding the average packedness of trajectories with almost constant speed. The empirical results confirm that the sparsification method approximates the maximum packedness for arbitrary polygonal curves. In big data models such as massively parallel computations, both sampling and sparsification are efficient and take a constant number of rounds. Most existing algorithms use line-sweeping which is sequential in nature. Also, we design two data-structures for computing the length of the curve inside a query shape: an exact data-structure for disks called hierarchical aggregated queries and an approximate data-structure for a given set of square queries. Using our modified segment tree, we achieve a near-linear time approximation algorithm.
ISSN:	2364-415X 2364-4168
DOI:	10.1007/s41060-021-00301-0