Worst-Case I/O-Efficient Skyline Algorithms
We consider the skyline problem (aka the maxima problem ), which has been extensively studied in the database community. The input is a set P of d -dimensional points. A point dominates another if the coordinate of the former is at most that of the latter on every dimension. The goal is to find the...
Gespeichert in:
Veröffentlicht in: | ACM transactions on database systems 2012-12, Vol.37 (4), p.1-22 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We consider the
skyline problem
(aka the
maxima problem
), which has been extensively studied in the database community. The input is a set
P
of
d
-dimensional points. A point
dominates
another if the coordinate of the former is at most that of the latter on every dimension. The goal is to find the
skyline
, which is the set of points
p
∈
P
such that
p
is not dominated by any other point in
P
.
The main result of this article is that, for any fixed dimensionality
d
≥ 3, in external memory the skyline problem can be settled by performing
O
((
N
/
B
)log
M/B
d−2
(
N
/
B
)) I/Os in the worst case, where
N
is the cardinality of
P, B
the size of a disk block, and
M
the capacity of main memory. Similar bounds can also be achieved for computing several skyline variants, including the
k-dominant skyline, k-skyband
, and
α-skyline
. Furthermore, the performance can be improved if some dimensions of the data space have small domains. When the dimensionality
d
is not fixed, the challenge is to outperform the naive algorithm that simply checks all pairs of points in
P
×
P
. We give an algorithm that terminates in
O
((
N
/
B
) log
d − 2
N
) I/Os, thus beating the naive solution for any
d
=
O
(log
N
/ log log
N
). |
---|---|
ISSN: | 0362-5915 1557-4644 |
DOI: | 10.1145/2389241.2389245 |