Continuous [Formula Omitted]-Regret Minimization Queries: A Dynamic Coreset Approach
Finding a small set of representative tuples from a large database is an important functionality for supporting multi-criteria decision making. Top-[Formula Omitted] queries and skyline queries are two widely studied queries to fulfill this task. However, both of them have some limitations: a top-[F...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2023-01, Vol.35 (6), p.5680 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Finding a small set of representative tuples from a large database is an important functionality for supporting multi-criteria decision making. Top-[Formula Omitted] queries and skyline queries are two widely studied queries to fulfill this task. However, both of them have some limitations: a top-[Formula Omitted] query requires the user to provide her utility functions for finding the [Formula Omitted] tuples with the highest scores as the result; a skyline query does not need any user-specified utility function but cannot control the result size. To overcome their drawbacks, the [Formula Omitted]-regret minimization query was proposed and received much attention recently, since it does not require any user-specified utility function and returns a fixed-size result set. Specifically, it selects a set [Formula Omitted] of tuples with a pre-defined size [Formula Omitted] from a database [Formula Omitted] such that the maximum [Formula Omitted]-regret ratio , which captures how well the top-ranked tuple in [Formula Omitted] represents the top-[Formula Omitted] tuples in [Formula Omitted] for any possible utility function, is minimized. Although there have been many methods for [Formula Omitted]-regret minimization query processing, most of them are designed for static databases without tuple insertions and deletions. The only known algorithm to process continuous [Formula Omitted]-regret minimization queries (C[Formula Omitted]RMQ) in dynamic databases suffers from suboptimal approximation and high time complexity. In this paper, we propose a novel dynamic coreset-based approach, called DynCore , for C[Formula Omitted]RMQ processing. It achieves the same (asymptotically optimal) upper bound on the maximum [Formula Omitted]-regret ratio as the best-known static algorithm. Meanwhile, its time complexity is sublinear to the database size, which is significantly lower than that of the existing dynamic algorithm. The efficiency and effectiveness of DynCore is confirmed by experimental results on real-world and synthetic datasets. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2022.3166835 |