Guessing Cost: Bounds and Applications to Data Repair in Distributed Storage
The guesswork refers to the distribution of the minimum number of trials needed to guess a realization of a random variable accurately. In this study, a non-trivial generalization of the guesswork called guessing cost (also referred to as cost of guessing) is introduced, and an optimal strategy for...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on information theory 2024-10, Vol.70 (10), p.6757-6779 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The guesswork refers to the distribution of the minimum number of trials needed to guess a realization of a random variable accurately. In this study, a non-trivial generalization of the guesswork called guessing cost (also referred to as cost of guessing) is introduced, and an optimal strategy for finding the \rho -th moment of guessing cost is provided for a random variable defined on a finite set whereby each choice is associated with a positive finite cost value (unit cost corresponds to the original guesswork). Moreover, we drive asymptotically tight upper and lower bounds on the logarithm of guessing cost moments. Similar to previous studies on the guesswork, established bounds on the moments of guessing cost quantify the accumulated cost of guesses required for correctly identifying the unknown choice and are expressed in terms of Rényi's entropy. Moreover, new random variables are introduced to establish connections between the guessing cost and the guesswork, leading to induced strategies. Establishing this implicit connection helped us obtain improved bounds for the non-asymptotic region. As a consequence, we establish the guessing cost exponent in terms of Rényi entropy rate on the moments of the guessing cost using the optimal strategy by considering a sequence of independent random variables with different cost distributions. Finally, with slight modifications to the original problem, these results are shown to be applicable for bounding the overall repair bandwidth for distributed data storage systems backed up by base stations and protected by bipartite graph codes. |
---|---|
ISSN: | 0018-9448 1557-9654 |
DOI: | 10.1109/TIT.2023.3339066 |