Cooperative Game in Dynamic Spectrum Access with Unknown Model and Imperfect Sensing

We consider dynamic spectrum access where distributed secondary users search for spectrum opportunities without knowing the primary traffic statistics. In each slot, a secondary transmitter chooses one channel to sense and subsequently transmit if the channel is sensed as idle. Sensing is imperfect,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on wireless communications 2012-04, Vol.11 (4), p.1596-1604
Hauptverfasser: Liu, Keqin, Zhao, Qing
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We consider dynamic spectrum access where distributed secondary users search for spectrum opportunities without knowing the primary traffic statistics. In each slot, a secondary transmitter chooses one channel to sense and subsequently transmit if the channel is sensed as idle. Sensing is imperfect, i.e., an idle channel may be sensed as busy and vice versa. Without centralized control, each secondary user needs to independently identify the channels that offer the most opportunities while avoiding collisions with both primary and other secondary users. We address the problem within a cooperative game framework, where the objective is to maximize the throughput of the secondary network under a constraint on the collision with the primary system. The performance of a decentralized channel access policy is measured by the system regret, defined as the expected total performance loss with respect to the optimal performance in the ideal scenario where the traffic load of the primary system on each channel is known to all secondary users and collisions among secondary users are eliminated through centralized scheduling. By exploring the rich communication structure of the problem, we show that the optimal system regret has the same logarithmic order as in the centralized counterpart with perfect sensing. A decentralized policy is constructed to achieve the logarithmic order of the system regret. In a broader context, this work addresses imperfect reward observation in decentralized multi-armed bandit problems.
ISSN:1536-1276
1558-2248
DOI:10.1109/TWC.2012.020812.111547