Learning in Repeated Multi-Unit Pay-As-Bid Auctions

Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, Procurement Auctions, and Wholesale Electricity Markets, which all involve the auctioning of homogeneous multiple units, we consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions. In each of these aucti...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-11
Hauptverfasser: Galgana, Rigel, Golrezaei, Negin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Motivated by Carbon Emissions Trading Schemes, Treasury Auctions, Procurement Auctions, and Wholesale Electricity Markets, which all involve the auctioning of homogeneous multiple units, we consider the problem of learning how to bid in repeated multi-unit pay-as-bid auctions. In each of these auctions, a large number of (identical) items are to be allocated to the largest submitted bids, where the price of each of the winning bids is equal to the bid itself. In this work, we study the problem of optimizing bidding strategies from the perspective of a single bidder. Effective bidding in pay-as-bid (PAB) auctions is complex due to the combinatorial nature of the action space. We show that a utility decoupling trick enables a polynomial time algorithm to solve the offline problem where competing bids are known in advance. Leveraging this structure, we design efficient algorithms for the online problem under both full information and bandit feedback settings that achieve an upper bound on regret of \(O(M \sqrt{T \log T})\) and \(O(M T^{\frac{2}{3}} \sqrt{\log T})\) respectively, where \(M\) is the number of units demanded by the bidder and \(T\) is the total number of auctions. We accompany these results with a regret lower bound of \(\Omega(M\sqrt{T})\) for the full information setting and \(\Omega (M^{2/3}T^{2/3})\) for the bandit setting. We also present additional findings on the characterization of PAB equilibria. While the Nash equilibria of PAB auctions possess nice properties such as winning bid uniformity and high welfare \& revenue, they are not guaranteed under no regret learning dynamics. Nevertheless, our simulations suggest these properties hold anyways, regardless of Nash equilibrium existence. Compared to its uniform price counterpart, the PAB dynamics converge faster and achieve higher revenue, making PAB appealing whenever revenue holds significant social value.
ISSN:2331-8422