An optimal bidimensional multi-armed bandit auction for multi-unit procurement

We study the problem of a buyer who gains stochastic rewards by procuring through an auction, multiple units of a service or item from a pool of heterogeneous agents who are strategic on two dimensions, namely cost and capacity. The reward obtained for a single unit from an allocated agent depends o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Annals of mathematics and artificial intelligence 2019, Vol.85 (1), p.1-19
Hauptverfasser: Bhat, Satyanath, Jain, Shweta, Gujar, Sujit, Narahari, Y.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We study the problem of a buyer who gains stochastic rewards by procuring through an auction, multiple units of a service or item from a pool of heterogeneous agents who are strategic on two dimensions, namely cost and capacity. The reward obtained for a single unit from an allocated agent depends on the inherent quality of the agent; the agent’s quality is fixed but unknown. Each agent can only supply a limited number of units (capacity of the agent). The cost incurred per unit and capacity (maximum number of units that can be supplied) are private information of each agent. The auctioneer is required to elicit from the agents their costs as well as capacities (making the mechanism design bidimensional) and further, learn the qualities of the agents as well, with a view to maximize her utility. Motivated by this, we design a bidimensional multi-armed bandit procurement auction that seeks to maximize the expected utility of the auctioneer subject to incentive compatibility and individual rationality, while simultaneously learning the unknown qualities of the agents. We first work with the assumption that the qualities are known, and propose an optimal, truthful mechanism 2D-OPT for the auctioneer to elicit costs and capacities. Next, in order to learn the qualities of the agents as well, we provide sufficient conditions for a learning algorithm to be Bayesian incentive compatible and individually rational. We finally design a novel learning mechanism, 2D-UCB that is stochastic Bayesian incentive compatible and individually rational.
ISSN:1012-2443
1573-7470
DOI:10.1007/s10472-018-9611-0