Policy Optimization Using Semi-parametric Models for Dynamic Pricing
In this paper, we study the contextual dynamic pricing problem where the market value of a product is linear in its observed features plus some market noise. Products are sold one at a time, and only a binary response indicating success or failure of a sale is observed. Our model setting is similar...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | In this paper, we study the contextual dynamic pricing problem where the
market value of a product is linear in its observed features plus some market
noise. Products are sold one at a time, and only a binary response indicating
success or failure of a sale is observed. Our model setting is similar to
Javanmard and Nazerzadeh [2019] except that we expand the demand curve to a
semiparametric model and need to learn dynamically both parametric and
nonparametric components. We propose a dynamic statistical learning and
decision-making policy that combines semiparametric estimation from a
generalized linear model with an unknown link and online decision-making to
minimize regret (maximize revenue). Under mild conditions, we show that for a
market noise c.d.f. $F(\cdot)$ with $m$-th order derivative ($m\geq 2$), our
policy achieves a regret upper bound of $\tilde{O}_{d}(T^{\frac{2m+1}{4m-1}})$,
where $T$ is time horizon and $\tilde{O}_{d}$ is the order that hides
logarithmic terms and the dimensionality of feature $d$. The upper bound is
further reduced to $\tilde{O}_{d}(\sqrt{T})$ if $F$ is super smooth whose
Fourier transform decays exponentially. In terms of dependence on the horizon
$T$, these upper bounds are close to $\Omega(\sqrt{T})$, the lower bound where
$F$ belongs to a parametric class. We further generalize these results to the
case with dynamically dependent product features under the strong mixing
condition. |
---|---|
DOI: | 10.48550/arxiv.2109.06368 |