Online Regularization for High-Dimensional Dynamic Pricing Algorithms
We propose a novel online regularization scheme for revenue-maximization in high-dimensional dynamic pricing algorithms. The online regularization scheme equips the proposed optimistic online regularized maximum likelihood pricing (OORMLP) algorithm with three major advantages: encode market noise knowledge into pricing process optimism; empower online statistical learning with always-validity over all decision points; envelop prediction error process with time-uniform non-asymptotic oracle inequalities. This type of non-asymptotic inference results allows us to design safer and more robust dynamic pricing algorithms in practice. In theory, the proposed OORMLP algorithm exploits the sparsity structure of high-dimensional models and obtains a logarithmic regret in a decision horizon. These theoretical advances are made possible by proposing an optimistic online LASSO procedure that resolves dynamic pricing problems at the process level, based on a novel use of non-asymptotic martingale concentration. In experiments, we evaluate OORMLP in different synthetic pricing problem settings and observe that OORMLP performs better than RMLP proposed in <cit.>.
READ FULL TEXT