On Solution Functions of Optimization: Universal Approximation and Covering Number Bounds
We study the expressibility and learnability of convex optimization solution functions and their multi-layer architectural extension. The main results are: (1) the class of solution functions of linear programming (LP) and quadratic programming (QP) is a universal approximant for the C^k smooth model class or some restricted Sobolev space, and we characterize the rate-distortion, (2) the approximation power is investigated through a viewpoint of regression error, where information about the target function is provided in terms of data observations, (3) compositionality in the form of a deep architecture with optimization as a layer is shown to reconstruct some basic functions used in numerical analysis without error, which implies that (4) a substantial reduction in rate-distortion can be achieved with a universal network architecture, and (5) we discuss the statistical bounds of empirical covering numbers for LP/QP, as well as a generic optimization problem (possibly nonconvex) by exploiting tame geometry. Our results provide the first rigorous analysis of the approximation and learning-theoretic properties of solution functions with implications for algorithmic design and performance guarantees.
READ FULL TEXT