Learning the Hypotheses Space from data Part I: Learning Space and U-curve Property
The agnostic PAC learning model consists of: a Hypothesis Space H, a probability distribution P, a sample complexity function m_H(ϵ,δ): [0,1]^2Z_+ of precision ϵ and confidence 1 - δ, a finite i.i.d. sample D_N, a cost function ℓ and a learning algorithm A(H,D_N), which estimates ĥ∈H that approximates a target function h^∈H seeking to minimize out-of-sample error. In this model, prior information is represented by H and ℓ, while problem solution is performed through their instantiation in several applied learning models, with specific algebraic structures for H and corresponding learning algorithms. However, these applied models use additional important concepts not covered by the classic PAC learning theory: model selection and regularization. This paper presents an extension of this model which covers these concepts. The main principle added is the selection, based solely on data, of a subspace of H with a VC-dimension compatible with the available sample. In order to formalize this principle, the concept of Learning Space L(H), which is a poset of subsets of H that covers H and satisfies a property regarding the VC dimension of related subspaces, is presented as the natural search space for model selection algorithms. A remarkable result obtained on this new framework are conditions on L(H) and ℓ that lead to estimated out-of-sample error surfaces, which are true U-curves on L(H) chains, enabling a more efficient search on L(H). Hence, in this new framework, the U-curve optimization problem becomes a natural component of model selection algorithms.
READ FULL TEXT