One way of introducing sparsity into deep networks is by attaching an
ex...
It is well established that increasing scale in deep transformer network...
Deep and wide neural networks successfully fit very complex functions to...
Recent investigations in noise contrastive estimation suggest, both
empi...
We consider a general statistical estimation problem wherein binary labe...
It is well established that training deep neural networks gives useful
r...
We develop an approach for estimating models described via conditional m...
Given one sample X ∈{± 1}^n from an Ising model [X=x]∝(x^ J x/2), whose ...
Spin glass models, such as the Sherrington-Kirkpatrick, Hopfield and Isi...
Statistical learning theory has largely focused on learning and
generali...
The standard linear and logistic regression models assume that the respo...
Asynchronous Gibbs sampling has been recently shown to be fast-mixing an...
A popular methodology for building binary decision-making classifiers in...
We prove near-tight concentration of measure for polynomial functions of...