The question of what makes a data distribution suitable for deep learnin...
Graph neural networks (GNNs) are widely used for modeling complex
intera...
In the pursuit of explaining implicit regularization in deep learning,
p...
Implicit regularization in deep learning is perceived as a tendency of
g...
Language models that utilize extensive self-supervised pre-training from...
Mathematically characterizing the implicit regularization induced by
gra...
Attention based models have become the new state-of-the-art in natural
l...