Suriya Gunasekar

research

∙ 09/11/2023

Textbooks Are All You Need II: phi-1.5 technical report

We continue the investigation into the power of smaller Transformer-base...

0 Yuanzhi Li, et al. ∙

research

∙ 02/17/2023

(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability

In this paper, we investigate the impact of stochasticity and large step...

0 Mathieu Even, et al. ∙

research

∙ 11/17/2022

How to Fine-Tune Vision Models with SGD

SGD (with momentum) and AdamW are the two most used optimizers for fine-...

0 Ananya Kumar, et al. ∙

research

∙ 07/22/2022

Neural-Sim: Learning to Generate Training Data with NeRF

Training computer vision models usually requires collecting and labeling...

5 Yunhao Ge, et al. ∙

research

∙ 07/05/2022

Generalization to translation shifts: a study in architectures and augmentations

We provide a detailed evaluation of various image classification archite...

0 Suriya Gunasekar, et al. ∙

research

∙ 06/09/2022

Unveiling Transformers with LEGO: a synthetic reasoning task

We propose a synthetic task, LEGO (Learning Equality and Group Operation...

8 Yi Zhang, et al. ∙

research

∙ 03/03/2022

Data Augmentation as Feature Manipulation: a story of desert cows and grass cows

Data augmentation is a cornerstone of the machine learning pipeline, yet...

7 Ruoqi Shen, et al. ∙

research

∙ 02/24/2021

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

We study the function space characterization of the inductive bias resul...

5 Meena Jagadeesan, et al. ∙

research

∙ 12/14/2020

NeurIPS 2020 Competition: Predicting Generalization in Deep Learning

Understanding generalization in deep learning is arguably one of the mos...

0 Yiding Jiang, et al. ∙

research

∙ 07/13/2020

Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy

We provide a detailed asymptotic study of gradient flow trajectories and...

0 Edward Moroshko, et al. ∙

research

∙ 04/02/2020

Mirrorless Mirror Descent: A More Natural Discretization of Riemannian Gradient Flow

We present a direct (primal only) derivation of Mirror Descent as a "par...

0 Suriya Gunasekar, et al. ∙

research

∙ 02/20/2020

Kernel and Rich Regimes in Overparametrized Models

A recent line of work studies overparametrized neural networks in the "k...

0 Blake Woodworth, et al. ∙

research

∙ 11/18/2019

Implicit Regularization of Normalization Methods

Normalization methods such as batch normalization are commonly used in o...

17 Xiaoxia Wu, et al. ∙

research

∙ 06/13/2019

Kernel and Deep Regimes in Overparametrized Models

A recent line of work studies overparametrized neural networks in the "k...

0 Blake Woodworth, et al. ∙

research

∙ 05/17/2019

Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models

With an eye toward understanding complexity control in deep learning, we...

0 Mor Shpigel Nacson, et al. ∙

research

∙ 10/28/2018

On preserving non-discrimination when combining expert advice

We study the interplay between sequential decision making and avoiding d...

0 Avrim Blum, et al. ∙

research

∙ 06/01/2018

Implicit Bias of Gradient Descent on Linear Convolutional Networks

We show that gradient descent on full-width linear convolutional network...

0 Suriya Gunasekar, et al. ∙

research

∙ 03/05/2018

Convergence of Gradient Descent on Separable Data

The implicit bias of gradient descent is not fully understood even in si...

0 Mor Shpigel Nacson, et al. ∙

research

∙ 02/22/2018

Characterizing Implicit Bias in Terms of Optimization Geometry

We study the bias of generic optimization methods, including Mirror Desc...

0 Suriya Gunasekar, et al. ∙

research

∙ 05/25/2017

Implicit Regularization in Matrix Factorization

We study implicit regularization when optimizing an underdetermined quad...

0 Suriya Gunasekar, et al. ∙

research

∙ 11/14/2016

Preference Completion from Partial Rankings

We propose a novel and efficient algorithm for the collaborative prefere...

0 Suriya Gunasekar, et al. ∙

research

∙ 08/02/2016

Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization

This work proposes a new algorithm for automated and simultaneous phenot...

0 Shalmali Joshi, et al. ∙

research

∙ 03/29/2016

Unified View of Matrix Completion under General Structural Constraints

In this paper, we present a unified analysis of matrix completion under ...

0 Suriya Gunasekar, et al. ∙

research

∙ 09/15/2015

Exponential Family Matrix Completion under Structural Constraints

We consider the matrix completion problem of recovering a structured mat...

0 Suriya Gunasekar, et al. ∙

research

∙ 12/05/2014

Consistent Collective Matrix Completion under Joint Low Rank Structure

We address the collective matrix completion problem of jointly recoverin...

0 Suriya Gunasekar, et al. ∙

Suriya Gunasekar

Featured Co-authors

Sign in with Google

Consider DeepAI Pro