CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One

by   Liyuan Wang, et al.

Continual learning requires incremental compatibility with a sequence of tasks. However, the design of model architecture remains an open question: In general, learning all tasks with a shared set of parameters suffers from severe interference between tasks; while learning each task with a dedicated parameter subspace is limited by scalability. In this work, we theoretically analyze the generalization errors for learning plasticity and memory stability in continual learning, which can be uniformly upper-bounded by (1) discrepancy between task distributions, (2) flatness of loss landscape and (3) cover of parameter space. Then, inspired by the robust biological learning system that processes sequential experiences with multiple parallel compartments, we propose Cooperation of Small Continual Learners (CoSCL) as a general strategy for continual learning. Specifically, we present an architecture with a fixed number of narrower sub-networks to learn all incremental tasks in parallel, which can naturally reduce the two errors through improving the three components of the upper bound. To strengthen this advantage, we encourage to cooperate these sub-networks by penalizing the difference of predictions made by their feature representations. With a fixed parameter budget, CoSCL can improve a variety of representative continual learning approaches by a large margin (e.g., up to 10.64 CUB-200-2011 and 6.72 performance.


Energy-Based Models for Continual Learning

We motivate Energy-Based Models (EBMs) as a promising model class for co...

Unicorn: Continual Learning with a Universal, Off-policy Agent

Some real-world domains are best characterized as a single task, but for...

Is Multi-Task Learning an Upper Bound for Continual Learning?

Continual and multi-task learning are common machine learning approaches...

Multiple Modes for Continual Learning

Adapting model parameters to incoming streams of data is a crucial facto...

Continual Learning with Dynamic Sparse Training: Exploring Algorithms for Effective Model Updates

Continual learning (CL) refers to the ability of an intelligent system t...

Incorporating Neuro-Inspired Adaptability for Continual Learning in Artificial Intelligence

Continual learning aims to empower artificial intelligence (AI) with str...

Multimodal Parameter-Efficient Few-Shot Class Incremental Learning

Few-Shot Class Incremental Learning (FSCIL) is a challenging continual l...

Please sign up or login with your details

Forgot password? Click here to reset