Large-scale Transformer models bring significant improvements for variou...
While vision transformers (ViTs) have continuously achieved new mileston...
The conventional lottery ticket hypothesis (LTH) claims that there exist...
Recently, sparse training has emerged as a promising paradigm for effici...
State-of-the-art Transformer-based models, with gigantic parameters, are...
Recent research demonstrated the promise of using resistive random acces...
Pretrained large-scale language models have increasingly demonstrated hi...
Structured weight pruning is a representative model compression techniqu...