There are growing interests in adapting large-scale language models usin...
The recent advance of self-supervised learning associated with the
Trans...
While model compression is increasingly important because of large neura...
Even though fine-grained pruning techniques achieve a high compression r...
Various post-training uniform quantization methods have usually been stu...
Transformer is being widely used in Neural Machine Translation (NMT).
De...
Quantization based on the binary codes is gaining attention because each...
The number of parameters in deep neural networks (DNNs) is rapidly incre...
Model compression techniques, such as pruning and quantization, are beco...