research
∙
11/30/2022
Fast Inference from Transformers via Speculative Decoding
Inference from large autoregressive models like Transformers is slow - d...
research
∙
10/17/2022