MP-Rec: Hardware-Software Co-Design to Enable Multi-Path Recommendation

02/21/2023
by   Samuel Hsia, et al.
0

Deep learning recommendation systems serve personalized content under diverse tail-latency targets and input-query loads. In order to do so, state-of-the-art recommendation models rely on terabyte-scale embedding tables to learn user preferences over large bodies of contents. The reliance on a fixed embedding representation of embedding tables not only imposes significant memory capacity and bandwidth requirements but also limits the scope of compatible system solutions. This paper challenges the assumption of fixed embedding representations by showing how synergies between embedding representations and hardware platforms can lead to improvements in both algorithmic- and system performance. Based on our characterization of various embedding representations, we propose a hybrid embedding representation that achieves higher quality embeddings at the cost of increased memory and compute requirements. To address the system performance challenges of the hybrid representation, we propose MP-Rec – a co-design technique that exploits heterogeneity and dynamic selection of embedding representations and underlying hardware platforms. On real system hardware, we demonstrate how matching custom accelerators, i.e., GPUs, TPUs, and IPUs, with compatible embedding representations can lead to 16.65x performance speedup. Additionally, in query-serving scenarios, MP-Rec achieves 2.49x and 3.76x higher correct prediction throughput and 0.19 0.22 datasets, respectively.

READ FULL TEXT

page 3

page 5

page 6

page 7

page 10

page 11

page 12

page 13

research
10/12/2020

MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions

Deep neural networks are widely used in personalized recommendation syst...
research
05/18/2021

RecPipe: Co-designing Models and Hardware to Jointly Optimize Recommendation Quality and Performance

Deep learning recommendation systems must provide high quality, personal...
research
03/01/2021

High-Performance Training by Exploiting Hot-Embeddings in Recommendation Systems

Recommendation models are commonly used learning models that suggest rel...
research
11/04/2020

Understanding Capacity-Driven Scale-Out Neural Recommendation Inference

Deep learning recommendation models have grown to the terabyte scale. Tr...
research
09/11/2020

Accelerating Recommender Systems via Hardware "scale-in"

In today's era of "scale-out", this paper makes the case that a speciali...
research
12/02/2022

DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation

Deep learning-based personalized recommendation systems are widely used ...
research
01/25/2022

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation

We propose RecShard, a fine-grained embedding table (EMB) partitioning a...

Please sign up or login with your details

Forgot password? Click here to reset