IOCA: High-Speed I/O-Aware LLC Management for Network-Centric Multi-Tenant Platform

by   Yifan Yuan, et al.

In modern server CPUs, last-level cache (LLC) is a critical hardware resource that exerts significant influence on the performance of the workloads, and how to manage LLC is a key to the performance isolation and QoS in the cloud with multi-tenancy. In this paper, we argue that besides CPU cores, high-speed network I/O is also important for LLC management. This is because of an Intel architectural innovation – Data Direct I/O (DDIO) – that directly injects the inbound I/O traffic to (part of) the LLC instead of the main memory. We summarize two problems caused by DDIO and show that (1) the default DDIO configuration may not always achieve optimal performance, (2) DDIO can decrease the performance of non-I/O workloads which share LLC with it by as high as 32 We then present IOCA, the first LLC management mechanism for network-centric platforms that treats the I/O as the first-class citizen. IOCA monitors and analyzes the performance of the cores, LLC, and DDIO using CPU's hardware performance counters, and adaptively adjusts the number of LLC ways for DDIO or the tenants that demand more LLC capacity. In addition, IOCA dynamically chooses the tenants that share its LLC resource with DDIO, to minimize the performance interference by both the tenants and the I/O. Our experiments with multiple microbenchmarks and real-world applications in two major end-host network models demonstrate that IOCA can effectively reduce the performance degradation caused by DDIO, with minimal overhead.


page 3

page 8

page 10


Workload Behavior Driven Memory Subsystem Design for Hyperscale

Hyperscalars run services across a large fleet of servers, serving billi...

Economical and efficient network super points detection based on GPU

Network super point is a kind of special host which plays an important r...

λ-NIC: Interactive Serverless Compute on Programmable SmartNICs

There is a growing interest in serverless compute, a cloud computing mod...

ALP: Alleviating CPU-Memory Data Movement Overheads in Memory-Centric Systems

Partitioning applications between NDP and host CPU cores causes inter-se...

Automatic Parallelization of Software Network Functions

Software network functions (NFs) trade-off flexibility and ease of deplo...

CuttleSys: Data-Driven Resource Management forInteractive Applications on Reconfigurable Multicores

Multi-tenancy for latency-critical applications leads to re-source inter...

Collie: Finding Performance Anomalies in RDMA Subsystems

High-speed RDMA networks are getting rapidly adopted in the industry for...

Please sign up or login with your details

Forgot password? Click here to reset