Autodidactic Neurosurgeon: Collaborative Deep Inference for Mobile Edge Intelligence via Online Learning

by   Letian Zhang, et al.

Recent breakthroughs in deep learning (DL) have led to the emergence of many intelligent mobile applications and services, but in the meanwhile also pose unprecedented computing challenges on resource-constrained mobile devices. This paper builds a collaborative deep inference system between a resource-constrained mobile device and a powerful edge server, aiming at joining the power of both on-device processing and computation offloading. The basic idea of this system is to partition a deep neural network (DNN) into a front-end part running on the mobile device and a back-end part running on the edge server, with the key challenge being how to locate the optimal partition point to minimize the end-to-end inference delay. Unlike existing efforts on DNN partitioning that rely heavily on a dedicated offline profiling stage to search for the optimal partition point, our system has a built-in online learning module, called Autodidactic Neurosurgeon (ANS), to automatically learn the optimal partition point on-the-fly. Therefore, ANS is able to closely follow the changes of the system environment by generating new knowledge for adaptive decision making. The core of ANS is a novel contextual bandit learning algorithm, called μLinUCB, which not only has provable theoretical learning performance guarantee but also is ultra-lightweight for easy real-world implementation. We implement our system on a video stream object detection testbed to validate the design of ANS and evaluate its performance. The experiments show that ANS significantly outperforms state-of-the-art benchmarks in terms of tracking system changes and reducing the end-to-end inference delay.


page 1

page 3

page 4

page 5

page 6

page 7

page 10

page 11


Real-Time Video Inference on Edge Devices via Adaptive Model Streaming

Real-time video inference on compute-limited edge devices like mobile ph...

Dynamic DNN Decomposition for Lossless Synergistic Inference

Deep neural networks (DNNs) sustain high performance in today's data pro...

Online Learning for Orchestration of Inference in Multi-User End-Edge-Cloud Networks

Deep-learning-based intelligent services have become prevalent in cyber-...

Collaborative Video Analytics on Distributed Edges with Multiagent Deep Reinforcement Learning

Deep Neural Network (DNN) based video analytics empowers many computer v...

Adaptive DNN Surgery for Selfish Inference Acceleration with On-demand Edge Resource

Deep Neural Networks (DNNs) have significantly improved the accuracy of ...

Deep Learning on Mobile Devices Through Neural Processing Units and Edge Computing

Deep Neural Network (DNN) is becoming adopted for video analytics on mob...

AppealNet: An Efficient and Highly-Accurate Edge/Cloud Collaborative Architecture for DNN Inference

This paper presents AppealNet, a novel edge/cloud collaborative architec...

Please sign up or login with your details

Forgot password? Click here to reset