Understanding Auto-Scheduling Optimizations for Model Deployment via Visualizations
After completing the design and training phases, deploying a deep learning model onto specific hardware is essential before practical implementation. Targeted optimizations are necessary to enhance the model's performance by reducing inference latency. Auto-scheduling, an automated technique offering various optimization options, proves to be a viable solution for large-scale auto-deployment. However, the low-level code generated by auto-scheduling resembles hardware coding, potentially hindering human comprehension and impeding manual optimization efforts. In this ongoing study, we aim to develop an enhanced visualization that effectively addresses the extensive profiling metrics associated with auto-scheduling. This visualization will illuminate the intricate scheduling process, enabling further advancements in latency optimization through insights derived from the schedule.
READ FULL TEXT