Resource Management for GPT-based Model Deployed on Clouds: Challenges, Solutions, and Future Directions

08/05/2023
by   Yongkang Dang, et al.
0

The widespread adoption of the large language model (LLM), e.g. Generative Pre-trained Transformer (GPT), deployed on cloud computing environment (e.g. Azure) has led to a huge increased demand for resources. This surge in demand poses significant challenges to resource management in clouds. This paper aims to highlight these challenges by first identifying the unique characteristics of resource management for the GPT-based model. Building upon this understanding, we analyze the specific challenges faced by resource management in the context of GPT-based model deployed on clouds, and propose corresponding potential solutions. To facilitate effective resource management, we introduce a comprehensive resource management framework and present resource scheduling algorithms specifically designed for the GPT-based model. Furthermore, we delve into the future directions for resource management in the GPT-based model, highlighting potential areas for further exploration and improvement. Through this study, we aim to provide valuable insights into resource management for GPT-based models deployed in clouds and promote their sustainable development for GPT-based models and applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2018

Software-Defined Multi-Cloud Computing: A Vision, Architectural Elements, and Future Directions

Cloud computing has been emerged in the last decade to enable utility-ba...
research
08/02/2019

Above the Clouds: A Brief Survey

Cloud Computing is a versatile technology that can support a broad-spect...
research
05/25/2021

A Holistic View on Resource Management in Serverless Computing Environments: Taxonomy and Future Directions

Serverless computing has emerged as an attractive deployment option for ...
research
12/03/2018

Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments

This chapter presents software architectures of the big data processing ...
research
05/12/2023

Predicting Resource Consumption of Kubernetes Container Systems using Resource Models

Cloud computing has radically changed the way organisations operate thei...
research
08/04/2022

Aiming in Harsh Environments: A New Framework for Flexible and Adaptive Resource Management

The harsh environment imposes a unique set of challenges on networking s...
research
08/07/2023

Intelligence-Endogenous Management Platform for Computing and Network Convergence

Massive emerging applications are driving demand for the ubiquitous depl...

Please sign up or login with your details

Forgot password? Click here to reset