Analyzing HPC Support Tickets: Experience and Recommendations
High performance computing (HPC) user support teams are the first line of defense against large-scale problems, as they are often the first to learn of problems reported by users. Developing tools to better assist support teams in solving user problems and tracking issue trends is critical for maintaining system health. Our work examines the Los Alamos National Laboratory HPC Consult Team's user support ticketing system and develops proof of concept tools to automate tasks such as category assignment and similar ticket recommendation. We also generate new categories for reporting and discuss ideas to improve future ticketing systems.
READ FULL TEXT