Query2Vec: NLP Meets Databases for Generalized Workload Analytics

01/17/2018
by   Shrainik Jain, et al.
0

We propose methods for learning vector representations of SQL workloads to support a variety of administration tasks and application features, including query recommendation, workload summarization, index selection, identifying expensive queries, and predicting query reuse. We consider vector representations of both raw SQL text and optimized query plans under various assumptions and pre-processing strategies, and evaluate these methods on multiple real SQL workloads by comparing with results of task and application feature metrics in the literature. We find that simple algorithms based on these generic vector representations compete favorably with previous approaches that require a number of assumptions and task-specific heuristics. We then present a new embedding strategy specialized for queries based on tree-structured Long Short Term Memory (LSTM) network architectures that improves on the text-oriented embeddings for some tasks. We find that the general approach, when trained on a large corpus of SQL queries, provides a robust foundation for a variety of workload analysis tasks. We conclude by considering how workload embeddings can be deployed as a core database system feature to support database maintenance and novel applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2018

Query2Vec: An Evaluation of NLP Techniques for Generalized Workload Analytics

We consider methods for learning vector representations of SQL queries t...
research
08/25/2018

Database-Agnostic Workload Management

We present a system to support generalized SQL workload analysis and man...
research
07/12/2019

Detecting coherent explorations in SQL workloads

This paper presents a proposal aiming at better understanding a workload...
research
05/26/2021

Database Workload Characterization with Query Plan Encoders

Smart databases are adopting artificial intelligence (AI) technologies t...
research
08/09/2021

"What makes my queries slow?": Subgroup Discovery for SQL Workload Analysis

Among daily tasks of database administrators (DBAs), the analysis of que...
research
06/01/2023

BitE : Accelerating Learned Query Optimization in a Mixed-Workload Environment

Although the many efforts to apply deep reinforcement learning to query ...
research
09/02/2018

Query Log Compression for Workload Analytics

Analyzing database access logs is a key part of performance tuning, intr...

Please sign up or login with your details

Forgot password? Click here to reset