Degree Sequence Bound For Join Cardinality Estimation
Recent work has demonstrated the catastrophic effects of poor cardinality estimates on query processing time. In particular, underestimating query cardinality can result in overly optimistic query plans which take orders of magnitude longer to complete than one generated with the true cardinality. Cardinality bounding avoids this pitfall by computing a strict upper bound on the query's output size using statistics about the database such as table sizes and degrees, i.e. value frequencies. In this paper, we extend this line of work by proving a novel bound called the Degree Sequence Bound which takes into account the full degree sequences and the max tuple multiplicity. This bound improves upon previous work incorporating degree constraints which focused on the maximum degree rather than the degree sequence. Further, we describe how to practically compute this bound using a learned approximation of the true degree sequences.
READ FULL TEXT