Optimal Sketching for Kronecker Product Regression and Low Rank Approximation
We study the Kronecker product regression problem, in which the design matrix is a Kronecker product of two or more matrices. Given A_i ∈R^n_i × d_i for i=1,2,...,q where n_i ≫ d_i for each i, and b ∈R^n_1 n_2 ... n_q, let A = A_1 ⊗ A_2 ⊗...⊗ A_q. Then for p ∈ [1,2], the goal is to find x ∈R^d_1 ... d_q that approximately minimizes Ax - b_p. Recently, Diao, Song, Sun, and Woodruff (AISTATS, 2018) gave an algorithm which is faster than forming the Kronecker product A Specifically, for p=2 their running time is O(∑_i=1^q nnz(A_i) + nnz(b)), where nnz(A_i) is the number of non-zero entries in A_i. Note that nnz(b) can be as large as n_1 ... n_q. For p=1,q=2 and n_1 = n_2, they achieve a worse bound of O(n_1^3/2poly(d_1d_2) + nnz(b)). In this work, we provide significantly faster algorithms. For p=2, our running time is O(∑_i=1^q nnz(A_i) ), which has no dependence on nnz(b). For p<2, our running time is O(∑_i=1^q nnz(A_i) + nnz(b)), which matches the prior best running time for p=2. We also consider the related all-pairs regression problem, where given A ∈R^n × d, b ∈R^n, we want to solve min_xA̅x - b̅_p, where A̅∈R^n^2 × d, b̅∈R^n^2 consist of all pairwise differences of the rows of A,b. We give an O(nnz(A)) time algorithm for p ∈[1,2], improving the Ω(n^2) time needed to form A̅. Finally, we initiate the study of Kronecker product low rank and low t-rank approximation. For input A as above, we give O(∑_i=1^q nnz(A_i)) time algorithms, which is much faster than computing A.
READ FULL TEXT