Clique-Based Lower Bounds for Parsing Tree-Adjoining Grammars
Tree-adjoining grammars are a generalization of context-free grammars that are well suited to model human languages and are thus popular in computational linguistics. In the tree-adjoining grammar recognition problem, given a grammar Γ and a string s of length n, the task is to decide whether s can be obtained from Γ. Rajasekaran and Yooseph's parser (JCSS'98) solves this problem in time O(n^2ω), where ω < 2.373 is the matrix multiplication exponent. The best algorithms avoiding fast matrix multiplication take time O(n^6). The first evidence for hardness was given by Satta (J. Comp. Linguist.'94): For a more general parsing problem, any algorithm that avoids fast matrix multiplication and is significantly faster than O(|Γ| n^6) in the case of |Γ| = Θ(n^12) would imply a breakthrough for Boolean matrix multiplication. Following an approach by Abboud et al. (FOCS'15) for context-free grammar recognition, in this paper we resolve many of the disadvantages of the previous lower bound. We show that, even on constant-size grammars, any improvement on Rajasekaran and Yooseph's parser would imply a breakthrough for the k-Clique problem. This establishes tree-adjoining grammar parsing as a practically relevant problem with the unusual running time of n^2ω, up to lower order factors.
READ FULL TEXT