Optimizing SPARQL Query Answering over OWL Ontologies
The SPARQL query language is currently being extended by the World Wide Web Consortium (W3C) with so-called entailment regimes. An entailment regime defines how queries are evaluated under more expressive semantics than SPARQLs standard simple entailment, which is based on subgraph matching. The queries are very expressive since variables can occur within complex concepts and can also bind to concept or role names. In this paper, we describe a sound and complete algorithm for the OWL Direct Semantics entailment regime. We further propose several novel optimizations such as strategies for determining a good query execution order, query rewriting techniques, and show how specialized OWL reasoning tasks and the concept and role hierarchy can be used to reduce the query execution time. For determining a good execution order, we propose a cost-based model, where the costs are based on information about the instances of concepts and roles that are extracted from a model abstraction built by an OWL reasoner. We present two ordering strategies: a static and a dynamic one. For the dynamic case, we improve the performance by exploiting an individual clustering approach that allows for computing the cost functions based on one individual sample from a cluster. We provide a prototypical implementation and evaluate the efficiency of the proposed optimizations. Our experimental study shows that the static ordering usually outperforms the dynamic one when accurate statistics are available. This changes, however, when the statistics are less accurate, e.g., due to nondeterministic reasoning decisions. For queries that go beyond conjunctive instance queries we observe an improvement of up to three orders of magnitude due to the proposed optimizations.
READ FULL TEXT