Heuristics-based Query Reordering for Federated Queries in SPARQL 1.1 and SPARQL-LD
The federated query extension of SPARQL 1.1 allows executing queries distributed over different SPARQL endpoints. SPARQL-LD is a recent extension of SPARQL 1.1 which enables to directly query any HTTP web source containing RDF data, like web pages embedded with RDFa, JSON-LD or Microformats, without requiring the declaration of named graphs. This makes possible to query a large number of data sources (including SPARQL endpoints, online resources, or even Web APIs returning RDF data) through a single one concise query. However, not optimal formulation of SPARQL 1.1 and SPARQL-LD queries can lead to a large number of calls to remote resources which in turn can lead to extremely high query execution times. In this paper, we address this problem and propose a set of query reordering methods which make use of heuristics to reorder a set of SERVICE graph patterns based on their restrictiveness, without requiring the gathering and use of statistics from the remote sources. Such a query optimization approach is widely applicable since it can be exploited on top of existing SPARQL 1.1 and SPARQL-LD implementations. Evaluation results show that query reordering can highly decrease the query-execution time, while a method that considers the number and type of unbound variables and joins achieves the optimal query plan in 88
READ FULL TEXT