the context of the Cloud of Linked Data, a large number of huge RDF
linked datasets have become available, and this number keeps growing.
Simultaneously, scalable and efficient RDF engines that follow the
traditional optimize-then-execute paradigm have been developed to
locally access RDF data, and SPARQL endpoints have been implemented for
remote query processing. However, given the size of existing datasets,
lack of statistics to describe available sources, and unpredictable
conditions of remote queries, existing solutions are still
First, the most efficient RDF engines rely their query
processing algorithms on physical access and storage structures that
are locally stored; however, because of the size of existing linked
datasets, loading the data and their links is not always feasible.
Second, remote linked data query processing can be extremely costly
because of the lack of query planning; also, current techniques are not
adaptable to unpredictable data transfers or data availability, thus,
executions can be unsuccessful. To overcome these limitations,
query physical operators and execution engines need to be able to
access remote data and adapt query execution schedulers to data
In this tutorial
we present the basis of adaptive query processing frameworks defined in
the database area, and their applicability in the Linked data context.
This tutorial targets any conference attendee who wants to know
limitations of existing RDF engines, adaptive query processing
techniques and how traditional RDF data management approaches can be
extended to remotely access linked data and be well-suitable to runtime
practitioners that develop or use query engines to
consume Linked data.
participants to have just a basic understanding of RDF and SPARQL.
The tutorial covers
traditional data management solutions that implement the optimize-then-execute paradigm, and
their pros and cons for Linked data query processing; novel storage and access data
structures, and query optimization and execution techniques implemented by state-of-the-art RDF
engines are described. Then, adaptive frameworks defined in the database
area to manage remote query processing, are analyzed; adaptive operators such as
symmetric hash joins (binary and n-ary), routing operators, and adaptive engines are studied.
Finally, the applicability of adapting techniques are illustrated
with an adaptive query processing engine for SPARQL endpoints; we
show the implemented physical operators and
the query scheduler as well as their performance.
• Traditional data management system architecture and its main
• Basic terminology.
• Cost-based optimization techniques.
• Traditional iterator model architecture.
• Centralized data management physical operators.
• Centralized data management query engines.
• Query optimization and execution techniques in existing RDF engines
like RDF-3X and Jena.
• SPARQL endpoints and their execution model.
• Current linked data query processing approaches.
Lecture 3-Adaptive Query Processing (50 minutes):
• Adaptive physical operators: symmetric hash joins and n-ary joins.
• Adaptive query processing schedulers, routing policies.
• Adaptive query engines.
• Requirements of physical operators for Linked data query processing.
• An adaptive query engine for Linked data.
is a Full Professor of the Computer Science department at the
Caracas, Venezuela where she has taught several Database courses at
undergraduate level. Prof. Ruckhaus has
participated in several international projects supported by AECI
Ruckhaus has over 20 publications in international and national
conferences and journals. She has been reviewer and has participated in
the Program Committee of several International Conferences.
Maria-Esther Vidal is a Full Professor of
the Computer Science department at the Universidad
in several international
projects supported by NFS (USA), AECI (Spain) and CNRS (France). She
has advised five PhD students and more than 45 master and undergraduate
students. Professor Vidal has published more than 50 papers in International
Conferences and Journals of the Database and The Semantic Web areas. She has been reviewer
and has participated in the Program Committee of several International
Journals and Conferences.