Seminars - Reading Group

Foundations of RDF Triple Storage & Querying: Hexastore and RDF-3x.

February 20th 2013

Sometimes triple storage is overlooked when designing semantic
applications. It is one of the modules that almost always appears in the
architecture labeled "Triple Store" or "SPARQL Endpoint" and developers
assume that will work just fine. However, designing these systems is a
challenging task: the requirements of final applications are
increasingly demanding each day, not only in terms of data volume but
also on the complexity of the queries and expected response time.

In this talk we will review some of the main basic concepts behind the
design of these data stores by presenting two highly influential papers
in the field: Hexastore, that proposes an internal organization of the
data by using multiple indices over the list of RDF statements; and
RDF-3x, that presents a simple yet powerful engine to improve query
performance optimization.

Finding Co-solvers on Twitter, with a Little Help from Linked Data

January 23rd 2013

If you were ever thinking about finding the best collaborator for a
research project. Then you might have found it necessary to look for
experts in that domain...well, you should come to this talk.
The talk gives an overview of the main state of the art techniques for
user profiling and user recommendation on the Social Web and in
particular on Twitter. It also introduces an interesting novel approach
for co-solver recommendation leveraging the potentialities of Linked Data.

Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact

December 17th 2012

Tweets can predict highly cited articles within the first 3 days of
article publication. Social media activity either increases citations or
reflects the underlying qualities of the article that also predict
citations, but the true use of these metrics is to measure the distinct
concept of social impact. Social impact measures based on tweets are
proposed to complement traditional citation metrics. The proposed
twimpact factor may be a useful and timely metric to measure uptake of
research findings and to filter research findings resonating with the
public in real time.

Towards Automated Quality Models for Software Development Communities: the QualOSS and FLOSSMetrics case

December 12th 2012

Software metrics and quality models play a pivotal role in measurement
of software quality. In this presentation, I will present state of the
art and comparison of different FLOSS quality models
(OpenBRR/QSoS/QualOSS/SQO-OSS). Most of the quality models available
today requires manual effort but the authors in this paper propose an
automated approach to calculate metrics and attributes.

How Novices Model Business Processes

December 10th 2012

You might be interested in representing your
plans/solutions/methodologies using some design standards, but you are
not sure whether or not learning a modelling language is going to make
your life easier! It might be possible that using a custom made design
method is going to be sufficient.
This paper is going to clarify for you these interrogations and help you
determine what type of modeller you are and wehther you have good
quality design skills.

What can quantum theory bring to information retrieval?

December 5th 2012

Recent works in information retrieval (IR) are exploring the application
of the principles and formalisms of quantum physics as representation
models for IR. The objective is to evolve beyond traditional vector
space models, which are at the center of IR, to a richer mathematical
representation for future IR systems. This new field, named Quantum
Information Retrieval (QIR), has already a solid community behind it and
it is a promising research direction for IR.
The objective of this talk is to introduce the basic concepts of Quantum
Information Retrieval. The paper of Piwowarski et al. is used as the
guiding reference in this introduction but other references will be used
to depict the landscape of this area. The talk is also going to briefly
describe connections between the formalism behind QIR and computational

The Internet of Things: A survey.

November 1st 2012

The Internet of Things (IoT) will change the way we live. We will be
surrounded by objects that are no longer passive, or isolated. They will
be smart, interconnected through a network, and will cooperate to help
us in many areas, such as domotics, e-health, assisted living,
transport, among many others. In this talk we will review some
definitions of the IoT, we will describe the scenarios where this
technology have great potential and summarize the already-solved and
still-challenging research problems involved.

Context-Aware Access Control for RDF Graph Stores

October 24th 2012

We present SHI3LD, an access control framework for RDF
stores. Our solution supports access from mobile devices with
context-aware policies and is exclusively grounded on standard Semantic
Web languages. Designed as a pluggable filter for generic SPARQL
endpoints, the module uses RDF named graphs and SPARQL to protect
triples. Evaluation shows faster execution time for low-selective
queries and less impact on larger datastores.

Fusion of Background Knowledge and Streams of Events

October 17th 2012

Adding background knowledge to events can provide many benefits to
complex event processing systems. However most current approaches
require loading the knowledge base into the memory to enable querying
the knowledge or do the reasoning at runtime. This will cause
scalability issues and the performance is one of the top concerns of
event processing systems. In this paper, the authors demonstrate that it
is not necessary to load the entire dataset into the memory and query
the knowledge base every time a new event occurs for some kinds of event
rules, moreover, for those which do need to do so, it can be optimized.

Executing SPARQL queries over the web of linked data

October 3rd 2012

The Web of Linked Data forms a single, globally distributed dataspace.
Due to the openness of this dataspace, it is not possible to know in
advance all data sources that might be relevant for query answering.
This openness poses a new challenge that is not addressed by traditional
research on federated query processing. In this paper we present an
approach to execute SPARQL queries over the Web of Linked Data. The main
idea of our approach is to discover data that might be relevant for
answering a query during the query execution itself. This discovery is
driven by following RDF links between data sources based on URIs in the
query and in partial results. The URIs are resolved over the HTTP
protocol into RDF data which is continuously added to the queried
dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization
of the pipelining approach and show that classical iterators may cause
blocking due to the latency of HTTP requests. To avoid blocking, we
propose an extension of the iterator paradigm. The evaluation of our
approach shows its strengths as well as the still existing challenges.