Estimating query rewriting quality over LOD

Torre Bastida Anai*, Jesas Bermadez, Arantza Illarramendi

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Nowadays it is becoming increasingly necessary to query data stored in different datasets of public access, such as those included in the Linked Data environment, in order to get as much information as possible on distinct topics. However, users have difficulty to query those datasets with different vocabularies and data structures. For this reason it is interesting to develop systems that can produce on demand rewritings of queries. Moreover, a semantics preserving rewriting cannot often be guaranteed by those systems due to heterogeneity of the vocabularies. It is at this point where the quality estimation of the produced rewriting becomes crucial. In this paper we present a novel framework that, given a query written in the vocabulary the user is more familiar with, the system rewrites the query in terms of the vocabulary of a target dataset. Moreover, it informs about the quality of the rewritten query with two scores: a similarity factor which is based on the rewriting process itself, and a quality score offered by a predictive model. This Machine Learning based model learns from a set of queries and their intended (gold standard) rewritings. The feasibility of the framework has been validated in a real scenario.

Original languageEnglish
Pages (from-to)529-554
Number of pages26
JournalSemantic Web
Volume10
Issue number3
DOIs
Publication statusPublished - 2019

Keywords

  • Linked Open Data
  • query rewriting
  • Semantic web
  • similarity
  • SPARQL

Fingerprint

Dive into the research topics of 'Estimating query rewriting quality over LOD'. Together they form a unique fingerprint.

Cite this