
ar2sparql
AR2SPARQL translates Arabic NL queries to SPARQL which is the W3C standard query language for the SW. It uses Arabic NLP techniques to effectively maps query terms to ontological entities. It then utilizes a set of grammar rules as well as the knowledgein the ontology to construct a SPARQL query by linking the ontological entities. AR2SPARQL can handle not only simple queries, but also complex queries such as those consisting of multiple sentences linked with conjunctions, i.e. “و, أو” or interrogative pronouns, i.e. “الذي, التي”. The types of questions supported by AR2SPARQL are as the following: • Factoid Queries starting with “ما”, “من”, “ماذا”,”أين”. Queries that require analysis such as those starting with “كيف”, “ولماذا”, “كم عدد” are out of scope. Although rules can be defined to support answering such analytic questions, we opted to focus this work on the translation of the natural query to triple patterns while leaving the door open for future extension targeting analytical and complex queries.
• Affirmative negative queries (e.g. queries starting with “هل”): are those queries that require an affirmation or negation as answer.
• Questions starting with imperative commands such as "أذكر", “عدد”.
AR2SPARQL is designed to be ontology-portable and no assumption is made about any specific domain of knowledge. It can be interfaced to any ontology as long as the ontology terms are represented in Arabic or their Arabic translations are provided within the ontology.
Arabic OWL test data based on Mooney geography dataset:
Ontology + 877 Arabic questions
Arabic OWL test data based on Diseases ontology:
Ontology + 100 Arabic questions
Project Information
The project was created on Dec 18, 2014.
- License: New BSD License
- svn-based source control
Labels:
Academic