|
FuXiUserManual
End User Manual for FuXi
IntroductionFuXi (pronounced foo-shee) is a forward-chaining production system for Notation 3 Description Logic Programming. FuXi was originally meant as a Python swiss army knife for all things semantic web related. It works as a companion to RDFLib, a Python library for working with RDF. The Primary ModulesAn overview of the top-level modules in FuXi serves as an introduction to the general features of FuXi. The FuXi libraries are divided as follows:
FuXi.HornThe Horn module was originally meant as a reference implementation of the W3C's Rule Interchange Format Basic Logic Dialect ( work in progress ) but eventually evolved into a Pythonic API for managing an abstract Logic Programming syntax. This module is heavily used by both the DLP and Rete modules for (respectively) creating the rulesets converted from OWL RDF expressions and creating a Horn ruleset from a parsed Notation 3 graph. The Horn module includes Python classes for each of the major components of the RIF BLD abstract syntax (EBNF Grammar for the Presentation Syntax of RIF-BLD):
Horn rulesets can be built from the ground up by instantiating the objects piecemeal: Example: {?C rdfs:subClassOf ?SC. ?M a ?C} => {?M a ?SC}. >>> clause = Clause(And([Uniterm(RDFS.subClassOf,[Variable('C'),Variable('SC')]),
... Uniterm(RDF.type,[Variable('M'),Variable('C')])]),
... Uniterm(RDF.type,[Variable('M'),Variable('SC')]))
>>> Rule(clause,[Variable('M'),Variable('SC'),Variable('C')])
Forall ?M ?SC ?C ( ?SC(?M) :- And( rdfs:subClassOf(?C ?SC) ?C(?M) ) )
>>> And([Uniterm(RDF.type,[RDFS.comment,RDF.Property]),
... Uniterm(RDF.type,[OWL.Class,RDFS.Class])])
And( rdf:Property(rdfs:comment) rdfs:Class(owl:Class) )
>>> Exists(formula=Or([Uniterm(RDF.type,[RDFS.comment,RDF.Property]),
... Uniterm(RDF.type,[OWL.Class,RDFS.Class])]),
... declare=[Variable('X'),Variable('Y')])
Exists ?X ?Y ( Or( rdf:Property(rdfs:comment) rdfs:Class(owl:Class) ) )
>>> And([Uniterm(RDF.type,[RDFS.comment,RDF.Property]),
... Uniterm(RDF.type,[OWL.Class,RDFS.Class])]).n3()
u'rdfs:comment a rdf:Property .\\n owl:Class a rdfs:Class'RIF BLD objects can also be constructed by parsing a Notation 3 document like so: >>> from FuXi.Rete.RuleStore import N3RuleStore >>> from FuXi.Horn.HornRules import Ruleset >>> from rdflib.Graph import Graph >>> from rdflib.syntax.NamespaceManager import NamespaceManager First, we instantiate an N3RuleStore which will be the recipient of the parsed Notation 3 assertions. Then we instantiate an rdflib NamespaceManager passing on a Graph that makes use of the rulestore ( so that any namespace prefix definitions are picked up ). Then an rdflib Graph is created using the rule store instance and namespace manager and the rdfs-rules.n3 Notation 3 document is parsed from the web: >>> ruleStore = N3RuleStore()
>>> nsMgr = NamespaceManager(Graph(ruleStore))
>>> ruleGraph = Graph(ruleStore,namespace_manager=nsMgr)
>>> ruleGraph.parse('http://www.agfa.com/w3c/euler/rdfs-rules.n3',format='n3')
<Graph identifier=... (<class 'rdflib.Graph.Graph'>)>Note, the latest revision in SVN simplifies this with a utility method called SetupRuleStore_: >>> from FuXi.Rete.RuleStore import SetupRuleStore
>>> ruleStore,ruleGraph=SetupRuleStore()
>>> ruleGraph.parse('http://www.agfa.com/w3c/euler/rdfs-rules.n3',format='n3')
<Graph identifier=... (<class 'rdflib.Graph.Graph'>)>Finally, a !Ruleset object is instantiated, passing the rule store and the namespace mappings. The RuleSet object is iterated over and each of the rules parsed from the Notation 3 document are printed (which serializes each rule using the RIF BLD syntax) >>> for rule in Ruleset(n3StoreSrc=ruleStore,nsMapping=ruleStore.nsMgr): print rule ... Forall ?Q ?P ?S ?O ( ?Q(?O ?S) :- And( owl:inverseOf(?P ?Q) ?P(?S ?O) ) ) Forall ?P ?S ?O ( ?P(?O ?S) :- And( owl:SymmetricProperty(?P) ?P(?S ?O) ) ) Forall ?P ?S ?O ?X ( ?P(?S ?O) :- And( owl:TransitiveProperty(?P) ?P(?X ?O) ?P(?S ?X) ) ) Forall ?Y ?P ?R ?X ( ?P(?X ?Y) :- And( owl:onProperty(?R ?P) owl:hasValue(?R ?Y) ?R(?X) ) ) ... snip ... Forall ?P ?S ?R ?O ( ?R(?S ?O) :- And( rdfs:subPropertyOf(?P ?R) ?P(?S ?O) ) ) Forall ?C ( rdfs:subClassOf(?C rdfs:Resource) :- rdfs:Class(?C) ) Forall ?A ?S ?B ( ?B(?S) :- And( rdfs:subClassOf(?A ?B) ?A(?S) ) ) Forall ?A ?C ?B ( rdfs:subClassOf(?A ?C) :- And( rdfs:subClassOf(?B ?C) rdfs:subClassOf(?A ?B) ) ) SerializationFrom the example(s) above, instantiated RIF BLD objects can be serialized in one of two ways: as human-readable RIF syntax or as Notation 3. The former serialization is built in by overriding the repr class method; a standard mechanism used in order to ".. compute the ``official'' string representation of an object.". The latter serialization can be achieved by invoking the n3 method on any RIF BLD Python object. The Horn module simplifies the process of serializing appropriate QNames (or curies) for the URIs associated with Uniterms. Uniterms can be thought of as the RIF equivalent of RDF statements or Logic Programming atoms. In order to associate a namespace mapping dictionary (a Python dictionary of prefixes to rdflib.URIRef instances of the corresponding fully qualified namespace URI), a Uniterm constructor can be invoked and passed such a dictionary via the newNss keyword argument FuXi.SyntaxThe FuXi.Syntax module incorporates the InfixOwl library (see the linked Wiki for more information). FuXi.ReteAt the heart of the python-dlp framework is an implementation of most of the RETE-UL algorithms outlined in the PhD thesis (1995) of Robert Doorenbos: Production Matching for Large Learning Systems. Robert's thesis describes a modification of the original Rete algorithm that (amongst other things) limits the fact syntax (referred to as Working Memory Elements) to 3-item tuples (which corresponds quite nicely with the RDF abstract syntax). The thesis also describes methods for using hash tables to improve efficiency of alpha nodes and beta nodes. Instances of the FuXi.Rete.ReteNetwork class are RETE-UL networks. So, to programmatically build a RETE-UL network, a developer would write: from FuXi.Rete import ReteNetwork
from FuXi.Rete.RuleStore import N3RuleStore
from rdflib.syntax.NamespaceManager import NamespaceManager
from rdflib.Graph import Graph
closureDeltaGraph=Graph()
ruleStore=N3RuleStore(additionalBuiltins={ .. URI to callable dictionary })
ruleStore = N3RuleStore()
nsMgr = NamespaceManager(Graph(ruleStore))
ruleGraph = Graph(ruleStore,namespace_manager=nsMgr)
.. parse a Notation 3 document into ruleGraph ..
network = ReteNetwork(ruleStore,inferredTarget = closureDeltaGraph,nsMap = ruleStore.nsMgr)First, a closure delta graph is created. This is the graph where all the inferred RDF statements will be stored. Next, an N3RuleStore is instantiated passing in an (optional) dictionary for user-specified built-ins. For a list of 'standard' CWM builtins, see: CWM Builtins. Note, the RETE-UL implementation doesn't support denoting (or calculating) built-ins. It only supports built-in predicates that compare existing values. So, for example math:product is not supported, but math:lessThan is. The additionalBuiltins keyword argument expects a dictionary where the key is an RDFLib URIRef instance (the URI of the built-in predicate) and the value is a Python callable which should take two arguments as input and return a boolean value that corresponds to the expected semantics for the custom built-in predicate. Finally, the network is instanciated, passing in the ruleStore, the closure delta graph, and a namespace mapping. Here, the ruleStore's namespace mapping is passed in, so the RETE network will inherit any namespace bindings parsed in from the N3 document. If a closure delta graph is not provided, one will be created. In either case, the inferredFacts attribute of the network will be set to the closure delta graph. From here, RDF facts can be fed into the network in order to calculate the inferred RDF statements and add them to the closure delta graph: from FuXi.Rete.Util import generateTokenSet network.feedFactsToAdd(generateTokenSet(someRDFGraph)) Here, someRDFGraph is an RDFLib Graph instance that contains the RDF facts to pass into the network. At this point, network.inferredFacts should consist of the RDF statements that can be inferred from the given ruleset and initial RDF facts. FuXi.DLPThis module is a Description Horn Logic implementation as defined by Grosof, B. et.al. ("Description Logic Programs: Combining Logic Programs with Description Logic" ) in section 4.4. As such, it implements recursive mapping functions "T", "Th" and "Tb" which result in "custom" (dynamic) rulesets. For the non logic-inclined, this essentially allows OWL ontologies (or a subset of OWL ontologies) to be automatically converted to a set of rules that exactly capture the semantics of the OWL document. This mechanism is fundamental to the larger framework that FuXi is a part of (python-dlp). The premise is two-fold. First (and most importantly), the ruleset(s) generated from an OWL ontology will be much more tailored to the specific constraints of the ontology than a general-purpose ruleset would. As such, the inference mechanism will be several orders of magnitude more efficient. Secondly, tools that are used for authoring OWL ontologies are significantly more mature than those used for authoring Notation 3 rulesets (or any other comparable semantic web rule language). Using the DLP mechanism, a domain expert can model the semantics of a particular domain using any off-the-shelf OWL editor and generate a corresponding ruleset. To invoke the DLP implementation, a developer would do the following: from FuXi.Rete.Util import generateTokenSet network.setupDescriptionLogicProgramming(tBoxGraph) network.feedFactsToAdd(generateTokenSet(tBoxGraph)) network.feedFactsToAdd(generateTokenSet(someRDFGraph)) The setupDescriptionLogicProgramming method can be invoked on a ReteNetwork instance, passing in an RDFLib Graph that consists of the OWL assertions that we wish to translate to a ruleset as the only argument. This method will return a list of RuleSet objects each of which represents a rule that was translated from the OWL assertions. The second line then sends the OWL RDF assertions through the network. This is necessary to fully classify the OWL ontology. Then finally, an RDF graph of facts are sent through the network. Typically, a user will have an RDF graph with instance-level statements (the ABox) and an OWL RDF graph that describes the vocabulary terms used in the instance graph (the TBox). After following the three steps above, the network.inferredFacts graph will now have all the RDF statements that can be inferred from the combination of the OWL graph and the instance graph. Note, the DLP algorithm only supports a subset of OWL-DL, so not all OWL graphs will be properly axiomatized. Finally, a network can be reset via the network.reset() method. This will clear the RETE-UL network, and is useful when you want to setup a network once from an OWL graph and calculate the closure delta graph for multiple instance graphs from the same ruleset. After resetting the network, the TBox graph will both need to be sent through the network again, followed by the subsequent instance graph: network.setupDescriptionLogicProgramming(tBoxGraph) network.feedFactsToAdd(generateTokenSet(tBoxGraph)) network.feedFactsToAdd(generateTokenSet(someRDFGraph1)) network.reset() network.feedFactsToAdd(generateTokenSet(tBoxGraph)) network.feedFactsToAdd(generateTokenSet(someRDFGraph2)) ..etc.. |
Sign in to add a comment
