bionlp-sadi


BioNLP Semantic Web Services (SADI)

BioNLP Semantic Web Services (SADI)

Project has been moved to https://sites.google.com/site/bionlpsadi/

Overview

Abstract:

The number of NLP and BioNLP tools published as web services grows every year. Web services do not require installation, they are platform independent, and provide access to software modules that cannot be installed on regular computers due their complexity and heaviness. Whereas XML is the de facto interchange format for web services, the different XML schemas and the absence of semantics make the integration of resources (XML-based web services and their outputs) a very challenging task requiring significant effort from end users. We propose the use of semantic web services that provide semantic description of their in- and outputs to achieve interoperability of BioNLP services and the ad-hoc consolidation of their results. We leverage the SADI framework as a development platform to realize, by example, a number of highly integrated application and data integration scenarios.

Targets: * Development of BioNLP (and NLP) SADI Semantic Web Services: input and output are defined in terms of OWL ontologies * Interoperability of (Bio)NLP tools and web services * Easy building text mining pipelines for users without any programming skills * Easy comparative evaluation of (Bio)NLP tools (benchmarking, e.g. using SPARQL)

Possible end users: * Corpus developers * Text mining system developers * Bio database curators

Methods: * Service output represented in RDF format. * OWL ontologies for modeling to ensure interoperability of the web services. * Semantic triples stores to store, query, and manipulate text mining results. * SPARQL query language for ad-hoc semantic querying on results and implementation of benchmarking evaluation metrics for comparative evaluation of web services. * SADI SPARQL Clients for ad-hoc query and consolidation of text-mining results with data from biological databases. * Third-party tools and APIs for easy access (Taverna, Web Interface, SADI Java, Python, and Perl APIs, Annotation Toolkits with graphical interface).

Related

The ontology contains classes and properties missing in other ontologies used for modelling. E.g. property hasSourceDocument is inverse property of http://purl.org/ao/onSourceDocument, processedBy links text document with the service that processed it.
  • Registry http://cbakerlab.unbsj.ca:8080/sadi-registry-0.1.0-bionlp-sadi/services
  • > SADI services can be registered in a SADI registry. Registry gives an overview of available services and their descriptions (input and output descriptions/classes). Services are indexed by properties attached by services.
  • Registry 2 http://cbakerlab.unbsj.ca:8080/sadi-registry-0.1.0-bionlp-sadi-ext/services
  • Additional registry with data retrieval services for demo purposes.
  • SHARE Client http://cbakerlab.unbsj.ca:8080/cardioSHARE-bionlp-sadi
  • > SHARE is is a open source prototypical client for SADI services. It answers SPARQL queries by drawing data from multiple SADI services. The user just submits a SPARQL query and gets the results. It is used to apply text-mining services on example input (document/text), apply SADI data-retrieval services to populate text-mining results, and search over results at the same time. Basically, it combines text-mining pipelining and search. Results from different services are merged automatically, and search is done on the merged data. It works for simple use cases such as processing a single text.
  • Web demo http://cbakerlab.unbsj.ca:8080/bionlp-sadi-web-demo
  • Web demo can be used to build pipelines using services available in the registry. Once a new service registered in the registry, it appears in the web demo. Insert text - get merged RDF graph.

    Demo SPARQL queries

    • Find drugs in text and get their DrugBank IDs
    • Retrieve drug-drug interactions from DrugBank for drugs found in document in the same sentence
      It shows how text-mining results mashed-up and extended with information derived from knowledge bases automatically.

    Example services

    • Drug Extraction
    • Sentence Splitter

    Modelling

    Next presentation at DILS2013 in July.

    References


    https://twitter.com/BioNLPSADI'>http://bionlp-sadi.googlecode.com/files/TWITTER.jpeg' alt='S-BNLP' width='100' height='40/>

    http://www4.clustrmaps.com/user/526fff5f'>http://www4.clustrmaps.com/stats/maps-no_clusters/code.google.com-p-bionlp-sadi--thumb.jpg' alt='Locations of visitors to this page' />

    Citation

    Ahmad C. Bukhari, Artjom Klein, Christopher J. O. Baker, "Towards Interoperable BioNLP Semantic Web Services Using the SADI Framework", Data Integration in the Life Sciences Lecture Notes in Computer Science Volume 7970, 2013, pp 69-80.
    http://link.springer.com/chapter/10.1007%2F978-3-642-39437-9_6'>http://link.springer.com/chapter/10.1007%2F978-3-642-39437-9_6

    Project Information

    The project was created on Oct 2, 2012.

    Labels:
    BioNLP NLP natural-language-processing text-mining semantic-web-services SADI