My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
UCI_Requirements  
One-sentence summary of this page.
Featured, Phase-Design, Phase-Requirements, Phase-Support
Updated Feb 4, 2010 by ruv...@gmail.com

Introduction

This document specifies the implementation of an semantic process that can broker access and represent multiple cloud providers that are cloud-platform or cloud-infrastructure designs. The concept is to provide a single interface that can be used to retrieve a unified representation of all multi-cloud resources and to control these resources as needed.

What is the Semantic Web and why does it matter for a unified cloud interface?

The Wikipedia describes the semantic web as "a vision of information that is understandable by computers, so that they can perform more of the tedious work involved in finding, sharing and combining information on the web." Similar to this vision for a sematic web, why not apply those same philosophies to the underlying control structures of web (i.e. the cloud) itself? A kind of Semantic Cloud Infrastructure capable of being adapted for a variety of methodologies / architectures and completely agnostic to any specific API or platform being described. An general abstraction that doesn't care if you're talking about a platform (Google App Engine, Salesforce, Mosso etc), applications (SaaS, Web2.0, email, id/auth) or infrastructure models (EC2, Vmware, CIM, Microsoft, etc).

Vision

The key drivers of a unified cloud interface is to create an api for and about other api's. (One abstraction to Rule them All) A singular abstraction that can encompass the entire infrastructure stack as well as emerging cloud centric technologies all through a unifed interface. What a sematic model enables for the UCI is a capability to bridge both cloud based API's such as Amazon Web Services with existing protocols and standards, regardless of the level of adoption of the underlying API's. (Develop your application once, deploy anywhere at anytime for any reason.)

The other benefit of a semantic model is that of future proofing. Creating a model that assumes we as an industry are moving forward and not making any assumptions on the advancements in technology by implementing a static specification based on current technology limitations but instead creating one that can adapt as we adapt.

In this vision for a unified cloud interface the use of the resource description framework (RDF) or something similar would be an ideal method to describe our semantic cloud data model (taxonomy & ontology). The benefit toRDF based ontology languages is they act as general method for the conceptual description or modeling of information that is implemented by web resources. These web resources could just as easily be "cloud resources" orAPI's. This approach may also allow us to easily take an RDF -based cloud data model and use it within other ontology languages or web services making it both platform and vendor agnostic. Using this approach we're not so much defining how, but instead describing what.

Use Case

A user has access to internal cloud infrastructure, an external cloud for off-site processing such as Amazon EC2. This user normally sees the cloud infrastructure through multiple access programs with multiple credentials. The UCI Agent is installed on a single server somewhere in the users infrastructure and it is configured to connect to a Jabber server, and configured with access credentials to the two different cloud providers. The agent starts up and connects to both cloud providers and the jabber server, then waits for input from XMPP commands.

The user points UCI Interface (unspecified at this point, we'll assume a web browser) and authenticates with the UCI Agent process. The user directs the Interface to query the agent for the current running state of both cloud infrastructures. The UCI Agent returns an XML RDF document representation of the state of the cluster, the UCI Interface renders this to the user. The user then chooses to startup multiple machines on Amazon EC2, he does this through the UCI Interface which directs the UCI Agent to provision multiple machines of the specified type. While this job is running, the UCI Interface occasionally probes the agent for the state information to show progress.

Every time the UCI Agent gets a request for state, it returns this XML RDF document quickly, as it is storing the state of the cluster in a local data store and does not need to probe the cloud itself. Also while the job is running the user directs the interface to delete machines on the UCI network to free up resources. This as well shows progress as it proceeds. The user decides she would like to add another cloud provider for the agent to manage. She shuts down the UCI agent, installs a new EGG for the Google App Engine plugin and provides her credentials in the config file.

The user starts the agent back up and refresh the UCI Interface which now shows all new resources on Google App Engine. This new representation shows applications and statistics on these applications instead of virtual machines, as is appropriate for a cloud-platform. The user logs out of the system, and all resources continue to function and state continues to update as changes happen on the cloud providers.

Details

Interface Specifications

  • Agent Code - Python daemonized process with event handling (possibly using Twisted).
  • Agent Configuration - Config file stored in a known location for the agent to parse on startup.
  • Agent Storage - Berkly DB or SQLite DB stored in a known location for the agent to open on startup.
  • Agent Logging - Standard python syslog logging.
  • Agent Command Interface - XMPP Server (API to be determined)
  • Agent Plugins - Python based code that hooks into Agent Code (hooking method unknown).
  • Cloud Control - Agent Plugin designed as a class with standardized, versioned methods.
  • Testing framework - A testing framework must be designed alongside this application to validate all functions of the system, this must interface using XMPP with the Agent externally to ensure that changes do not break the application.

Resource Description Framework RDF

The RDF format shall be used to as a mechanism for describing cloud / ecp resources. The subject of an RDF statement is a resource, possibly as named by a Uniform Resource Identifier (URI). Some resources are unnamed and are called blank nodes or anonymous resources. They are not directly identifiable. The predicate is a resource as well, representing a relationship. The object is a resource or a Unicode string literal.

The following example shows how such simple claims can be elaborated on, by combining multiple RDF vocabularies. Here, we note that the primary topic of the Wikipedia page is a "Person" whose name is "Tony Benn":

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:foaf="http://xmlns.com/foaf/0.1/" 
 xmlns:dc="http://purl.org/dc/elements/1.1/">
	<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn">
	<dc:title>Tony Benn</dc:title>
	<dc:publisher>Wikipedia</dc:publisher>
 <foaf:primaryTopic>
 <foaf:Person>
 <foaf:name>Tony Benn</foaf:name> 
 </foaf:Person>

Query and inference languages

SPARQL allows for a query to consist of triple patterns, conjunctions, disjunctions, and optional patterns.6

PREFIX abc: <nul://sparql/exampleOntology#> .
SELECT ?capital ?country
WHERE {
 ?x abc:cityname ?capital ;
 abc:isCapitalOf ?y.
 ?y abc:countryname ?country ;
 abc:isInContinent abc:Africa.
}

Variables are indicated by a "?" or "$" prefix. Bindings for ?capital and the ?country will be returned.

The SPARQL query processor will search for sets of triples that match these four triple patterns, binding the variables in the query to the corresponding parts of each triple. Important to note here is the "property orientation" (class matches can be conducted solely through class-attributes / properties)

 </foaf:primaryTopic>
	</rdf:Description>
</rdf:RDF>

To make queries concise, SPARQL allows the definition of prefixes and base URIs in a fashion similar to Turtle. In this query, the prefix "abc" stands for “http://example.com/exampleOntology#”

<xs:complexType name="StorageTemplateType"> 
<xs:sequence> 
<xs:element ref="storage-t-xs:Size" /> 
<xs:element ref="storage-t-xs:RaidLevel"  /> 
<xs:element ref="storage-t-xs:Hosts" minOccurs="0" maxOccurs="unbounded" /> 
</xs:sequence> 
</xs:complexType> 
<xs:element name="Size" type="xs:unsignedInt" /> 
<xs:element name="RaidLevel" type="xs:unsignedInt" /> 
<xs:element name="Hosts" type="storage-t-xs:HostType"/>    
<xs:complexType name="HostType"> 
<xs:sequence> 
<xs:element name="WWN" type="xs:string" minOccurs="1" maxOccurs="unbounded" />    

Pros & Cons

Primary strengths of RDF/OWL as:

  • support for information integration and reuse of shared vocabularies
  • handling of semi-structured data
  • separation of syntax from data modelling
  • web embedding
  • extensibility and resilience to change
  • support for inference and classification, based on a formal semantics
  • representation flexibility, especially ability to model graph structures
  • ability to represent instance and class information in the same formalism and hence combine them

Weaknesses noted are:

  • weak ability to validate documents
  • expressivity limitations, particularly in terms of correlating across different properties of a resource
  • performance
  • XML serialization issues and impedance mismatch with XML tooling
  • lack of familiarity and potentially high learning curve
  • inability to natively represent uncertain data and continuous domains
  • no built-in representation of processes and change

RDF/OWL is particularly suited to modelling applications which involve distributed information problems such as integration of data from multiple sources, publication of shared vocabularies to enable interoperability and development of resilient networks of systems which can cope with changes to the data models.

More Details here

http://www.hpl.hp.com/techreports/2005/HPL-2005-189.html

Potential implementation

http://www.w3.org/TR/rdf-sparql-json-res/

Diagram


Sign in to add a comment
Powered by Google Project Hosting