hydrant-kepler


Web front end for the kepler scientific workflow application

Kepler (http://www.kepler-project.org)

Kepler is a desktop application or building and executing scientific workflows. A workflow in Kepler is built by selecting specialised units of computation (called Actors), and linking in their inputs and outputs to perform the desired overall task. Actors are generally simple concepts such as file reading/writing and basic mathematics but can also perform more complex tasks such as web service calls, command line execution, even grid submission. Creation of workflows is made simple by supplying a GUI with a library of available actors and a Canvas similar to conventional drawing tools. Actors are added to the workflow by simply dragging the desired actor from the library and dropping it on the canvas. Links between actors are then made by drawing lines between the actors inputs and outputs. Kepler is based on an engineering modelling tool named Ptolemy II from which it inherits the Director concept. A Director controls how the workflow is executed, i.e., what order the actors execute in, whether they block while waiting for input and much more. This ability to change the execution model is what separates Kepler from other workflow tools. Kepler is built using the Java programming language, and contains a very easy API for building custom actors, making it easy for user that have specialised requirements for their workflows to build their own actors.

Problems with Kepler

For someone with little technical knowledge (i.e. your average scientist), Kepler is hard. As simple as the visual interface for building workflows is, it's still not simple enough. The application is analogous to a programming IDE. It is an excellent tool for building and testing workflows, but when it comes to deploying the workflows to users, simply setting them up with a copy of Kepler and expecting them to run their workflows, which may require tweaking of the actors properties for different runs, is like expecting a user to modify a program's source code each time they want to run the program. Another problem is that, with the exception of workflows the use grid job submission actors, all workflow execution is performed on the users desktop. This is not an ideal environment for jobs that require heavy computation or long running jobs.

The Solution

Kepler provides a great tool for building and testing workflows, so all that needs to be done is build a simple easy to use platform for deploying and executing workflows. Hydrant (which is just a codename for now) is a Web based portal which sits on top of the core Kepler engine, who's end goal is to solve this problem.

Design Concepts

The curent concept pushing the design on Hydrant is as follows: 1. workflow “designers” build workflows for scientists using the Kepler desktop application. 1. the designer uploads the workflow to Hydrant. 1. from Hydrant the designer can configure specific input parameters of the workflow to be displayed to the scientists via a standard web form. 1. the scientist logs into Hydrant, select the workflow, fill in the web form and submit a job. 1. The job is then executed and when complete the outputs from the workflow are made available through a web page in Hydrant.

Status

Hydrant is still in proof-of-concept stages.

The basic things that have been implemented are: * Loading workflows with Kepler, allowing direct access to workflow properties * Web view of kepler workflow * Displaying and editing of actor properties * Executing workflows * Web form generation for job specific changes to workflow properties * Capturing basic workflow output for display on the web * Workflow repository searching * Basic workflow sharing

Things that need doing in the near future: * Workflow validation * making sure the files uploaded are valid workflow * making sure all dependencies of the workflow are met * More work on translating actor property types to web form view * More advanced workflow sharing * Notification system (alert users of finished jobs, or errors etc) * User Interface enhancements * much, much more....

Implementation

The proof-of-concept prototype of Hydrant has been written under jython, using django as the web framework.

Jython (http://www.jython.org) was chosen for the rapid turn-around and rapid prototyping that the python language supplies, along with the jython's ability to directly access Java classes, thus allowing direct access to the Kepler engine.

Django (http://www.djangoproject.org) was chosen simply for exploratory reasons. it was a new web framework that seemed to have a huge potential. It has so far provided all the required functionality and is quick to develop under and easy to learn.

js-graph-it (http://js-graph-it.sourceforge.net) has been

Project Information

Labels:
workflow