|
EngineConcepts
Core Sarasvati engine concepts.
<-- Prev: Why graph based workflow? ... Next: Using Sarasvati --> Core ConceptsContents
IntroductionGraph based workflow/business process management engines are common. They have areas of commonality, but they also vary greatly in concept and implementation. For example, there are differences in how concurrency and synchronization are modeled and in how modularity and re-use are promoted. We begin with the some definitions, move on to features likely to be common across most engines, then explain Sarasvati specifics. DefinitionsGraphs come with a set of common terms. To begin with, a graph is made up of a set of things, hereafter referred to as nodes and a set of connections between nodes, know as arcs.
These definitions cover the parts of a process definition. However, they don't cover how that process definition is actually executed. When a process definition gets executed, the execution is called a process. Somehow, a process must track which nodes are being executed. This is generally accomplished by placing markers called token on the active nodes.
Sarasvati Graph ExecutionLet us start with a simple process definition, the classic 'Hello World'. When executed, this process will print out 'Hello, World!' and then complete. LegendFirst, we introduce a graphical notation for process definitions and execution. Not all the symbols will make sense immediately, but they will all be explained.
Single NodeThe simplest useful process definition would consist of a simple node. Here is the graphical representation:
How will this process be executed? First the engine needs to determine where to start execution.
There are various ways of handling this. For example, there may be a specific type of node designated for start positions. All nodes of this type will have tokens placed in them at process start. Alternately, nodes may have an attribute which indicates whether or not they are a start node, allowing any node to be a start node. Sarasvati takes this second approach. Assuming that the 'Hello World' node is a start node, execution would begin by creating a new node token at the 'Hello World' node.
With the addition of the node token, the process would now look like:
As you can see, the node now has an active node token stationed on it. At this point the node has not yet been executed. Before it can be, its guard would need to be invoked.
By default, a node's guard will return Accept. The node will then be executed. This should cause 'Hello, World!' to be printed out.
As there are no further steps in the process, it is now complete and looks like:
The entire process can be viewed here. Two NodesLet's now example a slightly more complicated example. Instead of a single node, we'll have two, the first of which prints out 'Hello', the second prints out 'World'. It looks as follows:
The Hello node is a predecessor of the World node. This dependency is indicated by the directed arc. As the Hello node is marked as a start node, a node token will be placed there when the process begins executing.
When the node token on Hello is completed, an arc token will be generated on the outgoing arc.
Since the arc on which the arc token is situated goes into a non-join node, a node token will be created on World immediately.
The process now looks like:
The World node will now run its guard and then execute. Finally the node token will be completed.
The entire process can be viewed here. Split and Join with Wait StatesLet us now examine an example which contains concurrent execution. The process describes an approval process.
The process looks like:
This a simplified system, since it does not allow approvals to be denied. There is more than one way that this process could be executed.
Let us view process execution for both these cases, starting with the case where approvals are done by people and thus tokens will need to enter wait states. Execution will begin as usual, by placing a node token in the nodes marked as being start nodes. The Request node will be executed. It generates a task for the requester to complete. Until the requester has filled out out the request and completed the task, the token will be in a wait state. During this time the process will look like:
Question: What happens once the Request has been completed? Which arc or arcs will arc tokens be generated on? Answer: Sarasvati requires that an arc name be specified when completing a node token. All arcs with this name will have arc tokens generated on them. Some things to note:
So now the node token on Request has been completed and arc tokens will be generated on the outgoing arcs. First a node token will be generated on the upper arc (though order of arc execution is not guaranteed).
This arc leads to a node which can be executed. The arc token will be completed and a node token will be placed in the Approval 1 node.
Here the node token will enter a wait state. Since no further execution can take place here, an arc token will now be generated on the second outgoing arc.
Again, since node Approval 2 can be executed immediately, the arc token will be completed and a node token will be created. It will also enter into a wait state once the notification to the user has been created.
At some point one of the approvals will be completed. Let's say that it's Approval 2. This will mark the node token complete and generate an arc token on the outgoing arc.
Now the engine will see if the Grant node can be executed. However, as the dashed border indicates, the Grant node is a join node.
Since there are two arcs with the 'default' name coming into Grant, and only one of them has an arc token, the node can not be executed at this time. Execution will halt at this point. At some point later, the token at Approval 1 is completed. This generates an arc token on the outgoing node.
Now when the engine tries to execute Grant it finds arc tokens on all the incoming 'default' arcs. These arc tokens are marked complete and a node token is generated on Grant.
Once the Grant task is finished, its node token will also be completed and the process will be complete.
The entire process can be viewed here. MultithreadingAs seen the previous example, a process may have multiple tokens active concurrently. Does this imply that each token executes in a separate thread? No. Concurrency here is like that of multiple programs running on a single chip. Each runs in turns, but may present the appearance of running simultaneously. However, true multithreading can be done at the node level. Each node when executed, may hand off its work to a background thread. The node token will then enter a wait state, and other nodes may be executed. When the background task is complete, it may then complete the node token, allowing further execution. Note that only one thread may safely execute the process at any given time, and care must be taken to serialize access to the process itself. Split and Join without Wait StatesLets now take a look at the same process, except now the approvals will be done by software and will not require a wait state. The execution will be the same up to the point where Approval 1 is executing.
Previously, the node token went into a wait state. This time, the approval is done synchronously and the token will be completed. This will generate an arc token on the outgoing arc.
Again, the Grant node is a join node, so it will wait for an arc token on the other incoming arc before executing. Execution will continue on the lower outgoing arc of Request.
Execution will continue into Approval 2.
This execution will also finish synchronously and an arc token will be generated on the outgoing arc.
Execution will finish as before now that all required incoming arcs have tokens on them. The entire process can be viewed here. Flow Control with Guards using SkipNow that we've seen how execution can split across arcs and join nodes can bring current executions back together, let us examine how to select which outgoing arcs receive tokens and which nodes get executed. This example uses almost the same process as the previous example. The difference is that either or both approvals may be optional, depending on what is being requested. Let us pick up execution after the request has been entered and an arc token generated on the upper arc:
Now the node token will be generated in Approval 1.
However, remember that this does not mean that the node will immediately execute. First the guard must be invoked. Up until now, the guard has always been assumed to just return Accept. This time however, the guard is intelligent. It will check to see if this approval is required. If not, it will return a Skip response.
Assume that Approval 1 is not required. The node token will marked as having skipped the node, and execution will continue on the outgoing arc.
For the case where the guard for Approval 2 returns Accept, the entire execution can be seen here. Flow Control with Guards using DiscardHaving seen Skip, let us examine how to use the Discard response from guards. The same basic process definition is used, only this time, the assumption is that only one of the guards is required. The graph now looks like:
Because we are using discard, only one token will reach Grant. This is why the Grant node is no longer a join node. Execution begins as normal. We pick up execution where a node token has been generated in Approval 1.
In this case, the guard determines that Approval 1 is not required, and returns a Discard response.
The process now looks like:
The node token has been discarded, and execution has continued from the completion of Request where an arc token has been generated on the lower outgoing arc. Execution will now continue.
Approval 2 will accept its node token and will continue normally.
Remember, because Grant is no longer a join node, it will have a node token generated on it as soon as any arc tokens arrived.
The complete process can be seen here. Flow Control with Guards using Named ArcsThis same basic process could be implemented using a guard which returns Skip along with an arc name. In this variant, a Select node has been inserted after Request. This node has no functionality, it only exists to give the guard a place to run.
Let us pick it up after process started, as Select has a node token generated on it, and its guard is invoked.
The Select guard will return a Skip response which includes the arc name on which to exit. All arcs with this name will have an arc token generated on them. In this case, let us say the guard determines that Approval 2 is required. It returns Skip two. An arc token is then generated on all arcs named two (of which is there only one in this case).
From here execution continues as normal. The complete process can be seen here. Flow Control from Node Completion using Named ArcsAs mentioned previously, when a node token is completed, an arc name must be specified. Arc tokens will be generated on all outgoing arcs with that name. So the previous example could also be implemented like this:
Instead of using the guard on the Select node, the Request node will specify which arc to exit on.
If we again specify two, then an arc token will be generated on that arc.
From there, execution will continue. The complete process can be seen here. Graph Composition and Nested ProcessesMuch like any software, a set of process definitions can grow larger, more complex and more intertwined as time goes. One solution used in the broader software world is encapsulation. This involves pulling out common functionality and breaking up large pieces into smaller components. These same techniques can be used with a set of process definitions. Rather than using copy/paste, sections of process definitions that are common can be extracted. Large process definitions can be split out into smaller components. Sarasvati supports two ways of doing encapsulation, each with it's own advantages and disadvantages. The first is graph composition, the second is nested processes. Both of these techniques allow complete process definitions and components that have been split out to be defined in separately. The difference lies in when they are composed.
Now that we have general idea of how graph composition and nested processes compare, let us investigate them in more detail.
Graph Composition Example OneLet's look at an examples of how this works in practice. Here is a small process definition which we want to embed. This process definition will be named ext.
It only has two nodes. Notice that both nodes are join nodes, even though one node has no inputs and the other only has one. However, in the composed graph these nodes may have more inputs. Next is the process definition which will be using ext.
This process definition looks very different from previous examples. It isn't even fully connected. Some things to note:
When the graph is loaded, the composed version will look as follows:
Graph Composition Example TwoThe previous example referenced only a single instance. Here is the example using two instances 'ext.
When it is loaded, the composed graph looks like:
As you can see, we now have two copies of ext embedded in the process definition. One copy will be made for each unique instance referenced. A process definition can have references to any number of different external definitions and each external process definition can be imported any number of times. Nested Processes ExampleThe above example could not be implemented with nested processes because a nested process must be represented by a single node in the parent process. So, here is a similar, but simpler example using nested processes.
Nodes S and T both refer to the nested process named nested. Note that nested is almost the same as ext, except that the first node is a start node. This is because nested will be executed as a separate process. If it didn't have a start node, it would not execute. When S and T execute, each will spawn a separate process. When S is executed, it will have an incomplete node token t. As part of execution it will start a new nested process P which have have the token t as a parent. When P completes, it will check if it has a parent token, and finding that it does, will complete t. This will allow execution to continue in the original process. Execution EnvironmentWhile executing your process definitions, it may be desirable to have some shared state or to send data between nodes via the tokens. Sarasvati supports both these things via the execution environment. Each process has an environment on which attributes/variables can be set. In addition, each token also has its own environment.
When using a memory backed engine, all environment attributes are stored in memory. However, when using a database backed engine, we may wish to persist only certain attributes. Also, storing objects in the database can be complicated, storing arbitrary objects in memory is easier than doing so in the database. By default, attributes are persistent, however, there is a separate set of variables which are transient.
Process AttributesIf you want state that is accessible from anywhere during process execution, then attributes can be set on the process environment. These attributes are visible and mutable by all nodes. Token AttributesEach node token also has its own environment. Arc tokens do not have an environment, because they do not execute in the same way that node tokens do, and thus have no need for private state. Node tokens are initialized with the state of their parent tokens.
|
Sign in to add a comment