All servers use the same database and a common file share. The file share is used for logs and several other directories, specifically: logs, var/email, var/plugins, and var/repository. Each server also independently maintains some configuration information, such as ports and hosts. The database is used for configuration information, runtime data, and so on.
Because the servers share the database, all servers run on the same interval.
Some configuration properties remain on the server, such as database and JMS connection information. Database configuration is handled during product installation; no additional configuration is required post-installation.
Workflows consist of activities. Activities can be run sequentially, run in parallel with one another, or some combination of the two. A typical workflow might consist of several sequential activities, such as:
For JMS-based communications, agents can be configured in several ways:
All servers constantly poll for pending workflows, so any server might initiate the workflow. The server that acquires this workflow runs the following tasks:
After it completes the work, the agent sends a response message over JMS. The message will be written to the database (by one of the servers) and the next activity started (by one of the servers). The server that started Activity B runs the same steps as described above.
In the simple workflow that is sketched here, the activities might all be handled by the same server or different servers (or some combination). Of course, the same would be true if this workflow consisted of three parallel activities.
An application workflow is maintained by a single record in the database (only one thread handles a workflow at the same time).
During application processing, command failures are marked in the workflow. Error handling is the responsibility of the application author. Component rollback can be handled with rollback command/steps. Rolling-back, as used here, means reinstalling an earlier component version.
If the server crashes while an agent is running a command, the JMS mesh assigns the workflow to another server.
If an agent crashes or otherwise disappears while running a command (remembering that failed steps do not cause agents themselves to fail), the server assumes the command is still running; there is no automatic time out. Normally, it is neither feasible nor practicable to assign a timeout interval.