Message flow transactions

A transaction describes a set of updates that are made by an application program, which must be managed together. The updates might be made to one or more systems. The updates made by the program are controlled by the environment in which the program executes, and either all are completed, or none. This property of a transaction is known as consistency: transactions might have other properties of atomicity, isolation, and durability.

IBM® Integration Bus supports transactions, and every piece of data processed by a message flow has an associated transaction. A message flow transaction is started by the integration node when input data is received at an input node in the flow; it is committed when the flow has finished with that message, or rolled back if an error occurs.

If the flow contains more than one input node, one transaction is started for each input node when it receives input data. A transaction is started for every type of input node, including user-defined input nodes.

The nodes that you include in your message flow provide specific processing of a message, according to the defined function of each node. The processing that they do includes internal work, some of which you can influence by configuring the node properties. Some nodes perform additional tasks that might affect systems that are external to the message flow or the integration node.

If an external system, such as a database, supports the concept of commit and rollback, and it can take part in an integration node transaction, you can configure the node so that the work it does is included in the flow transaction. Depending on the node, you can also specify if the work done in an external system that supports transactions is committed immediately, or when the message flow transaction completes.

Many of the resources with which your message flows can interact are controlled by resource managers that can participate in coordinated transactions; for example, databases, WebSphere® MQ messages and queues, and JMS messages. Other resource managers do not provide transactional support; for example, the HTTP protocol and file systems.

Commit or rollback

If the resource can participate in a transaction, you can configure it so that the work it does is committed or rolled back only when the message flow completes, or when the node completes. Databases and WebSphere MQ queues are examples of resources that you can use in this way. If the resource does not have transactional behavior, all the work that it does is committed immediately. For example, files and HTTP connections do not support transactions.

Updates that are made by a message flow are committed when the flow processes the input message successfully. The updates are rolled back if the following conditions are met:

A node in the flow throws an exception that is not caught by a node other than the input node (for example, the node itself, or a TryCatch node)
The Catch terminal of the input node is not connected
The Catch terminal of the input node is connected, but an unhandled exception occurs in the message flow nodes that are connected to the Catch terminal.

On distributed systems, message flow transactions are managed by the integration node, by default. These transactions are known as local transactions or locally coordinated transactions. When control returns to the input node when the flow finishes processing, the node either commits or rolls back the operations that have been taken, excluding the individual nodes that have been configured to perform their own commits and rollbacks, or that have no support for this option.

If more than one resource is accessed by the message flow, an error might occur that prevents all the resources committing all the work that has been done. The integration node raises an exception, and handles exception processing in a way that is determined by the transport that is involved. For example, messages that were read from WebSphere MQ queues are restored to those queues, and fault messages are sent to applications that submitted a message across HTTP (because HTTP has no concept of rollback). Because of these actions, the status of the resources might become inconsistent.

If it is important that your data and operations remain consistent, and that all operations are committed, or rolled back if one or more operations fail, you can coordinate the activity of the message flow.

Coordinating transactions

Coordination is provided by an external transaction manager which uses XA protocols to interact with resource managers. The transaction manager is called by the input node when the message flow has concluded (successfully or with errors). The transaction manager, rather than the input node and the integration node, interacts with the relevant resource managers to initiate the correct actions for each resource. Transactions that are controlled by a transaction manager in this way are known as globally coordinated transactions.

On distributed systems, a WebSphere MQ queue manager associated with the integration node performs the transaction manager role, which means that IBM Integration Bus requires access to WebSphere MQ when processing messages. The queue manager must be local, and configured to be the transaction manager. All other queue managers that interact with MQ nodes in that message flow are treated as locally-coordinated resources. For more information about using WebSphere MQ with IBM Integration Bus, see Enhanced flexibility in interactions with WebSphere MQ.

On z/OS®, transactions are always globally coordinated; you do not have to choose or configure for this option.

The role of the transaction manager

The result of the actions taken by message flows is the same for both local and globally coordinated transactions if the message flow is successful in all its actions. The advantage of a globally coordinated transaction is the ability to ensure that either all actions are committed, or none.

The external transaction manager, which operates a two-phase commit strategy, supports cases where one or more external resource managers are temporarily unavailable during commit processing. This potentially small window for failure might be costly for your business environment; the external transaction manager helps to eliminate the occurrence of a failure window. Therefore, the decision to include an external transaction manager, which involves a performance overhead, is an administrative decision, not one to be taken at message flow design time.

An external transaction manager does not prevent message loss; even if you use transaction coordination, you must configure and code your message flows to handle potential errors as much as you can.

To configure a message flow to be globally coordinated, you must also set up your environment so that your resource managers are defined to the supported transaction manager:

On distributed systems, transactions can be coordinated by WebSphere MQ
On z/OS, all transactions are globally coordinated by Resource Recovery Service (RRS)

This configuration might require you to change settings in the transaction manager as well as the participating resource managers.

Database access modes and locks

You must use separate ODBC connections if you want to include nodes with Automatic transaction status and nodes with Commit transaction status in the same message flow, where the nodes operate on the same external database. Set up one connection for the nodes that are not to commit until the completion of the message flow, and a second connection for the nodes that are to commit immediately.

If nodes with Commit transaction status are followed by a node with Automatic transaction status, the nodes with Commit transaction status commit independently of the flow transaction, and the nodes with Automatic transaction status commit at the end of the flow.
However, if nodes with Automatic transaction status are followed by a node with Commit transaction status, and you do not use separate ODBC connections, error message BIP4001 is issued, because otherwise the node with Commit transaction status commits the work of the Automatic nodes prematurely.

Linux platform Windows platform UNIX platform On systems other than z/OS, individual relational databases might not support this mode of operation.

If you define more than one ODBC connection to the same data source, you might get database locking problems. In particular, if a node with Automatic transaction status carries out an operation, such as an INSERT or an UPDATE, that causes a database object (such as a table) to be locked, and a subsequent node tries to access that database object by using a different ODBC connection, an infinite lock (deadlock) occurs.

The second node waits for the lock acquired by the first to be released, but the first node does commit its operations and release its lock until the message flow completes. However, the flow cannot complete because the second node is waiting for the database lock held by the first node to be released.

Such a situation cannot be detected by a DBMS automatic deadlock-avoidance routine because the two operations are interfering with each other indirectly by using the integration node.

You can use either of two options to avoid this type of locking problem:

Design your message flow so that uncommitted (automatic) operations do not lock database objects that subsequent operations access across a different ODBC connection.
Configure the lock timeout parameter of your database so that an attempt to acquire a lock fails after a specified length of time. If a database operation fails because of a lock timeout, an exception is thrown that the integration node handles in the typical way.

For information concerning which database objects are locked by particular operations, and how to configure the lock timeout parameter of your database, consult your database product documentation.