Skip to main content

CICS Transaction Server for z/OS V3.1

An edition of CICS Transaction Server

Technical detail

Autonomic computing offers improved service, increased operator productivity, and simplified growth.

The primary objective of CICS® is to support the growth and functional requirements of transaction processing applications. Because many of these applications are fundamental to the operation of today's businesses, CICS availability is, in turn, fundamental when considering autonomic goals.

Your operators issue commands and receive messages to communicate with and control operating systems, subsystems, and the network. Each of these systems, subsystems, and networks has its own command language syntax and message formats. Your operators have to keep in mind many different commands and messages, and sometimes they have to enter a complex series of commands and replies to messages to resolve a problem. The more complex the sequence of commands and replies, the greater the possibility of making a mistake.

If your operators have to recover an online application system in a production environment, they often have to make an immediate decision on a complex problem, which again increases the possibility of making a mistake. Delays in the resolution of problems cause a reduction in the quality of service that your end users experience. Autonomic computing can help you by speeding the detection of the problem and reducing the complexity of the decisions your operators must make.

As your system grows you typically have to add more resources to it (for example, new files, new transactions, new programs, or new CICS regions). The amount of monitoring activity your operator has to carry out grows in proportion to system growth. As the system grows, your operators still have to provide timely responses to critical messages displayed on the consoles, but typically they see more and more messages. Your operator's job gets harder as your system grows. Parallel processing and an increasing number of CICS regions spread over several MVS images further increase the complexity of the operator's job, making autonomic computing more desirable.

Providing a service based on CICS is important, but installations are increasingly finding that their users expect service of high quality. Users are asking for commitment to reaching certain targets of quality in areas such as availability and response time. Many installations already have formal commitments to reaching defined service level objectives. Those that do not have such agreements are implementing them now.

Satisfying service level objectives requires a consistent and reliable CICS service. As a business becomes more dependent on CICS, service interruption problems become more serious and costly. Many data processing organizations already require operations 24 hours a day, seven days a week, making even planned interruptions difficult and unplanned outages disastrous. Automation of operations helps to achieve service level objectives by detecting problems early and taking actions to resolve problems wherever possible.

To automate CICS operations in your environment you first have to think about the level of automation you want. You have to decide which events in your CICS environment can be handled automatically, whether you want those events to be handled automatically, and whether you want the automation procedures to handle the event completely or take some actions and leave your operator to complete the task of handling the event.

Detecting an event is the first step in an automation process. An event could be, for example, a short on storage condition, a failing task, or missing a service level objective. These events are of differing levels of severity because of the effect they have on the end user. We group CICS event detection and automation possibilities into three areas and base the comparison of the abilities of the different IBM automation products on those three areas.

The areas are:

Detecting the problem is one part of the autonomic process. If you have detected a problem, you can then take an action to correct it. Your action can be either reactive or preventive; in other words, you can either deal with a problem after it happens (reaction) or try to prevent the problem from happening (prevention).

Simple examples of reaction to a detected event are automating the restart of a failed CICS region or automatically canceling a CICS region if it is short on storage for an extended time. Typically reaction involves an automation action at a point in time when most users of the CICS region are already suffering.

Prevention involves an action at a very early stage, to prevent a problem from developing. For example, prevention could mean that you automatically start another CICS application-owning region (AOR) in a CICSPlex if the agreed service level for response time is being met, but only just. To take preventive actions you need a tool that enables you to detect events at an early point in time. CICSPlex System Manager is such a tool.