 |
Recommended maintenance and configuration to enhance automatic recovery of a highly available WebSphere Process Server environment
|
| | | Abstract | | This document provides a configuration and maintenance recommendation to enhance automatic recovery of a highly available WebSphere Process Server environment, following unplanned failures of hardware or software components. | | | | | | Content | A highly available WebSphere Process Server environment includes a number of software and hardware components, such as machines, networks, databases and application servers. The system provides resilience to, and automatic recovery from, many forms of failure which can affect any one of these components. This document describes recommended maintenance and configuration that further enhances the system's ability to automatically recover from failures.
Configuration prerequisites: - This document assumes that one of the following deployment environment patterns have been used to configure the WebSphere Process Server environment for high availability:
- Remote messaging pattern, often referred to as "silver topology"
- Remote messaging and remote support pattern, often referred to as "gold topology"
- This document also assumes that you have set up your WebSphere Process Server environment as detailed in the developerWorks article Building clustered topologies in WebSphere Process Server V6.1 . Additional information is available in the Planning your deployment environment topic of the WebSphere Process Server information center.
- If the topology has been scaled to include additional application and messaging clusters, this document assumes that this configuration has been performed as described in the following developerWorks article:
Configuring efficient messaging in multicluster WebSphere Process Server cells - This document assumes that the TCP keepalive settings of the server hosting your database have been configured to allow the database to detect the failure of an application server or network, in a time period appropriate for the high availability requirements of the environment.
Messaging engines will not be able to restart after a failure until the database has detected the failure and cleaned up the table locks previously established by the running messaging engines.
Refer to the documentation for the Operating System hosting your database for details on configuring TCP keepalive. Maintenance recommendation for WebSphere Process Server V6.0.2: WebSphere Process Server fix pack 6.0.2.5
This includes WebSphere Application Server fix pack 6.0.2.31 (includes APAR PK71156)
Install the interim fixes below in this order:
1. Interim fix for APAR PK65439
2. Interim fix for APAR PK72567
3. Interim fix for APAR PK75571 Maintenance recommendation for WebSphere Process Server V6.1.0: WebSphere Process Server fix pack 6.1.0.3
This includes WebSphere Application Server fix pack 6.1.0.21
Install the interim fixes in this order:
1. Interim fix for APAR PK72567
2. Interim fix for APAR PK75571 Maintenance recommendation for WebSphere Process Server V6.1.2: WebSphere Process Server fix pack 6.1.2.2
This includes WebSphere Application Server fix pack 6.1.0.19
Install the interim fixes below in this order:
1. Interim fix for APAR PK74220
2. Interim fix for APAR PK72567
3. Interim fix for APAR PK75571 Maintenance recommendation for WebSphere Process Server V6.2.0: WebSphere Process Server refresh pack 6.2.0.0
This includes WebSphere Application Server fix pack 6.1.0.21
Install the Interim fix for APAR PK75571 Introduction to a new property: sib.msgstore.jdbcFailoverOnDBConnectionLoss Application consistency in the face of a fault can be enhanced when a special function in APAR PK72567 is enabled. The applications impacted are those with SCA asynchronous interactions and those with long running processes. The fault requiring extra protection is the loss of the message store database. The function provided stops all message flows when the loss is detected. The messaging engine is recovered by stopping and starting the application server. You set up this consistency feature by setting a new custom property.
This new messaging engine custom property "sib.msgstore.jdbcFailoverOnDBConnectionLoss" was introduced in APAR PK72567, and must be set in the WebSphere Application Server administrative console to take advantage of this APAR - if you do not add this custom property and set it's value to true you do not take advantage of this new behavior. Follow these steps: - Navigate to the server's messaging engine: Service Integration -> Buses -> <Bus name> -> Messaging Engines -> <Messaging engine name> -> Custom Properties
- Click New
- Copy and paste the following string into the Name field:
sib.msgstore.jdbcFailoverOnDBConnectionLoss
- Set the Value field to
true
- Click OK
- Save the configuration
- You will need to restart your servers for the property to be picked up.
|
You need to add this property for each messaging engine in each messaging cluster, in each bus.
This includes messaging engines in the SCA.SYSTEM, SCA.APPLICATION, BPC and CommonEventInfrastructure buses.
Setting this property to True ensures that predictable recovery behavior occurs as a result of planned or unplanned restarts to the messaging database, which is used to assure the delivery of messages that are fundamental to the operation of WebSphere Process Server.
With the property set to true, application servers hosting messaging engines will be terminated after connectivity to the messaging database is lost. This stops all messaging activity in the environment until connectivity to the database is restored.
After connectivity to the database is restored, it is important to ensure that each messaging engine has been restarted and remains highly available.
The overall status of each messaging engine can be viewed by navigating to the following panel in the WebSphere Application Server administrative console: Service integration -> Buses -> <Bus Name> -> Messaging engines - this will show you the messaging engines in that bus and what the status is.
You should also navigate to the following panel in the WebSphere Application Server administrative console for each messaging engine cluster in each bus: Application Servers -> Core groups -> Core group settings -> DefaultCoreGroup -> Runtime -> Show groups -> "IBM_hc=CLUSTERNAME,WSAF_SIB_BUS=Bus2,WSAF_SIB_MESSAGING_ENGINE=CLUSTERNAME.000-BUSNAME,type=WSAF_SIB"
- DefaultCoreGroup is the name of the core group your messaging engine cluster servers are a member of
- CLUSTERNAME is the name of a messaging engine cluster
- BUSNAME is the name of the bus (BPC, SCA.SYSTEM, SCA.APPLICATION, CommonEventInfrastructure_bus)
This panel shows high availability state associated with each running server in the messaging engine cluster. If any server shows 'Disabled' state (indicated by a red square), the high availability of your environment is compromised as the messaging engine will not start on that server. If all servers show 'Disabled' state your messaging engine will not start until at least one server is enabled.
You can clear the 'Disabled' state by selecting the application server which is disabled and clicking 'Enable'. 'Disabled' state is automatically cleared when a server is restarted.
Checking that your messaging engines have restarted, and all available servers are enabled as failover locations for the messaging engine, is particular important if your database was unavailable for a period of 15 minutes or longer. | | | | | | | | | |
 |
| IBM, the IBM logo and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml. |
 |
 |
 |
| Please take a moment to complete this form to help us better serve you. |
 |
 |
 |
|
|
|
 |
 |
| Product categories: |
 |
| | Software |  |
| | Business Integration and Optimization |  |
| | Dynamic Business Process Management |  |
| | WebSphere Process Server |  |
| | General |  |
 |
| Operating system(s): |
| |
AIX, HP-UX, Linux, Solaris, Windows, i5/OS
|
 |
| Software version: |
| |
6.0.2.5, 6.1.0.3, 6.1.2.2, 6.2
|
 |
| Reference #: |
| |
1326291
|
 |
| IBM Group: |
| | Software Group |
 |
| Modified date: |
| | 2009-02-24 |
 |
|