Applying LAIF 7,8,9 or 10 on top of WebSphere InterChange Server 220.127.116.11 causes the recovery mechanism in WICS to fail.
In 18.104.22.168 LAIF 7 an enhancement was added to the recovery API. This enhancement introduced a problem with the actual recovery mechanism in WICS.
When WICS restarts after a crash or immediate shutdown, the controllers will go into recovery but will not process any new messages or the messages that were in-progress prior to crash. In the WICS startup logs, an exception is also thrown as below:
[Time: 2010/09/07 15:28:18.764] [System: Server] [Thread: Thread-14 (#2033372332)] [Mesg: _Recovery failed. Reason: java.lang.NullPointerException
In certain cases, the exception is not stated but an error message is logged as below:
[Time: 2010/09/07 15:28:18.764] [System: Server] [Thread: Thread-14 (#2033372332)] [Type: Error] [MsgID: 191] [Mesg: Recovery failed. Reason .]
The controller will remain hanging in recovery. Any attempt to start/stop the controllers will not be accepted by the server. Instead you will see the following message:
[Time: 2010/09/07 15:31:57.358] [System: Server] [Thread: WT=1 (#2130054316)] [Type: Error] [MsgID: 14316] [Mesg: Failed to handle deactivate operation because the controller is performing recovery work.]
If you are using LAIF 7,8,9 or 10 on WICS 22.214.171.124, then you should back out the LAIF and apply LAIF 11.
Note: You can find the version of LAIF in WICS startup logs. You will see a message as below:
[Time: 2010/09/09 15:26:29.405] [System: Server] [Thread: main (#2018909695)] [Mesg: LAIF 10]
Please contact IBM Support to procure the LAIF 11.
Steps to back out the affected LAIF:
1) Shutdown WICS
2) Backup the jars that were installed as part of a particular LAIF.
3) Replace them with the jars present in LAIF 11.
The LAIF 11 includes all the APAR up until LAIF 10 and the additional APAR JR37740, which fixes the problem with recovery component.