IBM Support

Server hang with Gresham EDT and STK devices connected via Emulex HBA

Technote (troubleshooting)


Problem(Abstract)

The Tivoli Storage Manager device driver may hang due to a deadlock with the Emulex HBA driver. This causes the Tivoli Storage Manager to hang. The server must be rebooted to resolve the hang condition.

Cause

New level of the Emulex HBA driver is needed to avoid server hang on Solaris.

Resolving the problem

The Tivoli Storage Manager server has been reported to hang on Sun Solaris with Gresham EDT and STK devices connected via an Emulex HBA. The levels of 6.02c and 6.02e Emulex HBA drivers with the Tivoli Storage Manager tape driver have timing problems on a heavy work load and high frequency I/O operations. In this case, the HBA driver releases the mutex before resetting the condition variables. It may happen at the same time as the Tivoli Storage Manager tape driver is awakened. Once this situation occurs, the Tivoli Storage Manager tape driver will get the mutex. However it cannot set the condition variables for its own conditions because the current condition variables are locked by the previous thread (the Emulex HBA driver thread). Therefore a deadlock is generated. This deadlock causes the Tivoli Storage Manager tape driver to hang the server. Furthermore, using the kill command cannot kill the Tivoli Storage Manager server thread, only rebooting the system can resolve the problem.

One recommendation is to upgrade the Emulex driver and firmware as follows:
HBA driver ---> 6.02f
HBA firmware ---> 3.9283

Emulex support indicates that the v6.xx driver was rewritten and maintains a SCSI command buffer. In the v6.xx drivers, once the upper layer starts a SCSI command, the driver will allocate a buffer. When the command completes, the upper layer needs to notify the device driver to free the buffer.

These new levels of HBA driver and firmware are supported by the Tivoli Storage Manager server.

Product Alias/Synonym

ITSM TSM

Document information

More support for: Tivoli Storage Manager
Server

Software version: All Supported Versions

Operating system(s): Solaris

Reference #: 1217457

Modified date: 23 September 2010