APAR status
INTRAN
Error description
The best way to get documentation is using parmlib member IEADMC The DUMP parmlib avoids the use of SLIP IF (PER) which is limited to only one PER trap per system. IEADMCxx, where "xx" is the suffix you specify on the PARMLIB= operand of the DUMP command. ************************************************************** The purpose of this APAR is to document the known DB2 RA10 RB10 RC10 5740XYR00 HANG-WAIT- SUSPEND problems. For HANG or WAIT problems in DB2 DISTRIBUTED asid also see II08215. Take note, if your problem is the failure of DB2 to startup, then remember, DB2 does NOT function without IRLM. Verify that your IRLM can function/startup without DB2. If IRLM is in an indeterminate state wherein it cannot IDENTIFY the DB2, then DB2 cannot fully startup. Check your SYSLOG for errors related to IRLM or look for DXRxxxx type messages. In addition to a current fix-list, a process has been provided that indicates what to do when DB2 or an ALLIED asid is hung (or looping). Please follow this process and have the listed documentation available for DB2 SUPPORT analysis.
Local fix
A new keyword 'service' has been added to the display thread command to assist in the diagnosis of DB2 thread hangs. The command is issued as follows: -dis thd(*) service(wait) This will display all threads that have been suspended for 2 times the IRLM timeout limit or a minimum of 60 seconds. If the thread is suspended due to IRLM resource contention or DB2 latch contention, additional information will be displayed to assist in identifying the problem. WHAT TO DO IF DB2, OR DB2 ALLIED ASID IS HUNG REFERENCE SECTION: TYPE-of-failure keywords in THE DB2 DIAGNOSIS REFERENCE GUIDE ________________________________________________________ ALL OF THE BELOW DOCUMENTATION WILL BE NEEDED BY DB2 LEVEL2 ---------------------------------------------------------------- 1) Displays: -DISPLAY THREAD(*) DETAIL D A,ALL (or D A,ssnm* or D A,IRL*) D GRS,CONTENTION D OPDATA (If possible, also execute the following 2 DISPLAY cmds) -DISPLAY DATABASE(*) USE/LOCKS LIMIT(*) -DISPLAY UTILITY (*) *NOTE* Be sure to keep the MVS SYSLOG to enable reference to the DISPLAY command's DSNV404I response output. *NOTE* A thread STATUS of PT* means that the thread is in DB2 and using Query CP Parallelism. (DEGREE ANY) ---------------------------------------------------------------- 2) Obtain MVS console dumps using the MVS DUMP command: Always include the DB2 subsystem name in your DUMP cmd title. DUMP COMM=(DB2P thread 505 hung) (a) If you suspect that your WAIT or HANG may be due to a LOOP in DB2 on an ALLIED asid, make sure that the MVS INTERNAL SYSTEM TRACE is set ON, and that this trace is set to a good working DIAGNOSTIC value. . Enter MVS commands: TRACE ST,999K,BR=OFF TRACE MT,264K . NOTE: If the system is on Z/OS 1.10 or higher, TRACE ST could be nnnM or nG. Note: when dealing with problems of this nature, this internal system TRACE my be the most important diagnostic item captured in the dump. Unless altered, TRACE default is 64K. (b) IF AN ALLIED ADDRESS SPACE IS HUNG: Determine if the relevant ALLIED asid is getting CPU cycles or if the asid is swapped out. - IF THIS ASID IS SWAPPED OUT - Take CONSOLE dump of the MVS *MASTER* (asid 0001). The order is IMPORTANT. The MVS MASTER must be dumped 1st to ensure good dump data. After asid(1) is dumped, then dump ALLIED ASID, IRLM ssnmMSTR, ssnmDBM1 from all members. Try to capture all in one. Use a joblist with wildcards in DUMP cmd to get all members: JOBNAME=(*MASTER*,BADjob,XCFAS,ssnmIRLM,ssnmMSTR,ssnmDBM1), where ssnm = is the subsystemname of the DB2 members. Use REMOTE keyword to gets dumps of all Datasharing members. REMOTE=(JOBLIST,SDATA,DSPNAME),DSPNAME=('ssnmIRLM'.*,'XCFAS'.*) - IF THIS ASID IS NOT SWAPPED OUT - The MVS MASTER (asid 0001) is not required, dump ALLIED, ssnmMSTR, ssnmDBM1, XCFAS, and IRLM asids of all members. - If D GRS,C command shows an address space is contending resource with DB2, dump this address space as the first address space to be dumped. c) IF DB2 IS HUNG: Take a CONSOLE dump of ssnmMSTR, ssnmDBM1, IRLM, & XCFAS . SDATA=(RGN,CSA,SQA,LPA,LSQA,SWA,PSA,ALLNUC,XESDATA,TRT,GRSQ,SUM) Use MVS command: D D,OPTIONS to check SDUMP defaults. (if using or considering SLIP, read II10850 and/or PN80921 ) To facilitate the accurate and timely diagnoses of a reported problem, it is imperative that the user produce COMPLETE dumps of the associated malady. PARTIAL dumps will waste valuable time, and usually could be deemed inadequate for full problem diagnoses. Always dump the DB2MSTR, DB2DBM1, IRLM and other DB2 asids(DISTand SPAS might be needed. Check for MSGIEA911E message after the DUMP command is issued The dump may take a minute or so to complete. When finished, MVS will issue the IEA911E message noting the conditon of the dump. The condition will either be COMPLETE or PARTIAL. The message can be MSGIEA611I if dump had been allocated through DYNALLOC. Another message to be aware of is MSGIEA043I MAXSPACE REACHED. This indicates a PARTIAL dump. At minimum, set DB2 using system to a reasonable level, in MVS Commands see these commands: DISPLAY : D D,OPTIONS CHNGDUMP : CD SET,SDUMP,TYPE=XMEME,MAXSPACE=8000M MAXSPACE=8000M minimum for DB2 z/OS V8 & V9 . V10 and above would require 16000M+ MAXSPACE should be set when system parmits. There is still a chance of getting partial dump depending on the customer configuration. . ||NOTE|| Should you see partial dump with 16G of storage in V10 contact IBM DB2 SYSTEMS support. . Please be sure to have z/OS APARs OA40015, OA39596/OA41315, OA40856, OA41994 to avoid the partial dumps in V10 systems. . Note: See II06471 : DUMPSRV uses AUX storage for dumping, you may need to add an extra PAGE dataset when dumping DBM1. Note: Allocate a hi-capacity device like a 3390 mod9 for dumps. **use ACS routines for size *** Note: With DFSMS120 and Dynamic Dump Allocation, multi-volume EFDS format datasets can be created for your SVCDUMP. ** ---------------------------------------------------------------- ---------------------------------------------------------------- 3) RECYCLE DB2: If the CANCEL of a hung thread is not successful, or if DB2 is hung, execute the following commands in the noted order until one of the commands accomodates your need: (ssnm is the DB2 subsystem name) A. -STOP DB2 MODE(QUIESCE) B. -STOP DB2 MODE(FORCE) C. If ssnmDIST is running do MVS command: CANCEL ssnmDIST,A=xx or Modify IRLMPROC with abend using command: F IRLMPROC,ABEND,NODUMP D. CANCEL ssnmDBM1,A=yy (issue 2 consecutive CANCEL commands) If Cancel ssnmDBM1 does not work then - E. CANCEL ssnmMSTR,A=zz IF cancel ssnmMSTR does not work, then - There is always a FORCE ssnmMSTR,ARM to use as noted earlier but we recommend avoiding its use. IRLM can remain in an indefinite state and you may not be able to restart DB2 before an IPL is done. Use the MVS command FORCE jobname as a LAST resort. This FORCE command may need to be issued several times before the wanted job finally terminates with MSGIEF404I. OEM products like RESOLVE and KILL can be used inlieu of this MVS FORCE command. Use MVS display commands (D A,ALL) to verify that the DB2/IRLM STCjob and ASIDs are no longer active to MVS. ---------------------------------------------------------------- 4) Use the MVS command SETDMN to verify DOMAIN parameters. If MAX and MIN are set TOO low then the DB2 subsystem will not stop and start cleanly, ie: SETDMN MAX=200,MIN=255 ---------------------------------------------------------------- ---------------------------------------------------------------- 5) OBTAIN SYS1.LOGREC: Use IFCEREP1 service aid to obtain DETAILed software event records for at least 1 hour prior to the error of note. You may find it beneficial to first run a HISTORY report. Note: If DB2 is FORCEd down, or abends in some way, expect to see MVS CROSS-Memory errors like S0D5 S0D6 S0D7 and S058 S0E0. There may also be several TASK term S13E errors logged in LOGREC. Do NOT interpret any of these secondary recoveries as the source of your DB2 subsystem outage concern. DB2 generated SOFT CANCEL entries like rc00E50013 may also be issued. ---------------------------------------------------------------- 6) If it appears that IRLM is hung, DB2 will most likely be hung along with associated DB2 jobs (threads). It may be necessary to obtain IRLM doc to diagnose the hang. This is especially critical in a DataSharing environment. Get a console dump of DB2MSTR,DB2DBM1 And IRLM ASIDS or see slip info II10850 Run with IRLM component traces active . -------------------------------------------------------------- 7) Check the PSP upgrades for HIPER fixes associated with your DB2 release (UPGRADEs :DB2810 DB2710 DB2610 ) To prevent unexpected DB2 outages caused by a WAIT / HANG / LOOP / SUSPEND, the DB2 SUPPORT TEAM highly recommends that the following APARS be applied to your ESA or OS390 SYSTEM: ---------- Maintenance/Info Apar------------------------- II06310 DUMPSRV info plus fixlist (II06226) II05402 CANCEL command fails to complete II10817 DB2 R610 R710 R810 STORAGE USAGE FIXLIST e-Support web site: http://www-3.ibm.com/software/data/db2/os3 Technical information is categorized so you can navigate directl COMMENTS: CLOSED FOR DB2INFO RETENTION: See II04309 for DB2 storage info and more DB2 diagnostic setup. ________________________________________________________________
Problem summary
Problem conclusion
Temporary fix
Comments
APAR Information
APAR number
II14016
Reported component name
PB LIB INFO ITE
Reported component ID
INFOPBLIB
Reported release
001
Status
INTRAN
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2005-03-29
Closed date
Last modified date
2018-03-29
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Applicable component levels
[{"Business Unit":{"code":null,"label":null},"Product":{"code":"SG19O","label":"APARs - MVS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEPEK","label":"Db2 for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
29 March 2018