IY97698: TEMS SHUTDOWN HANG IN FLUSH.

Subscribe

You can track all active APARs for this component.

APAR status

  • Closed as duplicate of another APAR.

Error description

  • Problem Description:
    TEPS looses communication with the HUB TEMS. Timeouts occur
    of duration 900 seconds or greater as viewed in the RAS1 log:
    +4627F7A4.0000 extend: 372 duration: 931 state: 1
    These RPCs are executing the CTDS_DestroyRequest() RPC,
    Interface:
    UUID 684152a852f9.02.c6.d2.2d.fd.00.00.00, opnum: 7.
    State capture reveals threads blocked on fflush.
    
    Detailed Recreation Procedure:
      Very difficult to reproduce: this is timing and load
    dependent.
    
    Related Files and Output:
      RAS1 log will show the SAR state summary with an extended
      duratiion of 900+ seconds and UUID / opnum indicating
      CTDS_DestroyRequest()
    
    Logs provided:
    ============================================
    During our daily call we saw TEPS disconnect and than we were
    not
    connecting back. but issue had occurred much earlier.  The
    problem was
    at the Hub_TEMS and it was stuck in a loop.
    
    TEPS logs indicated a rpc_sar error
    
    (2007.109 12.23.37-38:kdcc1sr.c,1000,"rpc__sar") Remote call
    failure:
    1C010001
    +2007.109 12.23.37 activity:
    c3e147f8920f.42.53.00.32.41.81.27.17
    started: 4627BE17
    +2007.109 12.23.37 interface:
    684152a852f9.02.c6.d2.2d.fd.00.00.00
    version: 121
    +2007.109 12.23.37 object: 000000000000.00.00.00.00.00.00.00.00
    opnum: 7
    +2007.109 12.23.37 srvr-boot: 4624E186 length: 16 a/i-hints:
    F923/0005
    +2007.109 12.23.37 interval: 15 pkts-in: 30 retries: 0
    +2007.109 12.23.37 pings: 31 no-calls: 0 working: 30
    +2007.109 12.23.37 facks: 0 mtu: 944 sequence: 2
    +2007.109 12.23.37 b-size: 32 b-fail: 0 b-hist: 0
    +2007.109 12.23.37 nextfrag: 0 fragnum: 0 timeouts: 31
    +2007.109 12.23.37 idem: false maybe: false large: false
    +2007.109 12.23.37 callback: false snd-frags: false rcv-frags:
    false
    +2007.109 12.23.37 extend: 372 duration: 930 state: 1
    +2007.109 12.23.37 bld-date: Mar 9 2007 bld-time: 18:12:12
    revision:
    1.1.1.2
    +2007.109 12.23.37 bsn: 3718534 bsq: 4 driver: d7068a
    +2007.109 12.23.37 short: 10 contact: 60 reply: 240
    +2007.109 12.23.37 req-int: 30 frag-int: 30 ping-int: 15
    +2007.109 12.23.37 limit: 900 work-allow: 30
    +2007.109 12.23.37 loc-endpt: ip.spipe:#*[7759]
    +2007.109 12.23.37 rmt-endpt: ip.spipe:#129.39.23.80[3660]
    (2007.109 12.23.37-38:kdsnccns.c,260,"NCSErrorMessage") CT/DS
    RPC Error:
    DSR010 - CTDS_DestroyRequest RPC abend
    (2007.109 12.23.37-38:kdsnccns.c,59,"ConvertNCSStatus") NCS
    Status Code:
    1c010001
    (2007.109 12.23.37-38:kdsnccns.c,209,"ConvertNCSStatus") CT/DS
    status
    returned: 155
    
    Indicating comm failure between Hub and TEPS. We recycled the
    TEPS but
    failed again with same error indicating that the TEMS was not
    accepting
    new responses. Here are two consecutive rpc errors after 932secs
    
    
    4627F7A4.0000-1:kdcc1sr.c,1000,"rpc__sar") Remote call failure:
    1C010001
    +4627F7A4.0000 activity: c3e215e6c4a5.42.53.00.33.41.81.27.17
    started:
    4627F401
    +4627F7A4.0000 interface: 684152a852f9.02.c6.d2.2d.fd.00.00.00
    version:
    121
    +4627F7A4.0000 object: 000000000000.00.00.00.00.00.00.00.00
    opnum: 7
    +4627F7A4.0000 srvr-boot: 4624E186 length: 16 a/i-hints:
    F8BD/0005
    +4627F7A4.0000 interval: 15 pkts-in: 30 retries: 0
    +4627F7A4.0000 pings: 31 no-calls: 0 working: 30
    +4627F7A4.0000 facks: 0 mtu: 944 sequence: 4
    +4627F7A4.0000 b-size: 32 b-fail: 0 b-hist: 0
    +4627F7A4.0000 nextfrag: 0 fragnum: 0 timeouts: 31
    +4627F7A4.0000 idem: false maybe: false large: false
    +4627F7A4.0000 callback: false snd-frags: false rcv-frags: false
    +4627F7A4.0000 extend: 372 duration: 931 state: 1
    +4627F7A4.0000 bld-date: Mar 9 2007 bld-time: 18:12:12 revision:
    1.1.1.2
    +4627F7A4.0000 bsn: 3718534 bsq: 4 driver: d7068a
    +4627F7A4.0000 short: 10 contact: 60 reply: 240
    +4627F7A4.0000 req-int: 30 frag-int: 30 ping-int: 15
    +4627F7A4.0000 limit: 900 work-allow: 30
    +4627F7A4.0000 loc-endpt: ip.spipe:#*[7758]
    +4627F7A4.0000 rmt-endpt: ip.spipe:#129.39.23.80[3660]
    (4627F7A4.0001-1:kdsnccns.c,260,"NCSErrorMessage") CT/DS RPC
    Error:
    DSR010 - CTDS_DestroyRequest RPC abend
    (4627F7A4.0002-1:kdsnccns.c,59,"ConvertNCSStatus") NCS Status
    Code:
    1c010001
    
    (462800CC.0000-6:kdcc1sr.c,1000,"rpc__sar") Remote call failure:
    1C010001
    +462800CC.0000 activity: c3e216897f00.42.53.00.33.41.81.27.17
    started:
    4627FF64
    +462800CC.0000 interface: 684152a852f9.02.c6.d2.2d.fd.00.00.00
    version:
    121
    +462800CC.0000 object: 000000000000.00.00.00.00.00.00.00.00
    opnum: 3
    +462800CC.0000 srvr-boot: 4624E186 length: 364 a/i-hints:
    FFFF/FFFF
    +462800CC.0000 interval: 15 pkts-in: 11 retries: 11
    +462800CC.0000 pings: 12 no-calls: 11 working: 0
    +462800CC.0000 facks: 0 mtu: 944 sequence: 13
    +462800CC.0000 b-size: 32 b-fail: 0 b-hist: 0
    +462800CC.0000 nextfrag: 0 fragnum: 0 timeouts: 12
    +462800CC.0000 idem: false maybe: false large: false
    +462800CC.0000 callback: false snd-frags: false rcv-frags: false
    +462800CC.0000 extend: 0 duration: 360 state: 1
    +462800CC.0000 bld-date: Mar 9 2007 bld-time: 18:12:12 revision:
    1.1.1.2
    +462800CC.0000 bsn: 3718534 bsq: 4 driver: d7068a
    +462800CC.0000 short: 10 contact: 60 reply: 240
    +462800CC.0000 req-int: 30 frag-int: 30 ping-int: 15
    +462800CC.0000 limit: 900 work-allow: 30
    +462800CC.0000 loc-endpt: ip.spipe:#*[7759]
    +462800CC.0000 rmt-endpt: ip.spipe:#129.39.23.80[3660]
    (462800CC.0001-6:kdsnccns.c,260,"NCSErrorMessage") CT/DS RPC
    Error:
    DSR034 - CTDS_CreateRequest RPC abend
    (462800CC.0002-6:kdsnccns.c,59,"ConvertNCSStatus") NCS Status
    Code:
    1c010001
    (462800CC.0003-6:kdsnccns.c,209,"ConvertNCSStatus") CT/DS status
    returned: 155
    
    We did a gencore on kdsmain process on the Hub and analyzed it
    thru dbx.
    we found thread 130 in hung state
    
    (dbx) where
    write.write(??, ??, ??) at 0xd033b988
    flsbuf._xwrite(??, ??, ??, ??) at 0xd0332cd8
    flsbuf._xflsbuf(??) at 0xd0332c10
    flsbuf.fflush_unlocked(??) at 0xd0332478
    flsbuf.fflush(??) at 0xd03330ac
    KBBRAFH(0x33e47278, 0x43431cad, 0x2) at 0x2018c244
    KBBSS_FlushBuffer(0x43431ba8, 0x1) at 0x20182a74
    BSS1_EndFormat(0x43431ba8) at 0x201890ac
    RAS1_Format(0x305229b0, 0x283, 0x3049c810, 0x409c0d84) at
    0x2017e840
    RAS1_Printf(0x305229b0, 0x283, 0x3049c810, 0x4c0cfec4, 0x14c,
    0x32cc8948, 0x101dc0b1, 0x43477ce0) at 0x2017e7a4
    Index_GetNextLocate(0x409c0fd8, 0x409c0fcc, 0x409c0fd0) at
    0x3047fe20
    KFAUS_PositionEqualFromBeginning(0x409c0fd8, 0x409c11e8, 0x64,
    0x409c0fcc, 0x409c0fd0) at 0x3047dfa8
    KFAUS_RetrieveUserIndexEntries(0x409c14d8, 0x154, 0x409c1378,
    0x409c1348, 0x304a785c, 0x1, 0x1, 0x409c11e8) at 0x3047c8c0
    kfastsal(0x4b6a3b58, 0x4b657d77, 0x4ae47c98, 0x0, 0x0, 0x0) at
    0x30492f20
    kfastpst(0x4aa78c18, 0x4b657d77, 0x4ae47c98, 0x4ae47c68,
    0x55000055) at
    0x30490d98
    kfastalr(0x4aa78c18, 0x4b640ad4, 0x4, 0x4aa78e90, 0x20, 0x0,
    0x4b657d77,
    0x0) at 0x30492658
    kfaatloc.Process(0xae900000, 0x4b203868) at 0x3044b660
    VST11PT(0xae810000, 0x4b18b9e8) at 0x1009db40
    VVW11_ManageView(0x4bd90978) at 0x10035ef4
    ThreadManager(0x38c44e98) at 0x2007fcdc
    
    (dbx) 0x4b18b9e8 / 100 c
    0x4b18b9e8: 'V' 'S' 'T' 'O' 'ᄅ' '' '\0' '\0' 'K' '^X' '"' '^T'
    '0' 'T'
    '^U' '￐'
    0x4b18b9f8: 'K' '^X' '' 'ハ' 'K' '^X' '"' 'D' '\0' '\0' '\0'
    '\0' '\0'
    '\0' '\0' '\0'
    0x4b18ba08: 'ハ' '^F' '' '\0' 'Q' 'A' '1' 'C' 'S' 'I' 'T' 'F' ' '
    ' ' ' '
    '\0'
    0x4b18ba18: '\0' '\0' '\0' '\0' '\0' 'T' 'S' 'I' 'T' 'D' 'E' 'S'
    'C'
    '\0' '\0' '\0'
    0x4b18ba28: 'O' '4' 'S' 'R' 'V' '\0' '\0' '\0' '\0' '\0' '\0'
    '\0' '\0'
    '\0' '\0' '\0'
    0x4b18ba38: '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0' '\0'
    '\0' '\0'
    '\0' '\0' '\0' '\0'
    0x4b18ba48: ' ' '￘' '\0' '\0'
    ==================================
    

Local fix

  • no workaround available
    

Problem summary

Problem conclusion

Temporary fix

Comments

  • This APAR is a duplicate of IY93582
    

APAR Information

  • APAR number

    IY97698

  • Reported component name

    TEMS

  • Reported component ID

    5724C04MS

  • Reported release

    610

  • Status

    CLOSED DUB

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    2007-04-23

  • Closed date

    2007-10-27

  • Last modified date

    2007-10-27

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels



Rate this page:

(0 users)Average rating

Document information


More support for:

Tivoli Components
ITM Tivoli Enterprise Mgmt Server V6

Software version:

610

Reference #:

IY97698

Modified date:

2007-10-27

Translate my page

Machine Translation

Content navigation