IBM Support

IBM Sterling B2B Integrator node crashed with: "java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11" due to nproc settings

Troubleshooting


Problem

OutOfMemory (OOM) error occurred on node1 of a 2-node IBM Sterling B2B Integrator (SBI) cluster

Symptom

The following error was noted in the SBI logs e.g., opsServer.log, noapp.log (with a date & time stamp), wf.log:

Caused by: java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11

The following error was noted in the noapp.log (without a date & time stamp):

JVMDUMP012E Error in System dump: insufficient system resources to generate dump, errno=11 "Resource temporarily unavailable"

Cause

The issue was on the OS end:
The required Red Hat 'nproc' values (* hard nproc 16000 ,* soft nproc 16000) in the /etc/security/limits.conf file were being overridden by the /etc/security/limits.d/90-nproc.conf file (to 1024). Once processing picked up, the overridden nproc values of 1024 were easily exceeded.

Thread dumps from OOM reflect soft nproc value being overridden to 1024:
1CIUSERLIMITS User Limits (in bytes except for NOFILE and NPROC)
NULL ------------------------------------------------------------------------
NULL type soft limit hard limit
2CIUSERLIMIT RLIMIT_NPROC 1024 16000

Environment

SBI 5.2.4.2_Interim Fix 4, Red Hat Enterprise Linux 6.x

Diagnosing The Problem

  • The error: "Caused by: java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11" means that this wasn't an JVM OutOfMemory error. The heap wasn't exhausted.
  • Hardware wasn't a limiting factor (128GB RAM on servers with dual 12-core processors (24 CPUs))

Exhausted the normal troubleshooting steps to diagnose the native OOM:
Started monitoring items on OS - ulimit for nproc (number of processes) & nofiles (number of open files):
  • nofiles: lsof | wc -l
  • nproc: ps -eLf | grep <SBIuser> | wc -l

Observed and came to the conclusion that the nproc values were not being honored.

Resolving The Problem

Make the necessary changes in the /etc/security/limits.d/90-nproc.conf file so that the hard and soft nproc settings of 16000 are honored

Additional information on the subject matter:


Consult RedHat Support with further questions or concerns on the matter

[{"Product":{"code":"SS3JSW","label":"IBM Sterling B2B Integrator"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"--","Platform":[{"code":"PF027","label":"Solaris"}],"Version":"5.2.4.2","Edition":"","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

Document Information

Modified date:
11 February 2020

UID

swg21689324