Troubleshooting
Problem
OutOfMemory (OOM) error occurred on node1 of a 2-node IBM Sterling B2B Integrator (SBI) cluster
Symptom
The following error was noted in the SBI logs e.g., opsServer.log, noapp.log (with a date & time stamp), wf.log:
Caused by: java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11
The following error was noted in the noapp.log (without a date & time stamp):
JVMDUMP012E Error in System dump: insufficient system resources to generate dump, errno=11 "Resource temporarily unavailable"
Cause
The issue was on the OS end:
The required Red Hat 'nproc' values (* hard nproc 16000 ,* soft nproc 16000) in the /etc/security/limits.conf file were being overridden by the /etc/security/limits.d/90-nproc.conf file (to 1024). Once processing picked up, the overridden nproc values of 1024 were easily exceeded.
Thread dumps from OOM reflect soft nproc value being overridden to 1024:
1CIUSERLIMITS User Limits (in bytes except for NOFILE and NPROC)
NULL ------------------------------------------------------------------------
NULL type soft limit hard limit
2CIUSERLIMIT RLIMIT_NPROC 1024 16000
Environment
SBI 5.2.4.2_Interim Fix 4, Red Hat Enterprise Linux 6.x
Diagnosing The Problem
- The error: "Caused by: java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830, errno 11" means that this wasn't an JVM OutOfMemory error. The heap wasn't exhausted.
- Hardware wasn't a limiting factor (128GB RAM on servers with dual 12-core processors (24 CPUs))
Exhausted the normal troubleshooting steps to diagnose the native OOM:
- See java.lang.OutOfMemoryError while creating new threads
- See Thanks for the memory - Understanding how the JVM uses native memory on Windows and Linux
Started monitoring items on OS - ulimit for nproc (number of processes) & nofiles (number of open files):
- nofiles: lsof | wc -l
- nproc: ps -eLf | grep <SBIuser> | wc -l
Observed and came to the conclusion that the nproc values were not being honored.
Resolving The Problem
Make the necessary changes in the /etc/security/limits.d/90-nproc.conf file so that the hard and soft nproc settings of 16000 are honored
Additional information on the subject matter:
- /etc/security/limits.conf soft nproc limit appears to be ignored:
- Sterling B2B Integrator 5.2.0 > UNIX/Linux Cluster Environment Installation > Operating System Verification:
Consult RedHat Support with further questions or concerns on the matter
Was this topic helpful?
Document Information
Modified date:
11 February 2020
UID
swg21689324