IBM Support

IBM Java for AIX HowTo: Troubleshoot 'java -version' hang or timeout issue

Question & Answer


Question

IBM Java for AIX HowTo: Troubleshoot 'java -version' hang or timeout issue

Answer

When using IBM Java for AIX, some customers have reported 'java -version' hang or timeout issue with different IBM Java versions.

This document provides step-by-step details to confirm and resolve the issue. If after completing all of these steps, if the problem continues, then it is recommended that you open a new IBM support call with the IBM AIX Java Support team for a more in-depth analysis into the issue. Prior to any additional analysis, the IBM AIX Java Support team will confirm that all of these steps have been completed since this is a well known and documented issue.
The instructions in this document make references to generic terms in Italics that will need to be replaced with information specific to the support call and the environment. It is very important that consistent and accurate values be used in place of the Italicized generic terms when collecting the data to ensure the prompt and correct delivery of the data when uploaded.
Overview
Step-by-Step Instructions

Causes

For most of the situations reported to IBM support, the 'java -version' hang or timeout has occurred as a result of these scenarios:

Scenario #1. The IBM Java is using IPv6 and the other parts of the environment are not IPv6-enabled.

Scenario #2. IBM Java is having issues connecting to a server/IP, especially when using DNS.

Scenario #3. An application does not cleanup semaphores (a.k.a., semaphore leak).

Scenario #4. A known issue in IBM Java that has been fixed in the later releases or a corrupted IBM Java for AIX Installation.

Scenario #5. The Java (JVM) process memory size is being restricted, as most of the physical memory is pinned as large pages.


The instructions that follow provide the details to identify/rule out some common scenarios and take corrective actions that must occur to resolve that scenario.

Confirm

Scenario #1:

IPv6 Issue

Issue:

Java commands that use IPV6 hostname resolution hang or perform poorly.

Root Cause:

If the Domain Name System (DNS) server is not configured to handle IPv6 queries, the application may have to wait for a response from the DNS server until the IPv6 query times out.

Confirm:

Thread dumps or javacores taken during the time of the slow response or hang, will show threads with the following lines at the top of the stack:

at java.net.Inet6AddressImpl.getLocalHostName(Native Method)
at java.net.InetAddress.getLocalHost(InetAddress.java:123)
.....................


Resolution:

If your environment only uses IPv4, add the following option to the failing java command:

-Djava.net.preferIPv4Stack=true

This will disable Java application IPv6 lookup requests and only use IPv4.

Scenario #2:

DNS Connection issue

Issue:

Java command experiences a hang or timeout when communicating with the DNS server.

Confirm:

To confirm the issue as DNS accessability, follow the instructions in the "Prepare", "Actions" and the "Examples" sections from:

http://www-01.ibm.com/support/docview.wss?uid=isg3T1024364

Determine if your system is using DNS or /etc/hosts, using the 'Prepare' section of the technote above. If DNS is being used, check if the connection is failing or not using the 'Actions' and 'Examples' section from the technote. If DNS is being used and host name resolution is failing, then you can confirm that there is an issue with the DNS server that your system is trying to connect to.

Resolution:

If the issue gets resolved by using /etc/hosts hostname resolution, look into DNS connection issue.

Scenario #3:

Semaphore Leak

Issue:

An application does not cleanup semaphores (a.k.a., semaphore leak).

Confirm:

AIX has a maximum limit of 65535 semaphores per ID (click here for more details). There have been reported situations when applications (Java or non-Java) are not correctly deleting semaphores, resulting in a gradual semaphore leak. When the number of semaphores reaches the maximum limit for the user id, the JVM is unable to create the required semaphores on startup, resulting in the "Port Library" error message.

To confirm if the semaphore leak is occurring, from a command prompt, repeatedly execute the commands:

# ipcs -saPrX | wc -l

over a period of time. If the number of semaphores shows a continuous increase, then there is a semaphore leak.

Resolution:

a. Rebooting the system will provide temporary relief for the issue. However, over time, the issue is expected to reoccur.

b. If IBM InfoSphere or IBM DataStage is running on the same system, then follow the instructions on the web page:

http://www-01.ibm.com/support/docview.wss?uid=swg21654008

c. Open a support call to the vendor / team for the application (not the JVM) creating, but not deleting the semaphores. That vendor / team will be responsible for resolving the issues of the semaphores not being released.

Scenario #4:

Java Install issue

Issue:

A known issue in a specific version and release of IBM Java that has been fixed in the later releases.

Resolution:


To rule out if the issue has already been fixed in one of the later releases of the Java version, please upgrade Java to the latest release of the Java version as per:

http://www-01.ibm.com/support/docview.wss?uid=isg3T1022692

Java fixes are cumulative and upgrading to the latest release will include all the fixes released so far for the Java version.

If you are already at the latest release for a Java version and the issue persists, to rule out any corrupted IBM Java Installation due to some corrupted or overwritten shared library files contributing to the issue, do the following:

1. Uninstall, then re-install that version of IBM Java for AIX
2. Retry the java command

Scenario #5:

Sustained Paging and low memory

Issue:

The RAM/physical memory is over-committed on the AIX LPAR.

Confirm:

To check if the memory is overcommitted on the system or not, monitor the output of the following 'vmstat' command, by repeatedly running it:

# vmstat -It 2 25 > vmstat.out &

Sustained paging activity is the best indication of low memory. If the "page-in" (PI) and "page-out" (PO) columns show non-zero values over "long" periods of time, then the system is short on memory. (All systems will show occasional paging, which is not a concern.)

Resolution:

To understand, optimize and tune memory, please look into the following resource and use the tunable 'vmo' with appropriate options:

https://www.ibm.com/developerworks/aix/library/au-aix7memoryoptimize1/
https://www.ibm.com/developerworks/aix/library/au-aix7memoryoptimize2/
https://www.ibm.com/developerworks/aix/library/au-aix7memoryoptimize3/

For example, if the system is configured with high number of large pages(16MB pages),which are always pinned and most of the applications on the system, are not configured to use large pages and cannot use this pinned large page memory. Hence, less memory is available to these applications. This will result in the applications constantly waiting for page-ins and this would appear as application/command hang or slowness to the user.

To check the allocated and used large pages, run the command 'svmon -G' and the output is something similar to:

# svmon -G

size inuse free pin virtual mmode
memory 4194304 4190023 4281 4185826 824553 Ded
pg space 4194304 167259
work pers clnt other
pin 480838 0 2108 204896
in use 688731 0 3308
PageSize PoolSize inuse pgsp pin virtual
s 4KB - 261671 26715 257858 284009
m 64KB - 26898 8784 26874 33784
L 16MB 854 0 0 854 0
S 16GB - 0 0 0 0



If you are not using those large pages on your LPAR(check the value under the inuse column, in the large pages row similar to the line highlighted above in the 'svmon -G' ouput. This 'inuse' value has to be zero in the svmon output), we suggest you to disable large pages or reduce their number. To disable large pages use the following command:

# vmo -p -d lgpg_regions

The above command will set the number of large pages to 0. Or if you want to set the number of large pages to any other lesser number they can be set using the following command:

# vmo -p -o lgpg_regions= -o lgpg_size=16777216

Contact IBM:

If, after following the above instructions, the 'java -version' continues to hang or timeout, please complete the following steps:

1. Confirm that you have completed all of the above steps.
2. Contact IBM and open a new IBM service request (new PMR).
3. Collect and upload data as per the data collection procedure for Java hang.

Step 9:

ACTION

Step 10:

ACTION

Step 11:

ACTION

Step 12:

ACTION

Step 13:

ACTION

Step 14:

ACTION

Step 15:

ACTION

Step 16:

ACTION

Step 17:

ACTION

Step 18:

ACTION

Step 19:

ACTION

Step 20:

ACTION

Step 21:

ACTION

Document Type: Instruction
Content Type: Howto
Hardware: all Power
Operating System: all AIX Versions
IBM Java: all Java Versions
Author(s): Vidya Makineedi
Reviewer(s): Rama Tenjarla

[{"Product":{"code":"SG9NGS","label":"IBM Java"},"Business Unit":{"code":null,"label":null},"Component":"Not Applicable","Platform":[{"code":"PF002","label":"AIX"}],"Version":"Version Independent","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}}]

Document Information

Modified date:
17 June 2018

UID

isg3T1025094