DB2 Connect Performance Issue Related to Cryptography on Power Series Hardware

Technote (troubleshooting)


Problem(Abstract)

DB2 LUW on AIX may experience slow connect performance on pSeries hardware with POWER5+ CPUs and higher. The defect has been fixed starting with DB2 V9.7 FP9, V10.1 FP3a and V10.5 FP3a.

If you are unable to upgrade to the latest fix packs then the instructions in this technote will allow you to workaround the problem until you are able to upgrade.

Symptom

The slow connect performance problem have been observed in two scenarios:

  1. Connection is slow on first connect, only. Subsequent connects from the same application/CLP is quick.
  2. Connection is slow on first and subsequent connects.

This article only applies to AIX systems running on POWER5+ hardware or later.

Reference APARs
  • IZ00735 for V8.2 (Out of service)
  • IZ12129 for V9.1 (Out of service)
  • IZ29155 for V9.5 (No plans to fix. Must upgrade to newer release.)
  • IC77542 for V9.7 (Fixed in V9.7 FP9)
  • IC87954 for V10.1 (Fixed in V10.1 FP3a)
  • IC97731 for V10.5 (Fixed in V10.5 FP3a)

Cause

The performance problems are related to the mechanisms used to generate good random numbers, a requirement for strong cryptography.


Environment

AIX

Diagnosing the problem

The problem being described is specific to DB2 running on PowerPC hardware and the workaround is AIX specific. The behavior described applies to both DB2 client and DB2 server products. In the case 1 below, the product could be either DB2 runtime client or a DB2 server. If the application is running on a machine with DB2 runtime client installed then you would only need to apply the workaround to the DB2 runtime client. There is no need to modify the DB2 server. However, if you are running your application on a machine where the DB2 server is installed then you would need to apply the workaround to the DB2 server. For case 2 below, the performance problem is at the DB2 server and hence the workaround is applied to the DB2 server, only.

Case 1) Connection is slow on first connect, only.

This is a DB2 application side problem. The cryptographic code used by DB2, on initialization, is having problem generating sufficient entropy. This initialization is performed by the DB2 client (application) once per process and on first connect. Cryptographic code is used on the client side when the database has been cataloged with authentication type:

  • NOT_SPEC (authentication type was not specified. Uses server's authentication type but will try SERVER_ENCRYPT first )
  • SERVER_ENCRYPT
  • DATA_ENCRYPT
  • KERBEROS (DB2 encrypts password in memory)


Hence if you do a "db2 list db directory" and you see that the database the application was connecting to was any one of the above then there is good chance that you are experiencing the issue described in this document. When running application directly on a DB2 server systems,. you can't specify an authentication type for the local database and hence connecting to a local database would be a problem too.

Example of a database that was cataloged without specifying an authentication type:

$ db2 list db directory

System Database Directory

Number of entries in the directory = 1

Database 1 entry:

Database alias                       = SAMPLE
Database name                        = SAMPLE
Node name                            = MYNODE
Database release level               = d.00
Comment                              =
Directory entry type                 = Remote
Catalog database partition number    = -1
Alternate server hostname            =
Alternate server port number         =

Example of a database that was cataloged with a specific authentication type:

$ db2 list db directory

System Database Directory

Number of entries in the directory = 1

Database 1 entry:

Database alias                       = SAMPLE
Database name                        = SAMPLE
Node name                            = MYNODE
Database release level               = d.00
Comment                              =
Directory entry type                 = Remote
Authentication                       = SERVER_ENCRYPT
Catalog database partition number    = -1
Alternate server hostname            =
Alternate server port number         =

There are two ways to determine you are experiencing this problem: 1) tprof and 2) db2 trace.

Use tprof to measure CPU usage. If the results shows that the cryptographic libraries have excessive CPU usage then you are experiencing the issues described here. The cryptographic libraries have different names in different releases however they all reside under the the directory named "icc".

 tprof -skeuj -x sleep 5

The other way is to get a DB2 trace. Collect a trace with timestamp enabled and if cryptContextRealInit is taking more then a second to complete then you are experiencing the problem described here.


Case 2) Connection is slow on first and subsequent connects


The problem here is at the DB2 server. The DB2 server is using authentication type SERVER_ENCRYPT or DATA_ENCRYPT. The performance problem is also related to random number generation but this time it occurs during the key exchange and not during initialization. Typically, the customer will experience slow connects that can almost be called a hang.The stack of the associated DB2 thread would look similar to the following:

0x090000000D73DB20 getbyte + 0xA0
0x090000000D73C9F4 trng_raw_gen + 0x2F4
0x090000000D73B9B8 CLiC_trng + 0x38
0x090000000D73EA5C efGenerateRandomSeed + 0xDC
0x090000000D73B024 PRNG_GenerateRandomSeed@AF34_12 + 0x24
0x090000000D73B1F8 efRNG_ReSeed@AF35_17 + 0x118
0x090000000D73B4D4 OldefRNG_Generate@AF36_1 + 0x1D4
0x090000000D73A4B4 efRNG_Generate + 0x114
0x090000000D7519AC fips_rand_bytes + 0x6C
0x090000000D790A5C C98E_RAND_bytes + 0x5C
0x090000000D7B1040 bnrand + 0x160
0x090000000D7B1388 C98E_BN_rand + 0x48
0x090000000D7D8E7C generate_key + 0x15C
0x090000000D7D8C30 C98E_DH_generate_key + 0x30
0x090000000D72A28C METAC_DH_generate_key@AF409_179 + 0x6C
0x090000000D685774 ICCC_DH_generate_key@AF359_157 + 0x34
0x090000000D67E9E8 ICCC_DH_generate_key + 0x28
0x090000000D674B90 ICC_DH_generate_key + 0x30
0x090000000947DB3C cryptDHGetPublicKey + 0x200

Use the following to generate the stack trace:

db2pd -stack all

Normally, the stack trace are written to sqllib/db2dump. Change to the sqllib/db2dump directory and run the following command:

find . -name "*stack.txt" -exec grep -l trng_raw_gen

If there is a hit then you know you are experiencing this issue.

Resolving the problem

What will be describe here is a workaround. The official fix is planned for a future fix pack.

Quick Patch for case 1

If the application is remote and the DB2 server is using authentication type SERVER then the problem can be avoided by changing the client side database catalog. This issue occurs only when there is a need for encryption and given the server's authentication type is SERVER, it is not necessary to load the cryptographic libraries. DB2 attempts to be secure by default, so if the authentication type is not specified on the DB2 client the initial attempt to connect to the server will use encryption. If the authentication type on the server does not require encryption then the connection is retried, but at that point it is too late to avoid the issue described here on systems that are susceptible. When you explicitly provide an authentication type SERVER at the DB2 client, there is no need for the client to try SERVER_ENCRYPT first and hence no cyrptographic code will be loaded or initialized on the client.

Workaround in DB2 V10.1 or DB2 V9.7 FP7 and Beyond

The workaround is applicable to case 1 and 2. The workaround is available in GSKit v8.0.14.14. This is the version of GSKit currently being used in DB2 V10.1 and starting with FP7 for DB2 V9.7. DB2 V10.5 uses GSKit v8.0.14.27. For all other releases, you need to contact DB2 technical support to obtain a special build.

To enable the performance workaround, environment variables need to be set. Note that enabling the workaround causes DB2 to use an alternate set of cryptographic algorithms that have not yet been FIPS 140 certified.

At the DB2 Server, the following must be set. If the DB2 Server has multiple partitions or members then the "export" statements must be added to db2profile. Do not add the "db2set" statement into db2profile.

export ICC_TRNG=ALT
export ICC_IGNORE_FIPS=YES
db2set DB2ENVLIST="ICC_TRNG ICC_IGNORE_FIPS"


At the DB2 clients:

export ICC_TRNG=ALT
export ICC_IGNORE_FIPS=YES


For V10.5, ICC_TRNG also supports ALT2. If ALT does not solve the performance problem then try ALT2.

DB2 V9.7 FP9 is shipped with GSKit v8.0.14.32 which contains a FIPS certified ICC containing the fix. Setting environment variables or DB2 registry variables are not required. The fix is enabled by default.


What does setting ICC_TRNG=ALT and ICC_IGNORE_FIPS=YES do?

Setting ICC_TRNG=ALT will instruct the cryptographic code to use an alternate random number generator that fixes the problem. This alternate code is only available in non-FIPS mode. Setting ICC_IGNORE_FIPS=YES causes GSKit-crypto to use the non-FIPS code. The cryptographic modules comes in two flavours: FIPS and non-FIPS. The FIPS and non-FIPS modules are bundled and managed under a product called GSKit-crypto. DB2 interfaces with GSkit-crypto to get encryption support. It is GSkit-crypto that provides the ability to switch between using the FIPS and non-FIPS cryptographic modules. By default, DB2 will always use the FIPS version. To get to the non-FIPS module that contains the workaround, you need to set those environment variables to get GSKit-crypto to switch to using the non-FIPS modules.


What is FIPS?

FIPS stands for Federal Information Processing Standards. It is a U.S. government standard for cryptographic modules. There are some institutions that requires the software they use, if they use encryption, to have the cryptographic modules be FIPS 140 certificed. For more information on the FIPS 140 standard see http://csrc.nist.gov/publications/fips/fips140-2/fips1402.pdf.


CHANGE HISTORY
December 20, 2013: Defect is now fixed in V9.7 FP9.
June 3, 2014: Defect is now fixed in V10.1 FP3a and V10.5 FP3a

Related information

IC77542
IZ29155
A simplified Chinese translation is available

Rate this page:

(0 users)Average rating

Add comments

Document information


More support for:

DB2 for Linux, UNIX and Windows

Software version:

9.5, 9.7, 10.1, 10.5

Operating system(s):

AIX

Reference #:

1614548

Modified date:

2014-06-03

Translate my page

Machine Translation

Content navigation