IBM Support

Regular Expression APIs Give Different Results in Different Envirnoments

Troubleshooting


Problem

Different environments give different results when working with regular expressions using regcomp() or regexec().

Resolving The Problem

Problem

Different environments give different results when working with regular expressions using regcomp() or regexec().

Resolution

The regcomp() and regexec() functions are Locale Sensitive. The locale which is loaded must match the CCSID of the character strings that are given to the functions. For instance, if the CCSID is 1146, the locale must be '/QSYS.LIB/EN_GB.LOCALE'.

Programs can sometimes work differently based on the activation group that they are running in. This is due to a difference in the locale that is in use. Locales are activation group resources. When the job is created, the locale is loaded from the system value QLOCALE and the LANG environment variable. It can be changed at any time by calling the setlocale() function. Once it is changed, it affects the entire activation group.

Often times, the locale in the default activation group will get changed. Something (operating system, another application, and so on) will set the locale to a different value, such as the default "C" locale. The default "C" locale is EN_US, which matches CCSID 37. When the regcomp() and regexec() functions are called, they expect CCSID 37 data. If CCSID 1146 data is used in this locale, the APIs will not produce the same results. Often times, the APIs will still run successfully and return a return code of 0; however, the results will be different than what is expected.

Each activation group has only one locale. The default activation group is used by lots of different code, so setting the locale within the default activation group can cause other things to work incorrectly because of the new locale setting. If the application needs to run in the default activation group, setlocale() should be called to load the correct locale prior to calling regcomp() and regexec(). A safer approach is to run in a named activation group. When this is done, there is no danger of causing inadvertent behavior changes in the default activation group. When a named activation group is created, it loads the locale from QLOCALE and the LANG environment variable. It will be the same regardless of whether the default activation group's locale was changed.

Additional background information on locales can be found in the Understanding CCSIDs and Locales and Unicode Support sections of Chapter 3 within the ILE C/C++ Runtime Library Functions manual which is available at the following URL:
http://publib.boulder.ibm.com/infocenter/iseries/v6r1m0/topic/apis/sc415607.pdf

[{"Type":"MASTER","Line of Business":{"code":"LOB57","label":"Power"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Platform":[{"code":"PF012","label":"IBM i"}],"Version":"7.1.0"}]

Historical Number

639749172

Document Information

Modified date:
18 December 2019

UID

nas8N1010877