Inserting data in GB18030 and UTF-8 environments with the unexpected results of the configuration being shown.
GB18030 environment:
To enable the WAS 5.0 to run in a GB18030 environment, configure as following:
- Under a web application, open file \WEB-INF\ibm-web-ext.xmi, add line:
autoRequestEncoding ="true"
autoResponseEncoding ="true"
If these two values are “false” (by default, they are), the web container will not automatically set request and response encoding according to the encoding.properties file. The programmer is expected to set these values using the methods available in the Servlet 2.3 API. To avoid the manual effort in application development, we suggest the web application automatically set these values.
-
Open file encoding.properties under WAS. After installation of WAS 5.0, the default encoding parameter in this file for Simplified Chinese is “zh=GB2312”. Modify it into “zh=GB18030”.
With the above configurations, the value of request.getCharacterEncoding() is GB18030.
We input GB18030 data via Text Field in an HTML file: GB18030Form.html, and a JSP file InsData.jsp will get the input parameters, then we insert them into database via JDBC.
The face of the GB18030Form.html is:

Figure 1
And the Netscape’s View > Character Encoding is “Simplified Chinese(GB18030)”:

Figure 2
We input the test data U+3401, and then open DB2 Control Center to see the data inserted:

Figure 3
However, if the configuration “zh=GB2312” in encoding.properties file remains, the result will be:

Figure 4
Here is a summary table to show the results for different configurations:
|
Browser's View > Encoding
|
encoding.properties
|
|
zh=GB2312
|
zh=GB18030
|
|
Simplified Chinese (GB2312)
|
Fail (Figure 4)
|
Fail (Figure 4)
|
|
Simplified Chinese (GB18030)
|
Fail (Data damaged with REPLACEMENT CHARACTER U+FFFD)
|
Success (Figure 3)
|
|
Western Europe (Windows)
|
Fail (Figure 4)
|
Fail (Figure 4)
|
Unicode (UTF-8) environment:
To enable WAS 5.0 to run in a UTF-8 environment, configure as following:
Specify -Dclient.encoding.override=UTF-8 for WAS JVM. This property will override any client preferences for parsing client input values. So we will ignore the configuration in encoding.properties file.
With above configuration, the value of request.getCharacterEncoding() is UTF-8.
And run GB18030Form.html in WAS, operate the same as that in GB18030 environment, the results are as following table:
|
Browser's View>Encoding
|
Result
|
|
Simplified Chinese(GB2312)
|
Fail (Figure 4)
|
|
Simplified Chinese(GB18030)
|
Fail (Data is damaged)
|
|
Western Europe(Windows)
|
Fail (Figure 4)
|
|
UTF-8
|
Success (Figure 3)
|
Notice only that the Browser’s View > Encoding is identical to that of the WAS environment could we get the expected result shown in Figure 3. |