Java character encodings

Java™ programs can convert data in different formats, enabling your applications to transfer and use information from many kinds of international character sets.

Internally, the Java virtual machine (JVM) always operates with data in Unicode. However, all data transferred into or out of the JVM is in a format matching the file.encoding property. Data read into the JVM is converted from file.encoding to Unicode and data sent out of the JVM is converted from Unicode to file.encoding.

Data files for Java programs are stored in the integrated file system. Files in the integrated file system are tagged with a coded character set identifier (CCSID) that identifies the character encoding of the data contained in the file.

When data is read by a Java program, it is expected to be in the character encoding matching file.encoding. When data is written to a file by a Java program, it is written in a character encoding matching file.encoding. This also applies to Java source code files (.java files) processed by the javac command and to data sent and received through Transmission Control Protocol/Internet Protocol (TCP/IP) sockets using the java.net package.

Data read from or written to System.in, System.out, and System.err are handled differently than data read from or written to other sources when they are assigned to stdin, stdout, and stderr. Since stdin, stdout, and stderr are normally attached to EBCDIC devices on the IBM® i server, a conversion is performed by the JVM on the data to convert from the normal character encoding of file.encoding to a CCSID matching the IBM i job CCSID. When System.in, System.out, or System.err are redirected to a file or socket and are not directed to stdin, stdout, or stderr, this additional data conversion is not performed and the data remains in a character encoding matching file.encoding.

When data must be read into or written from a Java program in a character encoding other than file.encoding, the program can use the Java IO classes java.io.InputStreamReader, java.io.FileReader, java.io.OutputStreamReader, and java.io.FileWriter. These Java classes allow specifying a file.encoding value that takes precedence over the default file.encoding property currently in use by the JVM.

Data to or from the DB2® database converts to or from the CCSID of the IBM i database through the JDBC APIs .

Data that is transferred to or from other programs through Java Native Interface does not get converted.