Method 10 for EUC to PC conversions
The input values for the conversion in this case are four bytes, rather than the two bytes used for the PC input in the previous example. This results in a conversion table construction that is more complex than the PC-to-EUC case. The following description applies equally to all tables dealing with four-byte input values, namely those of EUC and TCP/IP CCSIDs.
There are four levels of tables within the constructed conversion table, where each table corresponds to one input byte value of the four input bytes per character.
- Level 0 tables (B0): Only one table can be constructed at this level. Byte 0 (the first byte) of the input code point is used to index into the B0 table and retrieve a pointer to the B1 level tables. Table B0 is 256 bytes long.
- Level 1 tables (B1): There is one B1 table for each valid entry in the B0 table, plus one table to contain all of the invalid entries for B0. The first four bytes of each B1 table are used as a pointer (23), b2pt, to a corresponding group of B2 tables. The second byte of the input code point (byte 1) is used as an index into B1 to retrieve the index number for the B2 table within the group of B2 tables pointed to by the b2pt value. Each B1 table is 260 bytes long.
- Level 2 tables (B2): There is one group of B2 tables for each B1 table. The first four bytes of each B2 table are used as a pointer (23), b3pt, to a corresponding group of B3 tables. The third byte of the input code point (byte 2) is used as an index into B2 to retrieve the index number for the B3 table within the group of B3 tables pointed to by the b3pt value. Each B2 table is 260 bytes long.
- Level 3 tables (B3): There is one group of B3 tables for each B2 table. Use the fourth byte of the input code point (byte 3, where byte 0 is the first byte) to index into the B3 table to retrieve the final conversion value. Each B3 table is 512 bytes in length.
An index value of 0 corresponds to the first table in the group.
Figure 62. Method 10: EUC to PC Conversion
The method shown in Figure 62 has the following characteristics:
- It is used for the conversion of data between an input EUC CCSID and output PC CCSID.
- The valid encoding scheme for input data is X'4403', while the valid schemes for output data are X'2100', X'3100', X'2200', X'2300', X'2305' X'3200', and X'3300'.
- The input bytes are always received in a normalized four-byte format.
- The conversion table will accept single-byte, double-byte, or triple-byte code points from the input CS, CP pair as defined by the EUC encoding scheme to be converted to a possible single-byte, double-byte CS, CP code-point output value.
- The content of the table will reflect the criterion used for:
- Matched GCGID priority within the target CS, CP
- Mismatch management
- Space character management.
- For most EUC four-byte codes, only a certain range of code point values are valid for the three high-order bytes, therefore the tables are organized as several subtables. Subtable pointer tables contain entries that point to a pool of subtables. The lowest-level subtable points to a series of records containing 256 double-byte code point values used as output.
- Invalid single-byte input values will be mapped to the single-byte SUB character for the PC, which is a X'7F'. All other invalid input values will be mapped to the double-byte SUB for the respective PC mixed CCSID.
- Only a triple-byte CS, CP pair will use all four bytes of the input code point.