Assembly differences related to square brackets between TPF 4.1 and z/TPF
Why do z/TPF assemblies generate different object code than TPF 4.1 assemblies for square bracket characters ('[' and ']')? In TPF 4.1, a '[' generates x'AD' and ']' generates x'BD', but on z/TPF, a '[' generates x'BA' and ']' generates x'BB'.
The short explanation is that the High Level Assembler (HLASM) assembles code based on EBCDIC code page 037. This is error prone when the souce is used with editors and utilities like ftp that use a different EBCDIC codepage (such as IBM1047).
The longer explanation is as follows.
This behavior is due to the source editor, the hexadecimal data used to represent the square brackets in the source file, and the code page used by the editor. For TPF 4.1, HLASM is executed on z/OS. In z/OS the program source is in a dataset and it is in EBCDIC. Assume that the following instruction is in the program file:
CONSTANT DC C'[AB]'
When HLASM creates an object for this constant, it simply takes the hexadecimal data that is in the source. Assume that the hexadecimal data for the [AB] is x'ADC1C2BD'. In this case, HLASM puts x'ADC1C2BD' into the object.
So how did the square bracket [ become x'AD' in the source? This is dependent on the code page that was used by the editor. Likely, code page IBM 1047 was used. IBM 1047 results in '[' being represented as x'AD'.
Next, let's discuss how the process works for z/TPF on Linux. First, the source is saved in ASCII. Using the same example, the [AB] is ASCII x'5B41425D'. Next, HLASM on Linux uses EBCDIC the same way that it does on z/OS. This means that the source must be converted from ASCII to EBCDIC before HLASM does its work. This ASCII to EBCDIC conversion is done using EBCDIC code page 037. Going back to the example, the [AB] (ASCII x'5B41425D') is converted to x'BAC1C2BB' before it is input into HLASM on Linux. As mentioned previously, on a DC C'[AB]' HLASM takes the hexadecimal input from the source and puts it into the object. Therefore, x'BAC1C2BB' will be the object.
How can a square bracket [ in source code generate a x'AD' in HLASM when it is run on linux? This question was brought forward to the HLASM team to determine if a different code page can be used on Linux when the ASCII to EBCDIC conversion is done. Unfortunately, the answer is No. Only EBCDIC code page 037 is used. This means that a square bracket [ in the source will always generate a x'BA' when HLASM is run on Linux.
So what can you do to solve this incompatibility? One possibility would be to make sure the editor and utilities are using EBCDIC code page 037.
The more robust way to handle this inconsistency is to use hexadecimal data types instead of character data types (so that it is not translated during assembly).
For example, if you need to have a square bracket [ be x'AD' and ] be x'BD', you could use hexadecimal data types as follows. Going back to the previous example, the code can be changed to be:
CONSTANT DC 0CL4
DC X'AD' square bracket [ for IBM 1047 code page
DC X'BD' square bracket ] for IBM 1047 code page
Another example would be if there was code such as the following:
It could be changed to:
One other alternative would be to update the source to use the ASCII character that will generate the x'AD' in EBCDIC code page 037. Here the example with the characters that would generate x'AD' for '[ and x'BD' for ']'.
CONSTANT DC C'ÝAB¨' this is actually [AB]