Understanding libiconv

The section will cover the iconv application programming interface (API) conversion.

Often The iconv application programming interface (API) consists of the following subroutines that accomplish conversion:

iconv_open
Performs the initialization required to convert characters from the code set specified by the FromCode parameter to the code set specified by the ToCode parameter. The strings specified are dependent on the converters installed in the system. If initialization is successful, the converter descriptor, iconv_t, is returned in its initial state.
iconv
Invokes the converter function using the descriptor obtained from the iconv_open subroutine. The inbuf parameter points to the first character in the input buffer, and the inbytesleft parameter indicates the number of bytes to the end of the buffer being converted. The outbuf parameter points to the first available byte in the output buffer, and the outbytesleft parameter indicates the number of available bytes to the end of the buffer.

For state-dependent encoding, the subroutine is placed in its initial state by a call for which the inbuf value is a null pointer. Subsequent calls with the inbuf parameter as something other than a null pointer cause the internal state of the function to be altered as necessary.

iconv_close
Closes the conversion descriptor specified by the cd variable and makes it usable again

In a network environment, the following factors determine how data should be converted:

  • Code sets of the sender and the receiver
  • Communication protocol (8-bit or 7-bit data)

The following table outlines the conversion methods and recommends how to convert data in different situations.

Criteria Communication protocol Communication protocol
method to choose 7-bit only 8-bit
as is not valid best choice
fold7 OK OK
fold8 not valid OK
uucode best choice OK

This table shows communication with system using different code set or when receiver's code set is unknown.

Criteria Communication protocol Communication protocol
method to choose 7-bit only 8-bit
as is not valid not valid if remote code set is unknown
fold7 best choice OK
fold8 not valid best choice
uucode not valid not valid

If the sender uses the same code set as the receiver, the following possibilities exist:

  • When protocol allows 8-bit data, the data can be sent without conversions.
  • When protocol allows only 7-bit data, the 8-bit code points must be mapped to 7-bit values. Use the iconv interface and one of the following methods:
    uucode
    Provides the same mapping as the uuencode and uudecode commands. This is the recommended method.
    7-bit
    Converts internal code sets using 7-bit data. This method passes ASCII without any change.

If the sender uses a code set different from the receiver, there are two possibilities:

  • When protocol allows only 7-bit data, use the fold7 method.
  • When protocol allows 8-bit data and you know the receiver's code set, use the iconv interface to convert the data. If you do not know the receiver's code set, use the following method:
    8-bit
    Converts internal code sets to standard interchange formats. The 8-bit data is transmitted and the information is preserved so that the receiver can reconstruct the data in its code set.