Introduction to character conversion

In computers, all characters are encoded according to the rules of a particular encoding scheme and code page. If your database and applications handle data from multiple code pages, that data might be converted at certain times from one code page to another. This conversion process is called character conversion.

This situation of handling data from multiple code pages is likely if your database and applications contain international data or data from multiple character sets, such as Latin-1 and Katakana. In this situation, character conversions are likely to occur.

The problem with character conversions is that they can degrade performance and potentially cause data loss. Therefore, you should avoid these conversions if possible. One way to avoid these conversions is to have all of your data in one code page. If you use multiple character sets, you might considering using the Unicode code page. This code page includes all characters. If you use Unicode for all of your data, conversions can be avoided. However, converting all of your data to Unicode is not a simple process.

This information discusses basic principles about character conversion and general recommendations that you can apply to your environment for optimal performance and storage.