Code pages and CCSIDs

Because computers store only numbers, they store letters and other characters by assigning a number to them. Which number is mapped to each character depends on the CCSID and code page that is associated with that character.

A code page is a mapping of hexadecimal numbers to particular characters. For example, the following table shows code page 37.

Table 1. Code page 37 with CCSID 37
^{1st →} ^2nd↓	4-	5-	6-	7-	8-	9-	A-	B-	C-	D-	E-	F-
-0	(sp)	&	-	ø	Ø	º	µ	^	{	}	\	0
-1	(rsp)	é	/	É	a	j	~	£	A	J	÷	1
-2	â	ê	Â	Ê	b	k	s	¥	B	K	S	2
-3	ä	ë	Ä	Ë	c	l	t	•	C	L	T	3
-4	à	è	À	È	d	m	u	©	D	M	U	4
-5	á	í	Á	Í	e	n	v	§	E	N	V	5
-6	ã	î	Ã	Î	f	o	w	¶	F	O	W	6
-7	å	ï	Å	Ï	g	p	x	¼	G	P	X	7
-8	ç	ì	Ç	Ì	h	q	y	½	H	Q	Y	8
-9	ñ	ß	Ñ	`	i	r	z	¾	I	R	Z	9
-A	¢	!	¦	:	≪	ª	¡	[	- (SHY)	¹	²	³
-B	.	$	,	#	≫	º	¿	]	ô	û	Ô	Û
-C	<	*	%	@	ð	æ	Ð	‾	ö	ü	Ö	Ü
-D	(	)	_	'	ý	¸	Ý	¨	ò	ú	Ò	Ù
-E	+	;	>	=	þ	Æ	Þ	´	ó	ú	Ó	Ú
-F	\|	¬	?	“	±	¤	®	×	õ	ÿ	Õ	(EO)

Within a code page, each hexadecimal number representation for a character is called a code point. When looking at a code page, you can find the hexadecimal code point value for a particular character by concatenating the column header with the row header. For example, find the character 'A' in the preceding code page 37. The character 'A' is in column C and row 1. Therefore, the corresponding code point for the character 'A' is X'C1'. As another example, find the character 'a' in this same code page. The character 'a' is in column 8 and row 1. Therefore, the corresponding code point is X'81'.

A coded character set identifier (CCSID) is a number that identifies an implementation of a code page at a particular point in time. For example, the preceding code page 37, which is the US-English code page, has a CCSID of 37.

CCSIDs are defined by the IBM® character data representation architecture (CDRA). CDRA is an architecture that aims to achieve consistent representation, processing, and interchange of graphic character data in data processing environments. To achieve this consistency, CDRA defines a set of services, supporting resources, conventions, and identifiers, one of which is a CCSID. IBM maintains a repository list of all CCSIDs that are defined by CDRA.

DB2® for z/OS® uses CCSIDs. However, DB2 for Linux, UNIX, and Windows uses code pages. The difference between code pages and CCSIDs is the stability. Code pages might change. However, because CCSIDs capture a code page at a particular point in time, the code page that it references does not change.

For example, consider code page 37. At some point, this code page was changed so that code point X'9F' no longer mapped to the international currency symbol (¤). Instead, this code point was mapped to the euro symbol (€). CCSID 37 refers to the original code page 37. The altered code page has CCSID 1140. CCSID 1140 and CCSID 37 differ by only that one character at code point X'9F'. The following table shows CCSID 1140.

Table 2. Code page 37 with CCSID 1140
^{1st →} ^2nd↓	4-	5-	6-	7-	8-	9-	A-	B-	C-	D-	E-	F-
-0	(sp)	&	-	ø	Ø	º	µ	^	{	}	\	0
-1	(rsp)	é	/	É	a	j	~	£	A	J	÷	1
-2	â	ê	Â	Ê	b	k	s	¥	B	K	S	2
-3	ä	ë	Ä	Ë	c	l	t	•	C	L	T	3
-4	à	è	À	È	d	m	u	©	D	M	U	4
-5	á	í	Á	Í	e	n	v	§	E	N	V	5
-6	ã	î	Ã	Î	f	o	w	¶	F	O	W	6
-7	å	ï	Å	Ï	g	p	x	¼	G	P	X	7
-8	ç	ì	Ç	Ì	h	q	y	½	H	Q	Y	8
-9	ñ	ß	Ñ	`	i	r	z	¾	I	R	Z	9
-A	¢	!	¦	:	≪	ª	¡	[	- (SHY)	¹	²	³
-B	.	$	,	#	≫	º	¿	]	ô	û	Ô	Û
-C	<	*	%	@	ð	æ	Ð	‾	ö	ü	Ö	Ü
-D	(	)	_	'	ý	¸	Ý	¨	ò	ú	Ò	Ù
-E	+	;	>	=	þ	Æ	Þ	´	ó	ú	Ó	Ú
-F	\|	¬	?	“	±	€	®	×	õ	ÿ	Õ	(EO)

The exception to this idea of fixed CCSIDs is the CCSID set that DB2 for z/OS uses for Unicode code pages. For Unicode data, DB2 for z/OS uses CCSIDs that have the ability to grow as the Unicode standard grows. For more information about those CCSIDs, see Unicode CCSIDs.

In DB2 for z/OS, all character data is associated with a CCSID. If the data does not have one, DB2 uses the subsystem defaults. You specify these subsystem default CCSID values when you install DB2. Character conversion is described in terms of CCSIDs of the source and target.