Extended data classification

Extended data classification provides additional information about a given column. When you perform the column analysis by enabling Extended Data Classification, you can analyze and classify the data.

Enabling enhanced data classification when you run a column analysis job
When you run a column analysis job, enable the enhanced data classification option so you can get extended details about your data. You can view the extended data classification details when you view the column analysis results. You can view the column analysis results by going to the Investigate workspace in the Home Navigator menu, and clicking Column Analysis > Open Column Analysis.
Note: You can publish the data classification, and column analysis results if required. When you publish the data classification results, you can see the published data class objects in the Investigate model in approved state for the corresponding columns Data_field.

Data Classes preferences

The following table shows the data classes preference settings. You can make selection from the list of displayed fields.
Note: If the Enabled check box is checked against each data class, it is considered that the data class is enabled.
Table 1. Data Classes preference settings
Data class Description
American Express Card Infers whether a column can be considered as an American Express credit card number.
Boolean Infers whether a column can contain boolean values. The values can contain numeric or alpha codes. Example, 0 or 1, True or False, or Yes or No.
Canadian Social Insurance Number Infers whether a column can be considered a Canadian social insurance number (SIN).
Code A column that contains code values that represent a specific meaning. For example, a column with the class of Code might contain data about the area code in a telephone number.
Country Code Infers whether a column can be considered a country or region code.
Country Name Infers whether a column can be considered a country name.
Credit Card Number Infers whether a column can be considered a credit card number. The Credit Card Number contains all the credit card data classes for example, American Express Card, Diners Club Card, Discover Card, Japan CB, Master Card, and VISA card, etc.
Date Infers whether a column can be considered a chronological data. For example, a column with the class of Date might contain data such as 10/10/07.
Note: All date formats are supported. For example, MM-DD-YY, MM/DD/YYYY, DD-MM-YYYYMM-DD-YY, etc.
Diners Club Card Infers whether a column can be considered a Diners Club credit card number.
Discover Card Infers whether a column can be considered a Discover credit card number.
Email Address Infers whether a column can be considered an e-mail address.
French INSEE Number Infers whether a column can be considered a French National Institute for Statistics and Economic Studies (INSEE) number.
Gender Infers whether the column is a gender code, such as Male/Female or M/F.
Computer Host Name Infers whether a column can be considered a host name.
Identifier A data value that is used to reference a unique entity. For example, a column with the class of Identifier might be a primary key or contain unique information such as a customer number.
Indicator Infers whether a column contains binary values such as M/F, 0/1, True/False, Yes/No.
Internet Protocol Address Infers whether a column can be considered an IP address.
Internet Protocol Version 6 Address Infers Internet Protocol (IP) address intended to supplement and eventually replace IPv4.
International Standard Book Number Infers an International Standard Book Number (ISBN) which is a unique numeric book identifier.
International Securities Identification Number Infers an International Securities Identification Number (ISIN) which uniquely identifies a security such as bonds, stocks, and warrants.
Italian Fiscal Code Infers whether a column can be considered an Italian fiscal code.
Japan CB Infers whether a column can be considered a Japanese CB number.
Large Object A large object is a column whose length is greater than the length threshold. A column defined as a large object will not be explicitly profiled. Large Object is assigned to columns that have a BLOB data type.
Master Card Infers whether a column can be considered a Master Card credit card number.
Notes Email Address Infers whether a column can be considered an IBM Lotus Notes® e-mail address.
Passport Number Infers whether a column can be considered a passport number.
Quantity A column that contains data about a numerical value. For example, a column with the class of Quantity might contain data about the price of an object.
Spanish Fiscal Identification Number Infers whether a column can be considered a Spanish identification number (número de identificación or NIF).
Text A column that contains unformatted alphanumeric data and special character data.
UK National Insurance Number Infers whether a column can be considered a UK national insurance number (NINO).
Universal Product Code Infers whether a column can be considered a Universal Product Code (UPC).
Uniform Resource Locator Infers whether a column can be considered a Web address.
US Phone Number Infers whether a column can be considered a U.S. telephone number or a Canada telephone number.
US Social Security Number Infers whether a column can be considered a U.S. Social Security number (SSN).
US State Code Infers whether a column can be considered a U.S. state code or abbreviation.
US Zip Infers whether a column can be considered a U.S. postal code.
VISA Card Infers whether a column can be considered a Visa credit card number.