Getting the frequency distribution of a column

Use the GET columnAnalysis/frequencyDistribution API command to get the frequency distribution of a column.

Command

GET columnAnalysis/frequencyDistribution

Parameters

projectName
The name of the project that contains the columns.
columnName
The name of the columns for which you want to retrieve the frequency distribution. The column name must be fully qualified; for example, HOST.DATASOURCE.SCHEMA.TABLE.COLUMN.. You can retrieve the frequency distributions of several columns in one operation by providing a comma-separated list of column names or by using wildcard characters. For example, host1.datasource1.schema1.table1.*, host1.datasource1.schema1.table2.* retrieves the distributions for all columns of tables table1 and table2 in schema1. And host_name1.datasource1.schema.*.* retrieves all distributions for all columns of all tables of schema1.
maxNbOfValues
The maximum number of values to retrieve. By default all values are retrieved
startIndex
The index is 1-based, meaning that 1 is the index of the first value. This option can be used by applications that present the values in pages and load one page at a time.
ordering
The sort order for the returned values. The value of the ordering parameter is one of the following options:
  • ascendingFrequencies
  • descendingFrequencies
  • ascendingValues
  • descendingValues

Available HTTP methods

Table 1. HTTP API for retrieving the frequency distribution of a column
HTTP method URI pattern DATA format Success code Error codes
GET columnAnalysis/frequencyDistribution XML 200 400 (bad request) or 500 (server error)

Example HTTP request and return value

The following example shows how to get the frequency distribution for column COL1 of the table TABLE1 of the schema SCHEMA1 of the data source SOURCE1 of the project project1 on the server myServer:
GET https://myServer:9443/ibm/iis/ia/api/columnAnalysis/frequencyDistribution
?projectName=project1&columnName=SOURCE1.SCHEMA1.TABLE1.COL1
The following example shows the return value:
<?xml version="1.0" encoding="UTF-8"?>
<iaapi:Project xmlns:iaapi="http://www.ibm.com/investigate/api/iaapi" 
   name="project1">
  <DataSources>
    <DataSource name="SOURCE1">
      <Schema name="SCHEMA1">
        <Table name="name="TABLE1">
          <Column name="COL1">
            <ColumnAnalysisResults>
              <FrequencyDistribution>
                <Terms>
                  <Term name="Category1/Term1"/>
                  <Term name="Category1/Term2"/>
                </Terms>
                <Value frequency="150" percent="0.15">0</Value>
                <Value frequency=”25” percent=”0.025”>Value1</Value>
                <Value frequency=”1” percent=”0.001”>Value2</Value>
                <Value frequency=”1” percent=”0.001”>Value3</Value>
                <Value frequency=”1” percent=”0.001”>Value4</Value>
                <Value frequency=”1” percent=”0.001”>Value5</Value>
                <Value frequency=”1” percent=”0.001”>Value6</Value>
                <Value frequency=”1” percent=”0.001”>Value7</Value>
                <Value frequency=”1” percent=”0.001”>Value8</Value>
                <Value frequency=”1” percent=”0.001”>Value9</Value>
                <Value frequency=”1” percent=”0.001”>Value10</Value>
                (...)
              </FrequencyDistribution>
            </ColumnAnalysisResults>
          </Column>
        </Table>
      </Schema>
    </DataSource>
  </DataSources>
</iaapi:Project>