IBM PureData System for Analytics, Version 7.1

Hypothetical set family syntax

The syntax for the hypothetical set family is as follows:
SELECT fn(<exprlist>) 
WITHIN GROUP ( ORDER BY <value_expression> [asc|desc] [nulls {first | 
last}] [,...])) FROM <from_expr>[GROUP BY <group_expr>]
These functions are called hypothetical set functions. They take an argument and an order expression. The order expression is evaluated to produce an ordered collection of values. The argument list produces a row R, which is a hypothetical row in this ordered collection.
dense_rank
Returns the dense rank of a hypothetical row within a group of rows. The ranks are consecutive integers that begin with one. The number of arguments equal the number of expressions in the ORDER BY clause of WITHIN GROUP. The arguments to dense_rank are treated as if they are of the same data type (any necessary casting happens automatically) as that of the corresponding order-by expression of WITHIN GROUP.
rank
Calculates the rank of a hypothetical row within a group of values. Rows with equal values for the ranking criteria receive the same rank.
percent_rank
Calculates the percent rank (0 - 1) of a hypothetical row within a group of values. Returns the percent-rank of row R if it were a hypothetical row in the aggregating group (the rank of row R-1, divided by the number of rows in the augmented collection).
cume_dist
Calculates the cumulative distribution of a hypothetical row within a group of values. Returns the cume_dist of R if it is a hypothetical row in the aggregating group (the number of rows preceding or peers of R, divided by the number of rows in the aggregating group).
The argument expression/sub-expression needs to match the GROUP BY expression exactly. The following example is not exact, and returns an error:
SELECT dense_rank(b+a) WITHIN GROUP( order by col) FROM T GROUP BY a+b; 
The correct expression would be the following:
SELECT dense_rank(a+b) WITHIN GROUP( order by col) FROM T GROUP BY a+b;
The following example is not exact, and returns an error:
SELECT dense_rank(a+b+c) WITHIN GROUP(order by col) FROM T GROUP BY a, 
b+c;
The correct expression would be the following:
SELECT dense_rank(a+(b+c)) WITHIN GROUP(order by col) FROM T GROUP BY 
a, b+c;


Feedback | Copyright IBM Corporation 2014 | Last updated: 2014-02-28