You can use data rule functions to perform specific operations
when you build data rule definitions and working with data rules.
Functions are populated in the Source Data or Reference Data fields in the data rule
logic builder. Select the functions you want to use to perform a particular
action with your data such as COUNT, SUM, or AVG(value). You can choose
from the functions listed under the Functions tab. Below are detailed definitions and use case scenarios for all
of the available functions.
Note: Where applicable, the examples
in the function scenarios are based on data rule definitions. Each
data rule definition source data or reference data component needs
to be bound to a physical database-table-column to create a data rule.
Date and time functions
You can use date
and time functions to manipulate temporal data.
- date ()
- Definition: Returns the system date from the computer as
a date value.
- Use case scenario 1: You want to find order dates that
are no older than 365 days, but not beyond today's date.
- Example 1: dateCol > date()-365 and dateCol < date()
- Use case scenario 2: You want to find all the sales activity
that occurred in the last week.
- Example 2: IF ( date_of_sale > date() - 7days ) )
- datevalue (string,format)
- Definition: Converts the string representation of a date
into a date value.
- string
- The string value to be converted into a date.
- format
- The optional format string that describes how the date is represented
in the string.
- %dd
- Represents the two-digit day (01 – 31)
- %mm
- Represents the two-digit month (01 – 12)
- %mmm
- Represents the three-character month abbreviation (For example,
Jan, Feb, Mar)
- %yy
- Represents the two-digit year (00 – 99)
- %yyyy
- Represents the four-digit year (nn00 – nn99)
If a format is not specified, the function assumes
(%yyyy-%mm-%dd) as the default format.
- Note: You can use this function to convert a string, which
represents date, into its literal date value as part of your data
rule.
- Use case scenario 1: You want to check that the date value
in a column, which is coded as a string, is not older than 365 days
from now.
- Example 1: datevalue(billing_date,'%yyyy%mm%dd') > date()-365
- Use case scenario 2: You want to check that all dates
in your project are older than 01/01/2000.
- Example 2: billing_date > datevalue('2000-01-01')
- day (date)
- Definition: Returns a number that represents the day of
the month for a date that you specify.
- Use case scenario: You are interested in orders placed
on the first day of the month.
- Example: day(sales_order_date) =1
- month (date)
- Definition: Returns a number that represents the month
for a date that you specify.
- Use case scenario: You want to ensure that the month of
the billing date is a valid date.
- Example 1: month(billing_date) >= 1 and month(billing_date)
<= 12
- Example 2: If month(billing_date) = 2 then day(billing_date)
>= 1 and day(billing_date) <= 29
- weekday (date)
- Definition: Returns a number that represents the day of
the week for a specified date, starting with 1 for Sunday.
- Use case scenario: You expect sales orders on Sunday to
be below 1000 entries, so you want to run a function that checks to make
sure that orders on Sunday are below 1000 entries.
- Example: If weekday(sales_order_date) = 1 then count(sales_order_id)
< 1000
- year (date)
- Definition: Returns a number that represents the year for
a date that you specify.
- Use case scenario: You want to collect a focus group of
customers that were born between 1950 and 1955.
- Example: year(date_of_birth)> 1950 AND year(date_of_birth)
< 1955
- time ()
- Definition: Returns the system time (current time) from
the computer as a time value.
- Use case scenario: You want to find all the sales transactions
that occurred within the last four hours.
- Example: IF ( time_of_sale > time() - 4hours )
- timevalue (string,format)
- Definition: Converts the string representation of a time
into a time value.
- string
- The string value to be converted into a time value.
- format
- The optional format string that describes how the time is represented
in the string.
- %hh
- Represents the two-digit hours (00 – 23)
- %nn
- Represents the two-digit minutes (00 – 59)
- %ss
- Represents the two-digit seconds (00 – 59)
- %ss.n
- Represents the two-digit milliseconds (00 – 59) where n = fractional
digits (0 – 6)
If a format is not specified, the function assumes
(%hh:%nn:%ss) as the default format.
- Note: You can use this function to convert a string, which
represents time, into its literal time value as part of your data
rule.
- Use case scenario: You want to make sure that the check-in
time for guests at a hotel is set to a time later than 11 a.m.
- Example: checkin_time>timevalue('11:00:00')
- timestampvalue (value,format)
- Definition: Converts the string representation of a time
into a timestamp value.
- string
- The string value to be converted into a timestamp value.
- format
- The optional format string that describes how the timestamp is
represented in the string.
- %dd
- Represents the two-digit day (01 – 31)
- %mm
- Represents the two-digit month (01 – 12)
- %mmm
- Represents the three-character month abbreviation (For example,
Jan, Feb, Mar)
- %yy
- Represents the two-digit year (00 – 99)
- %yyyy
- Represents a four-digit year (nn00 – nn99)
- %hh
- Represents the two-digit hours (00 – 23)
- %nn
- Represents the two-digit minutes (00 – 59)
- %ss
- Represents the two-digit seconds (00 – 59)
- %ss.n
- Represents the two-digit milliseconds (00 – 59) where n = fractional
digits (0 – 6)
If a format is not specified, the function assumes
(%yyyy-%mm-%dd %hh:%nn:%ss) as the default format.
- Use case scenario: Your sales reports use a non-standard
timestamp for the order time. You need to find all the sales before
a specific time.
- Example: timestampvalue(timestamp_of_sale, '%yyyy %mm
%dd %hh %nn %ss') < timestampvalue('2009-01-01 00:00:00', '%yyyy-%mm-%dd
%hh:%nn:%ss')
- timestamp()
- Definition: Returns the system time (current time) from
the computer as a timestamp value.
- Use case scenario: You want to ensure that no orders have
a future order date.
- Example: order_timestamp < timestamp()
- hours(time)
- Definition: Returns a number that represents the hours
for the time value that you specify.
- Use case scenario: You want to validate that the sale occurred
between midnight and noon.
- Example: 0 < hours(sales_time) AND hours(sales_time)
< 12
- minutes(time)
- Definition: Returns a number that represents the minutes
for the time value that you specify.
- Use case scenario: You want to validate that the sale occurred
in the first 15 minutes of any hour.
- Example: 0 < minutes(sales_time) AND minutes(sales_time)
< 15
- seconds(time)
- Definition: Returns a number that represents the seconds
and milliseconds for the time value that you specified.
- Use case scenario: You want to validate that a sale occurred
in the last 30 seconds of any minute.
- Example: 30 < seconds(sales_time) AND seconds(sales_time)
< 60
- datediff(date1, date2)
- Definition: Returns the number of days between two dates.
Date1 is the most recent date of the two dates. Date2 is the later
of the two dates.
- Use case scenario: You want to determine the number of
days between the billing date and the payment date.
- Example: datediff(pay_date,bill_date)
- timediff (time1, time2)
- Definition: Returns the number of hours, minutes, and seconds
difference between two times. Time1 is the earliest of the two times.
Time2 is the later of the two times. The returned value is a time
value.
- Use case scenario: You want to determine the amount of
time between the start of a task and its completion.
- Example: timediff(end_time, start_time,)
Mathematical functions
Mathematical functions
return values for mathematical calculations.
- abs(value)
- Definition: Returns the absolute value of a numeric value
(For example ABS(-13) would return 13).
- Use case scenario 1: You want to return the absolute value
of the sales price to make sure that the difference between two prices
is less than $100.
- Example 1: abs(price1-price2)<100
- Use case scenario 2: You want to find all the stocks that
changed more than $10 in price.
- Example 2: abs(price1-price2) > 10
- avg(value)
- Definition: An aggregate function that returns the average
of all values within a numeric column.
- Use case scenario: You want to determine the average hourly
pay rate for employees in a division.
- Example: avg(hrly_pay_rate)
- exp(value)
- Definition: Returns the exponential value of a numeric
value.
- Use case scenario: You want to determine the exponential
value of a numeric value as part of an equation.
- Example: exp(numeric_variable)
- max(value)
- Definition: An aggregate function that returns the maximum
value found in a numeric column.
- Use case scenario: You want to determine the highest hourly
pay rate for an employee in a division.
- Example: max(hrly_pay_rate)
- min(value)
- Definition: An aggregate function that returns the minimum
value found in a numeric column.
- Use case scenario: You want to determine the lowest hourly
pay rate for an employee in a division.
- Example: min(hrly_pay_rate)
- sqrt(value)
- Definition: Returns the square root of a numeric value.
- Use case scenario: You want to determine the square root
of a numeric value as part of an equation.
- Example: sqrt(numeric_variable)
- standardize(col,group)
- Definition: Transforms a numerical value into a number
of standard deviation above or under the average value of the column.
You can use the standardize(col,group) function as a simpler
alternative to using this function: (col-avg(col, group))/stddev(col,
group). The function normalizes a numerical value based on the mean
value and standard deviation of the column.
Note: The second parameter
is optional, and can be used for specifying a grouping key.
- Use case scenario 1: You want to review how many employees
at your company have salaries that deviate from the company average.
The outliers are defined to be the values which are above the average
value plus three times the standard deviation.
- Example 1: standardize(salary) < 3
- Use case scenario 2: You want to identify all the same
outliers described in use case scenario 1, but you want to account
for the male and female populations separately.
- Example 2: standardize(salary, gender) < 3
- stddev(col,group)
- Definition: An aggregate function that computes the standard
deviation of a numerical column.
Note: The second parameter is optional,
and can be used for specifying a grouping key.
- Use case scenario 1: You want to review how many employees
at your company have salaries that deviate from the company average.
The outliers are defined to be the values which are above the average
value plus three times the standard deviation
- Example 1: salary < avg(salary)+3*stddev(salary)
- Use case scenario 2: You want to identify all the same
outliers described in use case scenario 1, but you want to account
for the male and female populations separately.
- Example 2: salary < avg(salary, gender)+3*stddev(salary,gender)
- sum(value, sum)
- Definition: An aggregate function that returns the sum
of all the values within a numeric column.
Note: The second parameter
is optional, and can be used for computing the sum of the subgroups.
- Use case scenario: You want to determine the total sales
amount for a store.
- Example: sum(sales_amount)
General functions
You can use this group
of functions to perform various tasks on values in your tables.
- coalesce(value, nullValueReplacement)
- Definition: Searches for null values, and then replaces
any null values found with another value.
- Use case scenario: You want to search a column that contains
department code numbers to see if any of the code numbers are undefined.
If there are any null values, you want to replace the null value fields
with the text 'undefined.'
- Example: coalesce(code, 'undefined'). All values that
are null will be replaced with the text 'undefined.'
- count(column)
- Definition: An aggregate function that provides a count
of the occurrences of a column.
- Use case scenario: You want to provide a frequency count
by a customer US postal code (ZIP code).
- Example: count(customer_ZIP)
- count(col,groupBy)
- Definition: An aggregate function that returns the number
of rows/values in each group of rows defined by the column groupBy.
- Use case scenario: You want to determine the number of
employees in each department of a company.
- Example: count(employeeID,departmentID)
- lookup(value, keyRefCol, valRefCol)
- Definition: Performs a replacement of a value with its
corresponding value in a lookup table. To run this function, there
must be a lookup table that contains a list of the values with a corresponding
column that contains data related to the first column. For example,
if the original value you are looking for is contained in a column
that contains keys, such as product codes, then the related lookup
table contains one column that lists all the possible key values and
another column that contains the corresponding values to use as a
replacement for the keys. This function is generally used in systems
where internal codes are used to represent values that occur in different
places in your data sources.
Note: The first parameter of the function
is the value to look up. The second parameter should be bound to the
reference column that contains the keys in the lookup table. The third
parameter should be bound to the reference column that contains the
values in the lookup table. Both reference columns must be in the
same physical table The result of the function is the value from the
lookup table corresponding to the key given as first parameter.
- Use case scenario: You have a customer table that contains
a column that represents the country that the customer is from. Instead
of storing the country name in the column (for example, Germany),
your company stores a country code (for example, 100) that represents
the country. Your company has a corresponding lookup table that captures
which code represents each country (100=Germany, 101=France, 102=Italy, and so on). You
want to write a data rule that uses the lookup function to find all
the French customers in your data source that have a phone number
made up of 10 digits.
- Example: If lookup(country, countrycodes, countrynames)='France'
then phone matches_format '99.99.99.99.99'
String functions
You can use string functions
to manipulate to strings.
- ascii(char)
- Definition: Returns the ASCII character set value for a
character value.
- Use case scenario: You want to search for all rows where
a column begins with a non-printable character.
- Example: ascii(code) <32
- char(asciiCode)
- Definition: Returns the character value for an ASCII character.
- Use case scenario 1: You want to convert an ASCII character
code to its localized character (For example, ‘35' returns ‘C').
- Example 1: char(‘35')
- Use case scenario 2: You are searching for the next letter
in the alphabet in the following sequence: if col1='a', col2 must
be 'b', if col1='b' col2 must be 'c', and so on.
- Example 2 (a): col2=char(ascii(col1)+1)
- Example 2 (b): ascii(col2)=ascii(col1)+1
- convert(originalString, searchFor, replaceWith)
- Definition: Converts a substring occurrence in a string
to another substring.
- originalString
- The string that contains the substring.
- searchFor
- The substring to be replaced.
- replaceWith
- The new replacement substring.
- Use case scenario 1: After a corporate acquisition, you
want to convert the old company name, "Company A," to the new company
name, "Company B."
- Example 1: If ( convert(old_company_name, 'Company A',
'Company B' ) = new_company_name )
- Use case scenario 2: After a corporate acquisition, you
want to convert the company acronym contained in the acquired product
codes from ‘XX” to ‘ABC'.
- Example 2: convert(product_code, ‘XX', ‘ABC')
- lcase(string)
- Definition: Converts all alpha characters in a string to
lowercase.
- Use case scenario: You need to change all product codes
to use only lowercase letters.
- Example: lcase(product_code)='ab'
- index(string, substring)
- Definition: Returns the index of the first occurrence of
a substring within a string. The result is a zero-based index, so
a zero indicates that the substring was found at the beginning of
a string. Negative one (-1), means that the substring was not found.
- Use case scenario: You want to locate a company code, ‘XX',
within a free form product code.
- Example: index(col, 'XX')>=0
- left(string, n)
- Definition: Returns the first n characters of a string.
- Use case scenario: You want to use the three-digit prefix
of each product code to determine which division is responsible for
the product.
- Example: left(product_code, 3)='DEV'
- len(string)
- Definition: Returns the total number of characters (that
is, the length) in a string.
- Use case scenario: You want to determine the actual length
of each customer's surname in the customer file.
- Example: len(cust_lastname)
- ltrim(string)
- Definition: Removes all space characters at the beginning
of a string.
- Use case scenario: You want to eliminate any leading spaces
in the customer's surname.
- Example: ltrim(cust_lastname)
- pad(string, begin, end)
- Definition: Adds space characters at the beginning and
at the end of a string.
- string
- The string to be converted.
- begin
- The number of spaces to add at the beginning of the string.
- end
- The number of spaces to add at the end of the string.
- Use case scenario: You want to add three spaces at the
beginning and end of each product title.
- Example: pad(product_title, 3)
- lpad(string, n)
- Definition: Adds space characters to the beginning of a
string.
- string
- The string to be converted.
- n
- The number of spaces to add to the beginning of the string.
- Use case scenario: You want to add three spaces at the
beginning of each product title.
- Example: lpad(product_title, 3)
- rpad(string, n)
- Definition: Adds space characters at the end of a string.
- string
- The string to be converted.
- n
- The number of spaces to add at the end of the string.
- Use case scenario: You want to add three spaces at the
end of each product title.
- Example: rpad(product_title, 3)
- right(string, n)
- Definition: Returns the last n characters of a string.
- string
- The string to be converted.
- n
- The number of spaces to return at the end of a string.
- Use case scenario: Use the three-digit suffix of each product
code to determine which division is responsible for the product.
- Example: right(product_code, 3)
- rtrim(string)
- Definition: Removes all space characters at the end of
a string.
- string
- The string to be converted.
- n
- The number of spaces to remove from the end of a string.
- Use case scenario: You want to eliminate spaces at the
end of the customer's surname.
- Example: rtrim(cust_lastname)
- substring(string, begin, length)
- Definition: Returns a substring of a string value.
- string
- The string value.
- begin
- The index of the first character to retrieve (inclusive), 1 being
the index of the first character in the string.
- length
- The length of the substring to retrieve.
- Use case scenario: You want to use the three-digit (actual
character positions four to six) value from each product code to determine
which division is responsible for the product.
- Example: substring(product_code, 4, 3)
- str(string, n)
- Definition: Creates a string of n occurrences of
a substring.
- Use case scenario: You want to create a filler field of
“ABCABCABCABC”.
- Example: str(‘ABC', 4)
- tostring(value, format string)
- Definition: Converts a value, such as number, time, or
date, to its string representation.
You can specify "format (string)"
to describe how the generated string should be formatted. If the
value to convert is a date, time or timestamp, then the format string
can contain the following format tags:
- %dd
- Represents the two-digit day (01 – 31)
- %mm
- Represents the two-digit month (01 – 12)
- %mmm
- Represents the three-character month abbreviation (For example,
Jan, Feb, Mar)
- %yy
- Represents the two-digit year (00 – 99)
- %yyyy
- Represents the four-digit year (nn00 – nn99)
- %hh
- Represents the two-digit hours (00 – 23)
- %nn
- Represents the two-digit minutes (00 – 59)
- %ss
- Represents the two-digit hours
- %ss
- Represents the two-digit seconds (00 – 59)
- %ss.n
- Represents the two-digit milliseconds (00 – 59), where n = fractional
digits (0 – 6)
If the value to convert is numeric, the format
string can contain one of the following format tags:
- %i
- Represents the value to be converted into a signed decimal integer,
such as "123."
- %e
- Represents the value to be converted into a scientific notion
(mantissa exponent), by using an e character such as 1.2345e+2.
- %E
- Represents the value to be converted into a scientific notion
(mantissa exponent), by using an E character such as 1.2345E+2.
- %f
- Represents the value to be converted into a floating point decimal,
such as 123.45.
The tag can also contain optional width and
precision specifiers, for example:
%[width][.precision]tag
For a numeric value, the format string follows the syntax
used by printed formatted data to standard output (printf) in C/C++.
- Use case scenario 1: You want to convert date values into
a string similar to this format: '12/01/2008'
- Example 1: tostring(dateCol, '%mm/%dd/%yyyy')
- Use case scenario 2: You want to convert numeric values
into strings, displaying the value as an integer between brackets.
An example of the preferred output is, "(15)".
- Example 2: val(numeric_col, '(%i)')
- Use case scenario 3: You want to convert a date/time value
to a string value so that you can export the data to a spreadsheet.
- Example 3: tostring(posting_date)
- trim(string)
- Definition: Removes all space characters at the beginning
and end of a string.
- Use case scenario: You want to eliminate any leading or
trailing spaces in the customer's surname.
- Example: trim(cust_lastname)
- ucase(string)
- Definition: Converts all alpha characters in a string to
uppercase.
- Use case scenario: You want to change all product codes
to use only uppercase letters.
- Example: ucase(product_code)
- val(value)
- Definition: Converts the string representation of a number
into a numeric value.
- string
- The string value to convert.
- Use case scenario 1: You want to convert all strings with
the value of 123.45 into a numeric value to do computations.
- Example 1: val('123.45')
- Use case scenario 2: You have a string column that contains numeric
values as strings, and want to make sure that all the values are smaller
than 100.
- Example 2: val(col)<100