Regular expression notation for HTTP Server

This topic provides a general overview of regular expression notation for the IBM® HTTP Server for i Web server.

A regular expression notation specifies a pattern of character strings. One or more regular expressions can be used to create a matching pattern. Certain characters (sometimes called wildcards) have special meanings. The following table describes the commonly used pattern matching scheme.

Regular expression pattern matching

Pattern Description
string string with no special characters matches the values that contain the string.
[set] Match a single character specified by the set of single characters within the square brackets.
[a-z] Match a character in the range specified within the square brackets.
[^abc] Match any single character not specified in the set of single characters within the square brackets.
{n} Match exactly n times.
{n,} Match at least n times.
{n,m} Match at least n times, but no more than m times.
^ Match the start of the string.
$ Match the end of the string.
. Match any character (except Newline).
* Match zero or more of preceding character.
+ Match one or more of preceding character.
? Match one or zero of preceding character.
string1|string2 Match string1 or string2.
\ Signifies an escape character. When preceding any of the characters that have special meaning, the escape character removes any special meaning from the character. For example, the backslash is useful to remove special meaning from a period in an IP address.
(group) Group a character in a regular expression. If a match is found the first group can be accessed using $1. The second group can be accessed using $2 and so on.
(?<name>regex) Named capturing group. Captures the text matched by "regex" into the group "name". The name can contain letters and numbers but must start with a letter.
\1 through \9 Backreference. Substituted with the text matched between the 1st through 9th numbered capturing group.
\10 through \99 Backreference. Substituted with the text matched between the 10th through 99th numbered capturing group.
\w Match an alphanumeric character.
\W Match a character that is not an alphanumeric character.
\s Match a white-space character.
\S Match a character that is not a white space character
\t Tab character.
\n Newline character.
\r Return character.
\f Form feed character.
\v Vertical tab character.
\a Bell character.
\b word boundary
\B not a word boundary
\0dd Octal character, for example \076 matches character ">".
Note: d must between 0 and 7
\ddd Octal character, for example \101 matches character "A".
Note: d must between 0 and 7
\o{ddd..} Octal character, for example \o{123} matches character "S"
Note: d must between 0 and 7
\xnn Hex character, for example \x41 matches character "A".
\cx Control character, for example \cJ matches newline character "\n".
Note: x is any ASCII printing character
\d Match a decimal digit
\D Match a character that is not a decimal digit
\Q...\E Escape sequence. Characters between \Q and \E are treated as literals

Examples of regular expression pattern matching

Pattern Examples of strings that match
ibm ibm01, myibm, aibmbc
^ibm$ ibm
^ibm0[0-4][0-9]$ ibm000 through ibm049
ibm[3-8] ibm3, myibm4, aibm5b
^ibm ibm01, ibm
ibm$ myibm, ibm, 3ibm
ibm... ibm123, myibmabc, aibm09bcd
ibm*1 ibm1, myibm1, aibm1abc, ibmkkkkk12
^ibm0.. ibm001, ibm099, ibm0abcd
^ibm0..$ ibm001, ibm099
10.2.1.9 10.2.1.9, 10.2.139.6, 10.231.98.6
^10\.2\.1\.9$ 10.2.1.9
^10\.2\.1\.1[0-5]$ 10.2.1.10, 10.2.1.11, 10.2.1.12, 10.2.1.13, 10.2.1.14, 10.2.1.15
^192.\.168\..*\..*$ (All addresses on class B subnet 192.168.0.0)
^192.\.168\.10\..*$ (All addresses on class C subnet 192.168.10.0)