Regular Expression Syntax

The following table shows basic regular expression syntax you can use when configuring Access Control and Post Transfer Actions:

Character

Meaning

.

Matches any single character.

[ ]

Indicates a character class. Matches any character inside the brackets (for example, [abc] matches "a", "b", and "c").

^

If this metacharacter occurs at the start of a character class, it negates the character class. A negated character class matches any character except those inside the brackets (for example, [^abc] matches all characters except "a", "b", and "c").

 

If ^ is at the beginning of the regular expression, it matches the beginning of the input (for example, ^[abc] will only match input that begins with "a", "b", or "c").

-

In a character class, indicates a range of characters (for example, [0-9] matches any of the digits "0" through "9").

?

Indicates that the preceding expression is optional: it matches once or not at all (for example, [0-9][0-9]? matches "2" and "12").

+

Indicates that the preceding expression matches one or more times (for example, [0-9]+ matches "1", "13", "456", and so on).

*

Indicates that the preceding expression matches zero or more times.

??, +?, *?

Non-greedy versions of ?, +, and *. These match as little as possible, unlike the greedy versions that match as much as possible (for example, given the input "<abc><def>", <.*?> matches "<abc>" while <.*> matches "<abc><def>").

( )

Grouping operator. Example: (\d+,)*\d+ matches a list of numbers separated by commas (for example, "1" or "1,23,456").

\

Escape character: interpret the next character literally (for example, [0-9]+ matches one or more digits, but [0-9]\+ matches a digit followed by a plus character). Also used for abbreviations (such as \a for any alphanumeric character; see the following table).

If \ is followed by a number n, it matches the nth match group (starting from 0). Example: <{.*?}>.*?</\0> matches "<head>Contents</head>".

$

At the end of a regular expression, this character matches the end of the input (for example,[0-9]$ matches a digit at the end of the input).

|

Alternation operator: separates two expressions, exactly one of which matches (for example, T|the matches "The" or "the").

!

Negation operator: the expression following ! does not match the input (for example, a!b matches "a" not followed by "b").

 

Additional Supported Characters

Refer to the following for additional information about regular expressions:

  • Post Transfer Action expressions support Perl syntax, which provides many additional options. For a complete reference, see http://www.boost.org/doc/libs/1_44_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html.
  • Access Control expressions use Basic Regular Expression (BRE) syntax. For these expressions, the following abbreviations are also supported.

    Abbreviation

    Matches

     

    \a

    Any alphanumeric character: ([a-zA-Z0-9])

     

    \b

    White space (blank): ([ \\t])

     

    \c

    Any alphabetic character: ([a-zA-Z])

     

    \d

    Any decimal digit: ([0-9])

     

    \h

    Any hexadecimal digit: ([0-9a-fA-F])

     

    \n

    Newline: (\r|(\r?\n))

     

    \q

    A quoted string: (\"[^\"]*\")|(\'[^\']*\')

     

    \w

    A simple word: ([a-zA-Z]+)

     

    \z

    An integer ([0-9]+)