I’m about to take a walk around Regular Expressions widely known as RegEx in Microsoft Lync Server 2013, Microsoft Lync Server 2010, like Microsoft Office Communications Server 2007, Regular Expressions can be used for phone number matching and translation. The Regular Expression pattern syntax is powerful and complex to cover almost any dialing criteria. Use of Regular Expressions for phone number manipulation is supported in the following areas:
- Dial Plan Normalization Rules,
- Voice Routes Pattern Match, and
- Trunk Configuration Translation Rules.
In this Article:
Testing Your RegularExpression
Shorthand Character Classes \d, \w, \s
Character Sets [a-z], [A-Z], [0-9]
Below is a table describes a quick examples for Regular Expressions; Note that:
- Egypt Country Code is 20
- Cairo Area Code is 2 and dial plan is 8 numbers length
- Alexandria Area Code is 3 and dial plan is 7 numbers length
- Banha Area Code is 13 and dial plan is 7 numbers length
RuleName |
Description |
Pattern |
Translation |
Example |
4digitExtension |
Translates 4-digit extensions |
^(\d{4})$ |
+2023303$1 |
3045 is translated to +20233033045 |
5digitExtension |
Translates 5-digit extensions |
^3(\d{4})$ |
+2023308$1 |
33045 is translated to +20233083045 |
8digitcallingCairo |
Translates 8-digit numbers to Cairo local number (area code=2) |
^(\d{8})$ |
+202$1 |
33033045 is translated to +20233033045 |
7digitcallingAlexandria |
Translates 7-digit numbers to Alexandria local number (area code=3) |
^(\d{7})$ |
+203$1 |
33033045 is translated to +20333033045 |
7digitcallingBanha |
Translates 7-digit numbers to Banha local number (area code=13) |
^(\d{7})$ |
+2013$1 |
33033045 is translated to +201333033045 |
LDCallingEG |
Translates numbers with LD prefix in Egypt |
^2(\d{10,11})$ |
+2$1 |
20233033045 is translated to +20233033045 and 201333033045 is translated to +201333033045 |
IntlCallingEG |
Translates numbers with international prefix in Egypt (48) |
^48(\d*)$ |
+$1 |
48914412345678 is translated to +914412345678 |
HQOperator |
Translates 0 to HQ Operator |
^0$ |
+20233033045 |
0 is translated to +20233033045 |
SharkeyaSitePrefix |
Translates numbers with on-net prefix (6) and Sharkeya site code (11) |
^611(\d{4})$ |
+2055234$1 |
6111234 is translated to +20552341234 |
MansouraSitePrefix |
Translates numbers with on-net prefix (6) and Mansoura site code (22) |
^622(\d{4})$ |
+2050234$1 |
6221234 is translated to +20502341234 |
LoxurSitePrefix |
Translates numbers with on-net prefix (6) and Luxor site code (33) |
^633(\d{4})$ |
+2095234$1 |
6331234 is translated to +20952341234 |
Testing Your Regular Expression:
There are many great tools to test your RegEx. Regexhero.net has an excellent online tool for testing your expression here
Now let’s have a deep dive onto RegEx.
In regular expressions, all characters match themselves except
for the following special characters:
. [ { ( ) * + ? | ^ $ \ |
The single character ‘.’ when used outside of a character set will match any single character.
Pattern |
Description |
. |
Matches any single character |
\. |
Matches the literal period . character |
Anchors ^, $
The anchor characters are used to match the beginning or end of a line.
Pattern |
Description |
^ |
Matches the start of a line, not including the first character of the line |
$ |
Matches the end of a line, not including the last character of the line |
A section beginning with open parenthesis ( and ending with a closed parenthesis ) acts as a Marked Group. The string that matches the group pattern is preserved for later use. Marked Groups can also be repeated, or referred to by a Back-Reference..
Pattern |
Description |
( ) |
Used to group expressions and to capture a set of characters for use in a back-reference. |
\ ( |
Matches the open parenthesis ( character |
\ ) |
Matches the close parenthesis ) character |
A Marked Group is useful to lexically group part of a regular expression, but has the side-effect of spitting out an extra field in the result. As an alternative, you can lexically group part of a regular expression, without generating a marked group by using (?: and ) , for example (?:ab)+ repeats the “ab” match phrase without splitting out a separate marked group.
Pattern |
Description |
(?: ) |
Used to group expressions without capturing them for a back-reference |
Shorthand Character Classes \d,\w, \s
These expressions provide a shorthand way to describe a class of characters, for example; \d matches any numeric digit. The capital versions of these shorthand express the negative version of this character, for example; \D matches any non-digit character.
Pattern |
Description |
\d |
Matches a numeric digit (0 to 9) |
\w |
Matches a word character (letters, digits, underscores)
|
\s |
Matches a whitespace character (space, tab, line breaks) |
\D |
Matches a non-numeric character (no number 0 to 9) |
\W |
Matches a non-word character (not a letter, digit or underscore) |
\S |
Matches a non-whitespace character (not a space, tab or line break) |
The \d shorthand is commonly used with the curly bracket repeater expression to match a specific number of digits, for example: \d{4} to match four consecutive digits.
The repeater characters ( *, +, ?, and {} ) enable matching of a character, expression or character class that is repeated.
Pattern |
Description |
* |
Match the preceding character or expression zero to unlimited times. |
+ |
Match the preceding character or expression one to unlimited times. |
{n} |
Match the preceding character or expression exactly n times |
{n,m} |
Match the preceding character or expression at least n times and at most m times |
{n,} |
Match the preceding character or expression at least n times and unlimited times |
? |
Optionally match the preceding character or expression |
* Examples
The * operator matches the preceding atom zero or more times,
for example the expression a*b matches the following input:
Pattern |
Input |
Match? |
a*b |
b |
Yes |
a*b |
ab |
Yes |
a*b |
aaaaaaaab |
Yes |
a*b |
acb |
No |
a*b |
aaaaaaacb |
No |
+ Examples
The + operator matches the preceding atom one or more times, for example the expression a+b matches the following input:
Pattern |
Input |
Match? |
a+b |
b |
No |
a+b |
ab |
Yes |
a+b |
aaaaaaaab |
Yes |
a+b |
acb |
No |
a+b |
aaaaaaacb |
No |
? Examples
The ? operator matches the preceding
atom zero or one times, for example the expression ca?b matches the following input:
Pattern |
Input |
Match? |
ca?b |
b |
No |
ca?b |
ab |
No |
ca?b |
cb |
Yes |
ca?b |
cab |
Yes |
ca?b |
caab |
No |
{ } Examples
The curly bracket repeaters allow matching of a character or expression a specific number of times.
Pattern |
Input |
Match? |
h{4} |
hhhh |
Yes |
h{4} |
hh |
No |
h{4} |
hhhhh |
No |
Pattern |
Input |
Match? |
h{2,5} |
hhhh |
Yes |
h{2,5} |
hh |
Yes |
h{2,5} |
hhhhh |
Yes |
h{2,5} |
h |
No |
h{2,5} |
hhhhhh |
No |
Pattern |
Input |
Match? |
h{3,} |
hh |
No |
h{3,} |
hhh |
Yes |
h{3,} |
hhhhhh |
Yes |
h{3,} |
hhhhhhhhhh |
Yes |
The normal repeat operators try to match as much input as possible, and so are described as “greedy” expressions. Adding a question mark ? after a repeater symbol alters this matching behavior and makes the expression match as little input as possible while still producing a match. A regular expression altered in this way is sometimes referred to as a “lazy” expression.
Pattern |
Description |
*? |
Matches the previous character or expression zero or more times, while consuming as little input as possible |
+? |
Matches the previous character or expression one or more times, while consuming as little input as possible |
?? |
Matches the previous character or expression zero or one times, while consuming as little input as possible |
{n,}? |
Matches the previous character or expression n or more times, while consuming as little input as possible |
{n,m}?
|
Matches the previous character or expression at least n and at most m times, while consuming as little input as possible |
An escape character followed by a digit n, where n is in the range 1-9, matches the same string that was matched by Marked Group. Marked Groups are created with an open and close parenthesis pair ().
Pattern |
Description |
\1 |
Outputs the content of the first capturing marked group
|
\2 |
Outputs the content of the second capturing marked group |
\n |
Outputs the content of the n capturing marked group (n must be a number, not the character n) |
Examples
Pattern |
Input |
Match? |
^(a*).*\1$ |
aaabbaaa |
Yes |
^(a*).*\1$ |
aaabba |
No |
The | operator match the characters or expressions to the left and right of the | symbol.
Pattern |
Description |
|
| |
Match the character or expression on either side of the | character |
|
Pattern |
Input |
Match? |
abc|def |
abc |
Yes |
abc|def |
def |
Yes |
The alternation operator is typically used with parenthesis to separate the alternate expressions from the rest of the matching expression.
Pattern |
Description |
|
( | ) |
Match the character or expression on either side of the | character, contained within the parenthesis ( ) |
|
Pattern |
Input |
Match? |
(abc|def) |
abc |
Yes |
(abc|def) |
def |
Yes |
(abc|def) |
ab |
No |
(abc|def) |
ef |
No |
a(bc|de)f |
abcf |
Yes |
a(bc|de)f |
adef |
Yes |
a(bc|de)f |
abcdef |
No |
a(bc|de)f |
af |
No |
a(bc|de)f |
acdf |
No |
Character Sets [a-z], [A-Z], [0-9]
A character set is a bracket-expression starting with [ and ending with ], it defines a set of characters, and matches any single character that is a member of that set.
Pattern |
Description |
[a-z] |
Matches any lowercase letter in the range ‘a’ to ‘z’ |
[A-Z] |
Matches any UPPERCASE letter in the range ‘A’ to ‘Z’ |
[a-c] |
Matches any character in the range ‘a’ to ‘c’ |
[abc] |
Matches any of the characters ‘a’, ‘b’, or ‘c’ |
[0-9] |
Matches any number in the range ‘0’ to ‘9’ |
[5-8] |
Matches any number in the range ‘5’ to ‘8’ |
[1234] |
Matches any of the numbers ‘1’, ‘2’, ‘3’ or ‘4’ |
[0-9a-zA-Z] |
Matches any character in the ranges ‘0’ to ‘9’, ‘a’ to ‘z’ or ‘A’ to ‘Z’ |
This expression is used within a square bracket pair to match any character that is not in the range or set of characters, for example; The regular expression [^a-c] matches any character that is not in the range a-c.
Pattern |
Description |
|
[^] |
Matches any character that is not shown after the caret ^ character |
|
Pattern |
Input |
Match? |
[^a-z] |
1 |
Yes |
[^a-z] |
9 |
Yes |
[^a-z] |
B |
Yes (since ‘B’ is not lowercase) |
[^1-9] |
4 |
No |
[^1-9] |
0 |
Yes (since ‘0’ is not in the range 1 through 9) |
[^@] |
@ |
No |
[^<] |
< |
No |
[^>] |
> |
No |
Any special character preceded by an escape character ‘\’ shall match itself. The special characters are as follows:
. [ { ( ) * + ? | ^ $ \ |
||
Pattern |
Description |
|
|
If the following character is a special character, ignore its special meaning and match the literal character. |
|
|
Match the ‘.’ character literally |
|
|
Match the ‘[‘ character literally |
|
|
Match the ‘{‘ character literally |
|
|
Match the ‘(‘ character literally |
|
|
Match the ‘)’ character literally |
|
|
Match the ‘*’ character literally |
|
|
Match the ‘)’ character literally |
|
|
Match the ‘|’ character literally |
|
|
Match the ‘^’ character literally |
|
|
Match the ‘$’ character literally |
|
|
Match the ‘\’ character literally |
http://en.wikipedia.org/wiki/Regular_expression,
http://www.regular-expressions.info/,
https://support.net.com/display/VXDOC/Understanding+Regular+Expressions
My programmer is trying to persuade me to move to .
net from PHP. I have always disliked the idea
because of the expenses. But he’s tryiong none the less. I’ve been using Movable-type on several websites for about a year and am anxious about switching to another
platform. I have heard very good things about blogengine.
net. Is there a way I can import all my wordpress posts into
it? Any help would be greatly appreciated!
I’m sorry, I’ve no idea 🙂
Good info. Lucky me I came across your site by accident (stumbleupon).
I have saved it for later!
I’m the lucky by having my blog in your favorites 🙂
Appreciation to my father who informed me on the topic
of this webpage, this webpage is actually amazing.
Thank you, and thanks to your father too 🙂
The other day, wɦile Ι was at work, my cousin
stole my apploe ipad and testwd to see if it can survive a forgy foot drop,
just so she can be a youtube sensation. My apple ipad is now destroyed and shе has 83 views.
I know this is commpletely off topic but I
hadd to sharе it with someone!
I’m sorry for your ipad.
Did you watched “Jobs”, the movie?
Hi good explanation but I think your 5 digit (5digitExtension) translation is incorrect. Shouldn’t 33045 be translated to +202 330 8 3045 or you transposed the 8 for a 3.
You are correct.
I updated it.
Great beat ! I would like to apprentice while you amend your website, how could
i subscribe for a weblog site? The account helped
me a acceptable deal. I have been tiny bit acquainted of this your
broadcast offered vibrant transparent idea
Peculiar article, totally wat I needed.
Hi.
Can I configure a route pattern like this ^\*810(\d+).
It should work..
Thank you for the answer.
But I get error 404 every time. See log below
Trace-Correlation-Id: 881287520
Instance-Id: 1066
Direction: outgoing;source=”local”
Peer: 192.168.219.8:60066
Message-Type: response
Start-Line: SIP/2.0 404 No matching rule has been found in the dial plan for the called number.
From: ;tag=e0373e4402;epid=85f032e261
To: ;tag=582C18E68254F818640CDEF7E3369051
Call-ID: f195ac3b8c3e4ccc98cfd1934515d580
CSeq: 1 INVITE
Via: SIP/2.0/TLS 192.168.219.8:60066;ms-received-port=60066;ms-received-cid=3FA00
Content-Length: 0
ms-diagnostics: 14010;reason=”Unable to find an exact match in the rules set”;source=”VMSLYNCFE1.VIRTUAL.LOCAL”;CalledNumber=”*810186612345″;ProfileName=”DefaultProfile”;appName=”TranslationService”
Also I tried to test by using a menu “Voice Routing” -> “Test Voice Routing” – the test result is Failed. All other routes work correctly. I have problem only with the route with the sign “*”.
When I changed ^\*810(\d+) on ^\+810(\d+) the problem disappeared.
Is it possible to route called number that include the symbol “*”.
Theoretically. yes you can.
test is here as well http://regexhero.net/tester/
many thanks for this document.
i have a urgent requirement for rule to match any input that is not +44. I have break my head for long time; however cant figure out the pattern.