Regular expressions are commonly known as regex. These are nothing more than a pattern or a sequence of characters, which describe a special search pattern as text string.
Regular expression allows you to search a specific string inside another string. Even we can replace one string by another string and also split a string into multiple chunks. They use arithmetic operators (+, -, ^) to create complex expressions.
By default, regular expressions are case sensitive.
Regular expression is used almost everywhere in current application programming. Below some advantages and uses of regular expressions are given:
You can create complex search patterns by applying some basic rules of regular expressions. Many arithmetic operators (+, -, ^) are also used by regular expressions to create complex patterns.
Operator | Description |
---|---|
^ | It indicates the start of string. |
$ | It indicates the end of the string. |
. | It donates any single character. |
() | It shows a group of expressions. |
[] | It finds a range of characters, e.g., [abc] means a, b, or c. |
[^] | It finds the characters which are not in range, e.g., [^xyz] means NOT x, y, or z. |
- | It finds the range between the elements, e.g., [a-z] means a through z. |
| | It is a logical OR operator, which is used between the elements. E.g., a|b, which means either a OR b. |
? | It indicates zero or one of preceding character or element range. |
* | It indicates zero or more of preceding character or element range. |
+ | It indicates zero or more of preceding character or element range. |
{n} | It denotes at least n times of preceding character range. For example - n{3} |
{n, } | It denotes at least n, but it should not be more than m times, e.g., n{2,5} means 2 to 5 of n. |
{n, m} | It indicates at least n, but it should not be more than m times. For example - n{3,6} means 3 to 6 of n. |
\ | It denotes the escape character. |
Special Character | Description |
---|---|
\n | It indicates a new line. |
\r | It indicates a carriage return. |
\t | It represents a tab. |
\v | It represents a vertical tab. |
\f | It represents a form feed. |
\xxx | It represents an octal character. |
\xxh | It denotes hexadecimal character hh. |
PHP offers two sets of regular expression functions:
The structure of POSIX regular expression is similar to the typical arithmetic expression: several operators/elements are combined together to form more complex expressions.
The simplest regular expression is one that matches a single character inside the string. For example - "g" inside the toggle or cage string. Let's introduce some concepts being used in POSIX regular expression:
Brackets [] have a special meaning when they are used in regular expressions. These are used to find the range of characters inside it.
Expression | Description |
---|---|
[0-9] | It matches any decimal digit 0 to 9. |
[a-z] | It matches any lowercase character from a to z. |
[A-Z] | It matches any uppercase character from A to Z. |
[a-Z] | It matches any character from lowercase a to uppercase Z. |
The above ranges are commonly used. You can use the range values according to your need, like [0-6] to match any decimal digit from 0 to 6.
A special character can represent the position of bracketed character sequences and single characters. Every special character has a specific meaning. The given symbols +, *, ?, $, and {int range} flags all follow a character sequence.
Expression | Description |
---|---|
p+ | It matches any string that contains atleast one p. |
p* | It matches any string that contains one or more p's. |
p? | It matches any string that has zero or one p's. |
p{N} | It matches any string that has a sequence of N p's. |
p{2,3} | It matches any string that has a sequence of two or three p's. |
p{2, } | It matches any string that contains atleast two p's. |
p$ | It matches any string that contains p at the end of it. |
^p | It matches any string that has p at the start of it. |
PHP provides seven functions to search strings using POSIX-style regular expression -
Function | Description |
---|---|
ereg() | It searches a string pattern inside another string and returns true if the pattern matches otherwise return false. |
ereg_replace() | It searches a string pattern inside the other string and replaces the matching text with the replacement string. |
eregi() | It searches for a pattern inside the other string and returns the length of matched string if found otherwise returns false. It is a case insensitive function. |
eregi_replace() | This function works same as ereg_replace() function. The only difference is that the search for pattern of this function is case insensitive. |
split() | The split() function divide the string into array. |
spliti() | It is similar to split() function as it also divides a string into array by regular expression. |
Sql_regcase() | It creates a regular expression for case insensitive match and returns a valid regular expression that will match string. |
Perl-style regular expressions are much similar to POSIX. The POSIX syntax can be used with Perl-style regular expression function interchangeably. The quantifiers introduced in POSIX section can also be used in PERL style regular expression.
A metacharacter is an alphabetical character followed by a backslash that gives a special meaning to the combination.
For example - '\d' metacharacter can be used search large money sums: /([\d]+)000/. Here /d will search the string of numerical character.
Below is the list of metacharacters that can be used in PERL Style Regular Expressions -
Character | Description |
---|---|
. | Matches a single character |
\s | It matches a whitespace character like space, newline, tab. |
\S | Non-whitespace character |
\d | It matches any digit from 0 to 9. |
\D | Matches a non-digit character. |
\w | Matches for a word character such as - a-z, A-Z, 0-9, _ |
\W | Matches a non-word character. |
[aeiou] | It matches any single character in the given set. |
[^aeiou] | It matches any single character except the given set. |
(foo|baz|bar) | Matches any of the alternatives specified. |
There are several modifiers available, which makes the work much easier with a regular expression. For example - case-sensitivity or searching in multiple lines, etc.
Below is the list of modifiers used in PERL Style Regular Expressions -
Character | Description |
---|---|
i | Makes case insensitive search |
m | It specifies that if a string has a carriage return or newline characters, the $ and ^ operator will match against a newline boundary rather than a string boundary. |
o | Evaluates the expression only once |
s | It allows the use of .(dot) to match a newline character |
x | This modifier allows us to use whitespace in expression for clarity. |
g | It globally searches all matches. |
cg | It allows the search to continue even after the global match fails. |
PHP currently provides seven functions to search strings using POSIX-style regular expression -
Function | Description | |
---|---|---|
preg_match() | This function searches the pattern inside the string and returns true if the pattern exists otherwise returns false. | |
preg_match_all() | This function matches all the occurrences of pattern in the string. | |
preg_replace() | The preg_replace() function is similar to the ereg_replace() function, except that the regular expressions can be used in search and replace. | |
preg_split() | This function exactly works like split() function except the condition is that it accepts regular expression as an input parameter for pattern. Mainly it divides the string by a regular expression. | |
preg_grep() | The preg_grep() function finds all the elements of input_array and returns the array elements matched with regexp (relational expression) pattern. | |
preg_quote() | Quote the regular expression characters. |