Home / regular-expression
Regular Expression Notes
- Basics
- Extracting out the match
- Matching with multiple patterns
- Matching anything with Wildcard period(.)
- Match single character with multiple possibilities([])
- Match single characters not specified([^])
- Match character that occur one or more time(+)
- Match character that occur zero or more time(*)
- Find Character with lazy Matching(?)
- Match Beginning string patterns
- Match Ending string patterns
- Specify upper and lower limit to match({})
- Check for All or None(?)
- Matching multiple patterns (Lookaheads)
- Shorthand Character classes
- Flags
Basics
Regex is always put in //
and you don’t have to put any kind of quotes around the string you want to search. For example, if you want to search the
in the string the dog is very good
, you will use /the/
.
We can use test
method on a regex and pass it the string we want to check. It will return true
if pattern is found otherwise false
.
let waldoIsHiding = "Somewhere Waldo is hiding in this text."
let waldoRegex = /Waldo/
let result = waldoRegex.test(waldoIsHiding)
// prints true
console.log(result)
Extracting out the match
You can extract out the substring that matches with regex using match
function on regex and passing it the string to match.
let extractStr = "Extract the word 'coding' from this string.";
let codingRegex = /coding/;
let result = extractStr.match(codingRegex); // returns ["coding"]
// prints coding
console.log(result)
Matching with multiple patterns
You can search for multiple patterns using the alternation
or OR
or |
operator.
let petString = "James has a pet cat."
let petRegex = /dog|cat|bird|fish/
let result = petRegex.test(petString)
// prints true
console.log(result)
Matching anything with Wildcard period(.)
The wildcard character .
will match any one character. It is used when you don’t know a specific character at any position.
let exampleStr = "Let's have fun with regular expressions!";
let unRegex = /.un/; // this will match fun, gun, sun, mun, tun etc.
let result = unRegex.test(exampleStr);
console.log(result)
Match single character with multiple possibilities([])
character classes
allows us to define a group of characters we wish to match by placing item inside square []
brackets.
If we want to match bag
, bug
, beg
big
but not bog
. We will use /b[aieu]g/
. The [aieu]
is a character class that will only match the characters a
, i
, e
or u
.
In a character class
or character set
we can use -
to specify a range of characters like [a-e]
will match any character between a
and e
and including both ends.
const regex = /[a-z]/gi // it will match all the letters
Match single characters not specified([^])
We use negated character set
. To specify this we use ^
or caret
after the opening bracket and before the character we don’t want to match.
let quoteSample = "3 blind mice.";
let myRegex = /[^0-9aeiou]/gi; // this will match any character that is not digit and vowel
let result = quoteSample.match(myRegex);
console.log(result)
Match character that occur one or more time(+)
If want to check a character that has appeared one or more time in a row, we can use +
. For example /a+/g
will match a
in abc
, aa
in aabc
and [a, a]
in abab
as two a
’s are not continuos.
Match character that occur zero or more time(*)
*
is used same as +
except one thing that it will match character that appears zero or more time.
For example /go*/
will match goooo
in string gooool
and g
in string gut is roof
.
Find Character with lazy Matching(?)
In regular expression, a greedy
match finds the longest possible match that fits into the regex pattern and returns a match. The alternative is called lazy
matching, which finds the smallest possible part of the string that satisfies regex.
We can apply the regex /t[a-z]*i/
to the string titanic
. This regex is basically a pattern that starts with t, ends with i, and has some letters in between.
Regular expressions are by default greedy, so the match would return ["titani"]
. It finds the largest sub-string possible to fit the pattern.
However, we can use the ?
character to change it to lazy matching.titanic
matched against the adjusted regex of /t[a-z]*?i/
returns ["ti"]
.
Match Beginning string patterns
We can use ^
to specify that pattern should begin matching from starting of the string.
let rickyAndCal = "Cal and Ricky both like racing.";
let calRegex = /^Cal/; // Change this line
let result = calRegex.test(rickyAndCal);
console.log(result) // prints true
Match Ending string patterns
We can use $
to search the end of string in the regex.
Specify upper and lower limit to match({})
We can use the {lower, upper}
to specify the lower
and upper
limit to match the characters.
- {lower, upper} - Specify both lower and upper limit.
- {lower,} - Specify only lower limit.
- {exact} - Specify exact number of matches.
Check for All or None(?)
We can specify the possible existence of an element with question mark ?
. This checks for zero or one of the preceding elements. This is read as that previous symbol is optional.
For example /colou?r/
. It will match both color
and colour
. /jpe?g
will match both jpg
and jpeg
.
Matching multiple patterns (Lookaheads)
Lookaheads are pattern that tell javascript to look ahead in your string to check for patterns further along. This is useful when we want to search for multiple patterns over the same string.
There are two kinds of lookaheads:-
- Positive Lookahead (?=…) - A
positive lookahead
will look to make sure that the element in search pattern is there. A positive look ahead is used as(?=...)
where the...
is the required part that should be there. - Negative Lookahead (?!…) - A
negative lookahead
will look to make sure that the element in search pattern should not be there. A negative lookahead used as(?!...)
where the...
is the pattern that we don’t want to there.
Example:-
let password = "abc123";
let checkPass = /(?=\w{3,6})(?=\D*\d)/; // looks for between 3-6 character and at least one number
checkPass.test(password); // Returns true
Shorthand Character classes
- \w -
alphanumeric characters match
. This shortcut is equal to/[a-zA-Z0-9_]/
. This character class matches lowercase, uppercase, digits and_(underscore)
. - \W -
non alphanumeric characters match
. This shortcut is equal to/[^a-zA-Z0-9_]
. - \d -
digits
. This is equal to[0-9]
. - \D -
non-digit characters
. This is equal to[^0-9]
. - \s -
matches whitespace
. This will not only match whitespace, but also carriage return, tab, form feed, newline characters. This is similar to[ \r\t\n\f\v]
. - \S -
non-whitespace characters
.
Flags
- Ignore case: The
i
flag. It will ignore the case and will match the string. It is appended to the regex like this-/ignorecase/i
. This regex can matchignorecase
,IgnoreCase
,igNoreCase
etc. - Global find: By default
str.match(regex)
will find only first match and return it. If we want to find all matched then we useg
flag. It is written as/regex/g
.