If you already have a subscription, you can sign in.
Enjoy free content straight from your inbox 💌
00:00
There is a saying that when you see a problem and decide to use a regular expression to solve the problem, all of a sudden you now have two problems. Humor aside, the main objective of a regular expression, which is also called a reject, is to validate search and replace text. We can create an instance of a regular expression within JavaScript by using the built-in reg apps constructor, but this is really how people normally create regular expressions because they are such a core part of JavaScript that JavaScript has special syntax for creating regular expressions.
00:33
And as far as differences between the two is concerned, they're both equivalent. They are both instances of the built-in res class. The string representation of the regex gives us the original special syntax that we would have used when we wanted to create the regex and we can get just the so string, which we would've passed to the constructor by using the source property. One of the simplest methods on the regex is the test method, which takes an input string and returns true if the reject finds a match within that string. Now this particular reject that we have created is searching for the characters A, B, C, D, E, F, in that sequence.
01:08
So of course it matches this particular string. However, by default it is case sensitive. So something like capital A, B, C, D, E, F will not match and therefore test returns false. Now regex is actually searching for a match within that string. So if A, B, C, D, E, F exists anywhere within the string test will return true in addition to the source. The other portion of configuring a regular expression are the flags which determine key behaviors of the regular expression. One common flag is I, which stands for case in sensitivity and G, which stands for global, which is required for functions that are designed to work for multiple matches.
01:45
There are other flags as well, but they're really used so we won't pull the covering them in this particular tutorial. We can pass the flag to the regex constructor as a second argument. Or alternatively, when we are using the special syntax, we can provide them after the ending slash. Here we are providing only one flag, which is I for case in sensitivity. So this particular regex will match A, B, C, D, E, F, lowercase as well as any mixed case version or full case version of the same characters. Similarly, we can create a regular expressions for the characters f o o,
02:18
and then we can create another version with the global flag. With the global flag. The guidance is pretty simple. If you are using a function that does not end with all, do not use the global flag. Otherwise you will get some behavior that you should not care about in modern JavaScript. For our demo demonstrating the differences between the global and the non-global version of a regex, we will create a very simple string table football foosball and you can see that FU appears twice within this string. With the match function, you should actually use the non-global version and this will still give you back an array, but as you can see,
02:52
it only contains one match and that is because we are using the non-global version. If you want to get all the matches, we should Use the more modern match all function and if you actually try to pass it a re reject that does not have the global flag Taos Skip will actually throw ontime error. That match all was called with a non-global re reject. The fix of course is pretty simple. We should invoke the all style function with Aex that has the global flag and in that particular case it returns an ator and as you can see it has two items inside of it. So it is matching both the football and the foosball fu.
03:26
As you would expect, sometimes you might need to change the flags on the Rex. For example, we have a simple reject with just one flag, which is I, and we can actually create different one with both I and G by using the regex constructor using the source property of the original regex and then using the original flags and combining anything else that we want. For example, we want to make this one global methods exist on the regular expression object that take a string, but most commonly you'll be using methods that are built into the string type and pass in the regular expression as the argument you've already seen the
04:02
reject method test, which takes an input string and returns true if the reject finds a match within that string. A method actually exists on a string called search that takes a reject as an input and then returns the index of the first match within that string. So here the match for A, B, C exists at index zero and within underscore A, B, C, the match for A B C will exist at index one. Of course if no match is found, then the search function returns minus one, which is similar to other index-based search functions we have looked at. In this course,
04:35
one of the most powerful functions for a regular expression within JavaScript is string match. If no match is found, then it returns now. So of course A B C does not contain any instances of fu. However, within the string games, football fool's ball FU exists twice and matches something that you should use without the global flag and it'll give you back the first match at a property index zero and another property. This match will have is index which will tell you where in the string this particular match was found. In this particular case it was found at index six. If you want to find all of the matches,
05:10
you should use the match all function and then pass in a reject that contains the global flag. It always returns an iterator, which we can easily convert into an array by using the spread operator and in this particular case, no match is found so the array is going to be empty. However, when we repeat the same example games football foosball with match all and FU being case insensitive and with the global flag we get back in iterator, that should have two matches and each of the items within the match are going to be exactly the same as the result for the single match that is at the zero index,
05:43
we will find the matched value and at the index property we will find the index for that match similar to the string match string match all duality string also has replace and replace all functions and You should use replace without the global flag and you can replace the first instance. As an example. Here we are using replace to replace the first instance of A with capital A or alternatively you can invoke replace all with a global flag and here we are replacing all instances of A to be capital case. And finally, the split method that exists on a string also accepts regular expression and
06:19
this returns an array of the text surrounding the matches. So in this particular input we get the strings that are surrounding the character A. Obviously there is a lot to regular expressions and as they get large, they get ape hard to understand. So here are some excellent resources that will help us deep dive into regular expressions. And with these you can just pop in your regular expression and it'll explain in human language what the regular expression is trying to do. The first resource that I wanna point out is reject 1 0 1, and this is what I tend to use when I'm working with languages beyond just JavaScript.
06:52
On the left you can select the language that you are working with and right now we are working with JavaScript. On the right you can provide your text and in the top you can provide the regular expression that you want to use to match with that text. So here we have the input string, a, B, C, D, E, F, a, B, C, and when we pop in the regular expression A, B, C, you can see that it finds two matches, one from index zero to three and the other from index eight to 11. But the most important thing over here is that it is explaining what the regular expression is trying to do. In this simple case, it is simply trying to match A, B, C, literally the characters in that order.
07:28
When I'm exclusively working with JavaScript, perhaps a better resource is reer.com. Here again, you can provide your input text and we will use the same text, a, B, C, D, E, F, a, B, C, and on top we can provide a regular expression. So here we are using the regular expression A, B, C, and you can see that it again found two matches and it is explaining what that regular expression is trying to do. In this simple case, it is looking for the character A, followed by the character B, followed by the character. C. If all we are doing is searching and replacing simple exact strings,
08:02
JS kept built in string fine and replaced would be sufficient. But what makes regular expressions really powerful are its support for special characters. So let's uncover the full-blown programming language in itself. That is regular expressions. The simplest special characters are the carrot and the dollar sign, which are used to match the start and the end of a string. So given this input string, we know that we can find all instances of the character's A, B, C in that order by simply using the regular expression A, B, C. However, if we only want to match the A, B, C, if it exists at the start of the string,
08:36
we can use the carrot character to specify that we only want to match A, B, C when it appears in the beginning of that text. And similarly, if you only want to match A, B, C, if it appears at the end we can use the dollar sign to specify that match me A, B, C only when it is followed by the end of the text. So now a question that you should have is what if you want to match the carrot character by itself for that purpose you can actually use the escape character which is backslash. So backslash carrot will match carrot exactly and similarly backslash dollar will match the dollar character and actually backslash backslash will match the
09:13
backslash character itself. And in addition to these special characters, we can use the backslash with any of the other special characters that we will look at as well. One of the most powerful features of JavaScript regular expressions is that they support a look ahead and look behind which are collectively called look around. So given this input string, we know that we can find all instances of the character B, but just using B. However, what if we only want to find the Bs if they are proceeded by the character A. This is called a look behind, which we can do with parentheses question mark less than to signify behind,
09:48
equal to to signify that it should exist and anything that we want to match, which in our particular case is simply the character A. And now you can see that only the BSS with the A before them have been matched instead of equals. We can use the not to specify that the sequence which in our case is just a should not exist and now only those Bs have been selected with the preceding character before P was not a similar to positive and negative look behind we have the positive and negative look ahead. So if you want to make sure that we only select the B if an A appears in the
10:23
forward, we can actually use ES question mark equal followed by the pattern which is A, and now only those Bs have been selected which are followed by A and similar to the positive look ahead, there is a negative look ahead. We simply replace the equal sign with the knot operator and now it'll search for BS such that the next character is not going to be A. And in our particular example there is only one search. B regular expressions also allow you to specify quantifiers so you can choose to only match text if it appears a certain quantity of times. So consider the inputs text which is a soup of B followed by os,
10:59
followed by edges. We can select all instances of BS followed by O using bo, but as you can see we have an O without a B. So we can actually make B optional by using the question mark quantifier and then we can select zero or more O'S by using the star quantifier. And finally we have the edge at the end and again we will select zero or more ET by using edge star. Unlike the star which is zero or more, we can actually use the plus quantifier which is one or more. So in this particular case B is optional, O must exist one or more times and H can exist zero or more times.
11:35
In addition to the question mark plus and star quantifiers, we can actually provide exact numbers as as well as an example, if you want O to appear exactly twice, we can provide that number between curly bracket to specify the quantity must be two. We can even provide a minimum And a maximum quantity. So as an example, we can specify that O must appear a minimum of one and a maximum of two times or even something more diverse like a minimum of one and a maximum of 10 times. And you can see that we are only matching strings if they have the required quantity of O regular expressions.
12:10
Also support built-in character classes that allow you to easily match a set of predefined characters. The built-in character class word can be used with back slash w and as you can see it matches all alphanumerics along with under another character class is back slash T which stand for digits which are the characters zero to nine to match all of the whitespace including spaces, tabs, and line breaks. We can use backslash Ss. We can actually negate all of these character clause by using capitalized versions of them. For example,
12:41
backslash capital W matches anything that is not a word, which in our case is just a whitespace slash capital D matches. Anything that is not a digit, which is everything other than the digits back slash capital Ss matches everything that is not a whitespace. A very special character class is actually dot, which matches anything except a line break. Its main use case is matching things that surround a particular pattern. For example, we can select any character followed by E, followed by whatever follows E by using E,
13:15
which in our example matches meg. In addition to the built-in character classes, you can also define your own custom character sets. We know that we can match all instances of A using A and all instances of Z using Z, but we can actually match all instances of A or Z by creating a character set between scale brackets that only include the characters A and Z. Additionally, we can include the dash between A and Z to include all of the characters that exist in the range A to Z, which is all of the alphabet. We can add numbers to this set as well by using zero to nine.
13:50
And we can even bring in complete character set for example, all of the white space by using back slash Ss. Now because Dash is used to provide a range within character sets, if you want to match the dash, you must provide it either at the start or at the end of the character set. We can actually negate the character set as well. So if you only want to match things that are not a part of this character set, we can use the carrot as the first character within the character set. So now in our example, it's only matching capital characters which are not a part of this particular character set.
14:21
Another special character worth knowing is the pipe which is used as the logical or we know that we can match the simple character sequence EBA using EBA or Lowell using Lowell. But what if you want to match EBA or Lowell? Well the answer is pretty simple. Just put a pipe between the two character sequences and now it's either going to match the first one or it's going to match the second one and nothing else is going to get matched when you want to take a portion of the match and add additional specifiers to that group. We can do that by creating what are called capture groups. As an example, let's say we want to match the characters above when they appear twice in a
14:59
sequence. We can try to use the quantifier two at the end, but it's actually only going to be quantifying the last A. So it is A B, B, a, a. But what we really want is the whole group ever appearing twice. We can do that by adding a grouping before we use the quantifier and now it's only going to match A, B, B, a, a, B, B A. The reason why groups in regular expressions are called capture groups is because they allow you to get access to what is matched for each group. The groups are essentially captured and handed over to you for further analysis, consider the string from which we want to extract any multiplication of two
15:36
numbers, which we assume are going to be the width and the height. We can match digits using back slash D and we want these digits to appear one or more times, which we do with the plus quantifier. Then we want to follow it up with an X and then another digit sequence one or more times with backslash the plus. And it does achieve our objective of matching this particular string. But wouldn't it be create if we could extract the width and the height portion of this particular match into their own groups? And that is why groups are called capture groups. We simply put parenthesis around the first digit plus and then parenthesis
16:12
around the second digit plus. And now if we hover over the match, you can see that it contains groups, the first one being 1920 and the second one being 10 80. So let's take a look at how we can get access to these groups using the JavaScript match and match all functions that we looked at before we create a simple string that contains a resolution value. Now if you were to use a regular expression without any capture groups, we will still get a match, but as you can see it has a length of one and the item at the zero index is the full sub portion that got matched. If however,
16:45
we invoke match with capture groups with separate capture groups for the width and the height. The return result has a length of three. The item at the zeroth index is still the full resolution, but now it has an item at index one, which is the width and an item at index two, which is the height. And we can actually use a a d structuring on the group result to get the value, the width and the height into their own variables. Now let's consider another example which has multiple resolutions embedded inside of it. And for this particular case, we would want to use match all and then use the global flag within our regular
17:19
expression match all returns and iterable. So we can simply loop over it using four off. And then each of these items is going to be the same as the result we saw for the match. So it'll contain the value of the width and the height at index zero one and two. Now that you have completed this masterclass, you have no reason to be afraid of regular expressions and if you ever come across them, you have access to resources to easily decode them. If you ever want to refresh your memory, you can always just watch this tutorial again. As always, thank you for joining me and I will see you in the next one.