31 May Getting Started With Regular Expressions in Python
In this tutorial, we’ll be discussing Regular Expressions in Python. According to Wikipedia, Regular Expressions can be defined as a sequence of characters that specifies a search pattern in text. We can go further by saying they are patterns used for matching character combinations in strings.
Python provided a module that supports the use of regex. This module is known as re. We can import it for use in our code. So, we’ll discuss what the module entails and how we can use it.
Let’s look at different examples of regex in Python.
In this example, we use the search function to check if our test string starts with “I” and ends with “texas.”
Below is our result;
1 Got the search.
Before we look at more examples, let’s discuss the different RegEx FunctionS available in Python.
RegEx Functions
1 import re 2 3 pattern = '^I.*texas$' 4 test_string = 'I am in texas' 5 result = re.match(pattern, test_string) 6 7 if result: 8 print("Got the search.") 9 else: 10 print("Search unsuccessful.")
findall: The findall function can be used to return all non-overlapping matches of the pattern in our string of data.
Here in this example, we find all digits we have in our text string.
1 import re 2 # The string of text where regular expression will be searched. 3 string_1 = """Here are some cudtomer's id, Mr Joseph: 396 4 Mr Jones: 457 5 Mrs Shane: 222 6 Mr Adams: 156 7 Miss Grace: 908""" 8 # Setting the regular expression for finding digits in the string. 9 regex_1 = "(\d+)" 10 match_1 = re.findall(regex_1, string_1) 11 print(match_1)
We should have this as our result.
1 ['396', '457', '222', '156', '908']
Search: . The search function is used with the regex module to search if a particular pattern exists within a string. If the search comes out successful, it returns the match object, but if not, it returns none.
Look at the example we have here;
In this example, we will use the search function to look for “school” in our string pattern.
1 2import re 3 4s = 'what school did you graduated from?' 5 6match = re.search(r'school', s) 7 8print('Start Index:', match.start()) 9print('End Index:', match.end())
It gave us the result below. This result tells us where the start index of school is and the end. This should match if we count that from the string.
1 Start Index: 5
2 End Index: 11
Split: In the split function, it splits the string based on the occurrence of the regex pattern we specified and then returned the list containing our substrings. An example is here below for us to understand this better.
In this example, we split the string by the first two hyphens and the pattern sequence for handling hyphens or non-alphanumeric characters.
1 import re 2 3#the pattern sequence for hyphen or non alphanumeric chracter 4pattern = r'\W+' 5 6# our targeted string 7string = "100-joe-01-10-2022" 8 9#we want to split the string by the first 2 hyphens 10txt_ = re.split(pattern, string, maxsplit=2) 11 12print(txt_)
Output
1 ['100', 'joe', '01-10-2022']
Sub: The sub-function can be used with the regex module to replace multiple elements in a string and now return the new replaced string.
Let’s see the example below;
1 import re 2 3 # Our Given String 4 s = "Debugging is very important when coding." 5 6 # Performing the Sub() operation 7 out_1 = re.sub('a', 'x', s) 8 out_2 = re.sub('[a,I]','x',s) 9 out_3 = re.sub('very','not',s) 10 11 # Print output 12 print(out_1) 13 print(out_2)
Output
1 Debugging is very importxnt when coding. 2 Debugging is not important when coding.
Now, if we observe, we used different patterns in each of the functions we mentioned earlier. This brings us to Meta Characters.
Python RegEx Meta Characters
The Meta Characters are very useful in defining rules to find the specific pattern we want in a string. These Meta Characters are listed below;
And that’s about Regular Expressions in Python. Thanks for reading this article. See you in the next post.
No Comments