Regular Expressions in Python

This tutorial will walk you through Python Regular Expression a.k.a. RegEx. We have covered every little detail to make this topic simpler for you.

Note: The syntax used here is for Python 3. You may modify it to use with other versions of Python.

Python Regular Expression

To Learn Python from Scratch – Read Python Tutorial

What is Regular Expression?

A Regular Expression or RegEx represents a group of characters that forms a search pattern used for matching/searching within strings.

Python Regular Expression Support

In Python, we can use regular expressions to find, search, replace, etc. by importing the module re. It has the necessary functions for pattern matching and manipulating the string characters.

It provides RegEx functions to search patterns in strings. We can even use this module for string substitution as well.

This Python regular expression module (re) contains capabilities that are similar to the Perl RegEx. It comprises of functions such as match(), sub(), split(), search(), findall(), etc.

How to use Regular Expression in Python?

To use a regular expression, first, you need to import the re module. You also need to understand how to pass a raw string (r’expression’) to a function. Another thing is to interpret the result of a RegEx function.

Import Re Module

When you want to use any functions present in the re module, you can access it with the below syntax

import re
re.function_name(list_of_arguments)

Or use this alternative approach.

from re import function_name
function_name(list_of_arguments)

Use Raw String Argument

You might need to use raw string to pass it as the pattern argument to Python regular expression functions. Follow the below code to know how to use it.

search(r"[a-z]", "yogurt AT 24")

RegEx Function Return Value

If a Python RegEx function (mainly the search() and match() functions) succeeds, then it returns a Match object.

We can pass the object to the group() function to extract the resultant string.

The group() method takes a numeric value to return output the matched string or to a specific subgroup.

print("matchResult.group() : ", matchResult.group())
print("matchResult.group(1) : ", matchResult.group(1))

Regular Expression Functions

The two most important functions used are the search and match functions. When you wish to perform regular expression search on a string, the interpreter traverses it from left to right. If the pattern matches perfectly, then it returns a match object or None on failure.

re.search(argument_list)

The search() function gets you the first occurrence of a string containing the string pattern.

Python Regular Expression - Search Function

The syntax for regular expression search is:

import re
re.search(string_pattern, string, flags)

Please note that you can use the following metacharacters to form string patterns.

(+ ? . * ^ $ ( ) [ ] { } | \)

Apart from the previous set, there are some more such as:

\A, \n, \r, \t, \d, \D, \w, \z etc and so on.

Let’s see the search() example:

from re import search
Search = search(r“[a-z]”, “yogurt AT 24”)
print((Search))

The output as follows:

<_sre.SRE_Match object; span=(0, 1), match='y'>

re.match(argument_list)

The match() function gets you the match containing the pattern from the start of the string.

Python Regular Expression - Match Function

The syntax for regular expression match is:

import re
re.match(string_pattern, string, flags)

Let’s see the match() example:

from re import match
print(match(r"PVR", "PVR Cinemas is the best."))

The output as follows:

<_sre.SRE_Match object; span=(0, 3), match='PVR'>

re.split(argument_list)

It is used to split the string according to the string pattern.

The syntax for the split() is:

import re
re.split(string_pattern, string)

Let’s see the split() example:

from re import split
print(split(r"y", "Python"))

The output as follows:

['P', 'thon']

re.sub(argument_list)

It is used to substitute a part of a string according to string pattern.

The syntax for the sub() is:

import re
re.sub(string_pattern, strings)

Let’s see the sub() example:

from re import sub
print(sub(r“Machine Learning”, “Artificial Intelligence”, “Machine Learning is the Future.”))

The output as follows:

Artificial Intelligence is the Future.

re.findall(argument_list)

It is used to find the occurrence of the string pattern anywhere in the string.

The syntax for findall() is:

import re
re.findall(string_pattern, strings)

Let’s see the findall() example:

from re import findall
print(findall(r“[a-e]”, “I am interested in Python Programming Language”))

The output as follows:

['a', 'e', 'e', 'e', 'd', 'a', 'a', 'a', 'e']

re.compile(argument_list)

It helps you create a string pattern for future purposes rather than on the fly string matching.

The syntax for compile() is:

import re
re.compile(string_pattern)

Let’s see the compile() example:

import re
future_pattern = re.compile(“[0-9]”) #This is a variable that can be stored for future use.
print(future_pattern.search(“1 s d f 2 d f 3 f d f 4 A l s”))
print(future_pattern.match(“1 s d f 2 d f 3 f d f 4 ”))

The output as follows:

<_sre.SRE_Match object; span=(0, 1), match='1'>

Further References

To learn more about module re in Python 3, you can visit the following link.

REF: https://docs.python.org/3/library/re.html

The link may be a bit too abstract for beginners or intermediate users. However, if you are an advanced user, then you may like to visit.

Best,

TechBeamers