LearnPython.com
  • Courses
  • Articles
  • Log in
  • Create free account
  • fullName

    User profile menu open Open user profile menu avatar
    avatar
    fullName
    Dashboard
    My Profile
    Payment & Billing
    Log out
MENU CLOSE
  • Courses
  • Articles
  • Dashboard
  • My Profile
  • Payment & Billing
  • Log in
  • Create free account
  • Log out 
Back to articles list Articles
26th Apr 2022 7 minutes read

How to Get a Substring of a String in Python

Author's photo
Luke Hande
  • python
See More

Learn how to get a substring of a string in Python.

Learning anything new can be a challenge. The more you work with Python, the more you notice how often strings pop up. String manipulation in Python is an important skill. In this article, we give you an introduction to generating a substring of a string in Python.

Python is a great language to learn especially if you’re a beginner, as we discuss in this article. We even have a course on working with strings in Python. It contains interactive exercises designed to start from the basic level and teach you all you need to know about this important data type. Once you’re comfortable working with strings, you can work on some interesting data science problems. Take a look at the Python for Data Science course, which gives you an introduction to this diverse topic.

Slicing and Splitting Strings

The first way to get a substring of a string in Python is by slicing and splitting. Let’s start by defining a string, then jump into a few examples:

>>> string = 'This is a sentence. Here is 1 number.'

You can break this string up into substrings, each of which has the str data type. Even if your string is a number, it is still of this data type. You can test this with the built-in type() function. Numbers may be of other types as well, including the decimal data type, which we discuss here.

Much like arrays and lists in Python, strings can be sliced by specifying the start and the end indexes, inside square brackets and separated by a colon. This returns a substring of the original string.

Remember indexing in Python starts from 0. To get the first 7 characters from the string, simply do the following:

	>>> print(string[:7])
	This is

Notice here we didn’t explicitly specify the start index. Therefore, it takes a default value of 0.

By the way, if you want more information about the print() function, check out this article. There’s probably more to it than you realize.

We can also index relative to the end of the string by specifying a negative start value:

	>>> print(string[-7:])
	number.

Since we didn’t specify an end value, it takes the default value of len(string). If you know the start and the end indexes of a particular word, you can extract it from the string like this:

>>> print(string[10:18])
sentence

However, this is not optimal for extracting individual words from a string since it requires knowing the indexes in advance.

Another option to get a substring of the string is to break it into words, which can be done with the string.split() method. This takes two optional arguments: a string defining which characters to split at (defaults to any whitespace), and the maximum number of splits (defaults to -1, which means no limit). As an example, if we want to split at a space, you can do the following, which returns a list of strings:

>>> string.split(' ')
['This', 'is', 'a', 'sentence.', 'Here', 'is', '1', 'number.']

But notice the full stop (point character) is included at the end of the words “sentence” and “number”. We’ll come back to this later in the article when we look at regular expressions.

There are plenty of built-in string methods in Python. They allow you to modify a string, test its properties, or search in it. A useful method to generate a more complex substring of a string in Python is the string.join() method. It takes an iterable of strings and joins them. Here’s an example:

>>> print(' and '.join(['one', 'two', 'three']))
one and two and three

With a clever indexing trick, this can be used to print a substring containing every second word from the original:

>>> print(' '.join(string.split(' ')[::2]))
This a Here 1

Since the input to the join() method takes a list, you can do a list comprehension to create a substring from all words with a length equal to 4, for example. For those of you looking for a more challenging exercise, try this for yourself. We’ll also show you a different method to do this later in the article. If you want to know how to write strings to a file in Python, check out this article.

The parse Module

There’s a little-known Python module called parse with great functionality for generating a substring in Python. This module doesn’t come standard with Python and needs to be installed separately. The best way is to run the pip install command from your terminal.

Here’s how to get a substring using the parse function, which accepts two arguments:

>>> import parse
>>> substring = parse.parse('This is {}. Here is 1 {}.', 'This is a sentence. Here is 1 number.')
>>> substring.fixed
('a sentence', 'number')

Calling the fixed method on substring returns a tuple with the substrings extracted from the second argument at the position of the curly braces {} in the first argument. For those of you familiar with string formatting, this may look suspiciously familiar. Indeed, the parse module is the opposite of format(). Check this out, which does the opposite of the above code snippet:

>>> print('This is {}. Here is 1 {}.'.format('a sentence', 'number'))
This is a sentence. Here is 1 number.

While we’re talking about the parse module, it’s worth discussing the search function, since searching is a common use case when working with strings. The first argument of search defines what you’re looking for by specifying the search term with curly braces. The second defines where to look.

Here’s an example:

>>> result = parse.search('is a {}.', 'This is a sentence. Here is 1 number')
>>> result.fixed
('sentence',)

Once again, calling the fixed method returns a tuple with the results. If you want the start and the end indexes of the result, call the spans method. Using the parse module to search in a string is nice – it’s pretty robust to how you define what you’re searching for (i.e., the first argument).

Regular Expressions

The last Python module we want to discuss is re, which is short for “regex,” which is itself short for “regular expression.” Regular expressions can be a little intimidating – they involve defining highly specialized and sometimes complicated patterns to search in strings.

You can use regex to extract substrings in Python. The topic is too deep to cover here comprehensively, so we’ll just mention some useful functions and give you a feel for how to define the search patterns. For more information on this module and its functionality, see the documentation.

The findall() function takes two required arguments: pattern and string. Let’s start by extracting all words from the string we used above:

>>> re.findall(r'[a-z]+', 'This is a sentence. Here is 1 number.', flags=re.IGNORECASE)
['This', 'is', 'a', 'sentence', 'Here', 'is', 'number']

The [a-z] pattern matches all lowercase letters, the + indicates the words may be of any length, and the flag tells you to ignore the case. Compare this to the result we got above by using string.split(), and you notice the full stop is not included.

Now, let’s extract all numbers from the string:

>>> re.findall(r'\b\d+\b', 'This is a sentence. Here is 1 number.')
['1']

\b matches a boundary at the start and end of the pattern, \d matches any digit from 0 to 9, and again the + indicates the numbers may be of any length. For example, we find all words with a length of 4 characters with the following:

>>> re.findall(r'\b\w{4}\b', 'This is a sentence. Here is 1 number.')
['This', 'Here']

\w matches any words, and {4} defines the length of the words to match. To generate a substring, you just need to use string.join() as we did above. This is an alternative approach to the list comprehension we mentioned earlier, which may also be used to generate a substring with all words of length 4.

There are other functions in this module worth taking a look at. match() may be used to determine if the pattern matches at the beginning of the string, and search() scans through the string to look for any location where the pattern occurs.

Closing Thoughts on Generating Substrings in Python

In this article, we have discussed extracting and printing substrings of strings in Python. Use this as a foundation to explore other topics such as scraping data from a website. Can you define a regex pattern to extract an email address from a string? Or remove punctuation from this paragraph? If you can, you’re on your way to becoming a data wrangler!

If you also work a lot with tabular data, we have an article that shows you how to pretty-print tables in Python. Slowly adding all these skills to your toolbox will turn you into an expert programmer.

Tags:

  • python

You may also like

Is it Difficult to Learn Python?
Are you interested in programming but not sure if Python is worth learning? In this article, we dispel your doubts and fears!
Read more
How to Count Money Exactly in Python
Using floats to do exact calculations in Python can be dangerous. Here, we explain why and show you an alternative solution.
Read more
A Complete Guide to the Python print() Function
There’s more to the Python print() function than you realize. We explore this function in detail and show you some handy examples.
Read more
How to Write to File in Python
Discover how to write to a file in Python using the write() and writelines() methods and the pathlib and csv modules.
Read more
How to Pretty-Print Tables in Python
Whether you want to publish your tabular data online or in an article, Python has some useful libraries to get the job done.
Read more
How to Convert a String to JSON in Python
Do you want to learn how to read and write JSON files in Python? Explore them in this article.
Read more
Subscribe to our newsletter Join our monthly newsletter to be notified about the latest posts.

How Do You Write a SELECT Statement in SQL?

What Is a Foreign Key in SQL?

Enumerate and Explain All the Basic Elements of an SQL Query

Quick links

  • Pricing
  • Blog
  • Vertabelo.com

Assistance

Need assistance? Drop us a line at [email protected]

Write to us

Follow us

LearnSQL Facebook We Learn SQL Facebook Linkedin LearnPython.com We Learn SQL Youtube
go to top
Copyright ©2016-2018 Vertabelo SA All rights reserved
Vertabelo
  • Terms of service
  • Privacy policy
  • Imprint