LearnPython.com
  • Courses
  • Articles
  • Log in
  • Create free account
  • fullName

    User profile menu open Open user profile menu avatar
    avatar
    fullName
    Dashboard
    My Profile
    Payment & Billing
    Log out
MENU CLOSE
  • Courses
  • Articles
  • Dashboard
  • My Profile
  • Payment & Billing
  • Log in
  • Create free account
  • Log out 
Back to articles list Articles
24th Feb 2022 6 minutes read

How to Visualize Sound in Python

Author's photo
Luke Hande
  • python
  • technical
See More

There’s a lot of music and voice data out there. There are also interesting applications to go with them. We show you how to visualize sound in Python.

The analysis of audio data has become ever more relevant in recent times. Popular virtual assistant products have been released by major technology companies, and these products are becoming more common in smartphones and homes around the world. They are largely developed on top of models that analyze voice data and extract information from it.

There is a large range of applications using audio data analysis, and this is a rich topic to explore. In this article, we’re going to focus on a fundamental part of the audio data analysis process – plotting the waveform and frequency spectrum of the audio file.

This article is aimed at people with a bit more background in data analysis. If you’re a beginner and are looking for some material to get up to speed in data science, take a look at this track.

Opening a WAV File

Audio files come in a variety of formats. You’re probably familiar with MP3, which uses lossy compression to store data. Formats such as FLAC use lossless compression, which allows the original data to be perfectly reconstructed from the compressed data. Our audio file is in the WAV (Waveform Audio File) format, which is uncompressed. The file sizes can get large as a consequence.

The sound file we’ll look at is an upbeat jingle that starts with a piano. Other sounds like bells and clapping come in throughout the jingle, with a strumming guitar part at two points in the track. It’s worth mentioning these features in the audio recording because we can identify some of these later when we plot the waveform and the frequency spectrum.

To open our WAV file, we use the wave module in Python, which can be imported and called as follows:

>>> import wave
>>> wav_obj = wave.open('file.wav', 'rb')

The 'rb' mode returns a wave_read object. Using 'wb' to open the file returns a wave_write object, which has different methods from the former object. You can also use a with statement to open the file as we demonstrate here. If you’re interested in learning more about how to programmatically handle large numbers of files, take a look at this article.

A sound wave is a continuous quantity that needs to be sampled at some time interval to digitize it. The sampling rate quantifies how many samples of the sound are taken every second. We can access this information using the following method:

>>> sample_freq = wav_obj.getframerate()
>>> sample_freq
44100

The sample frequency quantifies the number of samples per second. In this case, it is 44,100 times per second, which corresponds to CD quality. The number of individual frames, or samples, is given by:

>>> n_samples = wav_obj.getnframes()
>>> n_samples
5384326

We can now calculate how long our audio file is in seconds:

>>> t_audio = n_samples/sample_freq
>>> t_audio
122.09356009070295

The audio file is recorded in stereo, that is, in two independent audio channels. This creates the impression of the sound coming from two different directions. We can check the number of channels as follows:

>>> n_channels = wav_obj.getnchannels()
>>> n_channels
2

The next step is to get the values of the signal, that is, the amplitude of the wave at that point in time. To do this, we can use the readframes() method, which takes one argument, n, defining the number of frames to read:

>>> signal_wave = wav_obj.readframes(n_samples)

This method returns a bytes object. Check for yourself by using the type() built-in function on the signal_wave object. To get signal values from this, we have to turn to numpy:

>>> import numpy as np
>>> signal_array = np.frombuffer(signal_wave, dtype=np.int16)

This returns all data from both channels as a 1-dimensional array. If you check the shape of signal_array, you notice it has 10,768,652 elements, which is exactly n_samples * n_channels. To split the data into individual channels, we can use a clever little array slice trick:

>>> l_channel = signal_array[0::2]
>>> r_channel = signal_array[1::2]

Now, our left and right channels are separated, both containing 5,384,326 integers representing the amplitude of the signal.

Next, we show some examples of how to plot the signal values. We have our data stored in arrays here, but for many data science applications, pandas is very useful. Check out this article about visualizing data stored in a DataFrame.

Plotting the Signal Amplitude

Before we get to plotting signal values, we need to calculate the time at which each sample is taken. This is simply the total length of the track in seconds, divided by the number of samples. We can use linspace() from numpy to create an array of timestamps:

>>> times = np.linspace(0, n_samples/sample_freq, num=n_samples)

For plotting, we’re going to use the pyplot class from matplotlib. If you need some background material on plotting in Python, we have some articles. Here’s part 1 and part 2 of an introduction to matplotlib.

For simplicity, we only plot the signal from one channel. Let’s set up the figure, and plot a time series as follows:

>>> import matplotlib.pyplot as plt
>>> plt.figure(figsize=(15, 5))
>>> plt.plot(times, l_channel)
>>> plt.title('Left Channel')
>>> plt.ylabel('Signal Value')
>>> plt.xlabel('Time (s)')
>>> plt.xlim(0, t_audio)
>>> plt.show()

This opens the following figure in a new window:

How to Visualize Sound in Python

We see the amplitude build up in the first 6 seconds, at which point the bells and clapping effects start. There are two brief pauses in the jingle at 31.5 and 44.5 seconds, which are evident in the signal values. After the second pause, the main instrument alternates between a guitar and a piano, which is roughly seen in the signal, where the guitar part has lower amplitudes. Then, there’s a lower-amplitude outro at the end of the track.

Plotting the Frequency Spectrum

Now, let’s take a look at the frequency spectrum, also known as a spectrogram. This is a visual representation of the signal strength at different frequencies, showing us which frequencies dominate the recording as a function of time:

>>> plt.figure(figsize=(15, 5))
>>> plt.specgram(l_channel, Fs=sample_freq, vmin=-20, vmax=50)
>>> plt.title('Left Channel')
>>> plt.ylabel('Frequency (Hz)')
>>> plt.xlabel('Time (s)')
>>> plt.xlim(0, t_audio)
>>> plt.colorbar()
>>> plt.show()

The following plot opens in a new window:

How to Visualize Sound in Python

In the plotting code above, vmin and vmax are chosen to bring out the lower frequencies that dominate this recording. Indeed, the dominant frequencies for the whole track are lower than 2.5 kHz. You see the effect of different instruments and sound effects, particularly in the frequency range of about 10 kHz to 15 kHz. Each instrument and sound effect has its own signature in the frequency spectrum.

Where to Go From Here

Plotting the waveform and frequency spectrum with Python forms a foundation for a deeper analysis of the sound data. Perhaps you can further quantify the frequencies of each part of the recording. What is the average frequency of the guitar part compared to the piano part? And here, we’ve only looked at one channel. Another extension of the material here is to plot both channels and see how they compare. Try plotting the difference between the channels, and you see some new and interesting features pop out of the waveform and the frequency spectrum.

Tags:

  • python
  • technical

You may also like

How to Write to File in Python
Discover how to write to a file in Python using the write() and writelines() methods and the pathlib and csv modules.
Read more
How to Rename Files Python
Do you know how to rename, batch rename, move, and batch move files in Python? Discover how! It will improve your productivity.
Read more
Visualize Time Series Data with Pandas
Find out how to analyze stock prices for previous years and see how to perform time resampling, and time shifting with Python pandas.
Read more
Python Drawing: Intro to Python Matplotlib for Data Visualization (Part 1)
Want to know how Python is used for plotting? In this article, you'll learn how to use Python matplotlib for data visualization.
Read more
Python Drawing: Intro to Python Matplotlib for Data Visualization (Part 2)
Want to know how Python is used for plotting? A few more tips on how to use Python matplotlib for data visualization.
Read more
How to Learn Python Faster
Check out how to learn Python faster! Stop wasting time on other slow and ineffective methods.
Read more
Subscribe to our newsletter Join our monthly newsletter to be notified about the latest posts.

How Do You Write a SELECT Statement in SQL?

What Is a Foreign Key in SQL?

Enumerate and Explain All the Basic Elements of an SQL Query

Quick links

  • Pricing
  • Blog
  • Vertabelo.com

Assistance

Need assistance? Drop us a line at [email protected]

Write to us

Follow us

LearnSQL Facebook We Learn SQL Facebook Linkedin LearnPython.com We Learn SQL Youtube
go to top
Copyright ©2016-2018 Vertabelo SA All rights reserved
Vertabelo
  • Terms of service
  • Privacy policy
  • Imprint