PyCG 8: Audio Oscilloscope

Oscilloscope music is a very unique music genre, which is mainly developped by Jerobeam Fenderson (see the video below as an example). The idea is that you can not only listen to it, but you can also see it visually. Indeed, if you pass the audio signal into an oscilloscope, the fancy patterns cleverly designed by the artist will be revealed. This time, let us find out how to visualize the wave form given an oscilloscope music.

XY mode of oscilloscopes

XY mode is a function provided by many oscilloscopes, where two independent input signals are put together in the output. The image of oscilloscope music can only be viewed in XY mode. How XY mode works is simple: at any given moment, the strength of signal 1 represents the x-coordinate of the point, and the strength of signal 2 represents the y-coordinate. The famous Lissajous curve can be viewed easily in XY mode, with two sine waves as inputs.

Different types of Lissajous curves (source: Wolfram MathWorld)

Decoding audio data

The audioread package decodes audio data into signed short arrays (int16_t). This is the sample code provided in its docummentation:

with audioread.audio_open(filename) as f:
    print(f.channels, f.samplerate, f.duration)
    for buf in f:
        do_something(buf)

The audio file is read buffer by buffer, where each buffer is a chuck of audio data of a particular size (usually 4KB). The buffer is made up of samples, and the size of each sample is number of channels times the size of audio data. In this case, because it is a stereo audio, and the data type is signed short, the size of each sample is \(2 \times 2~\mathrm{byte} = 4~\mathrm{byte}\). In each sample, the data of each channel is stored side by side. In order to make the data more Python-friendly, we can join the data into a big array and use numpy to split them into two channels, just as shown below.

import audioread
import numpy as np
import openal

audioBuffer = []

sampleRate = None
audioLength = None

soundFile = 'Jerobeam Fenderson - Planets.wav'

# load audio
with audioread.audio_open(soundFile) as inaudio:
    assert inaudio.channels == 2
    sampleRate = inaudio.samplerate
    audioLength = inaudio.duration
    for buf in inaudio:
        data = np.frombuffer(buf, dtype=np.int16)
        audioBuffer.append(data)

dataBuffer = np.concatenate(audioBuffer).reshape((-1, 2)).astype(np.float32)

Audio playback

I use the PyOpenAL package to support audio playback in the demo. My knowledge on this package is still very limited, so I am using the most basic APIs provided by the package. I think more coding is needed in order to synchronize the video and audio accurately. What is more, PyOpenAL only supports WAV format, but audioread supports all kinds of formats. There should be a way to stream decoded audio data into PyOpenAL so that the audios of other formats can be played.

The demo

After acquiring all audio data points, we can convert them into NDC and load them entirely into the graphics memory. We can use glDrawArrays to control with part of audio we would like to draw. The audio sample that I use is captured from YouTube, therefore the image generated will tend to have a lower quality. You are welcomed to buy authentic tracks from Jerobeam Fenderson to examine what they look like.

The source code ban be found on GitHub.

Video capture of the demo