Pablo

· 07.20.2015 · projects

The Avalanches' Since I Left You

My favorite album of all time is The Avalanches' Since I Left You. It's a masterpiece of "plunderphonics", which essentially means "sound collage"; that is, constructing songs purely out of broadly sourced samples. Since the album came out 15 years ago (!!), there has been promise of a second release. This release has yet to materialize.

In this endless interim I wrote Pablo (named after Pablo's Cruise, which is a song on Since I Left You and was the album's original title), a program for automatically generating plunderphonic music. It's output is certainly nothing of Avalanches' caliber, but it's meant more for rapidly sketching ideas than for generating fully-fleshed tracks.

The GitHub repo for Pablo has instructions for installation. Here I'll talk a bit more about how it works.

Selecting and preprocessing the songs

Song selection

The approach is pretty simple - you direct Pablo to a directory of music (mp3s or wavs) and it will start analyzing the tracks, approximating their BPMs and keys for later reference. Almost all of Pablo's audio analysis capabilites rely directly on the Essentia library.

Pablo then randomly picks a "focal" song - the song the rest of the mix is based off of. Pablo goes through the other songs available in the directory and randomly chooses a few that are within a reasonable distance from the focal song in both key and BPM (to avoid ridiculous warping, although maybe you want that to happen).

These other songs are pitch-shifted and time-stretched to conform to the focal song. Then, in each track, beat onset detection happens to identify downbeats. These downbeats are used to slice the tracks into samples of different sizes (the sizes must be a power of 2, e.g. 4, 8, 16, 32, etc). By default these samples are 16 beats and 32-beats.

A challenge here was beatmatching. For the most part, Pablo does pretty well at identifying proper beat onsets. But some samples become irregular - if there is an instrumental bridge, for instance, or a long intro, then Pablo may not cut that part into properly-sized samples. So there is an additional step where Pablo finds the mode sample duration (in milliseconds) and discards all samples not of this length. That way we have greater assurance that when the samples are finally assembled into tracks, they will align properly (since all samples of the same beat length will have the exact same time duration).

Assembling the mix

With all the samples preprocessed, Pablo can begin putting the tracks together. To do so, Pablo just places samples one after the another using the stochastic process outlined below.

A song length is specified in beats, which also must be a power of 2. For instance, we may want a song that is 64-beats long. Pablo then recursively goes down by powers of 2 and tries to "fill in" these beats.

For example, say we want a 64-beat long song and we have a samples of size 32, 16, and 8. Pablo first checks to see if a 64-beat sample is available. One isn't, so then it knows it needs two 32-beat samples (i.e. we have two 32-beat slots to fill). A 32-beat sample is available, so for each 32-beat slot Pablo randomly decides whether or not to use a complete 32-beat sample or to further split that slot into two 16-beat slots. If Pablo chooses the latter, the same process is again applied until the smallest sample size is reached. So in this example, Pablo would stop at two 8-beat slots since it does not have any samples smaller than 8 beats.

Selecting samples

Pablo also has some heuristics to ensure a degree of coherence in the tracks it creates. Without this, Pablo's tracks would be too spastic, with sudden cutoffs and changes in samples every bar. So to decide what sample should follow the current sample, Pablo uses a simple Markov chain model which favors staying in the same song and favors repeating the same sample.

Markov Chain Coherence

Each track is constructed in this way. Pablo creates (by default) two tracks which are then overlaid to form the final mix. However, sometimes samples from the same song are overlaid each other, which isn't quite what I wanted, so one final heuristic is implemented. When constructing tracks after the first, Pablo will try its best to avoid placing samples from the same song over each other. Sometimes this is unavoidable, such as when there are more tracks being made than there are songs. But for the most part it works out.

That's really about it in terms of generating the songs. I've also been playing a bit with vocal detection, since hearing multiple people singing can be disorienting, and I would like to add in some genre-identification features as well.

The Infinite Playlist

One final feature is the "crate digging" feature, in which Pablo takes a seed YouTube url and crawls its related videos, quickly amassing some songs to sample. Eventually I'd like it so that Pablo can become the infinite playlist, crawling YouTube (and perhaps SoundCloud and other sources) and endlessly mixing songs into each other.

An example song

Here's a song that Pablo produced:

It's a bit hectic, but for the most part beats match well and there are some interesting moments in there. Pablo outputs sample listings with times, along with the original samples, so it's easy to recreate the parts that you like.

Source

You can check out how all this is implemented in the GitHub repo, along with installation instructions to run Pablo yourself!