Sunday, 2 October 2011

Algorithmic symphonies from one line of code -- how and why?

Lately, there has been a lot of experimentation with very short programs that synthesize something that sounds like music. I now want to share some information and thoughts about these experiments.

First, some background. On 2011-09-26, I released the following video on Youtube, presenting seven programs and their musical output:

This video gathered a lot of interest, inspiring many programmers to experiment on their own and share their findings. This was further boosted by Bemmu's on-line Javascript utility that made it easy for anyone (even non-programmers, I guess) to jump in the bandwagon. In just a couple of days, people had found so many new formulas that I just had to release another video to show them off.

Edit 2011-10-10: note that there's now a third video as well!

It all started a couple of months ago, when I encountered a 23-byte C-64 demo, Wallflower by 4mat of Ate Bit, that was like nothing I had ever seen on that size class on any platform. Glitchy, yes, but it had a musical structure that vastly outgrew its size. I started to experiment on my own and came up with a 16-byte VIC-20 program whose musical output totally blew my mind. My earlier blog post, "The 16-byte frontier", reports these findings and speculates why they work.

Some time later, I resumed the experimentation with a slightly more scientific mindset. In order to better understand what was going on, I needed a simpler and "purer" environment. Something that lacked the arbitrary quirks and hidden complexities of 8-bit soundchips and processors. I chose to experiment with short C programs that dump raw PCM audio data. I had written tiny "/dev/dsp softsynths" before, and I had even had one in my email/usenet signature in the late 1990s. However, the programs I would now be experimenting with would be shorter and less planned than my previous ones.

I chose to replicate the essentials of my earlier 8-bit experiments: a wave generator whose pitch is controlled by a function consisting of shifts and logical operators. The simplest waveform for /dev/dsp programs is sawtooth. A simple for(;;)putchar(t++) generates a sawtooth wave with a cycle length of 256 bytes, resulting in a frequency of 31.25 Hz when using the the default sample rate of 8000 Hz. The pitch can be changed with multiplication. t++*2 is an octave higher, t++*3 goes up by 7 semitones from there, t++*(t>>8) produces a rising sound. After a couple of trials, I came up with something that I wanted to share on an IRC channel:


In just over an hour, Visy and Tejeez had contributed six more programs on the channel, mostly varying the constants and changing some parts of the function. On the following day, Visy shared our discoveries on Google+. I reshared them. A surprising flood of interested comments came up. Some people wanted to hear an MP3 rendering, so I produced one. All these reactions eventually led me to release the MP3 rendering on Youtube with some accompanying text screens. (In case you are wondering, I generated the screens with an old piece of code that simulates a non-existing text mode device, so it's just as "fakebit" as the sounds are).

When the first video was released, I was still unsure whether it would be possible for one line of C code to reach the sophistication of the earlier 8-bit experiments. Simultaneities, percussions, where are they? It would also have been great to find nice basslines and progressions as well, as those would be useful for tiny demoscene productions.

At some point of time, some people noticed that by getting rid of the t* part altogether and just applying logical operators on shifted time values one could get percussion patterns as well as some harmonies. Even a formula as simple as t&t>>8, an aural corollary of "munching squares", has interesting harmonic properties. Some small features can be made loud by adding a constant to the output. A simple logical operator is enough for combining two good-sounding formulas together (often with interesting artifacts that add to the richness of the sound). All this provided material for the "second iteration" video.

If the experimentation continues at this pace, it won't take many weeks until we have found the grail: a very short program, maybe even shorter than a Spotify link, that synthesizes all the elements commonly associated with a pop song: rhythm, melody, bassline, harmonic progression, macrostructure. Perhaps even something that sounds a little bit like vocals? We'll see.

Hasn't this been done before?

We've had the technology for all this for decades. People have been building musical circuits that operate on digital logic, creating short pieces of software that output music, experimenting with chaotic audiovisual programs and trying out various algorithms for musical composition. Mathematical theory of music has a history of over two millennia. Based on this, I find it quite mind-boggling that I have never before encountered anything similar to our discoveries despite my very long interest in computing and algorithmic sound synthesis. I've made some Google Scholar searches for related papers but haven't find anything. Still, I'm quite sure that at many individuals have come up with these formulas before, but, for some reason, their discoveries remained in obscurity.

Maybe it's just about technological mismatch: to builders of digital musical circuits, things like LFSRs may have been more appealing than very wide sequential counters. In the early days of the microcomputer, there was already enough RAM available to hold some musical structure, so there was never a real urge to simulate it with simple logic. Or maybe it's about the problems of an avant-garde mindset: if you're someone who likes to experiment with random circuit configurations or strange bit-shifting formulas, you're likely someone who has learned to appreciate the glitch esthetics and never really wants to go far beyond that.

Demoscene is in a special position here, as technological mismatch is irrelevant there. In the era of gigabytes and terabytes, demoscene coders are exploring the potential of ever shorter program sizes. And despite this, the sense of esthetics is more traditional than with circuit-benders and avant-garde artists. The hack value of a tiny softsynth depends on how much its output resembles "real, big music" such as Italo disco.

The softsynths used in the 4-kilobyte size class are still quite engineered. They often use tight code to simulate the construction of an analog synthesizer controlled by a stored sequence of musical events. However, as 256 bytes is becoming the new 4K, there has been ever more need to play decent music in the 256-byte size class. It is still possible to follow the constructivist approach in this size class -- for example, I've coded some simple 128-byte players for the VIC-20 when I had very little memory left. However, since the recent findings suggest that an approach with a lot of random experimentation may give better results than deterministic hacking, people have been competing in finding more and more impressive musical formulas. Perhaps all this was something that just had to come out of the demoscene and nowhere else.

Something I particularly like in this "movement" is its immediate, hands-on collaborative nature, with people sharing the source code of their findings and basing their own experimentation on other people's efforts. Anyone can participate in it and discover new, mind-boggling stuff, even with very little programming expertise. I don't know how long this exploration phase is going to last, but things like this might be useful for a "Pan-Hacker movement" that advocates hands-on hard-core hacking to greater masses. I definitely want to see more projects like this.

How profound is this?

Apart from some deterministic efforts that quickly bloat the code up to hundreds of source-code characters, the exploration process so far has been mostly trial-and-error. Some trial-and-error experimenters, however, seem to have been gradually developing an intuitive sense of what kind of formulas can serve as ingredients for something greater. Perhaps, at some time in the future, someone will release some enlightening mathematical and music-theoretical analysis that will explain why and how our algorithms work.

It already seems apparent, however, that stuff like this stuff works in contexts far beyond PCM audio. The earlier 8-bit experiments, such as the C-64 Wallflower, quite blindly write values to sound and video chip registers and still manage to produce interesting output. Media artist Kyle McDonald has rendered the first bunch of sounds into monochrome bitmaps that show an interesting, "glitchy" structure. Usually, music looks quite bad when rendered as bitmaps -- and this applies even to small chiptunes that sound a lot like our experiments, so it was interesting to notice the visual potential as well.

I envision that, in the context of generative audiovisual works, simple bitwise formulas could generate source data not only for the musical output but also drive various visual parameters as a function of time. This would make it possible, for example, for a 256-byte demoscene production to have an interesting and varying audiovisual structure with a strong, inherent synchronization between the effects and the music. As the formulas we've been experimenting with can produce both microstructure and macrostructure, we might assume that they can be used to drive low-level and high-level parameters equally well. From wave amplitudes and pixel colors to layer selection, camera paths, and 3D scene construction. But so far, this is mere speculation, until someone extends the experimentation to these parameters.

I can't really tell if there's anything very profound in this stuff -- after all, we already have fractals and chaos theory. But at least it's great for the kind of art I'm involved with, and that's what matters to me. I'll probably be exploring and embracing the audiovisual potential for some time, and you can expect me to blog about it as well.

Edit 2011-10-29: There's now a more detailed analysis available of some formulas and techniques.