When designing audio we are often thinking of time across a large variety of units: samples, milliseconds, frames, minutes, hours and more. This article is inspired by a conversation I had with Andy Farnell about a year ago at a pub in Edinburgh, right before a sound design symposium, where we discussed about time and the role it plays when it comes to designing audio.
Like most other audio designers out there, I started twiddling the knobs and sliders well before I had an understanding of the underlying DSP. It was eye-opening experience to realise that almost every single DSP effect is related to time. So let’s start looking at a few common DSP tools used in everyday sound design and analyse how time and the precedence effect plays a role, starting from hundreds of milliseconds all the way down to a single sample.
The precedence effect is a psychoacoustic effect that sheds light on how we localise and perceive sounds. It has helped us understand how binaural audio works, how we localise sounds in space and also understand reverberation and early reflections. From Wikipedia:
The precedence effect or law of the first wavefront is a binaural psychoacoustic effect. When a sound is followed by another sound separated by a sufficiently short time delay (below the listener’s echo threshold), listeners perceive a single fused auditory image; its spatial location is dominated by the location of the first-arriving sound (the first wave front). The lagging sound also affects the perceived location. However, its effect is suppressed by the first-arriving sound.
You might be familiar with this effect if you’ve done any sort of music production or mixing. Quite often a sound is hard panned to one of the two stereo speakers and a delayed copy (10-30ms) of the sound is hard panned to the other speaker. Our ears and brain don’t perceive two distinct sounds, but rather an ambient/wide-stereo sound. It is a cool technique for creating a pseudo-stereo effect from a mono audio source.
The first 30 seconds in the video below shows an example of the precedence effect in action. The delayed signal smears the original signal with phasing artefacts after which it seems to split from the original signal and become a distinct sound of its own.
Echos And Reverb
Echos are distinct delays. Reverberation is made up of early reflections which are delayed sounds that arrive first at the listener (right after the direct sound) followed by a tail that consists of many such delays diffused into a dense cluster. Artificial reverbs are quite often approximated using networks of delays that feedback into each other (convolution reverbs behave a differently).
Depending on the size of the space, early reflections would usually make up the first 80 milliseconds or so. The late reverberation, as we know, can last many seconds.
Dialling down from hundreds of milliseconds, phasing and chorus effects are only an extension of the precedence effect. When the time delay time is modulated, usually in the tens of milliseconds, we end up with a phaser effect (a chorus and phaser are similar in theory, the major difference being in the choice of delay times, amongst other modulation parameters).
00:30 – 00:50 in the video below shows an example of this.
We now take amplitude panning for granted, but there was much experimentation into panning techniques in the early days of stereo (the precedence effect was first proposed in the 1940s). With amplitude panning, we can approximate positions of sounds in a stereo or surround stage by changing the amplitude of the sound across the array or matrix of speakers. In reality our ears and brain use a combination of factors to localise sounds: inter-aural time difference, inter-aural level difference, reflections off the body, early reflections, reverberation, amplitude and more.
We could construct an alternate stereo panner using only time differences across the speakers, rather than amplitude differences (Tomlinson Holman’s book on this subject, “5.1 Surround Sound: Up And Running” is worth reading). This is done by delaying the sound coming off one of the speakers between a small range, as little as 0 to 1ms. I’ve always found this effect to be more convincing. This technique hasn’t been used much because it is quite easy to end up with some nasty phasing issues if the listener isn’t in the sweet spot. Although, time-difference is one of the many factors used in binaural synthesis and panning.
00:55 – 1:17 in the video below shows an example of this.
We take filtering for granted these days, with filters being just a click away. All DSP filters are made up of delay lines, ranging from a single sample (such as first order IIRs or simple low/high-pass filters) to multiple samples (FIRs). A simple low-pass filter can be created by delaying a feedback signal by one sample and cross-fading between the delayed+feedback signal and the original signal. The cut-off frequency of the filter is nothing but that cross-fade value. Here’s a quick example implemented using gen~ in Max.
The [mix] object is a cross-fader. A value of 0 sent to its third inlet results in the just the signal at the first inlet being output, while a value of 1 results in only the signal at the second inlet being output. A value between 0 and 1 results in a proportional mix between the two signals.
01:19 – 1:44 in the video below shows an example of this.
And, I’ll leave you with this (thanks @lostlab):