I’ve been working on a game project on and off over the past year and a part of the design is of relevance to this month’s theme — animals. The gameplay revolves around creatures of various kinds — some good, some evil, some tiny, some large. I had to conjure a vocalisation system that achieved the following technical and design criteria:
- Actions by the user would directly affect the state (and sound) of the creature
- The player must be able to perceive some sort of emotive response from the creature
- A modular system which would work for various creature types and characters
- With mobile devices being the primary target, it had to be simple, effective and portable
- Low CPU and memory usage, which translates to maximising the design capabilities of the system with little DSP and few samples
As with most people, I’ve found creature/animal vocalisations easier to design when using material that consists of either human or animal vocal sounds. It is easier for players (or the audience) to make visual and mental connections if they find something remotely similar to reality. It was important for me to make the resulting design as close to what animals sound like.
I collected sounds that matched the above criteria and then shortlisted them based on recording quality (to ensure maximum quality after subjecting them to DSP mangling), character (sounds that created an image or an emotion in my mind) and frequency content (important when grouping sounds together).
‘Emotion’ is tough to parametrise or quantify. It is a loose descriptive and can mean different things to different people. Instead of going after specifics, I put down a list of questions to help me made decisions:
- What size does the sound convey? (the relative size of the animal)
- Is it irritating, menacing, timid or defensive? (dogs were a good reference for this)
- Does the sound convey speed and energy? (this is related to the previous question)
- Is there enough content to make the creature expressive and not boring? (player-creature encounters were expected to last a few minutes)
- Is the sound distinctive enough? (it is easy to get lost down the rabbit hole of perfection)
I encourage constraints in my work, but I also enjoy having room to wriggle. We decided to use libpd (the ‘packaged’ version Pure Data that can be used as a sound engine), because of its flexibility and rapid prototyping nature. It makes it easier to implement non-standard ideas and workflows, which was what this project needed.
Performance and hardware constrains are important with content designed for mobile devices. Having complete control of the audio system gave me the opportunity to experiment with a variety of techniques and closely link the design and technical worlds. It made no sense to treat them as separate concepts. In my head, technical design is a lot like mixing a film — a combination of careful technical and creative choices to do justice to the project and medium.
Over the past few years I’ve found myself getting comfortable with a brute-force design approach. I spend a large portion of my time on a project throwing ideas, sounds and images together until it all begins to make sense. It can be quite rewarding, once past the initial period of frustrating results.
Before constructing complex DSP techniques or piling on TheNextBestPlugin I try to maximise output by playing around with:
- Pitch: Realtime pitch changes. This could be either varispeed (pitch affects time) or a time-stretch like technique. In most cases I prefer varispeed because it also changes the amount of energy in addition to the pitch (more like reality).
- Amplitude: Simple amplitude analyses or automation.
- Time: Reorganising content, trimming samples, granulation or time stretch
- Frequency: Low pass, high pass or bandpass filters
A few days of experimentation resulted in this system:
It consists of three components:
- Behaviour/Character control: This section receives input from the game engine and controls the behaviour of the creature, by manipulating playback controls and sample assignment. In this case, it receives the distance of the player from the creature/animal (could potentially be any other game parameter). As the player gets closer to the creature, it gets more agitated or excited and vice-versa.
- Behaviour Algorithm: This is the ‘heart’ of the whole system and controls the behaviour of the creature and can be completely randomised or controlled. Each creature can be ‘assigned’ a different behaviour.
- Playback behaviour: This translates the output of the behaviour algorithm through a set of transfer functions which is used to control the type of samples played back and their pitch, rate and amplitude.
- Sample Banks and Playback: The very first implementation of this idea used a single sample bank and didn’t result in much flexibility. Instead, I decided to use three different banks of low, medium and high energy samples. These samples work best if they are segregated based on frequency content and vocalisation character. Each sample bank needs a total of 10 seconds worth of samples to work effectively. So, each creature character just needs about 30 seconds worth of mono audio data!
- Equal Power Crossfader: The behaviour algorithm controls the sample bank selection and uses a short equal power crossfader to switch between sample banks during playback
Here’s what the system sounds like, with a mix of human and animal sounds loaded into the bank and a gradual decrease in the amount of energy:
Extending the system:
With a few tweaks, the system worked well in the game with about five different creature character types. To increase the level of interaction, I decided to add two more controls: ‘health’ and ‘attack’, as the evil creatures could be attacked and destroyed.
- Creature health: If the creature has 100% health, there would be no difference in how the system functions. As the creature begins to lose health, the system will playback random samples from the ‘Wounded Samples’ sound bank. The probability of playback of these random samples is directly proportional to the loss of health, i.e, a lower percentage of health increases the chance of playback of these samples.
- Creature attack: Every successful attack on the creature by the player, will result in one of the sounds in the ‘Whine Samples’ sound bank to playback.
Here’s a clip of a subdued creature with a slightly low health:
And a clip of the creature being attacked:
In its current form, this system works quite well in the game. It includes enough flexibility and every creature character just needs about 30-40 seconds worth of audio data loaded into the RAM (which equates to a few hundreds of kilobytes if compressed). The system doesn’t include any complicated processes like multi-voice granulation or convolution and takes up a fraction of CPU resources.
Future work will include more options in the behaviour control (controlled randomness, patterns, etc.), possibly more sample banks for finer control and more importantly using the system to create bursts of audio to construct words and sentences of gibberish.