Guest Contribution by Carlos Alberto Manrique Clavijo
“The vocation of the sound film is to redeem us
from the chaos of shapeless noise by accepting it
as expression, as significance, as meaning. . . .”[0A]
“(…) When does sound become music?
Above all, in the supreme states of pleasure and displeasure
experienced by the will, as a will which rejoices
or a will which is frightened to death,
in short in the intoxication of feeling: in the shout.”[0B]
As sound designers, we are always driven by passion and curiosity to try to understand that gentle monster that is the sonic language in the audio-visual world it inhabits. We constantly invoke its power to express emotion and to create significance and meaning. Yet, it’s mechanisms aren’t fully understood. And both film industry practitioners and audiences, have historically had the tendency to group it as a single entity with a different creature: music. But beyond the simple understanding that both art-forms stimulate and play with our auditory perception, asking ourselves about all the possible points of contact between both expressive languages may lead us to discover rich common-grounds to nurture our creative processes as ‘aural story-tellers.'
So this is an invitation to continue a conversation that started long ago. Long before audiences were amused by sound effects in the nickelodeon era of the early 1900s, long before the presence of sound effects of the eighteenth century Kabuki theatre in Japan; before the fear of non-musical sounds in Medieval Christianity and probably even before Plato’s condemnation of musicians’ inability to reproduce with precision the things of the world (which to him were already poor reproductions of ideas). But before continuing, I’d like to thank all of those giants on whose shoulders we climb everyday: sound designers, editors, mixers, directors, writers and researchers whose names I won’t mention because it would just put in evidence how little we still know about so many people who keep transforming our art-form. To all of you, thanks for your knowledge and the inspiration you provide.
A short while ago Jack Menhorn posted a text by Karen Collins in which she discussed the evolution of the use of ‘non-musical’ sounds in music, highlighting the explorations of pioneers such as Edgar Verèse and Pierre Schaeffer, Delia Derbyshire, Bernard Herrmann, Jerry Goldsmith and Alan Splet among many others, and finished with some interesting musical uses of sound design in video games. Then, in an interview posted by Shaun Farley, Randy Thom added his experienced voice to the conversation and amongst other things, he discussed the need for a common language between sound designers, composers, directors and producers . And after a fantastic exploration on perceptual categories of sound along the lines of Michel Chion by Martin Stig Andersen, Doron Reizes complemented by stating several of the commonalities between sound design and music in terms of shared methodologies, procedures and common tools. So what other voices can be added to the conversation? What else can be seen from the vantage point of their shoulders?
Use of musical principles in sound design
When Walter Murch wrote that most sound effects are “half language, half music” his phrase was implicitly suggesting two of the main directions in which we can explore the powers inherent to sound design. And although both branches would involve disciplines such as psychology, sociology, philosophy and anthropology, one would be better studied by linguistics, semiotics and the like while the other would involve music theory, music analysis, ethnomusicology, etc. So let us proceed for the time being with some initial observations of the building blocks of music and how they can and often are -consciously or not- used in sound design.
In the same way in which we can differentiate between voices, atmospheres/backgrounds/ambiances and effects (be it hard-cut or Foley), we can differentiate several basic building blocks of the western musical language like rhythm, melody, harmony, dynamics, articulation, silence, form, phrasing and orchestration.
In a nutshell, we can say that rhythm is the way in which sounds inhabit musical time. What exactly do we mean by that and how does it happen?
The structure of time in western music is given by the constant, regular, repetition of implicit beats or pulses, much like the ticking of a clock or the beating of the heart. And although we don’t always hear it, this structure is always there. And every sound that we hear, is heard in reference to that ‘grid’. In some cases the pulses will be repeated at short intervals and in others, at a fast pace. That, in music, is called tempo -the speed of the pulses- and it’s measured in beats per minute (bpm). In sound design, time tends to be more flexible and instead of a fixed grid (such as what would arise from considering frame rates as the basic time structure) we have an elastic, ever changing canvas on which sound events happen and evolve. Time in film-sound is measured against a frame of reference woven by three different structures: the visuals (internal, the rhythm of the images within a shot; and external, the pace given by the edit), the story (depending on the ways in which time is handled from a plot based point of view) and other sounds (such as music, per se, or patterns of periodic sound events such as clock ticks, engines, ringtones, bells or even sea-waves). As sound designers, we have the choice of either going along with the pace dictated by the structure, opposing it, or alternating between both. It’s then worth mentioning at this point Michel Chion’s idea of empathetic or an-empathetic effects. In Audio-vision, Chion explains how “On one hand, music can directly express its participation in the feeling of the scene”, (empathetic music) or it “can also exhibit conspicuous indifference to the situation” (anempathetic), in regards to which we should also highlight what Walter Murch states in the prologue to “Audio-vision”: “(..) the greater the metaphoric distance, or gap, between image and accompanying sound, the greater the value added—within certain limits.” So how does this all connect? When two or more different art-forms (like music, sound design, cinematography and performance) converge, as is the case with film, they can either support or oppose what the others are conveying. And in the case in which they oppose one another, such ‘cognitive dissonance’ creates a semantic gap that has to be filled by the spectator’s mind, thus engaging them further with the film.
On the subject of tempo, another important aspect to consider is that it can gradually speed up (accelerando) or slow down (rallentando), implicitly creating a sense of direction (or vectorization in Chion’s terminology when he refers to sound’s influence in the perception of time in the image). “(…) sound vectorizes or dramatizes shots, orienting them toward a future, a goal, and creation of a feeling of imminence and expectation.” On the other hand, I must stress the importance of patterns in the management of the audience’s expectations. When we create a sense of repetition, several questions rise immediately in the spectator’s mind at an almost subconscious level: will it continue? for how long? will it be broken? if so, how will it be resolved? In the opening of Fellini’s 8 1/2 (1963), we don’t hear many of the sounds we would normally expect to hear during a traffic jam scene. Instead, amongst the few sounds that we do hear, we perceive a regular percussive event, halfway between music and sound design. Its incessant beat creates tension and suspense, through the mechanism we’ve just mentioned, up until the moment when the trapped character is finally able to escape the vehicle.
Continuing with our exploration of rhythm, we must admit that music would be incredibly monotonous if the only thing we had was repeating sounds at constant intervals. So there are two more aspects that influence our perception of rhythm: the notion of accents and subdivision of time. Firstly, in a way similar to speech, the musical discourse places accents on certain beats. And although there are plenty of exceptions, most often those accents appear at regular intervals; every certain number of beats. This is what gives rise to what we know as meter. And culturally, we are conditioned to certain associations of particular meters with certain genres of music or with specific moods. For decades, for example, meters with accents happening every three beats have been associated either with traditional European dances and carols while meters with accents happening every two beats have been associated with marches. On the other hand, the time between beats can be divided in any number of ways, of which the most common are halves, quarters, thirds, eights and sixths. And so, each note can potentially have the duration of a complete beat, several beats, one of the previously mentioned subdivisions or a combination of them. And although in sound design we don’t have such a regular repetition of beats and accents as in music, as mentioned before, we do have the option of establishing repetitive patterns and place accents on them. And as with tempo, we also have the chance of following or opposing those patterns with other particular sound events, depending on where we choose to place them (on the beats, on subdivisions of those beats, on the accented beats, on other beats, etc.). Interestingly, in ‘Speech, Music, Sound', professor Theo Van Leewen explains how in many of the so called developed or first world nations, music tends to affirm the beat whereas in third world countries or first people’s cultures (which have a more flexible understanding of time), music tends to subvert the beat and the notion of a grid (like ska, reggae, salsa, most Indian music, Maori Haka, etc. -to name a few of the most well known examples). This opposition to the pulse by placing emphasis on the off-beats is what is known as syncopation. As an example of accents and subdivisions in sound design, going back to the Fellini example mentioned above, it’s also interesting to notice the rhythmical interplay between the somehow musical beat and the rest of the sound effects. The character’s sounds of despair (like his hands rubbing the window or his feet hitting the glass) fight the periodicity of the beat in the same way in which the character struggles to escape the car, which at the same time also parallels the main theme of the film: a director’s inner war against creative stagnation.
In music, as opposed to sound design, sounds have a definite, clearly identifiable pitch given primarily by its fundamental frequency. Melody is how we call the units or musical phrases formed the succession of such pitches over time. As with rhythm, melody also vectorises the listener’s attention. But unlike rhythm, this is due to the overall contour or shape formed by changes in pitch (like a graph representing pitch on the vertical axis and time on the horizontal one) rather than patterns of durations. This pitch logic is another one of those devices constantly used (again, knowingly or unknowingly) in sound design. And I say pitch logic instead of melody, given that in sound design we work mostly with sounds that don’t necessarily have a high ratio of fundamental frequency loudness to that of their harmonics, making them less tonal and more noise-like. This is not to say that sound effects can’t be ‘tuned’. For example, in Steve Reich’s ‘Different Trains’ (1988), the pitch logic of the voice is mimicked by the musical instruments, turning it into melodies; in Erik Gandini’s ‘Surplus’ (2003), the pitch logic and rhythm of Tania’s speech (when talking about rice and beans) is changed of context and disembodied to be used with a melodic function in the music track for the rest of the scene.
In this example from Harán Arámbula (2005) by Lina Pérez, the water droplets in the atmos are tuned and organised to ‘play’ the main melody right before it actually appears in the percussions; almost like a canon between sound effects and music.
On the other hand, through our personal experience, as well as through cultural conditioning (and in a huge way thanks to Treg Brown’s sounds for Wile E. Coyote and the Roadrunner), we have come to associate melodic changes with changes in tension or in potential energy (e.g. rising pitch signaling springs being compressed or bow strings being stretched), changes in speed or kinetic energy (such as machines speeding up or being shut down as well as fast passing objects with Doppler effect) and changes in vertical direction (like the sounds of bombs falling from the sky). In the opening of Back to the future (1985), the raising pitch when Marty turns the knobs of the amplifiers greatly increases the tension already generated by the visuals, while the lowering pitch in the factory fight at the end of Terminator 2 when the T-8000’s arm is trapped in the giant cogs indicates a reduction in the Terminator’s energy levels. On the other hand, due to a certain degree of naturally occurring acoustic phenomena as well as the frequency response of our hearing mechanism (which doesn’t respond well to bass, so low pitched sounds need more energy to be perceived as loud as others), we have also come to associate pitch with size and weight. Large objects tend to have a lot of low frequencies (favouring low pitch) and small, light objects mostly posses high frequencies, thus allowing us to play with these conventions when using sound design for characterisation.
Aside from these connections, melody also finds its equivalent in sound design in the category of the effects. In music, melody tends to grab the listeners attention more than harmony, which put in terms of gestalt psychology would make melody the figure and harmony the ground. In the same way, effects are at the centre of the spectator’s attention focus, making atmos sounds the background. But more interestingly, from the point of view of Murray Shafer’s study of soundscapes, effects (and melodies) would equate to signals and atmos (as well as harmony) to keynote sounds. Why does this matter? Mainly because when we refer to signals and keynote sounds, we not only establish hierarchies but also relationships. It’s not only a matter of foreground versus background, but a matter of providing a reference within a semantic relationship: “(…) the ground exists (…) to give the figure its outline and mass. (…) the figure cannot exist without its ground; subtract it and the figure becomes shapeless, nonexistent.”[16a] The hard cut effect of a bird chirp against atmos of a forest is completely different form the same chirp laid over the atmos of a factory.
Also in the context of melody, it’s important discuss leitmotifs and polyphony and their relevance for sound designers. Most of us are familiar with the use of musical leitmotifs in film. A motif is a recognisable melodic unit and becomes a leitmotif (extensively used by Wagner) when it is associated to particular characters, locations, themes, objects or situations: the Imperial March of Star Wars, the pastoral hobbit theme in The Lord of the Rings trilogy, the Superman theme or Glinda’s theme in The Wizard of Oz, to name a few. The reasons behind the way in which leitmotifs work seem to be very Pavlovian:
“A dependency relationship is generated between two stimuli when a neutral stimulus that doesn’t necessarily produce a reaction (e.g. a sound or set of sounds arbitrarily chosen), is presented together with a significant stimulus that does produce some sort of reaction (e.g. a plot event or a particular image that evokes specific emotions). And whomever perceives these two stimuli simultaneously in several occasions will end up having the same reaction to the neutral stimulus as to the significant one even if the neutral one has been removed.” [16b]
But what about leitmotifs in the Elements of the Auditory Setting (using a term coined by Michel Chion)? There are two excellent examples of their use in Christopher Nolan’s ‘The Dark Knight’ (2008). For this film, Hans Zimmer created one of the aforementioned ‘sonic centaurs’ as a leitmotif that accompanies (and often prepares) several of the appearances of the joker: a highly dissonant sound that lies halfway between music and effects. At the same time, the sound design team, lead by Richard King, presented us with a very effective an interesting use of the Shepard tone as a leitmotif for the Batpod.
Finally, within the language of western music, polyphony refers to the interplay of simultaneous melodies. And if we continue the equivalence between effects and melodies, we come to what appears to be an obvious conclusion: that there is constant use of polyphony in sound design -there seems to be nothing interesting about that. What is interesting though, is the role that psychoacoustics play here. In ‘Dense clarity – clear density’, Walter Murch explains what he calls the “Law of Two-and-a-half”, by stating that
“Somehow, it seems that our minds can keep track of one person’s footsteps, or even the footsteps of two people, but with three or more people our minds just give up – there are too many steps happening too quickly. As a result, each footstep is no longer evaluated individually, but rather the group of footsteps is evaluated as a single entity, like a musical chord. (…) It turns out Bach also had some things to say about this phenomenon in music, relative to the maximum number of melodic lines a listener can appreciate simultaneously, which he believed was three.”
So yes, we can create interesting interplay between different sounds but we can’t have more than three of them simultaneously without compromising clarity. So if it’s three, where does the ‘two-and-a-half’ come from? In the same article, Murch explains how this cognitive limit seems to be less pronounced if the sounds in the foreground of our mental focus belong to different categories of sound design. We could somehow follow two conversations simultaneously but not three. But if the third element was not voice but music or an effect, clarity wouldn’t be so compromised. So I ask myself if this is due to the composite nature of our ‘sonic centaurs’. Is this because speech and music are apparently processed in different regions of the brain and sound effects fall somewhere in-between? In an article titled “Structure and function of auditory cortex”, a team of cognitive scientists present a hypothesis according to which
“(…) the predominant role of the left hemisphere in many complex linguistic functions might have arisen from a slight initial advantage in decoding speech sounds. The important role of the right hemisphere in aspects of musical perception – particularly those involving tonal pitch processing – might then have been in some sense a consequence of, and is complementary to, this specialization of language.” 
If Melody is the name we give to the units forming the succession of pitch over time, harmony is the name given -in western music- to the relationships between simultaneous sounds. The most basic one is that of consonance and dissonance; in which the former refers to combinations of sounds that are ‘pleasant’ and ‘stable,’ while that later refers to the opposite. And although the judgment on which relationships are consonant and which dissonant is largely determined by culture, what matters to us is that, within our current cultural and historical conditions, as sound designers we can tap into these codes to convey particular emotions of pleasure or displeasure to the audience depending on the sounds we choose. But how do we choose? We can either perform these choices based on our own subjectivity and intuition, or we can consciously ‘tune’ the fundamental frequencies of particular sounds by shifting their pitch and creating pitch relationships that are known to be either consonant or dissonant to our audiences. For example, current Western musical practices seems to agree on classifying octaves and unisons as the most consonant of musical intervals, then fifths, fourths, thirds and sixths, fourths, seconds, sevenths and finally, the most dissonant, augmented fourths (which is the same as diminished fifths), also called tritones.
The different layers of atmos in the opening sequence of Woods of Charol (2006) by Ana María Méndez S. were processed with vocoders in order to ‘tune’ them and produce chords.
Another device sound designers have taken from harmonic practices is the use of what’s known as pedal. It consists of playing a note for a long time while overlaying different melodies or harmonies over it, or in the case of sound design, what we often refer to as drone (used extensively in David Lync’s films). Pedals and drones create a kind of stasis due to their unchanging nature while at the same time generating expectations in the listener in a way similar to the rhythmic patterns previously described: after being heard for a while, the audience will start to unconsciously ask themselves about when and how it will end.
On that trajectory, when specific sets of sounds (based on juxtaposed thirds and sixths) are played simultaneously we get chords. And from the Baroque period the use of chord sequences subject to certain rules established a system made up of sets of patterns to which western listeners have become habituated (tonal music). In this system, chords tend to gravitate towards a particular central chord known as the key. So in a way, harmonic practices in tonal music (which include most popular music of the twentieth and twenty-first centuries) are based on a game of riddles in which the listener is teased by presenting well-known patterns and either fulfilling or breaking those expectations -not unlike what directors, editors and sound designers do.
Another result of the classification of pitches into harmonic systems and their corresponding rules is the appearance of modes. Modes are organised sets of notes that relate to one another hierarchically and as with chords, a sense of tension is created when using notes that are ‘far’ from the tonic (centre). Historically, there have been multiple theories of emotion in music that establish links between each mode and particular emotional states. But even if now a days there is still extensive use of multiple modes in genres such as Jazz or academic art music (classical music), in popular western music there are two predominant modes: major and minor. Major mode is normally associated with happy, bright, extrovert moods while minor is connected to sadness, introversion and darkness. But human emotions are incredibly more varied and complex than just those two basic states: happy and sad. In affective science it’s not hard to find lists and classifications of tenths and hundredths of different emotions like awe, excitement, gladness, happiness, tension, fear, anger, distress, tiredness, boredom, sadness, grief, ease, serenity, pride, courage, embarrassment, love, empathy, gratitude, pity, guilt, apathy, hatred and loneliness to name a few. But can either music or sound design convey all kinds of emotions? In “Music and Emotion- Seven Questions, Seven Answers” (a must read text), Patrik Juslin explains how
“Recent evidence from a handful of survey studies suggests that music can evoke quite a wide range of affective states. Among the most frequently felt musical emotions, according to these survey studies, are: happiness, calm, nostalgia, love, sadness, interest, hope, excitement, and longing, as well various synonymous emotion terms (…) In sum, the findings from studies so far suggest that music listeners could experience anything from mere arousal, chills, and ‘basic’ emotions (e.g., happiness, sadness) to more ‘complex’ emotions (e.g., nostalgia, pride), and even ‘mixed’ emotions.”
But regardless of the importance the theories about the mechanisms that arouse emotion in music have, it is quite a complex topic that exceeds the scope of this text. However, this is definitely an open invitation to participate in another conversation -Paraphrasing Juslin: Which emotions can sound design arouse? In what contexts do these emotions occur? How does sound design arouse emotions? And even, we can start asking ourselves questions like ‘what are the equivalents in sound design to major and minor modes?’
What we can say in the meantime is that many of us would definitely agree that the goal of sound design is to tell stories: communicating ideas, playing with people’s expectations, conveying emotions, generating worlds, giving life to characters. As Randy Thom says, “The biggest myth about composing and sound designing is that they are about creating great sounds.” There also seems to be some consensus in stating that in sound design it is often easier to convey negative feelings than positive ones, and that it’s often easier to create artificial sounding entities rather than organic ones. Thus, Karen Collins confirms in her text “The Sound of Music, or the Music of Sound?” that, “It’s not surprising that many films that blur the distinction between music and sound design tend to be from the science fiction or horror genres”. Furthermore, she mentions how these sounds aren’t often only disconcerting, unfamiliar and unsettling but many other times can become caricatures: genre conventions for slapstick and cartoons.
So even if sound design apparently can’t yet express the same range of emotions as music, it may be because the conventions of music for the expression of emotions have had more time of exploration and conditioning of audiences, which means we need to keep exploring, experimenting, sharing and educating. And while we further inquire into what are those tools proper to sound design, we may need to keep borrowing elements from other languages such as music. We may, for example, have to dig deeper into other areas of study such as psychology, soundscape studies, sociology, semiology and linguistics. So lets try to understand better, amongst many other things, how archetypal sounds, speech intonation and animal calls can relate to sound design. For example, in the same sequence I previously mentioned from Fellini’s 8 1/2, the pitch variations and ‘intonation’ of the rubbery squeaks of the character’s hands against the window have an astounding resemblance to dog whines, which account -in my opinion- to explaining why they work so effectively in communicating the idea of defencelessness. Another example of the kinds of seemingly unrelated elements that we can study is a research I’m currently starting on the parallels between certain uses of sound design and ‘figures of speech’ in verbal communication. How do aspects of language such as metaphors, allegories, puns and euphemisms find their way into sound design?
Dynamics and articulation
After that digression into the expressive potential of film sound, lets return to how music and sound design are related. The next elements to discuss are a lot closer to standard practices in our art-form, but it’s still worth raising awareness of their use within this context. The word dynamics, in music as well as in sound engineering, refers to changes of loudness over time. We constantly use crescendos and decrescendos as means for vectorising the audience’s perception of time. By creating a tendency, we generate expectations. In that famous scene from The Godfather I (1972), when Michael shoots Sollozzo and McCluskey not only does Walter Murch use the train sound as a metaphor for the turning point in Michael’s story, but also, he used the formal qualities of that particular sound (dramatically increasing volume) to vectorise our attention and expectations towards the moment when Michael finally shoots. In the final airport scene in Twelve Monkeys (1995), on the other hand, the team working with supervising sound editor Peter Joly and re-recording mixer Mick Boggis carefully vectorise the audiences attention by gradually decreasing the volume (decrescendo) until only a few voices and an alarm are left on a relatively silent base (created by processing the atmos track heavily with the diffuse reflections of a reverberation). Then after the alarm dies off, spectators are denied the sounds of evident on-screen actions (like the protagonist bumping someone else), further increasing the expectations until the moment when we finally hear a single sound effect: the gunshot that kills the protagonist.
Aside from crescendos and decrescendos, two other common dynamics resources used by sound designers are those of sforzando (sounds that are emphasised by playing them louder than other surrounding sounds) and fortepiano (a sound that starts loud and suddenly becomes quiet). The evolution of a sound’s loudness over time. Sounds familiar. Well, for sound designers, this would refer to nothing other than envelopes. And although in music, those envelopes have more to do with loudness than with frequency (since both parameters are considered in sound engineering), there are still alterations to the timbral qualities of instruments when such dynamic elements are applied.
On the topic of envelopes, we should also mention articulations. In music, articulation refers to interpretation techniques in which notes are played either continuously or separated from one another. In the context of sound design this could probably refer to what Tomlinson Holman calls ‘grammatical use of sound’. In “Sound for film and television”, when Holman coins this term, he refers to how “(…) sound provides a form of continuity or connective tissue for films” in a way that wouldn’t differ too much from Chion’s linearization of time.[26a] But if we reflect upon what grammar is, we can expand that term and hence, further develop the possibilities that this role of sound can offer. Even for non linguists such as myself (and I reiterate the invitation for interdisciplinary collaboration), the term grammar implies syntax; sets of rules, structure, order, logical relations. Within this context, sounds then have the potential for linking ideas, separating them, organising, establishing hierarchies and grouping different elements into units (which is a bit closer to Chion’s idea of punctuation).[26b]
Back in the field of music, we can thus mention a few of musical articulations that exhibit potential for sound designers: tenuto, marcato, staccato and legato. And although these are not all the articulations existent in the musical language, and although grammar and syntax weave conceptual webs far more complex than what these few techniques can offer, these examples can help unlock a door that will potentially lead to interesting new explorations. Tenuto consists of emphasising a particular sound through slight increased duration or loudness. We’re talking about subtleties here. The kind of subtleties that re-recording mixers orchestrate. And it’s also worth mentioning here that sound design is understood in this text as something that comprises both editing & mixing. Following on, marcato means to accentuate something heavily or forcefully. But interestingly, this doesn’t necessarily refer to loudness and duration alone, but it also considers character. How many times have we heard gentle sounds that are played loud in the mix to cut through busy atmos tracks making them out of context? So accents for us should go hand in hand with character and with the idea of acting and interpretation with which Foley artists so gracefully play. Stacatto and legato, on the other hand, refer to either separating sounds by means of shorter durations (without accentuating) or playing them smoothly and connected.
These musical terms seem to be referring primarily to micro-structures such as single notes or musical phrases. But we must reiterate that depending on the syntax, as we concluded from our enquiry into the ‘grammatical use of sound’, and on the sound events’ meanings derived from the context in which they appear, we can make use of articulations in sound design within larger parts of the structure of the piece.
We’ve spoken about loudness and how it affects our perception of the sonic sphere. But loudness is never loudness per se. Something is always loud in relation to something else. A particular sound or passage may be perceived as loud depending on the tolerance levels and the hearing health of the listener, but also in great manner depending on the passages that precede and succeed it. The subjectivity of our perception of loudness is greatly reliant on contrast. After we are exposed to sounds of high intensity, the stapedius reflex, or auditory reflex, contracts the muscles in our middle ear, reducing our perception of loudness. “Filmmakers often ignore the fact that continuous loud sound is no longer perceived as loud by the audience because the aural reflex ‘turns down the volume,’ making the scene less effective than expected.” So silence is crucial for the perception of loudness.
“The sound of nothing is hard to convey by itself. It is best accomplished by contrasting it with the sounds of something that suddenly go away. The more the contrast, either in volume, density, or variety of sounds that completely go away when you enter the room, the greater the sense of emptiness and isolation.”
As many have mentioned before, pure, absolute silence is very rare in film. In a way, it tends to be perceived by the audience as a technical fault in the projection system but on the other hand, it may break the audience’s suspension of disbelief by making them aware of their surroundings. However rare, there are still plenty of examples. In Babel (2006), the sound team lead by Martín Hernández and director Alejandro González Iñárritu used absolute silence as a device to convey the subjectivity of Chieko Wataya, a deaf-mute girl. Other interesting and well-known examples are the sound of the spacewalk scene in 2001 Space Odyssey (1968) and the opening sequence in Contact (1997).
There is yet another type of silence that sound designers employ in film: relative silence. This is when there are still sounds left, but they are soft and in the background of our mental representation of the film soundscape. Silence is, in this case, born entirely out of contrast. The scene previously mentioned from twelve monkeys is a good example of this. Also, the scene from Polanski’s The Pianist (2002) in which the tank fires at the protagonist uses this technique by filtering out the high frequencies from most sounds after the explosion and leaving us with only one clearly audible sound: the high pitch tone as signifier for tinnitus, partial deafness, shock and confusion.
So silence is in fact another tool in the palette of sound designers. But it’s not just a matter of muting things. It has to be prepared. One of these ways of preparing it is by adding reverberation with a long decay time to the sounds just before the ‘silent’ section, like in the Twelve Monkeys example mentioned above. Another way of preparing silence is by very gradually reducing the number of sound elements during key emotional plot points such as one of the scenes at end of The Godfather III (1990). In this sequence, when Mary is shot, all the layers from the atmos tracks are gradually removed until we’re only left with her voice, Michael’s voice and a few Foley sounds that correspond to her actions (e.g. her earings), immersing us into Michael’s subjective, traumatic experience of the death of his daughter. A third way to introduce silence consists of quickly transitioning from loud to quiet by distracting the audience’s attention with a prominent sound. This is the case of the famous Omaha beach sequence in Saving Private Ryan (1998), where a low frequency sound is pitch bent downwards introducing us to a sequence dominated by relative silence to illustrate the protagonist’s perception of the situation.
We have explained the types of silence and the methods to prepare it, but what does silence mean? If sound has roles in film, what are the roles of silence? In Western society, silence has multiple connotations. A few of them are on the positive side of the emotional palette (respect, spiritual or philosophical reflection, peacefulness or relaxation and rest) while the majority relate to negative feelings (disinterest, awkwardness, tension, anger and hostility). And all of those associations can be appropriated by sound designers. But on an even broader scale, other than triggering culturally specific emotional conditioning or highlighting the impact of a particular event through contrast (as previously mentioned), silence can be used to allow the audience to reflect upon an important event that has just happened, accentuate something through anempathy and emotional dissonance, or prepare the audience for the representation of hyper-awareness of a character´s ‘point of hearing’ (akin to ‘point of view’). A good example of accentuation through anempathy is the scene where Justin Quayle is shot in The Constant Gardener (2005). After a long preparation through a parallel montage where the killers drive towards him, the story-lines converge without allowing us to see them meet. At that point, from extreme close-up shots of his eyes, we go to a wide shot of a flock of birds flying away as if scared by a gun shot we never hear. The fact that the sound of the guns is omitted, does several things: it says that his story doesn’t resonate outside of his own individual experience; no one will ever know. Also, by denying it, the sound crew and the director are forcing our minds to ‘fill-in the gap’. And by completing that sound with our own sound, we engage even further with the story. Our experience of the film changes from passive to active, from spectators, we become makers. Something very similar happens at the end of The Godfather part III, right after the sequence we previously mentioned. Hurt by the death of his only daughter, Michael screams to the camera in a series of close-up shots. His expression is that of a very loud and painful scream. But we don’t hear it. Silence heightens the intensity of this emotionally significant event. When we see such a confronting image without hearing what we’d expect to hear, our mind is driven by expectations that in this case, are delayed for several seconds until we finally see the protagonist scream and breathe and at last, we are able to catch our breath together with Michael.
Let’s not forget, however, that silence is to be treated with subtlety and as any other creative device, it is to be employed only when it’s needed.
“I don’t think we should assume that silence in a movie is always a blessed event of artistic genius. (…) it’s easy for silence to be just another sound cliche, as heavy handed as any clunky musical cliche, waving a red flag that says profundity. (…) Of course there are moments in films where silence is appropriate. It could be used more often than it is now. If there is an organic way to work it in, why not?”
The structure of ‘sound design compositions’; the way in which sounds are organised in time, how they’re grouped into larger units and how those parts relate to one another is somehow mediated by the plot and the visuals. However, sound designers can analyse and extract information from those two frameworks and shape it in ways in which it maximises the story-telling potential of sonic language. Most popular songs go from an introduction, to alternations of verses and choruses, to bridges, solos, more repetitions of the musical material already used and then a closing section (or outro, as opposed to intro, for some people). In academic art music, these structures can be far more complex. Take the well-known sonata form, for example. A first set of musical material is presented, followed by a transition and then a secondary set of material that leads to a conclusive section. After this, either all or just portions of the material previously presented are developed and transformed. Consequently, the original sets of materials (including the transition and conclusive passage) are re-stated, finalising the piece with a coda; an even more conclusive section that helps wind down the momentum of the piece.
What kinds of structures can we then give to our sound, within the context of film? In “Sound Design”, David Sonnenshein suggests the idea of drawing visual maps. In this graph, he suggests that we represent, “(…) patterns that will give [us ](…) clues for building the sound design structure.” In his model, Sonnenschein proposes representing time along the horizontal axis and ’emotional intensity of the story’ on the vertical one. So this kind of graph could be an excellent starting point for graphing and representing the different escalations that drive the plot of the film. But additional to that, we can also extract elements like the balance between interior vs. exterior scenes, day vs. night, colour palette transformations or timbral qualities of locations that appear repeatedly in the film (water, fire, wood, city, nature, etc.), amongst many others. The possibilities are endless. So then we can think of organising our material onto these structures, following the escalations previously mentioned. A small example of how reiterated elements can be varied effectively, creating a sense of evolution and direction in relation to the structure, is the picnic scene in Citizen Kane (1941). In it, we shift from the exterior of the tent (a socially active setting with a band playing music and subtle walla of people enjoying the celebration) to the more intimate yet stressful couple’s argument. Following the form suggested by the plot, the atmos and effects alternate between loud in the exteriors to quiet in the interiors. But as the tension rises in the couple’s discussion, the sounds of both parts escalate in volume , with the exception of the shot just prior to the slap. At that point, there is a relative silence, followed by women screaming outside the tent as an externalisation of Susan Alexander’s feelings.
In time based arts, repetition and familiarity tend to convey the idea of ‘part’ or ‘section’. But constantly using and re-using familiar material will inevitably lead to monotony. This is not to say that constant variation, novelty and surprise are a solution. In fact, too much variation won’t allow listeners to have any points of reference in relation to structure and will easily disengage them from the piece. So the way in which a piece is organised in time, is really a result in the search for balance between familiarity and surprise. It is a way of establishing hierarchies in a similar way in which cinematographers structure the contents of their shots to create order and coherence of meaning through composition.
Some common techniques used by composers to transform musical material are: inversion (ascending pitch movements become descending and vice-versa), retrogradation (comparable with reversing sounds), retrograde-inversion, interpolation, ornamentation, rhythmical augmentation and diminution (which is like compressing or expanding sounds in time), layering and overlapping, and transposition (pitch shifting whole sections while keeping the overall melodic contour). Several of these techniques can even be heard in action in the music of most acousmatic and concrète composers like Pierre Schaeffer, who in his legendary Étude aux chemins de fer (1948) plays with the idea of structure built on repetition and variation.
Arranging and Orchestration
In the same way in which orchestrators and arrangers assign different instruments to different parts of compositions, sound designers and sound editors use different timbres to convey specific ideas and emotions. When deciding between closed voice harmonies (chords where voices are kept within a narrow range) and open voice harmonies, arrangers are essentially making the same decision a sound designer or sound editor would do when deciding what layers to use for a specific sound. In the case of an explosion for example, it’s more effective to use ‘open arrangements’ in which different layers of sound occupy multiple portions of the frequency spectrum, thus stimulating several regions of the basilar membrane and creating the feeling of a ‘large’ sound event. Whereas in subjective sequences that suggest reduced perceptual skills, ‘closed arrangements’ work better by decreasing the range of frequencies the character would be able to hear.
Arrangers also often ‘edit’ compositions, for example, by moving a main theme to the introduction, extending a bridge or condensing a verse in the same manner in which a sound editor could choose to reveal the source of an apparently non-diegetic sound or anticipate the atmos of a scene before the shots that depict it are seen. Both arranging and composition have to do with the systematic management of many of the elements we’ve discussed throughout the text in ways in which the expressive potential of the piece is maximised. Orchestrators, arrangers and even conductors (which would probably resemble re-recording mixers) are in charge of managing people’s attention through vectorisation, establishment of sonic hierarchies, and managing tension-release patterns. An orchestrator as well as a sound designer could possibly start a scene with just a few sounds, like the parallel montage scene at the end of the Godfather I (1972), and gradually increase that number in order to generate a sense of direction.
One last thing to mention in regards to orchestration and arranging is the fact that changes of volume won’t necessarily make a particular sound event stand out. It’s not a matter of volume but of how loud we perceive that sound event. So in order to make things ‘cut through’ the mix, the re-recording mixer or sound designer will filter any unnecessary frequencies from individual sounds so that those areas of the spectrum are free and don’t compete with other elements of the piece.
Is sound design music? Is music sound design?
Like every living organism, our beloved sound design, as well as music, has a pulse – it beats just like the heart. And both languages, as we have seen, have plenty of commonalities. But it seems that we always return (and always will) to the question of ‘what IS music’. And although it is definitely not my intention to try to solve this conundrum, I will propose the following starting point without any intention of oversimplifying its complexity: for our purposes, music can be defined as the phenomenon (experienced sensation, perception and cognition) of sounds and silences organised under culturally specified sets of restrictions, with the aim of conveying moods, emotions and/or ideas, and with the implicit intentionality of being experienced as such. Nevertheless, this definition doesn’t consider an important aspect of music: the word ‘play’ can also be understood as engaging in a game; pure enjoyment. And would anybody dare to say that Foley artists and sound designers don’t play when they’re at work?
When master ‘player’ Randy Thom said that “great sound sequences in movies are sequences that are dominated by one category of sound in each moment”[Y], he also suggested that composers and sound designers should negotiate things like ‘who will use specific parts of the frequency spectrum’ if music and sound design are to share particular sequences. After the explorations of the current text (as another episode of our time-honoured conversations), hopefully there will be many more points of contact and communication between both creative areas in order to enrich those negotiations. Aside from specific parts of the frequency spectrum, we now know that all aspects of rhythm, harmony and melody are up for grabs too.
And as sound designers, the more we expand our language by nurturing it with the seeds of other languages, the more fruitful our audio-visual experiences will be. Because in the end, giving a rather literary spin to Mr. Wittgenstein’s logical proposition, “(…) the limits of the language (…) mean the limits of my world.”[Z]
[0A] Balazs, Bela, “Theory of the Film: Sound”, hosted at https://soma.sbcc.edu/users/DaVega/FILMST_113/FILMST_113_0ld/GENERALTHEORY/Soundtheory_Balzacs.pdf
[0B] NIETZSCHE, Friedrich, “The Dionysiac World View” in “The Birth of Tragedy and Other Writings”, translation information not available, Pg. 136-137, hosted at http://archive.org/stream/Nietzsche-TheDionysianWorldView/Nietzsche-TheDionysianWorldView_djvu.txt (compared with the spanish translation by Andrés Sánchez Pascual, Alianza Editorial, Madrid, 1981, p.253)
 “Most sound effects (…) fall mid-way: like ‘sound-centaurs,’ they are half language, half music.”
Murch, Walter, “Dense clarity – clear density”, hosted at http://transom.org/?page_id=7006 I recently re-discovered thanks to Karen Collins’ text, “The Sound of Music, or the Music of Sound?”, posted on March 4, 2013 on https://designingsound.org/2013/03/the-sound-of-music-or-the-music-of-sound/ some excerpts from this section belong to the author’s book “Playing with Sound” (MIT Press, 2013).
 “Scott Gershin put it well when he called himself an ‘audio storyteller'”
Reizes, Doron, “Sound Design and Music: Diluting the Distinctions, Strengthening the Art Form”, posted on March 13, 2013 https://designingsound.org/2013/03/sound-design-and-music-diluting-the-distinctions-strengthening-the-art-form/
 “(…) a word with a clear modern meaning turns out to be used quite differently in early cinema practice: a pianist is not only someone who plays music on the piano but also the employee responsible in many early theatres for providing sound effects”.
Altman, Rick, “Silent Film Sound”, Columbia University Press, 2004, pg. 209.
 Although Plato mentions poets in this particular passage, within the context of the greek arts, music was one with poetry, drama and dance.
Plato, “The Republic”, Book X, 595a-b, translation by Benjamin Jowett, at http://oll.libertyfund.org/?option=com_staticxt&staticfile=show.php%3Ftitle=767&chapter=93795&layout=html&Itemid=27
 Collins, Karen, “The Sound of Music, or the Music of Sound?”, posted on March 4, 2013 on https://designingsound.org/2013/03/the-sound-of-music-or-the-music-of-sound/ some excerpts from this section belong to the author’s book “Playing with Sound” (MIT Press, 2013).
 “Collaborating with the Music Team – An Interview with Randy Thom”, Posted by Shaun Farley on March 5, 2013 https://designingsound.org/2013/03/collaborating-with-the-music-team-an-interview-with-randy-thom/
 Andersen, Martin Stig, “Audiovisual Correspondences”, posted on March 12, 2013 on https://designingsound.org/2013/03/audiovisual-correspondences/ extracted from the author’s article “Electroacoustic Sound and Audiovisual Structure in Film” (supervised by Denis Smalley) originally published by eContact.
 Reizes, Doron, “Sound Design and Music: Diluting the Distinctions, Strengthening the Art Form”, posted on March 13, 2013 https://designingsound.org/2013/03/sound-design-and-music-diluting-the-distinctions-strengthening-the-art-form/
 Zatorre, Robert J., et. al., “Structure and function of auditory cortex: music and speech”, TRENDS in Cognitive Sciences Vol.6 No.1 January 2002, hosted at http://web.mit.edu/hst.722/www/Topics/LeftRight/Zatorre%20et%20al%202002.pdf
 For more information on how the brain processes speech, I suggest reading the following article Georgetown University Medical Center, “Scientists Reaching Consensus On How Brain Processes Speech”, ScienceDaily, May 27, 2009. Hosted at http://www.sciencedaily.com/releases/2009/05/090526140733.htm
 Juslin, Patrik N. “Music and Emotion- Seven Questions, Seven Answers”, chapter in book “Music and the mind: Essays in honour of John Sloboda”, ed. Irene Deliege and Jane Davidson, Oxford University Press, New York, 2011
Hosted by the Department of Psychology, Uppsala University, Sweeden at http://www.psyk.uu.se/research/researchgroups/musicpsychology/downloads/?languageId=1 and www.psyk.uu.se/digitalAssets/31/31196_Chapter.pdf
Rath Jr, Don, “Introduction to Violin Articulations” hosted at http://donrathjr.com/introduction-to-violin-articulations/
Chad, “Tenuto and Portato”, March 8th, 2010, hosted at http://blog.twedt.com/archives/56
as well as the two other definitions quoted in the same post:
Schirmer, G., “Pronouncing Pocket-Manual of Musical Terms”, New York, ed. Dr. Theodore Baker, 1947
and “Essential Dictionary of Music”, 2nd edition. Alfred Publishing Company, Inc., Los Angeles, ed. Lindsey Harnsberger, 1997.
Collins English Dictionary – Complete and Unabridged © HarperCollins Publishers 1991, 1994, 1998, 2000, 2003
 Berger, Mark in “What is the sound of nothing?”, edited excerpt from CAS webboard by Charles Deenen (message thread: small room ambience – no sound), hosted at http://filmsound.org/QA/creating-silence.htm