This is a guest contribution by Chris Hegstrom. Chris started his audio career doing live sound for Blue Man Group while studying Music Synthesis at Berklee College of Music. He got into the games industry by way of web audio & shipped console titles like Lord of the Rings: The Two Towers, Star Wars Ep. 3, Burnout Paradise & God of War 3.
In 2010 he joined Microsoft to direct audio for Kinect & then moved onto HoloLens soon after. Chris spent 3 years working on all aspects of HoloLens audio design & direction from individual experiences to sonic continuity to General audio UX.
Chris left Microsoft in 2015 & is now the owner & creative director of Symmetry Audio & co-founder of AudioVR.
Audio for VR is important, especially for audio people. For us it’s really important, but how important will it be to your everyday developer?
Before we dive into that, a quick back-story:
VR has been around in one form or another for decades, but computing power has only recently caught up to our dreams & allowed people (not at technology institutes) access to head mounted displays & experience fabricated worlds in stereoscopic 3D.
& now…the arms race.
It’s an exciting time for technology right now because it feels like everything from camera technology, to story-telling techniques to graphics cards have a new goal: Be the VR standard. Be first, be the best & most importantly: be the one that everyone uses.
No one can decide what this new industry needs most but everyone agrees we need:
• Stronger, cheaper, faster, smaller hardware,
• compelling content &
• amazing software tools that can make that content.
Right now, HMDs are too tethered, too bulky, or too low-fi. The untethered HMDs need to improve on display size, fidelity or frame-rate. Tethered HMDs need to lose the cords & improve on the bulky peripherals.
As far as software, right now it’s mostly loaners from other industries. Video game engines are scrambling to jump over from screen-based-technology into the VR / MxR world as are video & animation suites.
Internet and broadcast channels are trying to figure out how to stream such beefy content to customer’s VR rigs over terrestrial networks while content makers are trying to figure out how to capture & cast mixed reality content for social consumption.
& everyone is already behind.
So what does any of this have to do with audio? Nothing & everything.
Audio has always been the most mysterious of the artistic disciplines. Lay people understand the least about how it is done & it generally works best if no one notices it.
People are always amazed & perplexed when told that generally 100% of what they hear in movies is designed after the shoot & that what is recorded on set is almost always unusable.
Live sound is also assumed to ‘just happen’, thinking that Jack White alone is responsible for delivering that guitar sound to seat 12d & that there isn’t a myriad of amplifiers, speakers, outboard gear & voltage controlled oscillators finely tuned to transfer the sound from the stage talent to the patron.
Studio music production dives the deepest into the nuances of audio production of for no other reason than it is standalone & has no other media (visual or otherwise) to support it.
Game audio is perhaps the most complex example of effortless sounding intricacy in that the user influences the sound through input opposed to the one-way audio flow of passive media.
Or at least it was the most complex before VR. Audio for virtual reality takes all of the aforementioned challenges & adds one new, unexplored exponential complexity: presence.
Presence is the fundamental difference between:
– wow that’s a really cool giant bug idle animation in that video game
&
– holy $#!+ there’s a giant bug towering over me!
In screen based technology (SBT), observation & analysis are objective because it’s a self-contained experience. In VR, gut feelings take over. You can tell yourself that the giant bug isn’t real but thousands of years of survival instinct will make your brain & body react differently.
Reactions are instinctive & self-referencing because you are an elemental part of the experience. It’s the difference between looking at a snow globe on the table in front of you & being inside the snow globe.
If done convincingly presence will make your brain process this virtual information as if it were real….
…which leads us to arguably the most important element of presence done convincingly within VR; audio.
Hearing is wired more directly to human perception than seeing. Our brains receive, recognize & process sound information more quickly & positionally than visual information.
When trying to convince a brain that something is real, sound is the first line of offense. 3D audio not only immerses the user in the virtual space, it anchors them in the individual elements in that space.
For example, when a user is trying to determine if an object is a far away large object or a closer small object, visually, the user could process the relative lighting, the depth of the object within the FOV, the parallax & a plethora of other visual aspects, or the user could move their head around slightly to hear the position of the object.
Survival instinct has trained us to determine the distance of an object by performing dozens of quick micro-movements of the head, almost drawing the sound position based on aural triangulation. Users do the same thing in VR.
Based on volume, reflections and timbre, the relative position of the object becomes much more obvious than it would using visual feedback. The user has an instantaneous understanding of the environment and the elements that populate said environment.
Another thing 3D audio does within VR is transcend the aural uncanny valley. If the user is listening to a stereo (or even traditional surround) mix, a bird chirping in the right channel stays in the right channel even when the user moves their head. This breaks the aural illusion because it isn’t how sound would react in the real world.
Even if you place a sound on an emitter on a game object, without 3D audio, the user won’t perceive height or depth. This is like hearing sound sources from a flat plane opposed to a sphere.
This will be interpreted not only “wrong” but also as muddled & claustrophobic. It would be the visual equivalent of viewing everything on a one-dimensional card instead of being placed in various points inside of a cube.
Never has audio been so important to an experience as it is in VR. With technology making audio sound clearer, more precise, and more positional than ever before, not having 3D sound will become much more of an obvious omission as users become accustomed to hearing sound “correctly”.
A lot of developers and project leads are raising eyebrows at the processing cost of such audio delivery & it’s true, analyzing X, Y & Z axis’ every frame for a 90 fps experience is not cheap.
Work is being done to optimize and streamline audio processing & algorithms are getting faster and more powerful every day but having 2D audio in a reality with 3D visuals and interaction isn’t really an option if the goal is true, transformative immersion.
It’s an exiting time for both the VR and the audio industries. When these two industries intersect, the full potential of each becomes even more amazing.
Many thanks to Chris for contributing this month. You can find out more about his work here:
Twitter: @symmetry_audio
Gregg Wilkes says
Chris,
Well done! You detailed the “why” in “3D Audio enables immersion in VR/AR”.