HOA Technical Notes - Introduction to Higher Order Ambisonics

So, what is this Higher Order Ambisonics thing?

Well, it's quite a few things. Primarily, it's a mathematical framework for handling 3D sound scenes or "soundfields". It uses a representation of the soundfield centred around a listener at the centre of the room. This allows us, for instance, to:

  • Examine what soundfield will be created if a speaker layout is driven in a particular way.
  • Stream a representation of how the 3D soundfield should be.
  • Work out how to drive the speaker layout to reconstruct that 3D soundfield as best we can.

This stream describing how the sound scene or soundfield should be is known as B-Format and is pretty unique to the ambisonic family of techniques. B-Format looks like an ordinary multichannel audio stream, but the channels do not correspond to speaker feeds. Instead, they combine mathematically to describe the soundfield. This is useful as it means we can decide what speaker layout we're going to use later.

Working with Higher Order Ambisonics typically involves a few stages:

  1. Sounds are "encoded" into HOA B-Format typically by panning, upmixing or recording.
  2. B-Format can be used to transfer the audio from one place to another, without having to worry about what actual speakers will be used.
  3. B-Format can be manipulated in various ways. For instance, it is easy to rotate the entire audio scene.
  4. B-Format is "decoded" for actual playback once the actual speaker layout is known.

Rapture3D contains a particularly sophisticated HOA decoder, which can be used for playback in the Rapture3D Player. Internally, the Rapture3D game engine performs encoding, based on the game's geometry information, as well as decoding.

The O3A plugins use third order ambisonics, which uses a 16 channel form of B-Format. The plugins cover panning, upmixing and decoding for standard formats, and Rapture3D can be used for non-standard ones. There are also lots of tools to manipulate B-Format, and more.

How Are Ambisonics And HOA Different?

Classic Ambisonics was developed initially in the 1970s by some very smart people including Michael Gerzon of the Oxford Mathematics Institute. This focussed on "first order" formats using up to four channels, along with a stereo-compatible format known as "UHJ".

Classic first order ambisonics uses four channels to capture a full periphonic (3D) sound image (or three channels for 2D). It does this stunningly well considering the bandwidth used! Further, there's a mature microphone technology associated with it, with microphones currently available from Soundfield and Core Sound.

HOA developed later, by which time formats like 5.1 had already established themselves. HOA extends the mathematical formulation to use more channels to capture additional spatial detail well beyond what 5.1 or 7.1 carry. The Rapture3D game engine can be run at up to fifth order, at which point 36 channels are being used to carry the spatial information!

Technical Notes

We have a few technical notes here for those interested in finding out a bit more.

  • More information on B-Format including how to encode a mono source.
  • Soundfield Rotations - it's possible to rotate the entire soundfield while it's represented in B-Format.
  • HOA Decoding of B-Format.