HOA Technical Notes - Decoding

In Higher Order Ambisonics, the main audio stream in use is B-Format. This contains an awful lot of information about sound directivity, and it is deliberately unaware of the speakers that you will use for playback. Because of this, it must be "decoded" for playback. Decoding is sometimes known as rendering.

There are lots of different ways to decode a Higher Order Ambisonic stream. Michael Gerzon did provide a relatively formal definition of "Ambisonic Decoding" for first order "Classic" Ambisonics, however many modern decoders do not work this way.

Rapture3D

One of the main feature of our Rapture3D software is a high quality Higher Order Ambisonic decoder. The same decoder core is used in the Rapture3D game engine, music playback engine and professional studio plugins. With Rapture3D, the idea is that you configure it once, and then everything "just works".

The "User" and "Game" editions of Rapture3D come with decoders for a collection of preset rigs that correspond to the standard Windows speaker layouts (such as stereo, 5.1, 7.1) and some variants (such as 3D7.1).

Rapture3D uses some very sophisticated techniques for decoding, taking into account soundfield reproduction (which is a form of "acoustic holography"), wavefront curvature, psychoacoustic cues, HRTFs and more. It also supports HRTF-based headphone decoding, output for surround processors and crosstalk cancelled stereo.

The Studio - O3A Plugins

The O3A Core plugin library includes a number of simple studio decoder plugins to get you started with decoding.

For a richer set of decoders, you can try the O3A Decoding plugin library. Most of these decoders are actually generated using the Rapture3D Decoder Generator, which is at the heart of the "Advanced" edition of Rapture3D. This lets you set up decoders for arbitrary speaker layouts and personalised HRTFs.

Reference Decoders

Over the years, quite a few poor decoders have appeared, which have not always given ambisonics a good name. For instance, the "pseudo-inverse" approach is commonly misapplied to irregular speaker layouts. This gives something that looks right, but often won't sound right!

Here, we recommend some simple low-order decoders. We prefer the Rapture3D ones, but if you're writing a new decoder and it sounds worse than these, you're doing something wrong!

These decoders are represented as matrices. To apply them, take the relevant components from each sample frame of your B-Format and multiply them by the matrix to produce the corresponding sample frame to feed to the speakers. Except for the mono and hexagon decoders, these are based on the decoders provided by Csound's bformdec1 opcode, converted to SN3D. [Up-to-date as of 2017-01-05, any errors being our own.]

Mono

The first B-Format channel provides an omnidirectional response, so this this can be used directly to provide a good mono output.

Simple Stereo

This is a simple decode equivalent to a M+S microphone array at the listening point. This works better than a front-facing arrangement for many (but not all) purposes.

Note that there are lots of ways to do stereo decodes of B-Format, including ones that generate "binaural" headphone stereo (for instance, see the notes on our amber HRTF decoder). Also, see more on "Synthetic Microphones" below (this decode is equivalent to two side-facing cardioids, though we've halved the gain).

ACN	0 In	1 In
Left	0.5	0.5
Right	0.5	-0.5

First Order Quad

This is a first order "in-phase" decoder.

ACN	0 In	1 In	3 In
Front Left	0.2500	0.1768	0.1768
Back Left	0.2500	0.1768	-0.1768
Back Right	0.2500	-0.1768	-0.1768
Front Right	0.2500	-0.1768	0.1768

Second Order 5.0

This is a second order decoder provided by Bruce Wiggins, targetting an ITU 5.0 speaker layout, compatible with DVD 5.1 etc. [Converted from FuMa, any errors our own.]

ACN	0 In	1 In	3 In	4 In	8 In
Front Left	0.2864	0.3100	0.3200	0.1443	0.0981
Front Right	0.2864	-0.3100	0.3200	-0.1443	0.0981
Front Centre	0.0601	0.0000	0.0400	0.0000	0.0520
Surround Left	0.4490	0.2800	-0.3350	0.0924	-0.0924
Surround Right	0.4490	-0.2800	-0.3350	-0.0924	-0.0924

Second Order Hexagon

The speakers here are assumed to be set out anticlockwise, with the first speaker at 11 o'clock and the last one at 1 o'clock. This is an "in-phase" decoder.

ACN	0 In	1 In	3 In	4 In	8 In
Front Left	0.1667	0.1147	0.1987	0.0642	0.0371
Left	0.1667	0.2294	0.0000	0.0000	-0.0742
Back Left	0.1667	0.1147	-0.1987	-0.0642	0.0371
Back Right	0.1667	-0.1147	-0.1987	0.0642	0.0371
Right	0.1667	-0.2294	0.0000	0.0000	-0.0742
Front Right	0.1667	-0.1147	0.1987	-0.0642	0.0371

Third Order Octagon

The speakers here are assumed to be set out anticlockwise, with the first speaker roughly at 11 o'clock and the last one roughly at 1 o'clock. This is an "in-phase" decoder.

ACN	0 In	1 In	3 In	4 In	8 In	9 In	15 In
NNW	0.1250	0.0718	0.1732	0.0612	0.0612	0.0146	0.0061
WNW	0.1250	0.1732	0.0718	0.0612	-0.0612	-0.0061	-0.0146
WSW	0.1250	0.1732	-0.0718	-0.0612	-0.0612	-0.0146	0.0061
SSW	0.1250	0.0718	-0.1732	-0.0612	0.0612	0.0061	-0.0146
SSE	0.1250	-0.0718	-0.1732	0.0612	0.0612	-0.0146	-0.0061
ESE	0.1250	-0.1732	-0.0718	0.0612	-0.0612	0.0061	0.0146
ENE	0.1250	-0.1732	0.0718	-0.0612	-0.0612	0.0146	-0.0061
NNE	0.1250	-0.0718	0.1732	-0.0612	0.0612	-0.0061	0.0146

First Order Cube

This is a first order "in-phase" decoder.

ACN	0 In	1 In	2 In	3 In
Front Lower Left	0.1250	0.0722	-0.0722	0.0722
Front Upper Left	0.1250	0.0722	0.0722	0.0722
Back Lower Left	0.1250	0.0722	-0.0722	-0.0722
Back Upper Left	0.1250	0.0722	0.0722	-0.0722
Back Lower Right	0.1250	-0.0722	-0.0722	-0.0722
Back Upper Right	0.1250	-0.0722	0.0722	-0.0722
Front Lower Right	0.1250	-0.0722	-0.0722	0.0722
Front Upper Right	0.1250	-0.0722	0.0722	0.0722

Virtual Microphones

It's possible to extract simple microphone responses from the B-Format stream. This is particularly relevant when decoding for stereo.

For an omnidirectional response, simply use the ACN0 (first) channel.
For a figure-of-eight response, take the scalar product of the microphone direction vector written as <Y,Z,X> and the ACN1, ACN2 and ACN3 channels.
For a cardioid response, add the omnidirectional and figure-of-eight responses together (you can vary this to produce hypercardioid responses etc.).

It is tempting to build multichannel decoders by feeding simple virtual microphone responses in the directions of the various speakers. This does not work well, particularly for speaker layouts that are not regular.

Doing It Properly

Or, if this all seems rather pedestrian, you can use custom layouts with Rapture3D. The decodes used there are typically not just matrices. Frequency-domain processing allows acoustic soundfield reconstruction and psychoacoustic processing for arrays with small or large numbers of speakers.