HOA Technical Notes - SN3D B-Format
B-Format is the main audio format used for Higher Order Ambisonics. This is a multichannel audio format where the individual channels do not correspond directly to speaker feeds. There is no "front left" channel. Instead, the channels contain components of the soundfield that are combined during a later decoding step.
With traditional multichannel audio, the channels do correspond to speaker feeds. This is great when the channels in the audio correspond to the speakers that are going to be used to listen, but can be a major problem when they don't.
B-Format allows multichannel audio to be generated, recorded and transferred from place to place without worrying about the speakers that are going to be used for playback in the end. It's slightly more complex to work with, but it means you'll normally only need to master your content once, rather than separately for stereo, 5.1, 7.1 etc.
B-Format supports full 3D too. It captures essentially equal information in all directions and can be rotated quite easily. This makes it useful for Virtual Reality applications, because B-Format material can be rotated into place before decoding, depending on where the user's head is pointing.
B-Format has an "order" which corresponds to the level of spatial detail provided. This determines the number of channels present - new channels are added on each time you increase the order. At "zero" order, there is just one mono channel. At first order, there are three additional spatial channels (totalling four), each behaving like a figure-of-eight microphone. At second order, another five channels are added (totalling nine), with rather less straightforward meaning. And so on. For the mathematicians out there, the channels correspond to the "spherical harmonics", which arise in solutions to the Acoustic Wave Equation in spherical polar coordinates.
We use third order for our O3A studio plugins. This needs sixteen channels.
In ambisonics, the coordinate system is normally set up so that X is forwards, Y is to the left and Z is upwards.
There are a few ways to get hold of B-Format content. These include:
The easiest way to encode audio into Higher Order Ambisonics is to use mono sources and to pan or "encode" them into the B-Format stream using some standard encoding equations.
There are a few different variants of B-Format which are mathematically equivalent, but not directly compatible. We use "third order SN3D" in our O3A plugins and the encoding equations for this are at the bottom of the page.
Other Formats: FuMa, N3D and SN3D
There are various forms of B-Format in use. The main ones are:
- FuMa was used in our plugins prior to our Version Two release (December 2016) and is an extension of "classic" B-Format from the 1970s. This is in wide use.
- Following our Version Two release, we now use SN3D which is being adopted rapidly (e.g. in YouTube).
- N3D is slightly different to SN3D and is available in some software. It is slightly easier to work with mathematically but is currently rare in the studio. This is probably a good thing as it is easily confused with SN3D.
Information about N3D and SN3D elsewhere on the web describes a number of possible different channels orders that could be or have been used to organize the channels. In practice, all current applications we are aware of support the ACN ordering convention so the others can and probably should be ignored. If you do find anything using N3D or SN3D that isn't supporting ACN we would be keen to hear about it as this is a source of potential confusion.
SN3D in the ACN channel ordering convention is used in the AmbiX file format and is sometimes known as AmbiX.
If you need to convert between FuMa and SN3D, the free O3A Core plugin library includes conversion plugins. The conversion is really just a change of convention and is essentially lossless, although levels are slightly different - so pay attention to your noise floor when writing to audio files and watch out for clipping. And be very careful not to mix formats because strange and bad things can happen!
The Rapture3D Universal game/VR engine supports all three formats. FuMa is supported from first to third order and N3D and SN3D from first to fifth order.
There are a number of four-channel microphones available that can record first order B-Format. Check their documentation to see whether they produce classic B-Format (which is the same as FuMa at first order) or SN3D and convert if necessary.
If you have material in a multichannel format and wish to convert it to B-Format, there are various ways to do it. The simplest way is to imagine each of the speakers is a sound source and use panning. However, there are also some clever ways to take material in other "conventional" formats like 5.1 and upmix them into B-Format, for instance using the "Inferred" method in our O3A Upmixer plugins.
Now, What Do I Do With It?
Once you have B-Format material there are various manipulations (for instance rotation) that can be applied naturally to material in this form, and our O3A Manipulators plugins provide a useful selection.
But what you probably really want to do is listen to it. For this, you need a Higher Order Ambisonic decoder of some type. These can be found in the O3A Core, O3A Decoding and O3A View libraries, and Rapture3D Advanced.
B-Format is a great mastering/archive format, as you can come back to it later and produce a stereo mix, or a 5.1 mix, or a 7.1 mix or whatever else is the flavour of the month!
Encoding Equations for Third Order SN3D
These equations are here for reference. You don't need to have any knowledge of them to use the technology!
These are used when panning a mono sound into a sound scene as a plane wave source and define the 16 channels of the B-Format we use. These use SN3D encoding in ACN order.
Please note the equations presented here changed in December 2016 when we switched from FuMa to SN3D encoding for the Version Two release of our studio software.
|ACN||Order||Angle/Elevation Representation||Cartesian Representation|
The table above provides two different ways to pan to B-Format, depending on whether you would prefer to say where the sound should be using angle and elevation, or a Cartesian representation. If you use the former, the angle is measured anticlockwise and elevation upwards. If you use the latter, the x, y, z coordinates must be the components of a unit vector in the direction of the sound (i.e. xx+yy+zz must equal 1).
The equations give gains which can be used to pan a mono (plane wave) source into the B-Format. Multiple panned sounds can be mixed together.
Each order introduces some new channels, but you must include the channels from the earlier orders. So, at second order, you will be including nine channels, 0 to 8. First order contains channels three channels, 0 to 3.