Developer Documentation

FMOD and ODIN

Integrating ODIN Voice Chat with the FMOD Audio Solution in Unity.

FMOD and ODIN

Introduction

Welcome to this guide on integrating the ODIN Voice Chat Plugin with the FMOD Audio Solution in Unity. The code used in this guide is available on the ODIN-FMOD Git repository.

What You’ll Learn:

  • How the FMODMicrophoneReader and FMODPlaybackComponent scripts work and how to use them in your project
  • Properly set up ODIN in Unity when using FMOD as audio solution
  • Deal with limitations and potential pitfalls
Warning

Note: This guide assumes that your project has disabled Unity’s built-in audio.

Warning

Disclaimer: Be aware that the implementation shown here uses Programmer Sounds of the FMOD Engine. While this allows real-time audio data, a big disadvantage of this approach is an increased latency by ~500ms.

Getting Started

To follow this guide, you’ll need to have some prerequisites:

  • Basic knowledge of Unity
  • The FMOD Plugin for Unity, which you can get here
  • The ODIN Voice Chat Plugin, available here

To set up FMOD in your project, please follow FMOD’s in-depth integration-tutorial. You can find the tutorial here.

To set up the ODIN Voice Chat Plugin, please take a look at our Getting-Started guide, which you can find here:

Begin ODIN Getting Started Guide

FMODMicrophoneReader

The FMODMicrophoneReader script is an essential part of the FMOD integration. It replaces the default ODIN MicrophoneReader component, taking over the microphone input responsibilities by using FMOD. This script is crucial for reading microphone data and sending it to the ODIN servers for voice chat.

You can either follow the Usage setup to drop the FMODMicrophoneReader directly into your project, or take a look at how it works to adjust the functionality to your requirements.

Usage

  1. Add the FMODMicrophoneReader script to your OdinManager prefab.
  2. Disable the original MicrophoneReader component.
The OdinManager prefab after adding the FMODMicrophoneReader and disabling the original MicrophoneReader

The OdinManager prefab after adding the FMODMicrophoneReader and disabling the original MicrophoneReader

Warning

If you’re using ODIN Plugin versions older than 1.5.9, do not remove the MicrophoneReader component, as doing so may lead to NullpointerExceptions.

Info

The script currently doesn’t support automatic device switching or allow for programmatically changing devices. If you’d like to see extensions to this script, feel free to join our Discord server and let us know.

How it works

To read data from the microphone using FMOD, we’ll need to perform the following steps:

  1. Setup and create a FMOD.Sound object, into which FMOD can store the microphone input data.
  2. Start the microphone recording.
  3. Continually read the FMOD microphone data and push it to the ODIN servers

1. Setup

The setup is performed in Unity’s Start() method.

Retrieve Microphone Info

You need to retrieve details about the microphone, such as the sampling rate and the number of channels. We’ll use this info to configure the FMOD recording sound in the next step and the ODIN microphone streams later on.

FMODUnity.RuntimeManager.CoreSystem.getRecordDriverInfo(_currentDeviceId, out _, 0, out _, 
    out _nativeRate, out _, out _nativeChannels, out _);

Configure Recording Sound Info

After obtaining the input device details, the next action is to set up the CREATESOUNDEXINFO object. This object carries essential metadata that FMOD needs for audio capture.

_recordingSoundInfo.cbsize = Marshal.SizeOf(typeof(CREATESOUNDEXINFO));
_recordingSoundInfo.numchannels = _nativeChannels;
_recordingSoundInfo.defaultfrequency = _nativeRate;
_recordingSoundInfo.format = SOUND_FORMAT.PCMFLOAT;
_recordingSoundInfo.length = (uint)(_nativeRate * sizeof(float) * _nativeChannels);

We use SOUND_FORMAT.PCMFLOAT because ODIN requires this format for microphone data. This avoids the need for audio data conversions later on.

The _recordingSoundInfo.length is set to capture one second of audio. To change the recording duration, you can adjust the formula with a different multiplier.

Create Recording Sound

To hold the captured audio, FMOD requires us to create a FMOD Sound object as shown below.

FMODUnity.RuntimeManager.CoreSystem.createSound("", MODE.LOOP_NORMAL | MODE.OPENUSER,
    ref _recordingSoundInfo,
    out _recordingSound);

Here, we use the MODE.LOOP_NORMAL | MODE.OPENUSER flags in combination with the previously configured _recordingSoundInfo to initialize the _recordingSound object.

2. Recording

At this point, we’re ready to start capturing audio. To do so, call the recordStart method from FMOD’s core system.

FMODUnity.RuntimeManager.CoreSystem.recordStart(_currentDeviceId, _recordingSound, true);
_recordingSound.getLength(out _recordingSoundLength, TIMEUNIT.PCM);

After initiating the recording, we also get the length of the recorded sound in PCM samples by calling getLength. This value will help us manage the recording buffer in later steps.

3. Continually push microphone data

In the Update() method, we manage the ongoing capture of audio data from the FMOD microphone and its transmission to the ODIN servers. This ensures that the audio stream remains both current and continuously active.

Initialization

The method starts by checking if there is an active OdinHandler with valid connections and rooms. If not, it returns immediately.

if (!OdinHandler.Instance || !OdinHandler.Instance.HasConnections || OdinHandler.Instance.Rooms.Count == 0)
       return;

The next step is to find out how much audio has been recorded since the last check. This way we know how much data to read from the buffer.

FMODUnity.RuntimeManager.CoreSystem.getRecordPosition(_currentDeviceId, out uint recordPosition);
uint recordDelta = (recordPosition >= _currentReadPosition)
    ? (recordPosition - _currentReadPosition)
    : (recordPosition + _recordingSoundLength - _currentReadPosition);

// Abort if no data was recorded
if (recordDelta < 1)
    return;

If the read buffer is too short to hold the new audio data, its size is updated.

if(_readBuffer.Length < recordDelta)
    _readBuffer = new float[recordDelta];

Read Microphone Data

Microphone data is read from the FMOD sound object and copied into the read buffer using FMODs @lock and the System Marshal.Copy functions.

IntPtr micDataPointer, unusedData;
uint readMicDataLength, unusedDataLength;

_recordingSound.@lock(_currentReadPosition * sizeof(float), recordDelta * sizeof(float), out micDataPointer, out unusedData, out readMicDataLength, out unusedDataLength);
uint readArraySize = readMicDataLength / sizeof(float);
Marshal.Copy(micDataPointer, _readBuffer, 0, (int)readArraySize);
_recordingSound.unlock(micDataPointer, unusedData, readMicDataLength, unusedDataLength);

In this implementation, it’s crucial to be aware of the unit differences between FMOD, ODIN, and the system’s Marshal.Copy function. FMOD expects the read position and read length to be specified in bytes. In contrast, both ODIN and Marshal.Copy require the lengths to be represented as the number of samples being copied. Since we’re recording in the SOUND_FORMAT.PCMFLOAT format, we can use sizeof(float) to easily switch between FMOD’s byte-sized units and ODIN’s sample-sized units.

Push Microphone Data

After reading, if there is any valid data, it is pushed to the ODIN servers and the current microphone read position is updated.

if (readMicDataLength > 0)
{
    foreach (var room in OdinHandler.Instance.Rooms)
    {
        ValidateMicrophoneStream(room);
        if (null != room.MicrophoneMedia)
            room.MicrophoneMedia.AudioPushData(_readBuffer, (int)readArraySize);
    }
}

_currentReadPosition += readArraySize;
if (_currentReadPosition >= _recordingSoundLength)
    _currentReadPosition -= _recordingSoundLength;

The _currentReadPosition is reset back to zero when it reaches the length of the recording buffer to avoid going out of bounds.

The ValidateMicrophoneStream method ensures that an ODIN microphone stream is setup and configured correctly:

private void ValidateMicrophoneStream(Room room)
{
    bool isValidStream = null != room.MicrophoneMedia &&
        _nativeChannels == (int) room.MicrophoneMedia.MediaConfig.Channels &&
        _nativeRate == (int) room.MicrophoneMedia.MediaConfig.SampleRate;
    if (!isValidStream)
    {
        room.MicrophoneMedia?.Dispose();
        room.CreateMicrophoneMedia(new OdinMediaConfig((MediaSampleRate)_nativeRate,
            (MediaChannels)_nativeChannels));
    }
}

By understanding and implementing these steps, you should be able to continually read FMOD microphone data and push it to the ODIN servers, thereby keeping your audio stream up-to-date.

FMODPlaybackComponent

The FMODPlaybackComponent script replaces the default ODIN PlaybackComponent component, taking over the creation and playback of an FMOD audio stream based on the data received from the connected ODIN Media Stream.

You can either follow the setup to use the FMODPlaybackComponent directly in your project, or take a look at how it works to adjust the functionality to your requirements.

Usage

  1. On the OdinHandler script of your OdinManager prefab, switch from Playback auto creation to Manual positional audio.
The OdinHandler&rsquo;s Manual positional audio setting required for FMODPlaybackComponent to work.

The OdinHandler’s Manual positional audio setting required for FMODPlaybackComponent to work.

  1. In a OnMediaAdded callback, instantiate a new FMODPlaybackComponent and set the RoomName, PeerId and MediaStreamId values based on the MediaAddedEventArgs input, e.g. like this:
...
void Start()
{
    OdinHandler.Instance.OnMediaAdded.AddListener(OnMediaAdded);
    OdinHandler.Instance.OnMediaRemoved.AddListener(OnMediaRemoved);
    ...
}
...
 private void OnMediaAdded(object roomObject, MediaAddedEventArgs mediaAddedEventArgs)
{
    if (roomObject is Room room)
    {
        FMODPlaybackComponent newPlayback = Instantiate(playbackPrefab);
        newPlayback.transform.position = transform.position;
        newPlayback.RoomName = room.Config.Name;
        newPlayback.PeerId = mediaAddedEventArgs.PeerId;
        newPlayback.MediaStreamId = mediaAddedEventArgs.Media.Id;
        _instantiatedObjects.Add(newPlayback);
    }
}
  1. Keep track of the instantiated objects and destroy FMODPlaybackComponent , if the OnMediaRemoved callback is invoked, e.g. like this:
private void OnMediaRemoved(object roomObject, MediaRemovedEventArgs mediaRemovedEventArgs)
{
    if (roomObject is Room room)
    {
        for (int i = _instantiatedObjects.Count - 1; i >= 0; i--)
        {
            FMODPlaybackComponent playback = _instantiatedObjects[i];
            if (playback.RoomName == room.Config.Name 
            && playback.PeerId == mediaRemovedEventArgs.Peer.Id 
            && playback.MediaStreamId == mediaRemovedEventArgs.MediaStreamId)
            {
                _instantiatedObjects.RemoveAt(i);
                Destroy(comp.gameObject);
            }
        }
    }
}

For the full implementation details, take a look at the AudioReadData script on our sample project repository.

How it works

To playback data from the microphone stream supplied by ODIN, we’ll need to perform the following steps:

  1. Setup and create a FMOD.Sound object, into which we’ll transfer the audio data received from ODIN. FMOD will then use the Sound object to playback that audio.
  2. Continually read the ODIN media stream data and transfer it to FMOD for playback.

1. Setup

We perform the setup in Unity’s Start() method.

Setup Playback Sound Info

First, populate the CREATESOUNDEXINFO object with the settings FMOD needs to play back the audio streams correctly.

_createSoundInfo.cbsize = Marshal.SizeOf(typeof(FMOD.CREATESOUNDEXINFO));
_createSoundInfo.numchannels = (int) OdinHandler.Config.RemoteChannels;
_createSoundInfo.defaultfrequency = (int) OdinHandler.Config.RemoteSampleRate;
_createSoundInfo.format = SOUND_FORMAT.PCMFLOAT;
_pcmReadCallback = new SOUND_PCMREAD_CALLBACK(PcmReadCallback);
_createSoundInfo.pcmreadcallback = _pcmReadCallback;
_createSoundInfo.length = (uint)(playBackRate * sizeof(float) * numChannels);

Here, we pull the number of channels and the sample rate from OdinHandler.Config to configure the playback settings. Similar to how the FMODMicrophoneReader operates, we specify the audio format as SOUND_FORMAT.PCMFLOAT. This ensures compatibility between FMOD and ODIN’s sampling units. We also set the playback sound buffer to hold one second’s worth of audio data.

The crucial part of this configuration is setting up the PCM read callback. The PcmReadCallback function is invoked by FMOD whenever it needs fresh audio data, ensuring uninterrupted playback.

Initialize Stream and Trigger Playback

In this case, we opt for the createStream method, which is essentially the createSound function with the MODE.CREATESTREAM flag added.

FMODUnity.RuntimeManager.CoreSystem.createStream("", MODE.OPENUSER | MODE.LOOP_NORMAL, ref _createSoundInfo, out _playbackSound);
FMODUnity.RuntimeManager.CoreSystem.getMasterChannelGroup(out ChannelGroup masterChannelGroup);
FMODUnity.RuntimeManager.CoreSystem.playSound(_playbackSound, masterChannelGroup, false, out _playbackChannel);

To initiate playback, we retrieve the Master Channel Group from FMODUnity.RuntimeManager.CoreSystem and use it along with the stream we’ve just created. Keeping a reference to the returned _playbackChannel allows us to configure the channel for 3D positional audio later on.

2. Read and Playing Back ODIN Audio Streams

The task of fetching audio data from ODIN and sending it to FMOD is accomplished within the PcmReadCallback method.

PCM Read Callback

[AOT.MonoPInvokeCallback(typeof(SOUND_PCMREAD_CALLBACK))]
private RESULT PcmReadCallback(IntPtr sound, IntPtr data, uint dataLength){
    ...
}

To enable calls between native and managed code, we annotate the method with the [AOT.MonoPInvokeCallback(typeof(SOUND_PCMREAD_CALLBACK))] attribute. The callback function is provided with three parameters, of which we only need data and dataLength. These values indicate where to store the ODIN audio data and the number of required samples, respectively.

Data Validation

Next, we include some validation logic:

int requestedDataArrayLength = (int)dataLength / sizeof(float);
if (_readBuffer.Length < requestedDataArrayLength)
{
    _readBuffer = new float[requestedDataArrayLength];
}

if (data == IntPtr.Zero)
{
    return RESULT.ERR_INVALID_PARAM;
}

Similar to our approach in the FMODMicrophoneReader , we use sizeof(float) to switch between the byte-size units used by FMOD and the sample-based units used by ODIN. If needed, we resize the _readBuffer and check the data pointer for validity.

Read ODIN Data and Transfer to FMOD Stream

In the final step, we read the requested amount of samples from the ODIN media stream into _readBuffer. Then we copy this data to FMOD using the provided data pointer.

if (OdinHandler.Instance.HasConnections && !PlaybackMedia.IsInvalid)
{
    uint odinReadResult = PlaybackMedia.AudioReadData(_readBuffer,  requestedDataArrayLength);
    if (Utility.IsError(odinReadResult))
    {
        Debug.LogWarning($"{nameof(FMODPlaybackComponent)} AudioReadData failed with error code {odinReadResult}");
    }
    else
    {
        Marshal.Copy(_readBuffer, 0, data, requestedDataArrayLength);
    }
}

The AudioReadData method pulls the specified amount of data from the PlaybackMedia stream into _readBuffer. We then use ODIN’s Utility.IsError method to verify the operation’s success. If everything checks out, Marshal.Copy is used to transfer the _readBuffer contents to FMOD’s designated playback location in memory, identified by the data pointer.

Accessing the PlaybackMedia

To access a specific ODIN media stream, you’ll need three identifiers: a room name, peer id, and media stream id. You can use these to fetch the media stream with a function like the following:

private PlaybackStream FindOdinMediaStream() => OdinHandler.Instance.Rooms[RoomName]?.RemotePeers[PeerId]?.Medias[MediaStreamId] as PlaybackStream;

The values for RoomName, PeerId, and MediaStreamId can be obtained, for instance, from the OnMediaAdded callback.

FMODPlayback3DPosition

After setting up the basic FMOD playback, you may want to enhance your audio experience by adding 3D positional sound features. The FMODPlayback3DPosition script is designed to handle this.

Usage

To add the 3D audio features to your playback, simply add the component to the same game object your FMODPlaybackComponent is attached to. The FMODPlayback3DPosition behaviour will automatically connect to and setup the playback component, as well as update the 3D position of the FMOD playback channel.

Info

The FMODPlayback3DPosition component currently does not connect to a RigidBody for velocity determination. If a doppler effect is necessary for your project, consider extending the current implementation to automatically access and update the velocity value as well.

How It Works

Setup

To incorporate 3D audio capabilities into your FMOD playback, you’ll need to modify some channel settings. The first step is to ensure that the FMOD Sound and Channel objects are fully initialized by the FMODPlaybackComponent (or a comparable setup).

private IEnumerator Start()
{
    // Wait until the playback component is initialized
    while (!(_playbackComponent.FMODPlaybackChannel.hasHandle() && _playbackComponent.FMODPlaybackSound.hasHandle()))
    {
        yield return null;
    }
    // Initialize 3D sound settings
    _playbackComponent.FMODPlaybackChannel.setMode(MODE._3D);
    _playbackComponent.FMODPlaybackChannel.set3DLevel(1);
    _playbackComponent.FMODPlaybackSound.setMode(MODE._3D);
}

After confirming initialization, we enable 3D audio by applying the MODE._3D flag to both FMODPlaybackChannel and FMODPlaybackSound. Additionally, we set the 3DLevel blend level to 1 to fully engage 3D panning.

Positional Updates

To keep the FMOD sound object’s position in sync with the Unity scene, we fetch the Unity GameObject’s transform and convey its position and rotation to FMOD.

 private void FixedUpdate()
{
    if (_playbackComponent.FMODPlaybackChannel.hasHandle())
    {
        ATTRIBUTES_3D attributes3D = FMODUnity.RuntimeUtils.To3DAttributes(transform);
        _playbackComponent.FMODPlaybackChannel.set3DAttributes(ref attributes3D.position, ref attributes3D.velocity);
    }
}

This approach ensures that the sound source remains spatially accurate within your Unity environment, enhancing the 3D audio experience.