Make ODIN APM Settings
Creates an APM settings object that can be used to construct a room.
|Enable Voice Activity Detection||boolean||When enabled, ODIN will analyze the audio input signal using smart voice detection algorithm to determine the presence of speech. You can define both the probability required to start and stop transmitting.|
|Attack Probability||float||Voice probability value when the VAD should engage.|
|Release Probability||float||Voice probability value when the VAD should disengage.|
|Enable Volume Gate||boolean||When enabled, the volume gate will measure the volume of the input audio signal, thus deciding when a user is speaking loud enough to transmit voice data. You can define both the root mean square power (dBFS) for when the gate should engage and disengage.|
|Attack Loudness (D BFS)||float||Root mean square power (dBFS) when the volume gate should engage|
|Release Loudness (D BFS)||float||Root mean square power (dBFS) when the volume gate should disengage|
|High Pass Filter||boolean||When enabled, the high-pass filter will remove low-frequency content from the input audio signal, thus making it sound cleaner and more focused.|
|Pre Amplifier||boolean||When enabled, the preamplifier will boost the signal of sensitive microphones by taking really weak audio signals and making them louder.|
|Noise Suppression||enum||When enbabled, the noise suppressor will remove distracting background noise from the input audio signal. You can control the aggressiveness of the suppression. Increasing the level will reduce the noise level at the expense of a higher speech distortion.|
|Transient Suppression||boolean||When enabled, the transient suppressor will try to detect and attenuate keyboard clicks.|
|Return Value||APMSettings||The constructed APM settings object.|
Some settings might be a bit confusing if you are new to audio. Most settings should be self explanatory but some settings require to enter some values.
If users are in a loud environment they don’t want to send background noise to other team members. To prevent that, we provide too different settings:
Voice Activity Detection
This system captures a few milliseconds of audio data and analyzes it to determine if the user is speaking. We use an advanced AI model for that, so it’s pretty smart in detecting voice. So, you typically should have this enabled.
As with every AI model it only returns a probability between
0.0 means that the probability is zero
that the recorded audio contains voice.
1 means that the AI is a hundred percent sure it’s voice. Use the Attack and
Release values to set the probability when the AI should start and stop transmitting.
Good values are
0.9 (if the AI is 90% sure that it’s voice) for attack and
That works very good. However, there are some special cases, where additional filtering needs to be implemented. Consider an open space office location with many people talking. The AI might be 100% sure that it’s voice but it’s not clear, that it’s voice of the person connected to the room.
For these cases, an additional filter Volume Gate can be used:
The basic idea of a volume gate is to have a setting that allows you to disable the microphone before a specific volume threshold. So, the AI has detected voice, but it’s not very load, so it also might be from the background noise. You can apply a loudness threshold that must be met before the microphone is enabled.
It’s not that easy to understand that topic, but a good starting point is Wikipedia on DBFS.
Typically, you should disable that setting. A good starting point is
-40 for release and
-30 for attack. Please
contact support if you need assistance with these values.