Voice Activity Detection

Last updated 13 days ago

The Wake Word detector can now be inhibited when no voice is heard, in order to limit the footprint of the platform. This is called Voice Activity Detection (VAD). We rely on the WebRTC open source library.

We optimised the parameters of the VAD component to limit the cases in which the solution fails to trigger, while controlling false positives. In quiet conditions, but also with music or TV in the background, we observed no false rejections. We measured below 2% false rejection rates with other types of background noise. The false positive rate is high with music or TV background sound, but significantly reduced with other types of background noises (e.g. a vacuum cleaner).

The VAD is turned on by default. To turn it off, set the no_vad_inhibitor parameter to true in the Platform Configuration file.

By default, messages related to VAD activation/deactivation are not communicated. This can be changed in the Platform Configuration file by setting vad_messages to true. In this, case, messages will be sent on the following MQTT topics:

  • hermes/voiceActivity/<siteid>/vadDown

  • hermes/voiceActivity/<siteid>/vadUp

These messages are not yet accessible on the SDK versions (Android and iOS) of the Snips Platform.