Speech recognition is a challenging task, especially when you aim to understand wide vocabularies. Such speech recognition systems generally run on large servers, thus compromising the privacy of your users and implying costly subscriptions. They require the presence of an internet connection for your objects to work.

Snips' Automatic Speech Recognition relies on a Deep Learning model that achieves state-of-the-art performances while running on small connected devices, like a Raspberry Pi 3. Here's how we do it.

Snips ASR

Now available in English, French, German and Japanese. Other languages coming soon.

In order to achieve the highest possible accuracy on such a small device, the Snips ASR is specialised to understand the specific vocabulary and queries that make up your assistant. Hitting the download button after creating your assistant will trigger a last step of training, called language model adaptation. During this step, we specialise our ASR to your assistant, both optimising its accuracy and ensuring a low memory and computational footprint. Give it a try through our web console, and feel free to contact us if you would like to know more about our enterprise solutions.

Using Google’s Cloud service for other languages

More languages will be supported by Snips ASR in the near future.

If you want to build an assistant today in these languages, it is possible to rely on cloud services like Google's Cloud Speech service, which have been integrated into our platform.

You will simply need to select it when configuring your assistant on the web console, and include your Google Cloud Speech credentials on your device.

Note that if you use Google’s API, you will need to pay for it after a certain point, your assistant will not work offline, and it will not follow the principles of privacy by design. We are working hard to extend our support for on-device ASR to other languages as soon as possible 😉.

Using Snips generic ASR model (english only)

This feature is not commercial grade and will be deprecated soon. We do not provide any kind of support for users of the feature, and we strongly encourage users to stick to default custom ASR behavior.

In order to provide assistants that are light and robust, the Snips console trains a custom Language Model for each assistant's ASR. This model is limited to the vocabulary appearing in the assistant, and some built-in entities (numbers, dates, etc).

This gets in the way of applications that require to understand large vocabulary. General knowledge questions, for example: "What's the distance between the earth and the moon?", "Who invented the television?", etc. Robust and embedded large vocabulary ASR's are beyond the current state of the art, but if you are willing to trade robustness for generality, Snips provides you with an experimental large vocabulary ASR.

First install the packages. Be aware that it takes about 500MB once installed, 160MB to download, and about 700MB to get setup.

sudo apt-get update; sudo apt-get install snips-asr-model-en-500mb

Next you need to override the assistant model by the generic model: in /etc/snips.toml, go the the snips-asr section to add:

model = "/usr/share/snips/snips-asr-model-en-500MB"

Finally restart the asr deamon:

sudo systemctl restart snips-asr