Automatic Speech Recognition (ASR), or the transcription of speech into tex, is a challenging task. This is especially when you aim to understand wide vocabularies. Such speech recognition systems generally run on large servers, thus compromising the privacy of your users and implying costly subscriptions. They require the presence of an internet connection for your objects to work.
Snips' Automatic Speech Recognition relies on a Deep Learning model that achieves state-of-the-art performances while running on small connected devices, like a Raspberry Pi 3. Here's how we do it.
In order to achieve the highest possible accuracy on such a small device, the Snips ASR is specialized to understand the specific vocabulary and queries that make up your assistant. Hitting the download button after creating your assistant will trigger a last step of training, called language model adaptation. During this step, we specialize our ASR to your assistant, both optimizing its accuracy and ensuring a low memory and computational footprint. Give it a try through our web console, and feel free to contact us if you would like to know more about our enterprise solutions.
Depending on your needs you may want to tweak the ASR to fit your use case. Make sure to check Using 3rd party and Generic ASR if you want to change the default ASR module provided by Snips and Partial Decoding if you want to access the ASR output as it is being processed.