Partial Decoding

Last updated 3 months ago

1. Presentation

The ASR is the module that turns voice into text. Once your done speaking, the transcribed text should be sent on MQTT. For some applications, it can be interesting to have access to the text understood by the ASR while the user is still speaking. For this reason, we're introducing a new feature, called ASR Partial Decoding, allowing to get this kind of data :

[Asr] is capturing text: "what"
[Asr] is capturing text: "what"
[Asr] is capturing text: "what is"
[Asr] is capturing text: "what is the"
[Asr] is capturing text: "what is the"
[Asr] is capturing text: "what is the weather in"
[Asr] is capturing text: "what is the weather in paris"
[Asr] captured text "what is the weather in paris" in 4.0s

ASR Partial Decoding needs more computational power than a simple decoding. Depending on your assistant size and your device capabilities, it might introduce a delay while retrieving the output of the ASR.

2. How to use

On a Raspberry Pi

The ASR partial decoding can be activated when launching the binary snips-asr.

Indeed, it contains two keywords allowing you to control the behaviour of the partial decodings:

  • --partial : Toggles the partial decodings and sends the partial text captures to the bus. To receive them, subscribes to the topic hermes/asr/partialTextCaptured.

  • --partial-period-ms <PERIOD> : Period to send partial captures in milliseconds (default: 250). The lower, the more times it will take and thus the lattency might increase.

These parameters can also be set in the snips.toml file in the ASR section :

[snip-asr]
partial=true
partial_period_ms=250

Then the platform must be restarted in order for the changes to be taken into account.

The partial decodings are displayed in snips-watch too.

On iOS and Android

The feature is not yet supported.