Azure Speech Recognizer Demo

joreg · February 27, 2019, 2:44am

since i wasn’t happy with the results of the SpeechRecognizer i shared over at VL.Speech i went on to try the azure one. this one is a cloud-service you need to get an api-key for. it does computation in the cloud and indeed works much better, even without any more noticable latency. also the full patch necessary, to get quick results is quite…manageable:

AzureSpeechRecognizerDemo.vl (19.3 KB)

also have a look at the other available cognitive services sdks (for vision, search, language and knowledge) that should all be similarly easy to access as the above. and then don’t forget to show us what you find…

Desaxismundi · February 27, 2019, 5:41am

Wait, so this implies that Kinect’s Azure features will be pretty much straight forward accessible in VL, right?

joreg · February 27, 2019, 10:06am

from what i saw so far, yes.

circuitb · March 1, 2019, 12:39pm

hey looks really promising!
vl beginner question here
not sure why it doesn’t work…no error shown in vl
key, region and microphone input looks good
any hint on how to debug it!
thx

sebescudie · March 1, 2019, 12:45pm

@circuitb have you tried with latest alpha ?

circuitb · March 1, 2019, 1:15pm

oui seb latest alpha here!

sebescudie · March 1, 2019, 1:28pm

ah, weird, worked out of the box with my webcam’s mic here :/
could you provide more info on what’s going wrong ? I mean, it simply outputs nothing ?

joreg · March 1, 2019, 1:36pm

and you got your own subscription key and entered it?

circuitb · March 1, 2019, 4:45pm

yep i try both keys… on win 8.1 and win 10 machines…
from a fresh new vvvv alpha install installing the Microsoft.Cognitive.Service.Speech
nuget 1.3.0
no luck so far!

mburk · April 25, 2019, 5:48pm

Just tested this and got great results. If you install the newest Nuget, you can alsp use SpeechSynthesis!

maximesouvestre · February 24, 2022, 11:55am

Hallo! Sorry for bringing this topic back on top but it’s still opened, relies on the demo patch provided, and an issue already explained before.

So, same as @circuitb , I’m also having trouble using your demo patch @joreg .
No issue reported from Gamma 2021.4.6.
My subscription key and region are correctly copied from Azure Dashboard.

When I bang Start, I can see that vvvv is using my microphone thanks to the mic icon in Windows system tray. When I bang Stop IOBox the icon disappears. So looks like this is correctly working.
Nevertheless when I’m talking, I get no text output.

If I try to use the Azure Speech-To-Text service from the command line as explained from Microsoft documentation (so without using VL), then it’s working perfectly.

maximesouvestre · February 24, 2022, 3:50pm

Coming back here because I’ve found out why the patch was actually not working and this might be useful for future people. Apparently Windows decided to change my default microphone for no reason… Thing is, I thought it was the correct one used by VL because it was the correct one used by other services requiring microphone (Zoom, Google Meet, etc.). I don’t exactly understand why Windows changed my setting all of a sudden, but I don’t want to try understanding that, haha.

joreg · February 24, 2022, 4:35pm

thanks for following up on this. i wonder if the nodes in the audio category would have helped you understand and change the settings:
grafik

maximesouvestre · February 28, 2022, 10:25am

As I was using another Create (SpeechRecognizer) with AudioConfig as a second input pin + FromDefaultMicrophoneInput node connected to it, I was expected that my default microphone input would be used. But since Windows changed my default mic, then obviously that couldn’t work.

But indeed you are totally right, I should have created a CaptureDevice node from the Audio category to check if the correct input device was used. I will do that for now on!