Many of today’s systems support innovative interfaces for people to interact
with, such as touch screen tablets and multi-touch surfaces. I would like to know to what extend voice recognition is supported inside vvvv. If it works accurately then it would be a great way interacting with media devices and also to bring up text out of a music to separate vocal despite the heavy use of instruments.
The siri function in iPhone works quite accurate and shows future would be more of voice commands to certain group of users. When we take youtube, the caption option shows the text but not accurate, mostly wrong transcript shows up. Considering accent i suppose it would be very challenging to get transcript out of audio precise but would be charm if it works.
I don’t know if this is already possible in vvvv or if anyone thinking in that direction of research?