Author Topic: Access to Google Speech Api (Read 12304 times)

Lem_Jukes · « **on:** August 08, 2017, 12:58:19 PM »

So this may already be doable but, a way to use the google speech API over the Microsoft speech engine would be incredibly useful. I've tested commands side by side and google's speech handler is miles ahead of what Microsoft's service is able to recognize. I realize the google speech API is a paid cloud service, so I'm not entirely sure on how the logistics of integrating it into Voice Attack. But at the very least is there any kind of a third party solution that would involve using the google speech recognition api to pass text to voice attack to be registered as commands? I realize this is a bit of a long shot but does anybody have any thoughts on this?

iceblast · « **Reply #1 on:** August 09, 2017, 04:12:52 AM »

Have you tried Microsoft Speech Platform 11? If you are just wanting better Recognition, for me MSP 11 does the job. Just install it, and go into VA's options and switch to it.

The standard speech engine doesn't work very will for me, but MSP 11 is a great deal better, and you don't have to train it.

Google would be great, but it's unlikely that VA will be able to use it.

Exergist · « **Reply #2 on:** August 16, 2017, 09:46:57 AM »

@iceblast - are you referring to this:

...which can be installed by following these instructions from the VA website and then accessed by enabling these options:

(actual use the top option for the recognition and the bottom option for the TTS voice).

Thoughts? Thanks!

iceblast · « **Reply #3 on:** August 17, 2017, 03:15:59 AM »

It's the Microsoft Server Speech Recognition Language - TELE (en-US)

That is the one I use. It works really good for me, but no speech recognition is perfect, they all have their flaws, but it's much better than the default one that comes with Windows, and you don't have to train it.

The use installed speech platform text-to-speech synthesizers is if you want to use microsoft voice I believe. I have the voice Ivona 2 Joey installed, so I use the top one.

svennv · « **Reply #4 on:** September 28, 2017, 03:17:09 AM »

I was hoping for this too.

After having used Google Assistant on my phone and having been astonished how well it understood me near heavy traffic or even in the middle of a rock concert, i was very disappointed how badly the microsofts voice recognition in my big machine is.

So i came here to ask for googles voice recognition too / see if this is already in discussion.

Now having had a quick look, i saw that there are actually two access points to googles tech.
- Googles Cloud Speech API ( https://cloud.google.com/speech ) should be irrelevant because it is a paid for online service.
- Luckily there is also the Google Assistant API ( https://developers.google.com/assistant/sdk/overview ) which is a library available for many different programming languages. Im not sure about their licensing, but this might be an option.

Additional Benefit: Decoupling Voice Attack from microsofts speech recognition should take care of the biggest hurdle to make it multiplatform.

svennv · « **Reply #5 on:** September 28, 2017, 05:51:01 AM »

I am also confused about the versioning of microsofts speech engine.
The recognizer installable via the link in Exergists post is named Version 11, but it also has been released 2011.
The engine preinstalled in windows 10 is labeled version 8, but is more recent.
WTF microsoft?

My google-fu was insufficient in determining which version is newer (and hopefully better). Can you enlighten me?

Pfeil · « **Reply #6 on:** September 28, 2017, 12:52:00 PM »

The main differences between Speech Platform 11 and the built-in Speech Recognition Engine is noted in the installation instructions:

Quote

First, you have to install MORE software. The files are not huge (and they don't take that long to install), but its just MORE software to manage.

Second, dictation functionality is not provided with this platform. Not a big deal for just about everybody, but for the few that want to use VoiceAttack to accept, 'freeform' speech, they won't be able to do so.

Third, since the dictionary that the speech engine uses is based solely on your commands, you will need to set a high confidence level (greater than 75). Otherwise, the speech engine will try to make a best guess at what you are saying and that will usually not end well.

Fourth, 'unrecognized' commands will not be reported in the log, as the speech engine does not raise any event when a voice command is unrecognized.

Fifth, recognition is very good, however, it is never as good as a speech engine that is trained to your individual voice. Which leads to the silver lining: The speech engines in this platform do not require voice training (I can hear the cheers).

So I'd argue it's less about which is newer, but rather which feature set attracts you most.

Antaniserse · « **Reply #7 on:** September 28, 2017, 01:37:25 PM »

Quote from: svennv on September 28, 2017, 03:17:09 AM

- Luckily there is also the Google Assistant API ( https://developers.google.com/assistant/sdk/overview ) which is a library available for many different programming languages. Im not sure about their licensing, but this might be an option.

Additional Benefit: Decoupling Voice Attack from microsofts speech recognition should take care of the biggest hurdle to make it multiplatform.

The biggest issue with this is that, if I understand correctly, the SDK still relies on an online service to do the actual processing, and those libraries only act as a bridge towards that.
Having every single command do a network round-trip to be resolved looks like a serious limitation to me, if you need a reasonably time sensitive response

Also, again I may be reading the documentation incorrectly, but it looks to me that every device needs to be activated by one's own Google account to access the API, not just by the developer coding the integration with the SDK, and this will tie VA to their service, which I'm not sure is Gary's intention

AlbertoRab · « **Reply #8 on:** November 22, 2017, 09:21:38 AM »

Quote from: Garry on November 22, 2017, 10:29:15 AM

You don't have to pay for the api, but there is an associated cost for using the api.

Do you have to pay for the api?

Gary · « **Reply #9 on:** November 22, 2017, 10:29:15 AM »

You don't have to pay for the api, but there is an associated cost for using the api.

rhradec · « **Reply #10 on:** July 21, 2021, 10:50:29 PM »

I decided to find a way to use google speech with voiceattack, since the microsoft speech "thingy" can't understand what I say AT ALL (I'm originally from Brazil)!! LoL

But Alexa and Google can understand me perfectly! Alexa is actually much better to understand me than google, but I couldn't find a python module that just returns the text of what I say using amazon alexa, like the python SpeechRecognition module does so easily using google. SpeechRecognition module actually have a few of other backends that I want to try later on, but google one is so easy to do and just works!

So you can find it here: https://github.com/hradec/google-speechrecognition-for-voiceattack

Currently the script runs "voiceattack.exe -command <text>" to send it to voiceattack, but a much better and elegant solution would be to connect via a TCP port!

So my question goes to the voiceattack developers I think, since I couldn't find this functionality in the docs:

Would it be possible to have voiceattack binding to some tcp port, so any other software could send text to it via tcp connection?

My interest in doing it this way, apart from being a much more elegant solution, is to be able to run my voice capture script on linux (where my headphone/mic is connect to) and send the text to my windows machine via network. I mainly use linux here, but I have a windows machine for gaming that I connect over using moonlight to play.

make sense?

Anyhow, if someone wants to try google speech recognition in voice attack now, just download the files from the github repository and run "run.cmd" (I think if you don't have python3 installed, windows will open the windows store for you and you can install from there!)!

Having python3 already installed, run.cmd should install all the required python modules and just start it! Then just speak away!

Please drop a issue on github if you guys have any problem running it and I'll do my best to help!

cheers...
-H

Pfeil · « **Reply #11 on:** July 22, 2021, 06:01:24 AM »

You could write a plugin that facilitates that TCP connection and executes the command using the Command.Execute() method.

C# has a few classes that could aid you in this, such as System.Net.Sockets.TcpListener

There is a plugin example in a subfolder of the installation directory.

Exergist · « **Reply #12 on:** August 20, 2021, 09:47:47 AM »

I recommend you check out something like SimpleTCP to facilitate a TCP/IP connection with VA.

Author Topic: Access to Google Speech Api (Read 12304 times)

Lem_Jukes

Access to Google Speech Api

iceblast

Re: Access to Google Speech Api

Exergist

Re: Access to Google Speech Api

iceblast

Re: Access to Google Speech Api

svennv

Re: Access to Google Speech Api

svennv

Re: Access to Google Speech Api

Pfeil

Re: Access to Google Speech Api

Antaniserse

Re: Access to Google Speech Api

AlbertoRab

Re: Access to Google Speech Api

Gary

Re: Access to Google Speech Api

rhradec

Re: Access to Google Speech Api

Pfeil

Re: Access to Google Speech Api

Exergist

Re: Access to Google Speech Api