Author Topic: speech to text  (Read 14111 times)

paul2978

  • Guest
speech to text
« on: March 12, 2018, 04:32:38 AM »
Can anyone help me put this code into VA?

Code: [Select]
Start Loop While : [{STATE_KEYSTATE:ENTER}] Equals '0'
Start Dictation Mode (Clearing Dictation Buffer)
Start Loop While : [{EXP:{DICTATION} + {STATE_KEYSTATE:ENTER}}] Equals '0'
End Loop
Stop Dictation Mode
Quick Input, '{DICTATION}'
End Loop

I found it in the forum, but I cant implement it into VA.

https://voiceattack.com/SMF/index.php?topic=953.0

Thanks


Paul

Pfeil

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4759
  • RTFM
Re: speech to text
« Reply #1 on: March 12, 2018, 11:25:44 AM »
I cant implement it into VA.
Which part are you getting stuck on?

I'll try to walk you though it:


Start off by clicking , then click "New Command".

Enter a phrase that will eventually trigger the command in the "When I say" textbox(I went with "Type as I speak").


Next, we'll start adding actions: Click "Other >", go to "Advanced >", go to "Add a Loop Start >", click "Single Condition (While Loop)".

In the "Loop Start" dialog that just opened, click the "Text" tab if it's not selected already, enter "{STATE_KEYSTATE:ENTER}" into the "Variable Name / Token" textbox, and enter "0"(zero) into the "Text" textbox.

Click "OK" to add the action to your command, you should now see this at the top of the action list:
Code: [Select]
Start Loop While : [{STATE_KEYSTATE:ENTER}] Equals '0'
Adding a loop start action will also create
Code: [Select]
End LoopAll other actions go in between these two actions(The top actions should automatically be highlighted, meaning any other actions you add are inserted under that).


Click "Other >", go to "Dictation >", click "Start Dictation Mode".

Check the box next to "Clear dictation buffer before starting dictation mode".

Click "OK" to add the action to your command, you should now see this as the second line in the action list:
Code: [Select]
Start Dictation Mode (Clearing Dictation Buffer)

Click "Other >", go to "Advanced >", go to "Add a Loop Start >", click "Single Condition (While Loop)".

Again in the "Text" tab, which should already be selected(VoiceAttack remembers what you used last), enter "{EXP:{DICTATION} + {STATE_KEYSTATE:ENTER}}" into the "Variable Name / Token" textbox, and enter "0"(zero) into the "Text" textbox.

Click "OK" to add the action to your command, you should now see this as the third line in the action list:
Code: [Select]
Start Loop While : [{EXP:{DICTATION} + {STATE_KEYSTATE:ENTER}}] Equals '0'
Again, adding a loop start will also add
Code: [Select]
End Loop
Because we're not putting anything into this loop(it exists only to make VoiceAttack wait until either something is spoken into the dictation buffer, or the enter key is pressed), you'll want to click on the fourth action in the list, which should be the first "End Loop" action, to ensure the following actions are added below it.


Click "Other >", go to "Dictation >", click "Stop Dictation Mode".

Make sure "Clear dictation buffer after stopping dictation mode" is NOT checked(otherwise anything you said will be removed before it can be typed out).

Click "OK" to add the action to your command, you should now see this as the fifth line in the action list:
Code: [Select]
Stop Dictation Mode

Click "Other >", go to "VoiceAttack Action >", click "Quick Input".

Enter "{DICTATION}" into the "Text" textbox.
Into the "Hold keys down for" textbox, you'll want to enter a value that's as fast as your application will accept input, but slow enough that it won't skip any letters. For some applications this can be problematic as they expect a pause between different keypresses, which "Quick Input" does not accommodate.
I use "00.060", which is kind of slow, but seems to work alright for many games.

Click "OK" to add the action to your command, you should now see this as the sixth line in the action list:
Code: [Select]
Quick Input, '{DICTATION}'

Your action list should now look like this:
Code: [Select]
Start Loop While : [{STATE_KEYSTATE:ENTER}] Equals '0'
Start Dictation Mode (Clearing Dictation Buffer)
Start Loop While : [{EXP:{DICTATION} + {STATE_KEYSTATE:ENTER}}] Equals '0'
End Loop
Stop Dictation Mode
Quick Input, '{DICTATION}'
End Loop

Click "OK" to save your command, then click "Done" at the bottom of the "Edit a Profile" window.


Now you can test your command by saying "Type as I speak"(or whichever command phrase you typed in earlier).
You should see "Recognized : 'type as i speak'" with a green icon, and "Dictation buffer cleared" with a blue icon, appear in the log(provided you don't have the main window in compact mode, in which case you'll only see "Dictation buffer cleared" without an icon).

Anything you say should be typed out as soon as you stop speaking momentarily, this will keep going until you press the enter key on your keyboard, or stop all VoiceAttack commands(either by using the button on the main window, or pressing a hotkey if you have one assigned).


It's worth keeping in mind that when using dictation, the speech recognition engine is guessing at what you're likely to have said without a frame of reference. This means accuracy may not be very high.

Gearz

  • Newbie
  • *
  • Posts: 7
Re: speech to text
« Reply #2 on: June 12, 2018, 04:19:00 PM »
That works great, but I want to be able to say enter or hit the enter key. The reason being, sometimes google guesses what you are searching for before you finish saying the words.

Examples:

1. You have a browser window open and the search field is empty. You start dictation mode as per your script and say youtube. The STT system has a hard time understanding what you are saying but as you say "you" Google auto-fills the search with "youtube.com" before you finish saying "tube".

Now being lazy as all hell, this is where an Enter Command (accompanied by a stop all commands) comes in useful. I tried to do this with your script as is but the enter command won't function because there is another command running. I'm assuming it is the loop started in the Text Entry command (your code). How can I have my cake and eat it too? I'd like to use the enter key if the text is typed correctly with VA or press enter to stop the command and accept what Google has auto-filled for the search text.

2.I also want to add a delete command that will select all the "wrong" text so you can start the search again. This is also cancelled because another command is not allowing it to execute.

I hope that is concise enough to be understood. I myself have had a pretty good knock on the head and learning new stuff can be a challenge. I'm secretly hoping to, someday, be struck by lightning and understand code or play the ukulele effortlessly.

ps: I'm attempting to set this up for a person that has limited use of his hands. So glad VA is on Steam so you can easily gift it to your friends!
« Last Edit: June 12, 2018, 04:22:33 PM by Gearz »

Gearz

  • Newbie
  • *
  • Posts: 7
Re: speech to text
« Reply #3 on: June 12, 2018, 04:27:06 PM »
I think all I had to do was check "allow other commands to run" while this one is running. It still doesn't stop dictation mode from the initial speech to text command.

Pfeil

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4759
  • RTFM
Re: speech to text
« Reply #4 on: June 12, 2018, 05:25:34 PM »
That works great, but I want to be able to say enter or hit the enter key. The reason being, sometimes google guesses what you are searching for before you finish saying the words.
You can already hit enter to stop the command(in fact you have to), are you trying to stop the command while the quick input action is still typing?
It should be able to type so fast you'll have trouble stopping it before it finishes, unless you have a long hold down time set up.


Because commands take precedence over dictation, speaking "enter" or "delete" is possible:

Dictation
Code: [Select]
Start Dictation Mode (Clearing Dictation Buffer)
Start Loop While : (Keyboard Key 'Enter' Is Not Pressed AND [endDictation] Equals False)
    Start Loop While : ([{DICTATION}] Equals '' AND Keyboard Key 'Enter' Is Not Pressed AND [endDictation] Equals False)
    End Loop
    Begin Boolean Compare : [endDictation] Does Not Equal True
        Quick Input, '{DICTATION}'
        Clear Dictation Buffer
    End Condition
End Loop
Stop Dictation Mode
Set Boolean [endDictation] to [Not Set]
Write '[Blue] Dictation stopped' to log
VoiceAttack has evolved since the original example, so this uses the new Device State condition to check keyboard keys, and the Condition Builder for "AND" functionality.

Enter
Code: [Select]
Set Boolean [endDictation] to True
Press Enter key and hold for 0,01 seconds and release
This only exists to allow the voice command, it's not supposed to be triggered by the keyboard(the main command handles that, still).

Delete
Code: [Select]
Begin Text Compare : [{DICTATIONON}] Equals '1'
    Press Left Ctrl+A keys and hold for 0,01 seconds and release
    Press Delete key and hold for 0,01 seconds and release
End Condition
This has a safeguard to prevent accidentally deleting things when you don't want to. You could extend this safeguard to the "Enter" command, or make it check a variable set by the main command, in case you use dictation elsewhere and don't want this command enabled.

Gearz

  • Newbie
  • *
  • Posts: 7
Re: speech to text
« Reply #5 on: June 12, 2018, 07:15:18 PM »
Thanks!!!

I'll give that a whirl.

I've got another question for you. Since windows now has a pretty good dictation feature with remarkably good speech recognition is there a way to tie it into VA to do text entry? I have a simple profile that opens Dictation in windows and copies the text to the clipboard. Just wondering if it will shut a game's window down while it's being used. Dictation that is.

Gearz

  • Newbie
  • *
  • Posts: 7
Re: speech to text
« Reply #6 on: June 12, 2018, 07:24:04 PM »
Ouch, that is a bit over my head. Not sure what to add as commands. Is there a possibility that you could send over a profile with that in it so I can dig in and try to understand it?

Pfeil

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4759
  • RTFM
Re: speech to text
« Reply #7 on: June 12, 2018, 08:44:25 PM »
Since windows now has a pretty good dictation feature with remarkably good speech recognition is there a way to tie it into VA to do text entry?
"Dictation" in VoiceAttack and "Dictation" in the Windows 10 Fall Creators update are two different technologies.

One big difference is that VoiceAttack uses a fully offline speech recognition engine, meaning you don't need to be connected to the internet to utilize it, and it doesn't send your data to a remote server.
The "Dictation" app triggered by Win+H uses the same technology as Cortana, which both requires you to be online, and sends everything you say to it to Microsoft.

Cortana technology is not currently usable natively within VoiceAttack.

I have a simple profile that opens Dictation in windows and copies the text to the clipboard. Just wondering if it will shut a game's window down while it's being used. Dictation that is.
If you're referring to the Fall Creators update's "Dictation" app, that may affect window focus and cause your game to minimize(like any app pulling focus would, if it does).

The dictation feature integrated into VoiceAttack does not affect window focus(Just like the rest of VoiceAttack doesn't, unless you tell it to).


Is there a possibility that you could send over a profile with that in it so I can dig in and try to understand it?
A profile containing those commands is attached to this post.

Gearz

  • Newbie
  • *
  • Posts: 7
Re: speech to text
« Reply #8 on: June 12, 2018, 09:07:26 PM »
Downloaded the profile and imported it. Updated to the latest beta version and it's working. Thanks.

I'm not sure if it was a version problem but I'm going to look at your commands and see what I might have done wrong. I wasn't sure if they were text or boolean and tried them both.

I see what the failsafe is. It checks to see if dictationon is true. I'm assuming that means its looking to see if dictation is still running.

Thanks so much for your help and time!!!

Pfeil

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4759
  • RTFM
Re: speech to text
« Reply #9 on: June 12, 2018, 09:50:28 PM »
I wasn't sure if they were text or boolean
A variable can only have one type, and it always remains that type(in VoiceAttack at least), so if it starts out as Boolean, any action involving it will either have to use it as boolean, or explicitly cast(convert) it to text.

So, if you see something like
Code: [Select]
Set Boolean [endDictation] to TrueYou can deduce "endDictation" is a boolean variable, and it will stay that way(even conversion does not change the data stored in the source variable, it only retrieves that data and makes a copy of it in another format; Think MP3 file, not Age Of Empires ;)).


I see what the failsafe is. It checks to see if dictationon is true. I'm assuming that means its looking to see if dictation is still running.
Yes. If dictation mode is active(I.E. started by a "Start Dictation Mode" action) it will return "1", otherwise it will return "0".

You can find a description of all currently available tokens in VoiceAttackHelp.pdf, starting at page 126(VoiceAttackHelp.pdf is located in VoiceAttack's installation directory, but can also be accessed by pressing F1 while VoiceAttack has focus, or by right-clicking the title bar of the main window and clicking "Help Document" in the context menu).