Author Topic: Speech to text- single characters (Read 4063 times)

pavsko · « **on:** August 30, 2019, 03:53:05 AM »

Well, {DICTATION} token do not work very precisely for me, so I am looking for the way how to send to the game chat some short texts dictated one by one character.

I have made successfully command for the dictation of numbers:

Command: set number [0..9]
Set Text [mytext] to '{CMD}'
Set integer [mynumber] value to the converted value of {TXTNUM:mytext}
Set Windows clipboard to '{INT:mynumber}'
Press Left Ctrl+V keys and hold for 0,1 seconds and release
Say, '{INT:mynumber}'

but I am not able to do something similar with characters...

The following command works but I have to do one command for every character (not very practical):

Command: set character C
Set Windows clipboard to 'c'
Press Left Ctrl+V keys and hold for 0,1 seconds and release
Say, 'c'

I would welcome any tip or advice how to make a single command for speech to text of a single characters...

Pfeil · « **Reply #1 on:** August 30, 2019, 03:49:24 PM »

Assuming you usually don't have a need to dictate a single character (responding with "k" aside), you could make a single command that allows you to speak all the characters, then submit them when you're done (this also means you can't trigger a character command unintentionally).

Something like

Code: [Select]

Set Text [~output] to ''
Start Loop While : [1] Equals [1]
    Wait for spoken response: '[capital;] [a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z];1;2;3;4;5;6;7;8;9;0;space;,;.;!;?;backspace;complete;cancel'
    Begin Text Compare : [~response] Equals 'complete'
        Begin Text Compare : [~output] Does Not Equal ''
            Set Windows clipboard to '{TXT:~output}'
            Press Left Ctrl+V keys and hold for 0,01 seconds and release
            Write [Blue] 'Pasted "{TXT:~output}"' to log
        Else
            Write [Orange] 'Output buffer is empty, exiting without pasting' to log
        End Condition
        Exit Command
    Else If Text Compare : [~response] Equals 'cancel'
        Write [Orange] 'Input cancelled' to log
        Exit Command
    Else If Text Compare : [~response] Equals 'backspace'
        Begin Text Compare : [~output] Does Not Equal ''
            Set Text [~output] to '{TXTSUBSTR:~output:0:{EXP: {TXTLEN:~output} - 1}}'
        End Condition
        Set Text [~response] to ''
    Else If Text Compare : [~response] Starts With 'capital'
        Set Text [~response] to '{TXTUPPER:"{TXTREPLACEVAR:~response:"capital ":""}"}'
    Else If Text Compare : [~response] Equals 'space'
        Set Text [~response] to ' '
    End Condition
    Set Text [~output] to '{TXT:~output}{TXT:~response}'
    Write [Blue] 'Output buffer: "{TXT:~output}"' to log
End Loop

I added some punctuation characters as well.

The while loop just checks a boolean variable named "1" against itself; Make sure "Evaluate 'Not Set' as false" is checked.

The "Wait For Spoken Response" action should not have a timeout (set to 0.0), and should not continue on any speech.

As an aside, there is no reason in your example to use a variable set to the value of "{CMD}"; You can nest tokens (using double quotes to indicate to the "{TXTNUM:}" token that the input should be treated as a literal value rather than a variable name, as noted in the documentation):

Code: [Select]

Set integer [mynumber] value to the converted value of {TXTNUM:"{CMD}"}
Set Windows clipboard to '{INT:mynumber}'
Press Left Ctrl+V keys and hold for 0,1 seconds and release
Say, '{INT:mynumber}'

pavsko · « **Reply #2 on:** September 02, 2019, 12:42:16 AM »

Thanks very much Pfeil for your expert answer. The script looks like I exactly need!

Unfortunately, I am not able to duplicate that complex command. I was stuck on the 3rd line - cannot find command " Wait for spoken response:"

BTW: Is there any way how import complex commands from txt to VA?

Pfeil · « **Reply #3 on:** September 02, 2019, 12:45:52 AM »

Quote from: pavsko on September 02, 2019, 12:42:16 AM

cannot find command " Wait for spoken response:"

That action can be found under "Other >", "Advanced", "Get User Input", "Wait For Spoken Response".

Quote from: pavsko on September 02, 2019, 12:42:16 AM

BTW: Is there any way how import complex commands from txt to VA?

No.

pavsko · « **Reply #4 on:** September 02, 2019, 03:03:36 AM »

Thanks again, finally I successfully managed to insert all lines of the command

Well, it works but seems practically unusable for me. I am not a native speaker and VA does not recognize numbers and characters correctly.

e.g. I tried to dictate numbers 1 to 7 and finish with the word "complete" and this is the output:
except number 3 all were correctly dictated, but only number 1 was recognized as character pasted:

10:57:31 - Pasted "1"
10:57:29 - Unrecognized : 'seven'
10:57:27 - Unrecognized : 'six'
10:57:25 - Unrecognized : 'five'
10:57:23 - Unrecognized : 'four'
10:57:21 - Unrecognized : 'to being'
10:57:19 - Unrecognized : 'two'
10:57:17 - Output buffer: "1"
10:57:13 - Recognized : 'dictation characters'

As for the letters, not a single one is correctly recognized by VA $:-\$

Pfeil · « **Reply #5 on:** September 02, 2019, 03:29:55 AM »

Have you tried restarting VoiceAttack?

If so, post the contents of your action list (right-click it, choose "Copy All as Text", paste into code tags here; Click the "#" button to add those tags).

pavsko · « **Reply #6 on:** September 02, 2019, 03:40:33 AM »

OK, downloaded the newest version, installed, restarted, no change.
So here is my command

Code: [Select]

Say, 'Dictation started'
Set Text [~output] to ''
Start Loop While : [1] Equals [1]
    Wait for spoken response: '[capital;] [a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z];1;2;3;4;5;6;7;8;9;0;space;,;.;!;?;backspace;complete;cancel'
    Begin Text Compare : [~response] Equals 'complete'
        Begin Text Compare : [~output] Does Not Equal ''
            Set Windows clipboard to '{TXT:~output}'
            Press Left Ctrl+V keys and hold for 0,1 seconds and release
            Write [Blue] 'Pasted "{TXT:~output}"' to log
        Else
            Write [Orange] 'Output buffer is empty, exiting without pasting' to log
        End Condition
        Say, 'Dictation finished,'
        Exit Command
    Else If Text Compare : [~response] Equals 'cancel'
        Write [Orange] 'Input cancelled' to log
        Say, 'Dictation canceled,'
        Exit Command
    Else If Text Compare : [~response] Equals 'backspace'
        Begin Text Compare : [~output] Does Not Equal ''
            Set Text [~output] to '{TXTSUBSTR:~output:0:{EXP: {TXTLEN:~output} - 1}}'
        End Condition
        Set Text [~response] to ''
    Else If Text Compare : [~response] Equals 'capital'
        Set Text [~response] to '{TXTUPPER:"{TXTREPLACEVAR:~response:"capital ":""}"}'
    Else If Text Compare : [~response] Equals 'space'
        Set Text [~response] to ' '
    End Condition
    Set Text [~output] to '{TXT:~output}{TXT:~response}'
    Write [Blue] 'Output buffer: "{TXT:~output}"' to log
End Loop

Pfeil · « **Reply #7 on:** September 02, 2019, 03:49:48 AM »

That looks good, aside from one minor error:

Code: [Select]

Else If Text Compare : [~response] Equals 'capital'has to be

Code: [Select]

Else If Text Compare : [~response] Starts With 'capital'as you can only use "capital" as a prefix.

If you try the "Wait for Spoken Response" action by itself in a command, do the letters get recognized?

I.E. only

Code: [Select]

Wait for spoken response: '[capital;] [a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z];1;2;3;4;5;6;7;8;9;0;space;,;.;!;?;backspace;complete;cancel'

should allow any of the letters to be recognized, though only once per command execution as you don't have the loop in there.

pavsko · « **Reply #8 on:** September 02, 2019, 06:59:20 AM »

Well I tried this one:

Code: [Select]

Say, 'Dictation started'
Set Text [~output] to ''
Wait for spoken response: '[a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z];1;2;3;4;5;6;7;8;9;0;space;,;.;!;?;backspace;complete;cancel'
Set Windows clipboard to '{TXT:~output}'
Press Left Ctrl+V keys and hold for 0,1 seconds and release
Write [Blue] 'Pasted "{TXT:~output}"' to log
Say, 'Dictation finished,'
Exit Command

There is variable ~output used in the command "Wait for spoken response".

only 1 and 9 digits are recognized:

14:51:48 - Pasted "9"
14:51:46 - Unrecognized : 'eight'
14:51:45 - Unrecognized : 'seven'
14:51:43 - Unrecognized : 'six'
14:51:41 - Unrecognized : 'five'
14:51:40 - Unrecognized : 'four'
14:51:38 - Unrecognized : 'ring'
14:51:36 - Unrecognized : 'two'
14:51:34 - Recognized : 'dictation single character'
14:51:29 - Pasted "1"
14:51:25 - Recognized : 'dictation single character'

Pfeil · « **Reply #9 on:** September 02, 2019, 07:21:24 AM »

If you make a command with

Code: [Select]

[capital;] [a;b;c;d;e;f;g;h;i;j;k;l;m;n;o;p;q;r;s;t;u;v;w;x;y;z];1;2;3;4;5;6;7;8;9;0;space;,;.;!;?;backspace;complete;cancel

as the "When I say" value, do the characters get recognized properly?

pavsko · « **Reply #10 on:** September 02, 2019, 07:33:29 AM »

Tried it, but the same (only 1 and 9 are recognized properly):

15:32:09 - Recognized : '9'
15:32:06 - Unrecognized : 'eight'
15:32:04 - Unrecognized : 'seven'
15:32:02 - Unrecognized : 'six'
15:32:01 - Unrecognized : 'five'
15:31:59 - Unrecognized : 'four'
15:31:57 - Unrecognized : 'being'
15:31:55 - Unrecognized : 'two'
15:31:53 - Recognized : '1'

Pfeil · « **Reply #11 on:** September 02, 2019, 07:44:26 AM »

There may be a broader issue on your machine. The speech recognition engine does appear to recognize what you're saying (most of the numbers, at least), but it's not getting matched to the command phrases.

You could try a new speech recognition profile; Make sure to train it, at least three times.

pavsko · « **Reply #12 on:** September 03, 2019, 12:37:55 AM »

Thanks very much Pfeil again for your effort to help me.
I will try to train my profile. As I wrote earlier I am not a native speaker so maybe there is the problem. On the other hand, I have been using VA for several years to command my PC and I have tons of commands for the gaming as well. All works very well for me though I have noticed that the shorter words are generally understood worse than the long words or phrases. So I prefer longer words or two words commands in my profiles.

But the only serious problem I have encountered is the DICTATION token which does not recognize my speech correctly at all and now the recognition of the single numbers and letters (which is imho related to the problem with short words generally).

BTW: my first script to write numbers to the clipboard (see my first post) works well - all numbers are recognized.
May be thanks to the function TXTNUM?

AcesHigh · « **Reply #13 on:** October 12, 2019, 11:30:08 AM »

I just bought Voice Attack and I think I'm having the same issue. When I put "one;two;three" in the "Wait for spoken response" field it works and stores the response in a TXT variable and I can print it in the log. If I put "1;2;3" in the "Wait for spoken response" field it doesn't recognize it.

Maybe I'm missing something? This use case is exactly why I bought this program. If I can work around this, can someone let me know how.

UPDATE: Some numbers are recognized as digits (1,2,etc) other are recognized but are using text ("eight", "nine", etc) even though the Allowed responses are digits 1,2,3,4,5,6,7,8,9... When I say 1 or 2 and I write to log it shows "1" or "2", but other entries like 8,9 are showing "eight" and "nine" in the logs and I can't run type conversions on those. You can see that it is recognizing the speech correctly in all cases below, but it only matches speech to digit correctly in certain cases.

2:04:48 PM - 15
2:04:46 PM - Recognized : 'set number'
2:04:16 PM - 12
2:04:14 PM - Recognized : 'set number'
2:04:06 PM - 5
2:04:05 PM -
2:04:05 PM - Recognized : 'set number'
2:04:01 PM - Unrecognized : 'six'
2:04:00 PM - Recognized : 'set number'
2:03:55 PM -
2:03:51 PM - Unrecognized : 'seven'
2:03:50 PM - Recognized : 'set number'
2:03:46 PM -
2:03:42 PM - Unrecognized : 'eight'
2:03:41 PM - Recognized : 'set number'
2:03:39 PM - 10
2:03:37 PM - Recognized : 'set number'

iceblast · « **Reply #14 on:** October 12, 2019, 05:43:25 PM »

Spoken command = [0..9]

Code: [Select]

Set Text [count] to '{CMD}'
Set integer [count] value to the converted value of {TXTNUM:count}
Quick Input, '{TXTNUM:count}'

I just say the number and it types it.

If you want to do letters.

Spoken command = [Letter A;B;C;D;E;F;G;H;Letter I;J;K;L;M;N;Letter O;Letter P;Q;R;S;T;U;V;W;X;Y;Z]

Code: [Select]

Set Text [Letter] to '{CMD}'
Begin Text Compare : [Letter] Equals 'Letter A'
    Quick Input, 'a'
End Condition - Exit when condition met
Begin Text Compare : [Letter] Equals 'Letter I'
    Quick Input, 'i'
End Condition - Exit when condition met
Begin Text Compare : [Letter] Equals 'Letter O'
    Quick Input, 'o'
End Condition - Exit when condition met
Begin Text Compare : [Letter] Equals 'Letter P'
    Quick Input, 'p'
End Condition - Exit when condition met
Quick Input, '{TXT:Letter}'

The speech engine doesn't understand a,i,o,p for me, but if I say Letter a, Letter i, Letter o, Letter p, it will type the letter.

Now, I'm using an Alternate Speech Engine with VA, Microsoft Speech Platform 11, this engine you don't have to train. Not sure how well it works for languages other than English though. I believe you can pick a language pack, but not sure.

I personally could never get the speech engine that came with windows to work for me. Training it didn't seem to make any difference at all, but Microsoft Speech Platform 11 doesn't require training, and can't be trained anyway. If you use Microsoft Speech Platform 11, you can no longer use Dictation in VA, but for me, it never worked anyway, because the speech engine never heard me correctly.

Microsoft Speech Platform 11 hears me correctly the majority of the time.

This is how I personally enter numbers and letters.

Pfeil · « **Reply #15 on:** October 12, 2019, 07:34:50 PM »

Have you tried creating a new speech recognition profile, as described here?

AcesHigh · « **Reply #16 on:** October 12, 2019, 11:12:22 PM »

I’m not sure how creating a new profile and retraining will help, but I may try. In the example I gave when I say, “six” it says it’s unrecognized but is able to spell/echo the word six to the screen. In language, 6 and “six” are equivalent as far as annunciation. It baffles me how the voice recognition knows I said six because it spells it, but it’s unrecognized In the application because I’m looking for “6.” They are the same!

iceblast · « **Reply #17 on:** October 13, 2019, 12:18:50 AM »

Just set it to when it hears six, to type 6.

It tells you the word it's hearing, just use that word for the command. It's how to work around issues like this.

AcesHigh · « **Reply #18 on:** October 13, 2019, 08:43:03 AM »

Again, I’m using the “wait for spoken response” input dialog which just sets a VARIABLE to the spoken response from the allowed responses I provide in the dialog. I have “1;2;3;4;5;6;7;8;9” in the dialog and it works for most of these. When I say “one” the variable is set to “1”. “One” and “1” are equivalent when spoken. I assume it’s setting the value to “1” in this case because some way the Voice Attack software knows I’m looking for “1?” When I say “six” the variable is set to “six” which doesn’t match any of the allowed responses therefore I get “unrecognized response”... BUT 6 and “six” are equivalent just like “1” and “one” and it works for “one”!! I need the variable set appropriately so I can do a text to integer conversion to perform calculations.

This is a bug. Sure, I can put “six” and “6” in the list of acceptable responses and then check for both and set the variable to “6,” but I shouldn’t need to do this.

Gary · « **Reply #19 on:** October 13, 2019, 09:45:07 AM »

What the guys are trying to say is that, for whatever reason, your speech engine is not able to pick up a specific word. There is zero code in VA that tries to figure out what to do with English numbers - it's all reacting to what the speech engine is interpreting. If you have 1;2;3;4;5;6;7;8;9 in a, 'Wait for spoken response' action, the speech engine is fed exactly that - 1 through 9 (numeric) and not 'one' through 'nine' (alpha). You speak, 'one', the speech engine is able to make the connection as, '1'. You speak, 'six' and the speech engine somehow is not able to make the connection, no matter what. This would be classified as an anomaly of the speech engine, and speech engine anomalies are not uncommon - it could be speaking style, it could be environment, line noise, drivers, hardware, etc. that contribute to how the speech engine behaves. The speech engine also, 'learns' over time, so that could be a factor. I would go with the suggestion of trying a new speech profile to see if that helps at all.

All that said, do you have any other commands in your profile that have the word, 'six' in them? It is possible that the speech engine could be using that to make its decisions. Try typing the word, 'six' in the box at the bottom of your profile edit screen to see if anything comes up (for grins, also try, '6'). If so, you'll want to work with that. If not, you're still at the mercy of the speech engine and will need to adjust your profile accordingly and/or create a new speech profile, as the guys are suggesting (they are long-time VA users).

AcesHigh · « **Reply #20 on:** October 13, 2019, 11:09:55 AM »

Got it. Not trying to be argumentative! Just trying to understand.

There is/was no reference to "six" in my profile, which is what is so confusing. There is a reference to "6," which is expected.

I used the same profile and retrained some more. It is working better. Makes no sense to me how the voice recognition can hear "six," know it is "six" (it echoes it to the log) and not equate "six" with "6" somehow since, as I said, the annunciation is exactly the same. I don't say "six" different than "6." … and how retraining helped that is beyond me.

iceblast · « **Reply #21 on:** October 13, 2019, 02:03:15 PM »

Yeah, the windows speech engine can be frustrating at times. There are words that sound similar, and no matter how clear you try to say the word, it will always hear the other.

The only thing you can do is, retrain the engine for that word, and if that doesn't help, check what VA is hearing, and see if you can use that word to trigger the command.

Before I started using MSP11, I was using the default speech engine. It never heard me correctly. You should have seen my profile for VA, I had a bunch of random words set to run my commands. It worked, it wasn't perfect, but it got the job done. MSP11 doesn't need training, and everything just seems to work. Though some words still give it trouble, the majority are understood properly.

Author Topic: Speech to text- single characters (Read 4063 times)

pavsko

pavsko

pavsko

pavsko

pavsko

pavsko

pavsko

AcesHigh

AcesHigh

AcesHigh

AcesHigh