Author Topic: Slurred speech into text into text-to-speech? (Read 221 times)

Plum · « **on:** July 04, 2024, 07:07:51 PM »

Hi all, my brother plays a lot of DOTA and was the in-game shotcaller for his group of friends. Unfortunately, over the last couple of years illness has greatly affected his ability to speak clearly, and his speech is now slow and slurred. His friends have set him up with a text-to-speech system on Discord, which he can use to talk while they play games, but obviously flipping to Discord and typing sentences is problematic mid-game.

I remember when setting up VA for myself that I had to spend a while training the system to recognise my words (I think I used the default Windows voice recognition app). Which made me wonder:

1. Is it feasible for him to train that (or some other voice suite) to recognise what he is trying to say.
2. Can that then be combined with VA while he's playing DOTA, in order to pass the words into Discord to trigger the text-to-speech system there.

Ideally, he would be able to speak, get his words converted to text, which are then converted back to clear speech, all without having to leave his game.

I'm open to other ideas if anyone has solved this is another way. Thanks in advance

Pfeil · « **Reply #1 on:** July 05, 2024, 02:16:11 AM »

While it is technically possible to have the Microsoft SAPI speech recognition system transcribe freeform speech (I.E. not predefined like normal commands), E.G. using VoiceAttack's dictation mode, getting accurate results requires as close to ideal circumstances as possible, including a quiet environment, microphone well-suited to speech recognition, but also clear and consistent pronunciation and enunciation.

They can certainly try speech recognition training, and see whether the resulting accuracy after many training sessions (three is considered the absolute minimum) is anywhere close to what would be practical.

Do also keep in mind that any spoken words must be known to the speech recognition system.
The Microsoft SAPI speech recognition system comes with a default dictionary for the language the engine is intended for, but that can be expanded by manually adding words (and ideally pronunciations) to it.
To add to the dictionary, click the wrench icon on VoiceAttack's main window, and on the "Recognition" tab of the VoiceAttack options window click the "Utilities >" button, then choose "Add/Remove Dictionary Words"

As to text-to-speech, Windows requires applications to have focus to receive normal keyboard input, I.E. you'd need to bring the Discord window to the foreground to be able to type into it.
However, VoiceAttack can perform text-to-speech using SAPI-compliant (or Speech Platform 11) text-to-speech engines.
Combining that with something like a virtual audio cable, or VoiceMeeter (which is unrelated to VoiceAttack), would allow VoiceAttack to directly "speak" into Discord by having audio from a playback device sent to a recording device.

Plum · « **Reply #2 on:** July 06, 2024, 02:34:43 PM »

Ah, it sounds like it's not so simple as I thought. I think we'll try the first step that you mentioned - just getting it to recognise certain keywords that we've manually added to the dictionary, and if we can get that far then we'll figure out what we can do with that. Even if we can't do full sentences, getting some trigger words to work would be a useful step forward. Maybe use an old laptop that handles the VA and Discord interaction, while he games on a separate machine entirely (not that your other solutions don't sound feasible, but neither of us is super techy and I'm sure someone in the family must have an old laptop going unused).

A big thank you for giving us a direction to try!

Author Topic: Slurred speech into text into text-to-speech? (Read 221 times)

Plum

Slurred speech into text into text-to-speech?

Pfeil

Re: Slurred speech into text into text-to-speech?

Plum

Re: Slurred speech into text into text-to-speech?