Author Topic: HDI Migrate a trained Windows SR engine's knowledge to new Windows install?  (Read 2770 times)

ralf44

  • Newbie
  • *
  • Posts: 41
I have taught one computer both through Windows speech training applet and by correcting mis-hearings in dictation mode, and also by adding words to the SR dictionary (both to recognise and to ignore) over weeks.

Where does Windows (7/64) store this info and can I just copy a couple of files across to another Win7/64 install to avoid training that one from scratch? The text entries to the dictionary would be most tedious and difficult to redo, the actual VR exercises less of a bother.

Thanks if anyone knows!
(I've searched Google, my HD and the forum already.)

Gangrel

  • Caffeine Fulled Mod
  • Global Moderator
  • Full Member
  • *****
  • Posts: 216
  • BORK FNORK BORD
Quick Google search gave me this:

Transfering WSR profile

Pfeil

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4759
  • RTFM
The links to the tool in that article are dead, and the Easy Transfer doesn't really offer an option to transfer just the profiles.

This tool seems to be similar to the "WSR Profile Tool" mentioned, judging by its description it may do what you want, provided the old machine is still functioning. Though it's listed as working for XP.

EDIT: The tool errored out on my machine, but it pointed to the files at "%LOCALAPPDATA%\Microsoft\Speech\Files", which also stores the audio from speech training. I don't know whether transferring those would work by itself.

EDIT#2: As I expected, there's a reference to those files in the registry.

The key on my machine is "HKEY_CURRENT_USER\Software\Microsoft\Speech\RecoProfiles\Tokens\{1119241B-E581-422C-B9B9-1038E81E8FBD}\{DAC9F469-0C67-4643-9258-87EC128C5941}\Files", and contains a list of names prefixed with "AM" and suffixed ".am", "am_bak", ".tpc", ".fpr", or ".fpr_bak", referencing the files in the "Microsoft\Speech\Files\MSASR" folder in the data field.

That key is partially unique; "{1119241B-E581-422C-B9B9-1038E81E8FBD}" is the ID on my machine(it could be the ID of the speech profile itself), yours will be different, but if you navigate to "HKEY_CURRENT_USER\Software\Microsoft\Speech\RecoProfiles\Tokens\" there should only be one key anyway, so it's not hard to find
({DAC9F469-0C67-4643-9258-87EC128C5941} is not unique, as it's the same on my Windows 7 VM).

The same key is also found under "HKEY_USERS", but that's because "HKEY_CURRENT_USER" is a reference pointing to the currently active user's ID(you can copy or modify either, they're the exact same data).

There's also a similar key that references the training audio files under "HKEY_CURRENT_USER\Software\Microsoft\Speech\RecoProfiles\Tokens\{1119241B-E581-422C-B9B9-1038E81E8FBD}\{00000000-0000-0000-0000-000000000000}\Files" that has key prefixed with "TrainingAudio-" pointing to the "TrainingAudio" directories' files(as you'd expect).

Again, "{1119241B-E581-422C-B9B9-1038E81E8FBD}" is unique to my machine.


So in summary: If you export/import the registry keys under "HKEY_CURRENT_USER\Software\Microsoft\Speech\RecoProfiles\Tokens" and copy/paste the files in "%LOCALAPPDATA%\Microsoft\Speech\Files", you could potentially have a working copy of your speech profile on another machine.



As a side note, there are also references to the "AM" prefixed name under "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens\MS-1033-80-DESK\Models\1033\L1033\AMs", which in turn point to files in "%windir%\Speech\Engines\SR\en-US\", however those also exist on a Windows 7 VM I have(which doesn't actually have a trained speech engine), so there shouldn't be a need to copy those files or that key.
« Last Edit: November 19, 2017, 05:34:29 PM by Pfeil »

ralf44

  • Newbie
  • *
  • Posts: 41
Thanks, it seems MS did not make this trivial to do. @Pfeil I forgot to search the Registry. Good catch and detailed detective work as always. Looking the file suffixes I can't guess which might be the user dictionary(s) so will try copying everything you identified.

Exergist

  • Global Moderator
  • Sr. Member
  • *****
  • Posts: 405
  • Ride the lightning
Generally speaking every time you create a WSR profile a unique ID is created at "HKEY_CURRENT_USER\Software\Microsoft\Speech\RecoProfiles\Tokens\" (and the ID is also in "HKEY_USERS"). Even if you name a profile "doggies" then delete it and then subsequently create a new profile called "doggies" their IDs will be different.

I believe data entered in the Dictionary is stored in .dat files, and Pfeil already commented about the location for the audio files from voice training.

In general other speech-related files are found in AppData with similar folder pathing as mentioned previously. Dig through the registry path as Pfeil pointed out to see if you can find other files related to the profile. Bear in mind though that the profiles may pull other references from the registry, so you may run into difficulties there.

Check out my post here for how to programmatically teach the Dictionary words with pronunciations. I'll be adding more information and functionality to this as I can, but maybe this will help you get up and running faster.