One thing worth mentioning is that, while capable of doing also dictation in the proper sense, VA main goal is to respond to *known* words/phrases in a custom grammar you create with each profile.
This means that it doesn't really matter if it gets every spoken word 100%, as long as your actions are built in such a way that he can "guess" the right ones.
So, to make a practical example, if you realize (looking at the log) that it often gets '
lock' when you meant to say '
look', simply add both options to your desired commands, at it will fire anyway.
Also, while using single words commands may seems faster, using two short words is much more reliable, since the "guessing" part done by the speech engine works better: using '
go look' instead of '
look' usually gives much more consistent results.
Now, since your goal is using a programming tool, inevitably there is gonna be much more work involved, since you are not just firing up buttons and menus and shortcuts, but also many "freeform" textual inputs in code... you are probably gonna have to prepare a good number of code snippets beforehand, and devise some sort of standard (typically by using the prefix/suffix feature of VA, very powerfull) to group them into your own preferred logic
This is an interesting example, it illustrates a different software than VA but it still applies the same concepts and might give you some starting ideas... pretty sure however that it took him quite some time to have everything in place