Good that you got it working.
Individual letters, like short phrases in general, can be difficult to recognize accurately, as there isn't much data to go on. Therefore, having a well-trained speech recognition profile is a necessity.
Now that you have at least some understanding of using tokens, you could potentially optimize your command somewhat.
E.G.
Say, 'Say, the letter of drive' (and wait until it completes)
Wait for spoken response: 'A;C;D;E;F;G;H;I;J;K;L;M;N;O;P;Q;R;S;T;U;V;W'
Set Windows clipboard to '{TXT:Drive}:\'
Press Left Ctrl+V keys and hold for 0,05 seconds and release
would use the letter you spoke, directly.
If you're using a non-zero "Timeout" value for the "Wait For Spoken Response" action, you may still want to use a condition to check whether a value was actually spoken, rather than the action timing out.