Powered by Blogger

Tuesday, July 24, 2007

Venturing into Vista: JAWS and Speech Recognition

As I wrote in Sunday’s article, the Speech Recognition system was the next bit of Vista that I would explore.  After logging into my new machine, I went to Control Panel and launched the Speech item.  All three screen readers (System Access, JAWS and Window-Eyes) performed well in the main speech recognition window.

 

One of the links on the Speech Recognition page said something like, “Speech Recognition Tutorial” and, as I hadn’t done anything with the Vista speech functionality yet, I decided that the tutorial might be useful.  I had been running System Access as I am trying to spend a week with it as my primary screen reader in order to get a better feel for the gestalt of the Serotek product.  The tutorial works like a wizard and when the first dialogue came up, SA only spoke the “Next” button.  Switching to the Virtual Mouse Cursor did little to help improve matters.

 

I launched Window-Eyes next.  Using its PC cursor, very little spoke properly.  As I am not a proficient WE user, I’ll reserve judgment on its performance in the Speech Recognition program for now.  I will, however, point to an issue I reported last week about what I feel is a general deficiency of Window-Eyes; namely, that it seems to have no global keyboard settings.  I cannot, in my wildest imagination, figure out why anyone would want the keystrokes that move the Mouse Cursor to change from program to program.  I wrote the other day that I found it annoying that System Access had no way to change one’s keymap but in WE, I seem to be forced to go to its hot key dialogue for every separate program I use.  Maybe because I have used JAWS for so long, the logic behind such a peculiar user interface escapes me and, if any WE user out there can tell me how to more efficiently create a set of keystrokes that will work everywhere, please call or write to me soon.  Of course, I could read the Window-Eyes manual which may tell me how to accomplish this task but, in my opinion, this shouldn’t be so hard.

 

Then, I launched JAWS 8.0.  With its PC cursor, JAWS also only saw the “next” button but, when I switched to the JAWS Cursor, virtually everything read quite nicely.  I do not have a copy of jVist from Brian Hartgen and T&T so I’m using JAWS in its out-of-the-box configuration.  The JAWS “Read in TAB Order (INSERT+B by default)” feature worked tremendously well for going through the Speech Recognition tutorial.

 

Throughout the tutorial, the user is asked to say something into a microphone.  This changes the text in the dialogue a bit, usually providing the text for what the user should say next or, if one has finished a section, it will instruct the user to say “next” to go to the next dialogue in the tutorial.

 

I have used Dragon Naturally Speaking for a pretty long time now.  If you have RSI problems as bad as mine, dictation software provides a healthier, albeit slower, means of entering information into a computer.  It also improves your spelling tremendously as dictation programs find words in their dictionary which are all spelled correctly as soon as you say them.  I type very quickly so dictation slows me down a lot as it is difficult to think while talking (I believe Will Pearson wrote a comment to a BC article a while back explaining why one can think and type more easily than think while talking).  The alternative, though, means that I will find my hands, wrists, forearms and shoulders screaming in pain and I will have to lay off for a couple or three days.  Thus, speech recognition is very important to me.

 

Using JAWS, I completed the tutorial pretty quickly.  I then started exploring the rest of the Vista Voice Recognition functionality and, in every part of the program I tried, JAWS performed pretty well.  Even without special custom scripts, JAWS works in every area I tried better than it does (again without scripts) in the Dragon product.  The best performance I’ve seen with any screen reader using voice recognition is JAWS with Dragon Pro and jSay from T&T.  In the past, using Microsoft Word in “full screen” view, I have found that turning on “echo all” in JAWS provides a kludgerous way to use it with Dragon but it works reasonably well with the $99 version of Dragon Naturally Speaking which drastically cuts down on the overall cost of using voice recognition in Windows XP and earlier.

 

I must commend the Microsoft people on the quality of their Vista voice recognition facility.  Without having trained the recognition system (something I plan on doing today or tomorrow), the Vista facility works very nicely.  I was able to issue quite a few commands and hear JAWS announce that programs started, that menus activated, etc. 

 

I didn’t return to either Window-Eyes or System Access yesterday as my work time had finished and, even with dictation, I try to keep my computer usage to a scheduled period of time so I can pretend that I have a life but, mostly, so I can read books, listen to the radio and play with the dogs.

 

So, even without special scripts, JAWS won the day in a part of the OS that I find particularly useful.  As I suggest above, Window-Eyes might also work reasonably well in the voice recognition features but I grew so frustrated trying to understand its keymap editor that I stopped using WE but will return to it when I feel a bit more patient.

 

Today, I plan on trying out dictation in Word 2007.  Wish me luck…

 

-- End

3 Comments:

Anonymous Anonymous said...

Chris,

First of all, as another software developer, I am surprised you wouldn't take the time to read any of the Window-Eyes help documentation. It is very well written and should answer any questions you might have. The Window-Eyes interface is far from perfect, and having never read the manual, I can see where you would get confused about the keymap. Since you are going off heresay from others who don't use the program, I will do my best to explain how the system currently functions.

A set file contains the main configuration options for a program. Within the set are entries for what settings should be local and global. Currently, only the voice and verbosity settings can be globally managed; meaning the keyboard settings are all viewed as local to each set. Until this changes, here is something to try. I will keep these steps brief since the manual explains how these utilities work:

1. Pull up the Window-Eyes control panel.
2. From the File menu, choose "Set to Text."
3. Enter the name of the set for the application you modified; E.G. wineyes.000 or word12.000.
4. In the output box, enter something that is easy to find like c:\settings.txt.
5. Press OK to convert the settings to text.
6. Open the resulting text file in a text editor.
7. Look for the line that starts with
; Menu 4 - Hot Key Settings
This should be around line 82 of the file.
8. Delete the text from the beginning of the file to that line.
9. Now look for the line containing
; Menu 5 - Cursor Key Settings
For me, this is line 357.
10. Delete from the beginning of this line to the end of the file. You should now be left with a text file containing only hotkey definitions.
11. Save this file as something like c:\keyboard.txt.
12. Open the Window-Eyes control panel again.
13. From the file menu, choose "text to set."
14. In the input filename dialog, enter the name of the definition file you just saved. In the output field, enter something like *.0?? to apply the keyboard configuration to all set files.
15. Press OK, and wait for about ten seconds.

That should be all you have to do. I realize I will receive flack because this isn't as easy as editing a jkm, and you'd be absolutely right on that score. However, I need to point out that the approaches GW Micro and Freedom Scientific have taken to their customization greatly differ. The tradeoff to Window-Eyes' fewer settings files per application is obviously that more work needs to be done to apply something across the board. It has been my experience, though, that sharing JFW customizations is no walk in the park. Granted, the last one I was "allowed" to use, according to the brilliant idiots in the Freedom Scientific sales department, was 5.1, so this may have changed since that version of the software.
Additionally, GW Micro has modularized the way one adds a keyboard layout selection to Window-Eyes, so there may be an even easier way to save a layout to a text file and apply it through the keyboard menu. I haven't needed to adjust hotkeys (other than switching between the desktop and laptop layouts) so can't provide comments as to how intuitive the process is for third parties. I would suggest giving the above steps a try and see how your new layout works with WE. I will talk to GW Micro and ask about an easier process to do these kinds of things, since I agree this is definitely not the easiest way to go about it. They are working on the next major release of Window-Eyes, and this kind of feature would doubtless appeal to people who, despite all common sense, don't want to read the accompanying documentation. Good luck.

Steve from Texas

9:12 AM  
Anonymous Brian Hartgen said...

Hi Chris.

I was interested to read about your experiences with the speech recognition component of Windows Vista. A few comments if I may from someone who has obviously had to use this tool without scripts in order to find out what I needed to do to write them!

1. I really do not think that using the tutorial in the way you did is the best method for getting the most accurate recognition very quickly. I recommend that you undertake:
A. The lead up to the volume check, I.E. the positioning of the microphone screens, selecting the headset, etc.
B. Undertaking the volume check. This requires the reading of three short sentences into the microphone.
C. Completing the ten minute training passage.
While the option for more training is available, as with Dragon best results are achieved by educating the software, either through its correction system or by "feeding" it words through its internal dictionary.

2. As a general point of reference, we do not within J-Vist support any of the commands which are part of Windows Vista speech. We support:
A. The lead up to the volume check as previously described.
B. The volume check.
C. The training.
D. Dynamic echo of dictated text.
E. Correction and spelling.

3. I find from a personal point of view that, using Dragon V9.5 without undertaking the training, I can get close to 95 per cent accuracy, and with training and educating a great deal more than that. I also find the whole dictation experience with Dragon very smooth and easy to do.
I find with Windows Vista Speech I get about 85 per cent with training. I don't believe that results are too favourable without training, and because it only takes a few minutes it may as well be done. To be fair to Windows Vista Speech however, I delivered a lot of demonstrations of it last week in challenging environments and it did not let me down.

4. It is worth saying in conclusion that there would be no difficulty producing a similar product to J-Vist for at least Window-Eyes. What is stopping that happening is that, unlike JAWS, W E does not give one the ability to lock the work down to a specific serial number. So the hours of investment time involved would not be worthwhile.

Thank you for reading.brian@hartgen.org

2:08 PM  
Anonymous Jake said...

Wow! I can't wait to try out this new speech recognition feature, or any of Windows Vista for that matter. I start my job very soon and I might get a chance to at least have a sneak preview of Windows Vista. I was talking with someone yesterday at the office where I will work, and she mentioned that they recently got Windows Vista installed on at least one, maybe two, of their computers. Plus, I may be getting a new home PC as I am moving out of my current apartment in about a month.

10:14 AM  

Post a Comment

<< Home