Objective: Evaluate the performance of a continuous-speech interface to a decision support system.
Design: The authors performed a prospective evaluation of a speech interface that matches unconstrained utterances of physicians with controlled-vocabulary terms from Quick Medical Reference (QMR). The performance of the speech interface was assessed in two stages: in the real-time experiment, physician subjects viewed audiovisual stimuli intended to evoke clinical findings, spoke a description of each finding into the speech interface, and then chose from a list generated by the interface the QMR term that most closely matched the finding. Subjects believed that the speech recognizer decoded their utterances; in reality, a hidden experimenter typed utterances into the interface (Wizard-of-Oz experimental design). Later, the authors replayed the same utterances through the speech recognizer and measured how accurately utterances matched with appropriate QMR terms using the results of the real-time experiment as the "gold standard."
Measurements: The authors measured how accurately the speech-recognition system converted input utterances to text strings (recognition accuracy) and how accurately the speech interface matched input utterances to appropriate QMR terms (semantic accuracy).
Results: Overall recognition accuracy was less than 50%. However, using language-processing techniques that match keywords in recognized utterances to keywords in QMR terms, the semantic accuracy of the system was 81%.
Conclusions: Reasonable semantic accuracy was attained when language-processing techniques were used to accommodate for speech misrecognition. In addition, the Wizard-of-Oz experimental design offered many advantages for this evaluation. The authors believe that this technique may be useful to future evaluators of speech-input systems.