Publication Details

Suominen, H., Basilakis, J., Johnson, M., Dawson, L., Hanlen, L., Kelly, B., Yeo, A. & Sanchez, P. (2012). Clinical speech to text: Evaluation setting. CEUR Workshop Proceedings, 1178 1-5.


Failures in information flow from clinical handover are the leading cause of sentinel events in the USA and associated with nearly half of all ad-verse events and over a tenth of preventable adverse events in Australia. Verbal clinical handover provides a good picture of the background clinical history and current state of clinical management of a group of patients cared for by a nurs-ing team. However, all this valuable verbal information is lost after three con-secutive shifts if no notes are taken during handover. When traditional note-taking by hand occurs, less than a third of data is transferred correctly after five shifts. We propose using an automated approach of cascading speech-to-text con-version, standardisation with respect to controlled thesauri, and structuring in accordance with documentation standards. This transcribes verbal handover in-formation into written drafts for subsequent clinical review, editing, and addi-tion to electronic health records. In this paper, we introduce the evaluation setting for this technology devel-opment in a laboratory environment. It ranks a wide range of recording devices used alone or in combination with headsets and lapel microphones based on cli-nicians' preferences and their accuracy in speech-to-text conversion. The sam-ple consists of four student nurses and four experienced academics from diverse clinical specialties and speaking styles. To simulate realistic nursing clinical handovers, twenty handover scenarios have been scripted. The subsequent eval-uation in a clinical environment will address speech-to-text conversion, stand-ardisation, and structuring with the short-listed devices in six hospitals with the sample of thirty authentic handover situations per hospital. To compare recorder-microphone combinations across all participants, pro-fessional-level recording devices are used to record each participant. The re-cordings are then played using professional-level speakers across all recorder-microphone combinations to achieve equivalency in voice input. Statistical ac-curacy in speech-to-text conversion with noise experimentation is used to de-termine the most accurate combination. Two speech-to-text systems are com-pared against transcription by hand. An eighteen-item pre-experimental survey addresses initial perceptions of us-ing the proposed automated approach in clinical settings. This includes partici-pants' opinion on the improvement of clinical handover with the proposed au-tomated approach, their understanding of the related technologies and perceived problems with the clinical application. An eleven-item post-experimental sur-vey examines device usability with reference to the specific experimental de-vices. Each participant is asked to complete both surveys and participate in a one-to-one interview. All participants are videoed using the recording devices and accessing typical device functions to further examine human-device inter-actions for usability assessment. We are seeking additional partners to further develop and evaluate the ap-proach and setting.