Burnham, D, Reynolds, J, Vatikiotis-Bateson, E, Yehia, H, Ciocca, V, Morris, R, Haszard, Hill, H, Vignali, G, Bollwerk, S, Tam, H & Jones, C, The perception and production of phones and tones: The role of rigid and non-rigid face and head motion, In Yehia, H (Eds), Proceedings of the 7th International Seminar on Speech Production, 2006, p 1-8, Brazil: CEFALA.


There is evidence, mostly with phones (consonants & vowels), that visual concomitants of articulation facilitate speech perception. Here the visual concomitants of lexical tone are considered. In tone languages fundamental frequency variations signal lexical meaning. In a word identification experiment with auditory-visual words differing only in tone, Cantonese perceivers performed above chance in a Visual Only condition. A subsequent study showed augmentation of word pair discrimination in noise in an Auditory-Visual compared to an Auditory Only condition for Cantonese, tonal Thai speakers, and even non-tone Australian speakers). The source of this perceptual information was sought in an OPTOTRAK production study of a Cantonese speaker. Functional Data Analysis (FDA) and Principal Component (PC) extraction suggests that the salient PCs to distinguish tones involve rigid motion of the head rather than non-rigid face motion. Results of a final perception study using OPTOTRAK output in which rigid or non-rigid motion could be presented independently in tone differing or phone differing conditions, suggests that non-rigid motion is most useful for the discrimination of phones, whereas rigid motion is most useful for the discrimination of tones.

