SPEAKER-INDEPENDENT PERCEPTION OF HUMAN SPEECH BY ZEBRA FINCHES
Human speech is a hierarchically organized coding system in which meaningless sounds, called phonemes, are combined into larger meaningful units: words. An important role in the coding process is played by formants - vocal tract resonances that can be altered rapidly by changing the geometry of the vocal tract using different articulators such as tongue and lips. Changing the formant pattern of an articulation results in a different vowel produced. Although human voices differ in acoustic parameters such as fundamental frequency and spectral distribution the relative formant frequencies of an utterance enable intelligibility of speech regardless of individual variation across speakers. Although it has been argued originally that speech is special and uniquely human (Lieberman 1975), several studies have shown that some aspects of speech perception also apply to other species. Chinchillas for example show the same phonetic boundary effect as humans do when discriminating between /d/ and /t/ consonant-vowel syllables (Kuhl & Miller, 1975). Nevertheless, there is still an ongoing debate about which characteristics of speech production and perception are unique to humans and which are shared with other species (Hauser et al., 2002; Trout, 2003; Pinker & Jackendoff, 2005). In this study (Ohms et al., 2010) we addressed the question whether birds (zebra finches) are able to distinguish between spoken words with a minimal difference in acoustic features, and, if so, which cues they might use to do so. We trained 8 zebra finches on a go/no-go operant conditioning task to discriminate between the Dutch words wit (wIt) and wet (Wεt) which differ in their vowels only and which were recorded from several native speakers. The results show that zebra finches when trained to discriminate a single minimal pair can transfer this discrimination to unfamiliar voices of the same and even to the other sex. When confronted with new voices the discrimination performance was immediately clearly above chance level. However, our data also revealed a learning process since performance increased constantly. This suggests that both intrinsic and extrinsic speaker normalization are involved in discriminating between the two words. These results indicate that formant normalization and the capability of normalizing formant patterns across different speakers and sexes is a perceptual trait that not only occurs in humans but also in songbirds.
Note from Publisher: This article contains the abstract and references.