![]() |
This volume comprises refereed papers and abstracts of the 8th International Conference on the Evolution of Language (EVOLANG8), held in Utrecht on 14–17 April 2010. As the leading international conference in the field, the biennial EVOLANG meeting is characterized by an invigorating, multidisciplinary approach to the origins and evolution of human language, and brings together researchers from many subject areas, including anthropology, archaeology, biology, cognitive science, computer science, genetics, linguistics, neuroscience, palaeontology, primatology and psychology.
The latest theoretical, experimental and modelling research on language evolution is presented in this collection, including contributions from many leading scientists in the field.
Sample Chapter(s)
Chapter 1: Is Grammaticalization Glossogenetic? (1,243 KB)
https://doi.org/10.1142/9789814295222_fmatter
The following sections are included:
https://doi.org/10.1142/9789814295222_0001
It has recently been suggested that grammaticalization can be fruitfully explained by the glossogenetic mechanisms for language evolution and historical change. Contrary to this position, it is here argued that the incorporation of grammaticalization processes in the glossogenetic ontology is far from unproblematic.
https://doi.org/10.1142/9789814295222_0002
Languages are far more complex than they need to be for one-to-one communication. This paper attempts to answer the question as to why that should be. The answer, it is suggested, lies in the evolution of story-telling, legend and myth as culturally-important means of expression. Myth may not mark the dawn of proto- or rudimentary language, or even the beginnings of full language, but its existence accounts at least in part for the evolution of linguistic complexity. Language co-evolved with mythology in symbolic frameworks which extended, to the limits of cognition, the capacity for verbal expression.
https://doi.org/10.1142/9789814295222_0003
Present-day people derive pleasure from rhymes, rhythms and repetitive visual patterns, that is, from instances of similarity. Similarity is the basis for grouping items into categories and so setting up abstract general concepts such as ripeness or weight. In present times, such grouping by similarity is a source of pleasure; the current plethora of concepts and words denoting them derives partly from pleasure in forming them. Then the question arises: how far back in prehistory has this pleasure been a motivation? Both beads and handaxes suggest by their symmetry that hominins may have derived this pleasure-in-the-head or internal reward as far back as Acheulian time: at this time, the motivation to construct abstract general concepts and thus expand language may have already been present. The timing is open to question but the pattern of inference connecting symmetry to language is particularly direct.
https://doi.org/10.1142/9789814295222_0004
Understanding language evolution in terms of cultural transmission across generations of language users raises the possibility that some of the processes that have shaped language evolution can also be observed in historical language change. In this paper, we explore how constraints on production may affect the cultural evolution of language by analyzing the emergence of the Romance languages from Latin. Specifically, we focus on the change from Latin's flexible but OV (Object-Verb) dominant word order with complex case marking to fixed SVO (Subject-Verb-Object) word order with little or no noun inflections in Romance Languages. We suggest that constraints on second language learners' ability to produce sentences may help explain this historical change. We conclude that historical data on linguistic change can provide a useful source of information relevant to investigating the cognitive constraints that affect the cultural evolution of language.
https://doi.org/10.1142/9789814295222_0005
Among the many puzzling questions about language, two are salient: First, why are there any languages at all, evidently unique to the human lineage. Second, why are there so many languages? These are in fact the basic questions of origin and variation that so occupied Darwin and other evolutionary thinkers and comprise modern biologys explanatory core: why do we observe this particular array of living forms in the world and not others – the key problem of reconciling the underlying unity of organisms with their apparent diversity, invariance and variation. Here we examine these two questions from the viewpoint of modern linguistics, biology, and dynamical system theory.…
https://doi.org/10.1142/9789814295222_0006
Human language is the result of a cascade of consequences from an initial mutation which provided a new "representational" capacity to some mirror neurons. This initial mutation has high evolvability. The mutation coincidentally allowed representations of the two substances of signs to meet in human brains, thus accounting directly for signs. Recursivity is a result of the self-organization triggered by the choatic system that emerged from this system of signs.
https://doi.org/10.1142/9789814295222_0007
Humans acquire far more of their behaviour from conspecifics via culture than any other species. Our culture is larger because it accumulates, where other species' seem to stay approximately the same size (Tomasello, 1999). This chapter attempts to clarify the problem of cultural accumulation by distinguishing between the size of a culture that can be transmitted from one generation, and the extent of culture transmitted. A culture's size is determined largely by ecological constraints, and certainly homonins (and some other species) show adaptations to facilitate this. But the exponential accumulation hypothesised by (Tomasello, 1999) I claim cannot be accounted for this way, but rather is a consequence of increasing information value in semantic components. This process can be achieved through memetics — semantics will be selected for which transmits the most information. Thus cultural evolution achieves compression of information, generating increased extent in culture even when maintaining a fixed size. I support my argument with evidence from simulations explaining the size of culture (Čače and Bryson, 2007), and simulations demonstrating selection for increased extent Kirby (1999).
https://doi.org/10.1142/9789814295222_0008
Language learning is an iterative process, with each learner learning from other learners. Analysis of this process of iterated learning with chains of Bayesian agents, each of whom learns from one agent and teaches the next, shows that it converges to a distribution over languages that reflects the inductive biases of the learners. However, if agents are taught by multiple members of the previous generation, who potentially speak different languages, then a single language quickly dominates the population. In this work, we consider a setting where agents learn from multiple teachers, but are allowed to learn multiple languages. We show that if agents have a sufficiently strong expectation that multiple languages are being spoken. we reproduce the effects of inductive biases on the outcome of iterated learning seen with chains of agents.
https://doi.org/10.1142/9789814295222_0009
This paper questions the assumption that subject-verb (SV) structures are basic and primary and shows instead that these apparently simple structures are quite complex informationally, intonationally, semantically, and syntactically. In contrast, we point out that verb-subject (VS) structures, particularly those involving unaccusative verbs and sentence focus, are simpler and better candidates for primary structures from an evolutionary point of view. From this perspective, Agent-first (SV) structures, which have been mentioned as examples of protolinguistic "fossils" (e.g. Jackendoff 2002), are not as basic as previously thought.
https://doi.org/10.1142/9789814295222_0010
For the last two decades, a major question for paleoanthropologists has been the origins of modernity and modern thinking. Explanations such as symbolic culture, fully syntactic language, or abstract reasoning are all too often proffered without clear or adequate operationalizations. It is purpose of the present paper to suggest both an evolutionary cognitive basis for one aspect of modern thinking and modern language, metaphors, and to offer a potential neurological substrate.
In our attempt to trace the evolution of a more circumscribed component of modern cognition, we think the candidate trait should be shared, at least in part, by our closer nonhuman primates. The trait should also be evident early (ontogeny) in humans, and there should be some specifiable and demonstrable neurological substrate. Finally, there should be evidence that the trait unambiguously sets a foundation for modern thinking. We think this trait is numerosity, i.e., the ability to think about and reason with numbers.
https://doi.org/10.1142/9789814295222_0011
A computational language game model is presented that shows how a population of language users can evolve from a brightness-based to a brightness+hue-based color term system. The shift is triggered by a change in the communication challenges posed by the environment, comparable to what happened in English during the Middle English period in response to the rise of dyeing and textile manufacturing c. 1150–1500. In a previous model that is able to explain such a shift, these two color categorization strategies were explicitly represented. This is not needed in our model. Instead, whether a population evolves a brightness-or a hue-based system is an emergent phenomenon that depends only on environmental factors. In this way, the model provides an explanation of how such a shift may come about without introducing additional mechanisms that would require further explanation.
https://doi.org/10.1142/9789814295222_0012
This paper presents a preliminary description and analysis of prosodic features (amplitude, duration and rhythm) observed in Northern muriquis vocalizations. The northern muriqui (Brachyteles hypoxanthus) is an endangered primate species which lives in Atlantic forests of Minas Gerais and Espirito Santo, Brazil.
https://doi.org/10.1142/9789814295222_0013
Individuals devote one third of their language time to mentioning unexpected events. We try to make sense of this universal behaviour within the Costly Signalling framework. By systematically using language to point to the unexpected, individuals send a signal that advertises their ability to anticipate danger. This shift in display behaviour, as compared with typical displays in primate species, may result from the use by hominins of artefacts to kill.
https://doi.org/10.1142/9789814295222_0014
Language is a defining characteristic of the biological species Homo sapiens. But Chomskian Universal Grammar is not what is innate about language; Universal Grammar requires magical thinking about genes and genetics. Constraints of universal grammar are better explained in an evolutionary context by processes inherent in symbols, and by such processes as syntactic carpentry, metaphor, and grammaticalization. We present an evolutionary timeline for language, with biological evidence for the long-term evolution of the human capacity for language, and for the co-evolution of language and the brain.
https://doi.org/10.1142/9789814295222_0015
The following sections are included:
https://doi.org/10.1142/9789814295222_0016
The following sections are included:
https://doi.org/10.1142/9789814295222_0017
There is currently considerable disagreement about the value of different theoretical accounts that have been employed to explain the evolution of communication. In this paper, we review some of the core tenets of the 'adaptationist' and the 'informational' account. We argue that the former has its strength mainly in explaining the evolution of signals and the maintenance of honest signaling, while the latter is indispensible for understanding the cognitive mechanisms underpinning signal usage, structure, and comprehension. Importantly, the informational account that incorporates linguistic concepts is necessary prerequisite to identify which design features of language are shared with nonhuman primates or other animals, and which ones constitute derived traits specific for the human lineage.
https://doi.org/10.1142/9789814295222_0018
Because language doesn't fossilize, it is difficult to unambiguously time the evolutionary events leading to language in the human lineage using traditional paleontological data. Furthermore, techniques from historical linguistics are generally seen to have an insufficient time depth to tell us anything about the nature of pre- modern-human language. Thus hypotheses about early stages of language evolution have often been seen as untestable "fairy tales". However, the discovery of human-unique alleles, associated with different aspects of language, offers a way out of this impasse. If an allele has been subjected to powerful selection, reaching or nearing fixation, statistical techniques allow us to approximately date the timing of the selective sweep. This technique has been employed to date the selective sweep associated with FOXP2, our current best example of a gene associated with spoken language. Although the dates themselves are subject to considerable error, a series of different dates, for different language-associated genes, provides a powerful means of testing evolutionary models of language if they are explicit and span the complete time period between our separation from chimpanzees to the present. We illustrate the potential of this approach by deriving explicit timing predictions from four contrasting models of "protolanguage." For example, models of musical protolanguage suggest that vocal control came early, while gestural protolanguage sees speech as a late addition. Donald's mimetic protolanguage argues that these should appear at the same time, and further suggests that this was associated with Homo erectus. Although there are too few language-associated genes currently known to resolve the issue now, recent progress in the genetic basis for dyslexia and autism offers considerable hope that a suite of such genes will soon be available, and we offer this theoretical framework both in anticipation of this time, and to spur those developing hypotheses of language evolution to make them explicit enough to be integrated within such a hypothesis-testing framework.
https://doi.org/10.1142/9789814295222_0019
Slavic aspect has remained a mystery for centuries and continues to fascinate linguists. The genesis of this intricate grammar category is even a greater puzzle. This paper aims at computationally reconstructing the prerequisite for aspect – the emergence of a system of markers for Aktionsarten. We present an experiment where artificial language users develop a conventional system as the consequence of their distributed choices in locally situated communicative acts.
https://doi.org/10.1142/9789814295222_0020
This paper reviews findings on comparative primate and animal cognition. It suggests that although modern human linguistic, cognitive, and motor behaviors differ profoundly from those of great apes, primarily with respect to required mental constructional skills, an early hominin with ape-like capacities could have used non-innate, referential signals. To determine the most probable selective agents that may have motivated these first steps towards language evolution, it is necessary to look beyond the non-human primates to a wider range of animal species. When this is done, foraging adaptations emerge as the most probable selective agents for cooperative breeding and for the cognitive and behavioral suite that would eventually lead to language.
https://doi.org/10.1142/9789814295222_0021
This paper adopts the category game model that simulates the coevolution of categories and their word labels to explore the effect of social structure on linguistic categorization. Instead of detailed social connections, we adopt social popularities, the probabilities with which individuals participate into language games, to denote quantitatively the general characteristics of social structures. The simulation results show that a certain degree of social scaling could accelerate the categorization process, while a much high degree of social scaling will greatly delay this process.
https://doi.org/10.1142/9789814295222_0022
Cultural transmission is the primary medium of linguistic interactions. We propose an acquisition framework that involves the major forms of cultural transmissions, such as vertical, oblique and horizontal transmissions. By manipulating the ratios of these forms of transmission in the total number of transmission across generations of individuals, we analyze their roles in language evolution, based on a lexicon-syntax coevolution model. The simulation results indicate that all these forms of transmission collectively lead to the dynamic equilibrium of language evolution across generations.
https://doi.org/10.1142/9789814295222_0023
When evolutionary biologists and epistemologists investigate the evolution of life, they deconstruct the problem into three research areas: they search for the units, levels and mechanisms of life's evolution. Here, it is investigated how a similar approach can be applied to evolutionary linguistics. A methodology is proposed that allows us to identify and further investigate the units, levels and mechanisms of language evolution.
https://doi.org/10.1142/9789814295222_0024
Through a constructive study of grammaticalization as a potentially important process of language evolution, we have found two findings. One is that linguistic analogy, which applies linguistic rules extendedly, is a very critical for language acquisition and meaning change. The other is that inferences based on the recognition of similarity and contingency among particular meanings can realize unidirectional meaning change, a remarkable characteristic of grammaticalization. We discuss the significance of these findings in the context of the origin and the evolution of language, especially the role of linguistic analogy in creativity. Based on the discussion, a hypothetical scenario of the origin and the evolution of language is proposed.
https://doi.org/10.1142/9789814295222_0025
In the early seventies, the bio-mathematician George Price developed a simple and concise mathematical description of evolutionary processes that abstracts away from the specific properties of biological evolution. In the talk I will argue argued that Price's framework is well-suited to model various aspects of the cultural evolution of language. The first part of the talk describes Price's approach in some detail. In the second part, case studies about its application to language evolution are presented.
https://doi.org/10.1142/9789814295222_0026
It has been suggested that human language emerged as either a new, critical faculty to handle recursion, which linked two other existing systems in the brain, or as an exaptation of an existing mechanism, which had been used for a different purpose to that point. Of these two theories, the latter appears more parsimonious, but, somewhat surprisingly, has attracted less attention among researchers in the field. Navigation is a prime candidate for a task that may benefit from being able to handle recursion, and we give an account of the possible transition from navigation to language. In the described context, it appears plausible that the transition adding the crucial component of human language was promoted by kin selection. We show that once language is present among its speakers, it reinforces the mechanisms of kin selection, boosting such behaviour that benefit one's kin, and any such behaviour in turn boosts the use of language. The article also describes a mechanism through which language is used in lieu of kin markers to promote altruistic behaviour between potentially large communities of unrelated individuals.
https://doi.org/10.1142/9789814295222_0027
Recent studies showed that three-year old children learned novel words better when the form and meaning of the words were sound symbolically related. This was the case for both children learning a language with a rich sound symbolic lexicon (Japanese) and that without (English). From this robust nature of sound symbolic facilitation, it was inferred that children's ability to use sound symbolism in word learning is the vestige of protolanguage consisting largely of sound symbolic words. We argued that sound symbolic protolanguage was able to refer to a wide range of information (not just auditory events). It had the added advantage that it was relatively easy to develop a shared open-class lexicon and it provided a stepping stone from a holophrastic protolanguage to a combinatoric protolanguage.
https://doi.org/10.1142/9789814295222_0028
In this paper I consider the possibility that language is more strongly grounded in sensorimotor cognition than is normally assumed—a scenario which would be providential for language evolution theorists. I argue that the syntactic theory most compatible with this scenario, perhaps surprisingly, is generative grammar. I suggest that there may be a way of interpreting the syntactic structures posited in one theory of generative grammar (Minimalism) as descriptions of sensorimotor processing, and discuss the implications of this for models of language evolution.
https://doi.org/10.1142/9789814295222_0029
In this paper we offer arguments for why modeling in the field of artificial language evolution can benefit from the use of real robots. We will propose that robotic experimental setups lead to more realistic and robust models, that real-word perception can provide the basis for richer semantics and that embodiment itself can be a driving force in language evolution. We will discuss these proposals by reviewing a variety of robotic experiments that have been carried out in our group and try to argue for the relevance of the approach.
https://doi.org/10.1142/9789814295222_0030
The received view is that the first distinct word types were noun and verb (Heine & Kuteva, 2002; Hurford, 2003a). Heine and Kuteva (2007) have suggested that the first words were noun-like entities. The present paper submits ten new arguments that support this claim. The arguments are novel implications of the reviewed evidence which is made to bear on the evolution of the linguistic predicate/argument (e.g. noun/verb) structure. The paper concludes that the evidence for noun-like entities antedating other word types is overwhelming.
https://doi.org/10.1142/9789814295222_0031
This paper presents a model of lexical alignment in communication. The aim is to provide a reference model for simulating dialogs in naming game-related simulations of language evolution. We introduce a network model of alignment to shed light on the law-like dynamics of dialogs in contrast to their random counterpart. That way, the paper provides evidence on alignment to be used as reference data in building simulation models of dyadic conversations.
https://doi.org/10.1142/9789814295222_0032
We examine the evolution of major grammatical forms and constructions as linguistic manifestations of human cognitive ability, based on historical data from English. We show that the complex linguistic system has arisen as more and more grammaticalized forms have accumulated. Word order and case go back to the earliest language. Tense, aspect, modality, gender, questions, negations, parataxis can be traced back to Proto-Indo-European, and they may go back further. Most crucial is the rise of embedding recursion and its product, the VO word order in Old English. This brought about the transition from the syntactic organization of the clause interwoven with discourse organization to the more strictly syntactic organization of the clause. With this transition, the periphrastic constructions of progressive, perfect and pluperfect, modal auxiliaries, periphrastic do and definite article arose due to speakers' desire to be more specific than was possible with the older forms. We also show the role of high-frequency words in the evolution of the grammatical forms.
https://doi.org/10.1142/9789814295222_0033
The last decade has been a very productive one for our knowledge of our closest extinct relative, Homo neanderthalensis. A wide variety of studies has focused on various aspects of the Neandertal skeletal record and how to read it (e.g. Hublin 2009; Weaver 2009), on the chemical composition of their bones and how that might inform us on their diet (e.g. Richards and Trinkaus 2009), on their geographical distribution and their archaeological record (e.g. Roebroeks 2008) and, very importantly, on their genetic characteristics (e.g. Green et al. 2008; Briggs et al. 2009). Genetic studies indicate that modern humans and Neandertals shared a common ancestor only 500,000 to 700,000 years ago, which is also the picture emerging from studies of their physical remains (Hublin 2009). Building on the same Bauplan, two different hominin lineages emerged, in Africa the ancestors of modern humans, and in western Eurasia the Neandertals, who vanished from the record around 35,000 radiocarbon years ago. Integration of genetic data with the other lines of evidence promises to yield major breakthroughs in our understanding of the differences and similarities between these two groups of hominins in the very near future …
https://doi.org/10.1142/9789814295222_0034
The purpose of language is to encode information, so that it can be communicated. Both the producer and the comprehender of a communication want the encoding to be simple. However, they have competing concerns as well. The producer desires conciseness and the comprehender desires fidelity. This paper argues that the Minimum Description Length Principle (MDL) captures these two pressures on language. A genetic algorithm is used to evolve languages, that take the form of finite-state transducers, using MDL as a fitness metric. The languages that emerge are shown to have the ability to generalize beyond their initial training scope, suggesting that when selecting to satisfy MDL one is implicitly selecting for compositional languages.
https://doi.org/10.1142/9789814295222_0035
The evolution of language capabilities is closely linked to the evolution of human brain structures. Human brain auditory cortices are anatomically and functionally asymmetrical. Studies at the microscopic level have found a thinner cortex and more widely spaced neuronal columns in the left (dominant) hemisphere, which reasonably correlate with its greater ability to discriminate speech sounds. The nature of these differences is consistent with a "balloon model" of brain growth, which states that as the brain white matter grows, it stretches the overlying cortex. Thus, the amount and duration of brain growth is an important factor in acquiring the ability to perceive speech. Humans have a much longer brain maturation time than any other primates (or animals). This "extended maturation time" allows language capabilities to evolve in the brain over time, rather than requiring them to be present at birth. The extended maturation time also must have a genetic basis, but not one specific to language, and the HAR1, G72 and FOXP2 genes might well be examples of genes which affect cortical and white matter growth. Finally, if this neuronal system can learn language without depending on specific language genes, then what could be the origin of universal grammar? Natural human grammars, like object-oriented software programs, are constrained to describe our experiential universe – an idea mooted also by "the early Wittgenstein" and others. Insofar as humans mostly share the same experiential universe, our descriptions of it (our languages, some branches of mathematics) share many features; these common features can appear as a "universal grammar."
https://doi.org/10.1142/9789814295222_0036
Segmentation and combination is a fundamental and ubiquitous feature of modern human languages. Here we explore its development in newly emergent language systems. Previous work has shown that manner and path are segmented and sequenced in the early stages of Nicaraguan Sign Language (NSL) but, interestingly, not in the gestures produced by Spanish speakers in the same community; gesturers conflate manner and path into a single unit. To explore the missing step between gesturers' conflated expressions and signers' sequenced expressions, we examined the gestures of homesigners: deaf children not exposed to a sign language who develop their own gesture systems to communicate with hearing family members. Seven Turkish child homesigners were asked to describe animated motion events. Homesigners resembled Spanish-speaking gesturers in that they often produced conflated manner+path gestures. However, the homesigners produced these conflated gestures along with a segmented manner or path gesture and, in this sense, also resembled NSL signers. A reanalysis of the original Nicaraguan data uncovered this same transitional form, primarily in the earliest form of NSL. These findings point to an intermediate stage that may bridge the transition from conflated forms that have no segmentation to sequenced forms that are fully segmented.
https://doi.org/10.1142/9789814295222_0037
In this study, we tested the circumstances under which cultural evolution might lead to regularisation, even in the absence of an explicit learning bottleneck. We used an artificial language experiment to evaluate the degree of structure preservation and the extent of a bias for regularisation during learning, using languages which differed both in their initial levels of regularity and their frequency distributions. The differential reproduction of regular and irregular linguistic items, which may signal the existence of a systematicity bias, is apparent only in languages with skewed distributions: in uniformly distributed languages, reproduction fidelity is high in all cases. Regularisation does happen despite the lack of an explicit bottleneck, and is most significant in infrequent items from an otherwise highly regular language.
https://doi.org/10.1142/9789814295222_0038
How can we explain the enormous amount of creativity and flexibility in spatial language use? In this paper we detail computational experiments that try to capture the essence of this puzzle. We hypothesize that flexible semantics which allow agents to conceptualize reality in many different ways are key to this issue. We will introduce our particular semantic modeling approach as well as the coupling of conceptual structures to the language system. We will justify the approach and show how these systems play together in the evolution of spatial language using humanoid robots.
https://doi.org/10.1142/9789814295222_0039
Evans & Levinson (2009) argue that language diversity is more robust than linguistic homogeneity, and also suggest that explanations for recurring patterns in language are not the product of an innate, evolved language faculty. I examine various kinds of evidence in favour of a specialized language faculty, and argue against the claim that typologically distinct languages must have distinct parsing systems.
https://doi.org/10.1142/9789814295222_0040
Recent iterated language learning studies have shown that artificial languages evolve over the generations towards regularity. This trend has been explained as a reflection of the learners' biases. We test whether this learning bias for regularity is affected by culturally acquired knowledge, specifically by familiarity and literacy. The results of non-iterated learning experiments with miniature artificial musical and spoken languages suggest that familiarity helps us learn and reproduce the signals of a language, but literacy is required for regularities to be faithfully replicated. This in turn indicates that, by modifying human learning biases, literacy may play a role in the evolution of linguistic structure.
https://doi.org/10.1142/9789814295222_0041
Negated sentences in Dutch child language are analyzed. It is argued that, rather than an innate UG structure, the child's acquisition procedure explains a temporary rise and fall of negative concord. It is further suggested that natural preferences of the acquisition procedure are a substantive source for grammatical universals. This evades the assumption that the evolution of the human brain as such has already produced an innate repertoire of grammatical universals.
https://doi.org/10.1142/9789814295222_0042
The art of ranking things in genera and species is of no small importance and very much assists our judgment as well as our memory. You know how much it matters in botany, not to mention animals and other substances, or again moral and notional entities as some call them. Order largely depends on it, and many good authors write in such a way that their whole account could be divided and subdivided according to a procedure related to genera and species. This helps one not merely to retain things, but also to find them. And those who have laid out all sorts of notions under certain headings or categories have done something very useful.
Gottfried Wilhelm von Leibniz, New Essays on Human Understanding (Leibniz, 1704)
https://doi.org/10.1142/9789814295222_0043
Pronouns form a particularly interesting part-of-speech for evolutionary linguistics because their development is often lagging behind with respect to other changes in their language. Many hypotheses on pronoun evolution exist – both for explaining their initial resilience to change as well as for why they eventually cave in to evolutionary pressures – but so far, no one has proposed a formal model yet that operationalizes these explanations in a unified theory. This paper therefore presents a computational model of pronoun evolution in a multi-agent population; and argues that pronoun evolution can best be understood as an interplay between the level of language strategies, which are the procedures for learning, expanding and aligning particular features of language, and the level of the specific language systems that instantiate these strategies in terms of concrete words, morphemes and grammatical structures. This claim is supported by a case study on Spanish pronouns, which are currently undergoing an evolution from a case- to a referential-based system, the latter of which there exist multiple variations (which are called leísmo, laísmo and loísmo depending on the type of change).
https://doi.org/10.1142/9789814295222_0044
According to recent developments in (computational) Construction Grammar, language processing occurs through the incremental buildup of meaning and form according to constructional specifications. If the number of available constructions becomes large however, this results in a search process that quickly becomes cognitively unfeasible without the aid of additional guiding principles. One of the main mechanisms the brain recruits (in all sorts of tasks) to optimize processing efficiency is priming. Priming in turn requires a specific organisation of the constructions. Processing efficiency thus must have been one of the main evolutionary pressures driving the organisation of linguistic constructions. In this paper we show how constructions can be organized in a constructional dependency network in which constructions are linked through semantic and syntactic categories. Using Fluid Construction Grammar, we show how such a network can be learned incrementally in a usage-based fashion, and how it can be used to guide processing by priming the suitable constructions.
https://doi.org/10.1142/9789814295222_0045
This paper discusses problems associated with the "moving target argument" (cf. Christiansen & Chater 2008, Chater et al. 2009, see also Deacon 1997: 329, Johansson 2005:190). According to this common argument, rapid language change renders biological adaptations to language unlikely. However, studies of rapid biological evolution, varying rates of language change and recent simulations pose problems for the underlying assumptions of the argument. A critique of these assumptions leads to a richer view of language-biology co-evolution.
https://doi.org/10.1142/9789814295222_0046
A consideration of the communicative abilities of other animals makes it clear not only that no other species has a system with the essential properties of a human natural language, but also that there is no reason to believe that such systems are even accessible to animals lacking our specific cognitive capacities (Anderson, 2004). In every animal species that has been seriously investigated, it is clear that the specific properties of its communicative mechanisms are tightly grounded in specific properties of its biology. There must be some aspect of our biological nature, therefore, which has been distinctively shaped in the course of evolution to subserve our ability to acquire and use the languages we do (Pinker & Bloom, 1990). Let us call this aspect of human biology the "Language Faculty," without prejudice as to whether components of it might have other roles to play as well.
Some have argued that it is plausible to suggest that there is very little about the Language Faculty that is unique to humans and to its role in language: perhaps only the capacity for recursive elaboration of structure (Hauser, Chomsky, & Fitch, 2002). This claim has provoked heated debate between those who maintain a highly structured species-specific capacity devoted to the acquisition and use of language (Pinker & Jackendoff, 2005; Jackendoff & Pinker, 2005) and those who argue that most of what makes language possible in humans has substantive parallels in other domains and/or other species (Fitch, Hauser, & Chomsky, 2005; Samuels, 2009). I maintain that the participants in this discussion are largely talking past one another: while it is clear that analogs and even homologues of components of human biology relevant to language exist in other species, and in other cognitive domains, it is also clear that these components have been shaped distinctively in humans by their role in language.
Within Linguistics, there has similarly been argument over whether our ability to acquire and use language is the product of a distinctive faculty, or simply due to the confluence of capacities equally relevant to other domains. A component of that discussion has been controversy about whether the regularities we find across the whole range of human languages result from a distinctive and highly specific faculty, or are simply the inevitable outcome of external processes shaping language use and language change. This has made the process of discovering linguistic universals, and attributing them to the substantive character of the Language Faculty, particularly difficult: merely demonstrating that every language in the world conforms to a given generalization (even ignoring the problem of showing that this would also be true for all possible languages) does not support attributing that generalization to such a faculty if an alternative account in terms of external factors of usage and change is available.
Anderson (2008) suggests that this difficulty is more apparent than real. Supposing that external forces conspire to shape languages in particular ways, independent of the precise nature of the cognitive capacity underlying their acquisition and use, we should still expect that precisely these recurrent regularities would be incorporated into our biological nature by Baldwinian evolution, given the central role played by language in our ecological niche and the concomitant value of an ability to acquire the language of the surrounding community quickly and without excessive effort. Nativist and externalist accounts of linguistic regularities are therefore complementary, not contradictory. On this understanding, a highly specific Language Faculty is just what we predict if external forces of usage and change really can drive the emergence of recurrent cross-linguistic regularities.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0047
The Category Game is a computational model designed to investigate how a population of individuals can develop a shared repertoire of linguistic categories, i.e. co-evolve their own system of symbols and meanings, by playing elementary language games (Puglisi, Baronchelli, & Loreto, 2008). Consensus is reached through the emergence of a hierarchical category structure made of two distinct levels: a basic layer, responsible for fine discrimination of the environment, and a shared linguistic layer that groups together perceptions to guarantee communicative success. The only parameter of the model is the Just Noticeable Difference (JND) of the agents defined as the smallest detectable difference between two stimuli. Remarkably, the number of linguistic categories turns out to be finite and small, as observed in natural languages, even in the limit of an infinitesimally small JND. As in pioneering work on the coevolution of language and meaning (Steels & Belpaeme, 2005), finally, the shared categorization is reached through pure cultural negotiation, but in the Category Game the individuals are additionally able to categorize a continuum environment. The analogy with color categorization is therefore natural (Steels & Belpaeme, 2005; Puglisi et al., 2008), even though computational modeling implies a large number of (even drastic) simplifications.
Here we focus on the (much debated (Lakoff, 1987)) question of the origins of universal (i.e. shared) categorization patterns across cultures. In particular, we report on an in silica experiment pointing out that cultural and linguistic interactions can induce universal patterns in categorization provided that the human perceptual system is taken into account (Baronchelli, Gong, Puglisi, & Loreto, 2009). We simulate, through the Category Game model, a certain number of non-interacting populations each developing its own synthetic language. We find universal categorization patterns among populations whose individuals are endowed with the human JND function, describing the resolution power of the human eye to variations in the wavelength of the incident light (Long, Yang, & Purves, 2006). We furthermore show that, on the contrary, populations whose individuals' JND is uniform do not exhibit any signature of universality. In particular, we repeat the same statistical analysis performed in (Kay & Regier, 2003) and find that the difference between these two classes of simulated populations is in striking agreement with the difference between the experimental World Color Survey data and their randomized counterparts.
Remarkably, the model we present (i) incorporates a true feature of human perception (i.e. the human hue JND), and produces results (ii) testable against and (iii) in agreement with experimental data. Our work not only corroborates the findings of (Kay & Regier, 2003), but also validates the hypothesis that the universal properties of human visual system are probably involved in the regularities of color nomenclatures of the world's languages.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0048
How one language strategy - a set of instructions to structure and express one particular subarea of meaning - allows a population of agents to self-organise their own language system that allows them to reach communicative goals in routinised interactions, is relatively well understood. How agents can align on which strategy to use when several strategies are available, has been explored far less (Steels, 2010). This paper investigates how linguistic selection based on communicative success, might be suitable for this purpose in a concrete case study.
The starting point of this case study is an observation in the history of English colours terms. In Old English the colour terms primarily had a brightness meaning sense but in the transition most colour terms shifted simultaneously to a hue sense. In the history of the term "yellow" for example, the OE term "geolo" meant "to shine" whereas ME "yelou" referred to the hue of specific objects, such as yolk, ripe corn or discoloured paper. Colour terms that have been introduced after this shift, never had a brightness sense (Casson, 1997).
We propose a language game model based on the colour naming game to model this observation. In this game, the communicative goal of the speaker is to describe one of the objects in a shared context to the hearer using only a single term that describes the colour of the object.
Each meaning sense is implemented as a different language strategy. Colour categories are represented by their prototype which represent the most prototypical colour of that category. Each prototype is a single point in a three dimensional perceptual colour space (CIE L*u*v*), of which the L* dimension corresponds to the brightness of a colour. In brightness strategy only the L* dimension is taken into account during categorisation, whereas in the hue strategy, all dimensions are taken into account.
To model linguistic selection of language strategies, each agent keeps track of the communicative success of each strategy for each linguistic item in their inventory. This is the default strategy that is used in production or interpretation when an item is used. When the stored associations fail in a specific communicative challenge, the items will be re-interpreted using the other strategies that are available to the agent, starting from the strategy that is considered to be generally most fit by that agent. When the language system needs to be expanded, the agents will prefer the language strategy they consider to be generally most successful in previous interactions and that is suitable for the current communicative challenge.
The mechanisms proposed in this paper allow the agents to align on the strategies they use and give rise to some interesting dynamics. In some runs one strategy clearly becomes the most dominant one, whereas in others a shift occurs from one strategy into the other and both strategies continue to co-exist in the same language system (Fig. 1). The usage of the strategies reflects their fitness within the system.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0049
If language is an emergent product of both cognitive biases and human social interactions, then the structure of language and how it changes over time may provide valuable information about these biases, potentially bringing us closer to understanding their origins (Hruschka et al., 2009). Modelling and simulation is one way to discover the consequences of repeated interactions between agents, and how they may end up amplifying (or, indeed suppressing) these biases.
There has recently been a massive burst of activity within the statistical physics community, whereby the emergent properties of many repeated socially-inspired interactions between agents have been investigated (Castellano, Fortunato, & Loreto, 2009). Many researchers have studied whether one of two variant forms with the same function is ultimately adopted as a convention by a community, and if so, which one wins out. Despite the insights into complex interacting systems that physicists bring with them, a valid criticism of this work is that many models invoke ad-hoc rules that are simplistic in their treatment of cognition and lack empirical contact (Castellano et al., 2009).
A more systematic approach is possible, by thinking beyond specific rules and focussing on the type of bias that they imply. We consider three superficially distinct models, one in which speakers sample interlocutors' utterances randomly (Baxter, Blythe, Croft, & McKane, 2006), another in which agents negotiate referents for an object by eliminating other potential referents when a communicative act is deemed successful (Baronchelli, Felici, Caglioti, Loreto, & Steels, 2006), and another that investigates competition between two languages but permits the possibility of bilinguals (Castelló, Eguíluz, & San Miguel, 2006).
We have previously shown that these distinct models can be unified. An analysis of the relevant stochastic equations of motion allows one to identify three distinct biases operating within them (Blythe, 2009). One is a maximising bias, where agents seek to eliminate the minority variant. This quickly brings the community to consensus on the global majority variant. The second behaviour is pure sampling: both variants then fluctuate in frequency before one of them goes extinct. Both behavioural biases have been observed experimentally, e.g., in the work of Hudson Kam and Newport (2005); a formal cognitive basis for such behaviour has also been provided within a Bayesian learning framework (Reali & Griffiths, 2009). Finally, the model demonstrates the logical possibility of a third type of behaviour, where the two variants coexist indefinitely in equal numbers.
Here we further argue that these three emergent outcomes are generic in models that invoke maximising, linear sampling or "anti-maximising" behaviour, in various combinations and implementations. Whether individuals are variable or categorical users of a variant; whether the bias is in production, or perception, or both; whether they interact with many or few other members of the community; all are 'details' as regards the qualitatively distinct emergent outcomes that can be expected. Unfortunately, all are problematic from the point of view of observed language change. We will discuss consequences, e.g., for the systematisation of a holistic protolanguange (Kirby, Dowman, & Griffiths, 2007), and propose cultural evolutionary mechanisms that may plausibly describe actual instances of language change, speculating as to the kinds of cognitive processes that may underpin them.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0050
A growing number of studies revealed that, to some extent, parallels can be found between nonhuman primates' vocal communication and human language (e.g. conversation-like vocal exchanges, referential communication and call combination). But at the same time, nonhuman primate calls are different from speech because of their very limited flexibility. This is intriguing for people interested in the evolution of communicative abilities and in the phylogenetic origin of language. It is now commonly admitted by the scientific community that social life played an essential role in this evolution. Interestingly, vocal plasticity in nonhuman primates is certainly limited but not inexistent and they are nice models to test the influence of social factors on acoustic variability. Several studies have shown that changing the group social composition triggered changes in individual vocal signatures. For example, replacing the harem male in a captive Campbell's monkeys group leads to a rearrangement of social networks and, in parallel, to fine acoustic modifications of the female contact calls conducting to vocal sharing between preferential partners. Here we propose two ways to test the potential influence of social factors on acoustic variability. Our questions were: 1) Does the social context of calling influence the level of intra/inter-individual variability? 2) Does the social system of the species play a role on the shaping of the vocal repertoire's variability?
We conducted call recordings in two primate species presenting interesting social differences. Red-capped mangabeys (Cercocebus torquatus) live in the wild in large multi-male and multi-female groups and their social organisation is, like most baboons and macaques, strongly based on frequent peaceful and agonistic interactions. Campbell's monkeys (Cercopithecus campbelli campbelli) live in the wild in small harem groups and their social organisation is, like most forest guenons, based on rare physical interactions and a discrete hierarchy. We investigated whether the level of intra/inter-individual variability depended on the call social function. We hypothesized that, as in several other animal species, the level of acoustic variability, notably the level of individual distinctiveness, would be higher in calls presenting a high social value (e.g. contact call types) than in calls presenting a low social value (e.g. alarm call types), the former being involved in affiliative dyadic interactions and the latter consisting in a communication at the group level. In humans, individual distinctiveness can also be accentuated or attenuated by speakers according to the macro context of communication depending on the size and the composition of the audience. We then hypothesized that the variability found in affiliative versus agonistic calls would depend on the species social system.
We studied in mangabeys and guenons respectively six and five call types for which we could associate a high, intermediary and low social value. Among highly social calls we compared in both species affiliative contact calls and agonistic threat calls. Intra- and inter-individual acoustic variability was assessed by measuring a set of temporal and frequential values in the same way for all call types. We measured respectively 1416 calls from 14 individuals in mangabeys and 1348 calls from 6 individuals in guenons. We found that the degree of the call social value predicted the level of variability. For instance, we found in both species a much higher potential for identity coding in affiliative calls used during dyadic exchanges than in less social calls like alarm calls (two to four times more variable). A strong difference between the two species was found in social calls associated to a negative value. While the level of variability of threat calls was almost as high as the one of affiliative calls in mangabeys, it was as low as the one of alarm calls in Campbell's monkeys, supporting the social system effect hypothesis.
This study highlights the determinant role played by social factors on the structuring of vocal repertoires and individual acoustic variability. It opens new perspectives of comparative research in animals for understanding how human language evolved and supports the general theory of a social-vocal co-evolution.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0051
Evans & Levinson (2009) claim that language is a "biocultural hybrid" of gene:culture co-evolution, and that a crucial fact for understanding the place of language in human cognition is its diversity or fundamental variation in form and content. A plausible case for language diversity must explain (1) how language systems can evolve and diversify as socio-cultural products under cognitive constraints on learning, and (2) how come children can learn and adults use any one of the alternative language systems. It is argued here that language is fundamentally pragmatic, and that this explains language diversity given a certain understanding of how behavior and environment specify each other (Brinck 2007; Clark 2008). From a system dynamics perspective it is considered how methods in, e.g., the theory of situated learning (Lane & Husemann 2008) might be used to explore the hypothesis.
(1) Evans' & Levinson's conception of evolution relates to a broader notion of context-dependence that places cultural and technological adaptation at centrestage. Evolution involves continuous interaction between species and environment leading to new artifacts and behavior, potentially with a wide application such as language. Communication is a situated practice, controlled by local (causal) and global (socio-cultural) contextual features (Garfinkel & Sacks 1970). Internalized skills interact with the environment to produce action, whereas emerging behavior forms induce changes among existing conditions. In a constantly developing feedback loop (spiral), our ancestors' environment influenced the evolution of language, while the language shaped the environment. Given that every alternative language has evolved in a particular ecological niche, diversity is no surprise.
(2) Assuming cognitive constraints on learning, the multi-layered nature of language explains how verbal communication can be prolific and occur with such ease. Language relies for its proper functioning on a variety of cognitive, affective, and conative processes. Some occur in nonverbal communication among nonhuman primates and human infants. Consider reference. Data from comparative, cognitive, and developmental psychology show that nonverbal reference can take many forms, some intentional, others automatic or reflexive (Brinck, 2008; Leavens et al., 2009; Senju & Csibra, 2008; Zlatev et al., 2008). Mechanisms for producing and responding to attention, imitation, gesture, and emotion occur on different processing levels in different formats, simultaneously working towards a common goal, thus enabling multi-layered communication. To optimize performance, behavior is tuned to local properties, explaining how come, in spite of diversity, language is accessible. To exemplify, referential skills are cue-driven and tuned to action contexts. Actions are encoded in detail, and apparently identical contexts may be handled differently by the same agent, because behavioral competence depends on local affordances and constraints.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0052
The study of language evolution has a long tradition of connecting gestures to language origins (Condillac, 1746; Hewes, 1973). Modern theories point to gesture as the solution to a central problem: the emergence of symbolic communication. Prominent versions (Arbib, 2005; Corballis, 2002; Tomasello, 2008) share three critical features: i) early forms of communication consisted of pointing and pantomiming; ii) these gestures then became conventionalised and arbitrary, or symbolic; iii) at some point, the symbolic channel 'switched' and vocalisations became the dominant channel for symbolic communication. I agree that i) is a plausible stage in language evolution but contend that points ii) and iii) are less likely, as they do not follow the evolutionary principles of parsimony and continuity, nor do they provide a satisfactory explanation for the relationship between speech and gesture as it exists today (McNeill, 2005). In addition, arguments for this scenario rely on questionable assumptions regarding early hominid gestural and vocal abilities, the vocal channel's greater potential for creating arbitrary symbols and the role of speech in the instruction of manufacturing techniques.
Although these accounts recognise the powerful representational potential of gesture and consider the advantages of an additional, distinct modality of communication, they do not appear to fully appreciate the synergistic potential of both modalities together nor the limitations of a single modality on its own. If mimetic gestures became symbolic as postulated, the power of their 'natural' meaning would have been lost. Moreover, distributing meaning expression between symbolic and nonarbitrary forms provides cognitive and communicative benefits in language production and comprehension (Goldin-Meadow et al., 2001; Kelly et al., 1999), an advantage that would be sacrificed if gestures transitioned into arbitrary symbols. Though it is not claimed nonarbitrary gestures disappeared during this transition, this scenario does not allow for the same simultaneous nonarbitrary-and-arbitrary signaling distributed across modalities that would enable the cognitively demanding task of forming symbols.
Another problem for these theories is the 'switch' to vocalisations as the dominant vehicle for symbolic communication. If a symbolic gestural system arose, it would have been hugely advantageous and caused evolutionary forces to move toward manual signed language, thus making it very unlikely for speech to evolve (Emmorey, 2005). In addition, an evolutionary scenario in which signaling types shift between modalities entails multiple and significant evolutionary transitions.
A careful consideration of gesture research and the nonarbitrary nature of human communication can contribute substantially to our understanding of language origins. The representational power of gesture alone is not sufficient to explain how arbitrary forms came to carry meaning, as claimed in current gestural origins theories. It is the coordinated multimodality of human expression that provides the opportunity for bodily manifestations of meaning to be transferred to co-occurring vocal signals. If nonarbitrary gestures co-occurred with vocalisations early in hominid history, it presents an opportunity for sounds to become symbolic while preserving gesture's 'natural' meaning and retaining the cognitive and communicative benefits of gesturing. In this view, symbols arose in the modality in which they still occur today, thus obviating a 'switch' in symbolic channel in the course of human evolution.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0053
Research has demonstrated that captive apes use gestures flexibly in different contexts and in accordance with some understanding of their recipients (e.g. Cartmill & Byrne, 2007; Pika, Liebal, Call, & Tomasello, 2005). This flexibility shows that ape gestures are intentional rather than automatic responses to external stimuli, but gives no indication that they can be used to communicate specific meanings. Discussions of meaning in ape gestures have largely focused on contextual flexibility or cataloguing the number of functions per gesture (e.g. Genty, Breuer, Hobaiter, & Byrne, 2009; Pika et al., 2005). The greater flexibility of gestural vs. vocal communication in great apes demonstrates that the system is less rigid and, in some ways, a seemingly better precursor to language (see Arbib, Liebal, & Pika, 2008; Pollick & de Waal, 2007). However, if gestures are used so flexibly that there is no predictable relationship between form and meaning, then they are not used to communicate something. By placing too much weight on the flexible use of gestures, researchers risk underestimating gesture meaning.
Researchers studying ape gesture should adopt a more probabilistic and systematic approach to meaning: one that incorporates our understanding of apes' flexibility in strategic communication but focuses on whether gestures are used predictably to elicit particular behaviors. Such an approach should identify cases where the final outcome of an interaction fulfills the apparent goals of some of the gestures in the exchange. Examples where the gesture's goal appears to be satisfied can be described as having goal-outcome matches. One can then attribute meaning to gestures that occur predictably with single matches and test these attributions of meaning using the rest of the dataset. Previous studies have assessed the "function" or "goal" of gestures by correlating form with context (e.g. Genty et al., 2009; Pika et al., 2005). The present approach extends this work significantly by using a subset of data to predict meanings and then testing those meanings using the whole dataset to determine what apes do when recipient responses do not match their gestures' attributed meanings.
I applied this methodology to 64 gestures produced by 28 captive orangutans (Pongo pygmaeus and P. abelii) to conspecifics housed in 3 European zoos. Out of a total of 1344 examples of "intentional" gesture, 698 had goal-outcome matches (most of the others received no reaction). Of the examples of gestures with goal-outcome matches, 29 gestures occurred more than 70% of the time with a single match, and an additional seven gestures occurred more than 50% of the time with a single match. Each gesture that met either of these meaning thresholds was found to have one of six meanings: Affiliate/Play, Stop action, Look at/Take object, Share food/object, Co-locomote, and Move away.
These attributions of meaning were tested by assuming that every example of a gesture (including those without goal-outcome matches) had the same meaning. I examined whether orangutans were more likely to persist following recipient reactions that did not match the meaning attributed to the initial gesture used. The type of recipient reaction (matching or not matching) significantly affected the gesturer's probability of persisting (χ2=63.35, df=1, p<0.001). This supported the attributions of gesture meaning.
The results indicate that the gestural communication system of orangutans is composed of both ambiguous and meaningful gestures. Most meaningful gestures do not show a one-to-one correspondence between form and meaning, which allows them to be used with some degree of flexibility. Flexibility and semanticity therefore need not be mutually exclusive, and by redirecting the discussion of ape gesture from flexibility to meaning researchers can use attributions of meaning to make specific predictions about communication strategies and open up new comparisons to human language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0054
Why is language the way it is, and how did it come to be that way? Answering these questions requires postulating genetic constraints on language. A key challenge for language evolution research is therefore to explain whether such genetic constraints are specific to language or whether they might be more general in nature. In this talk, I argue that traditional notions of universal grammar as a biological endowment of abstract linguistic constraints can be ruled out on evolutionary grounds (Chater, Reali & Christiansen, 2009; Christiansen, Chater & Reali, 2009). Instead, the fit between the mechanisms employed for language and the way in which language is acquired and used can be explained by processes of cultural evolution shaped by the human brain. On this account, language evolved by 'piggy-backing' on pre-existing neural mechanisms, constrained by socio-pragmatic considerations, the nature of our thought processes, perceptuo-motor factors, and cognitive limitations on learning, memory and processing (Christiansen & Chater, 2008). Using behavioral, computational and molecular genetics methods, I then explore how one of these constraints—the ability to learn and process sequentially presented information—may have played an important role in shaping language through cultural evolution (Reali & Christiansen, 2009). I conclude by drawing out the implications of this viewpoint for understanding the problem of language acquisition, which is cast in a new, and much more tractable, form (Chater & Christiansen, in press).
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0055
Chomsky's viewpoint on the evolution of language is quite controversial. (Pinker & Bloom, 1990; Newmeyer, 1998; Jenkins, 2000; Bickerton, 2005; Pinker & Jackendoff, 2005) On the one hand, many supporters believe that Chomsky, who benefits a lot from the new development of evolutionary theories, especially from the non-selectionist concepts such as "spandrel", "exaptation", and "known or unknown physical laws", provides a reasonable speculation on the origin and evolution of language (Jenkins, 2000), which probably can be termed as "neo-neo-Darwinism" (Piattelli-Oalmarini, 1989: 9); On the other hand, some critic argue that the "by-product" concept held by Chomsky is "utterly implausible" (Newmeyer, 1998: 313), and that the evolution of syntax attributed to a very lucky accident of "hopeful monster" mutation, is out of biology (Szabolcs Számadó, 2009: 18).
From Chomsky's own words, we can see that although "the specificity and richness of the language faculty" held in the early phases of generative grammar poses serious barriers to inquiry into how this faculty might have evolved, the Principles and Parameters (P&P) approach, which makes "a sharp distinction between process of acquisition and the format of the internal theory of a language", removes "a crucial conceptual barrier to the study of evolution of language". (Chomsky, 2007, p. 13-4) According to Chomsky, the idea of P&P approach mainly comes from both the intensive study of various languages and an analogy to the new development of biology, evo-devo theory. Chomsky (2005b) has recently written a paper to discuss the application of evo-devo to the study of language, in which the origin and evolution of language became a simple thesis of evo-devo. The impression given by Chomsky is that the exploration on the evolution of language faculty in the framework of P&P approach can greatly make use of the updated development of evolutionary theory---evo-devo. Then could evo-devo save Chomsky from the evolutionary paradox that subtly goes across natural selection and directly attributes the birth of the "merge" operation to a magic random mutation?
The form of every animal is the product of two processes---development from an egg and evolution from its ancestor. What evo-devo focuses on is the closely related relationship between these processes. The fundamental principles of evo-devo theory are that evolution has close relationship with development and the regulatory gene that controls the development of different organisms, is very conservative even across a long period. (Carroll, 2005) Chomsky (2005a, b, 2007) seems to have seized the soul of evo-devo correctly and makes two analogies in linguistics: from the relationship between development and evolution, Chomsky (2005a) finds three factors that can influence the development of language in an individual; and from the regulatory gene mechanism, he gets the idea that the interaction between invariant principles and alternative parameters can determinate the nature of language and language acquisition. Although those ideas, as Chomsky has said, truly get their origins from the parallel development in biology, they are just simple conceptual analogies without any direct relationship with evo-devo theory. We can not find any evolutionary mechanism for Chomsky's "hopeful monster" random mutation in the birth of "merge" operation in evo-devo. From the evo-devo perspective, more attentions should be paid to the epigenetic factors in language development and evolution, rather than just putting too much weight on language phenomenon on genetic instructions. (Deacon, 2003a, b, 2005) Meanwhile, the conservation of regulatory gene means that the reappropriation of old genes to new use is much more the normal case that organisms adopt than the creation of new genes, and instead of expecting too many new "language genes" produced just by magic mutation, the research of language evolution needs the mechanism, such as relaxed selection (Deacon, 2010). to reduce its heavy burden on genetic requirement.
From the analysis above, we can draw conclusions that (1) though evolutionary linguistics can greatly make use of evo-devo, Chomsky just makes a few analogies at the conceptual level and there is not a direct relation between his arguments on the magic mutation and the content of evo-devo theory; (2) Chomsky's viewpoint on language evolution can't get enough evidence from evo-devo, and in fact, some points of his arguments are contradictive to evo-devo; (3) what Chomsky says on the origin and evolution of language mainly comes from his language philosophy, not soundly based on biological phenomena.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0056
The evolution of human language can be investigated from different points of view, including from an ontogenetic perspective. By presenting a number of arguments, including original data of our own, we aim to demonstrate the relevance of studying the development of communication in infancy to better understand the evolution of language.
Both neuroanatomical and behavioural studies have shown that gestures and speech entertain close relations during human ontogeny (e.g., Bates & Dick, 2002; Iverson & Thelen, 1999; Rowe & Goldin-Meadow, 2009), and given the cerebral lateralization of speech in humans, the investigation of manual laterality for communicative gestures can provide valuable clues regarding the nature of these speech-gestures links. Experimental and observational studies have reported that the production of communicative gestures, especially pointing gestures, is lateralized to the left cerebral hemisphere (e.g., Cochet & Vauclair, submitted). Interestingly, this right-sided bias for pointing gestures was reported to be stronger than for manipulative actions, whether it concerned hand use in simple reaching or in bimanual activities (Bates, O'Connel, Vaid, Sledge, & Oakes, 1986; Vauclair & Imbault, 2009). Hand preference for communicative gestures then appears to be independent of handedness for manipulative actions. This finding has led researchers to postulate the existence of a specific communication system in the left cerebral hemisphere, controlling both gestural and vocal communication, and which may differ from the system involved in purely motor activities.
Different patterns of laterality between communicative gestures and non communicative actions have also been observed in non human primates (in baboons: Meguerditchian & Vauclair, 2009; in chimpanzees: Hopkins, Russel, Freeman, Buehler, Reynolds, & Schapiro, 2005). Altogether, these results support the gestural theory of the origin of speech. The communication system in the left cerebral hemisphere, which is likely to be located in Broca's area (Gentilucci & Dalla Volta, 2007), might have served as a substrate for the evolution from a gestural communication system to a vocal one (Corballis, 2009).
In order to investigate in greater depth this hypothesis, we focused on the function of pointing gestures, in addition to their handedness. So far, three main functions have been described: children produce imperative pointing as a request for an object, declarative expressive pointing to share interest with the adult about a specific referent and declarative informative pointing in order to help someone by providing him/her needed information (Tomasello, Carpenter, & Liszkowski, 2007). Declarative pointing gestures, whose frequency increases when children grow old (Cochet & Vauclair, submitted), are thought to reflect more complex cognitive skills related to the understanding of others as attentional and intentional agents. We set up three experimental designs at day nurseries to elicit these three different pointing gestures in 48 toddlers between 15 and 30 months of age. A unimanual reaching task was also administered. Main results revealed that declarative gestures were more frequently accompanied by vocalizations than imperative gestures and that gaze alternation was more frequent in informative pointing compared to imperative and expressive situations. Furthermore, the difference in the degree of manual preference between manipulative actions and pointing gestures was the strongest for informative pointing.
As non human primates point imperatively to request food, but not declaratively (except a few language-trained apes), studying the emergence of declarative communication in human infants might point out some socio-cognitive prerequisites for the emergence of language. In this regard, investigating the development of declarative informative pointing is particularly relevant, as this gesture seems to benefit only the recipient of the signal, opening window into the development of cooperative abilities. Our results then suggest that such cooperative gestures may have played an important role in the evolution of human language and its cerebral lateralization.
This research was supported by a French National Research Agency (ANR) grant reference ANR-08-BLAN-0011_01.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0057
How do new linguistic elements emerge? How are they changed from the pre-linguistic raw materials of their origin? To catch the very earliest stages of such a transformation, we turned to a young language – Nicaraguan Sign Language (NSL). Because NSL is only 30 years old, the pioneering generation that created it is around today, able to show us what the earliest Nicaraguan signing looked like. Members of different age cohorts today represent a living "fossil record" of the language. Elements that served as likely linguistic precursors are also still observable, in present-day co-speech gestures and homesigns.
Here we take a humble gesture – the point – and follow its transformation into a linguistic element. This basic gesture often accompanies speech to indicate real-world locations and objects. As it transformed into a sign, we found an increase in its use to identify the participants in events rather than referring to locations or real-world objects. With this shift, points took on new linguistic functions, including indicating the subject of a verb, and serving as a pronoun.
Most deaf signers in Nicaragua learned to sign through social contact when they entered special education centers in Managua that were newly established and expanded rapidly starting in the late 1970s. Educators did not sign, but they did not prohibit their students from gesturing, and the children began to create a common communication system. What began as gesturing among fifty individuals became a full, rich sign language used by a community of over 1000.
To capture different periods in the language's emergence, we grouped participants into cohorts: Children who arrived in the late 1970s and early 1980s (now adults) form the first cohort, those who arrived in the mid- to late-1980s (now adolescents) form the second cohort, and those who arrived in the 1990s (now children) form the third cohort.
Going one level deeper in the fossil record, we compared these NSL signers to four deaf homesigners who never entered the programs in Managua. These homesigners have not acquired NSL; none has a regular communication partner who signs NSL, none uses NSL vocabulary or grammar. They represent the communication systems of deaf Nicaraguans before NSL developed.
We compared the same signed story (based on a cartoon) from four participants from each group along this continuum of language emergence: adult homesigners who never acquired a conventional sign language, and NSL signers who acquired the language at three successive periods during its emergence. How does the form and function of pointing change along this continuum?
We classified each instance of a point according to its endpoint (e.g., a nearby object, the signer's chest, or an empty space in front of the signer). We then determined the meaning and function of points and categorized them into locatives, which referred to locations (such as 'overhead' or 'to the left') and nominals, which referred to persons or objects (such as 'Tweety the bird' or 'the cage'). Note that both types entail a displacement of the referent from the real world and real objects. Such displacement is a fundamental symbolic characteristic of language that allows reference to entities and locations that are not in the here-and-now. As the points took on this symbolic function, we also examined whether and how they combined with other signs to form phrases.
Comparisons across our continuum revealed a shift in the use of the manual point, starting with mostly concrete, locative meanings, and later taking on more symbolic, abstract, and displaced nominal (and possibly pronominal) functions. While the frequency of locative points remained constant across participant groups, the frequency of nominal points increased significantly. The nominal forms also became integrated into the syntax of NSL. Thus, modern NSL points participate in constructions that give them a more categorical, less context-bound flavor than the co-speech forms that are their origin.
These uses of pointing differ strikingly from those of gestures accompanying speech. Indeed, the more the sign-like uses develop, the less they show the spatial and locative meaning associated with typical pointing gestures. In accord with findings from spoken language grammaticalization, a crucial step in the transformation of pointing gestures into abstract, recombinable linguistic elements seems to be the loss of locative semantic content. Thus, from the earliest to the most developed form of NSL, we find more points that refer not to locations but to entities, and that increasingly serve linguistic functions.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0058
Iterated language learning has recently emerged as a method for experimentally exploring processes of cultural transmission in language evolution amongst humans. In this approach, miniature artificial languages are acquired by participants, and then transmitted to the next participant following a diffusion chain paradigm. This technique has revealed how an initially unstructured language can become structured and progressively easier to acquire over time (Kirby et al., 2008), adapting in a systematic way alongside the meanings being conveyed (Cornish et al., 2009). Importantly, participants are only seeking to reproduce the language given to them: there is no communication with other learners, and changes arise purely as a result of the cultural transmission process, not through deliberate invention by participants.
This previous work has focused exclusively on the cultural transmission of meanings and signals, and in particular, how the presence of structured meanings gives rise to compositional linguistic structures. The question remains, though, whether other types of cognitive constraints, in the absence of structured meanings, may affect cultural transmission via iterated learning, as suggested by Christiansen & Chater (2008). Accordingly, in this talk, we present the results from a novel iterated learning paradigm that seeks to investigate experimentally whether biases in sequence memory lead to the cultural evolution of structure, independent of any language-like task.
To isolate the effect of sequence memory constraints on cultural transmission, we ran an iterated version of a simple artificial grammar learning (AGL) task. Although AGL tasks have been primarily used to study implicit learning, the cognitive mechanisms employed in this task are likely the same as those used for artificial language learning (Perruchet & Pacton, 2006). Indeed, AGL tasks have been shown to activate the same part of Broca's area that is also involved in language (Petersson et al., 2004). Crucially, unlike previous iterated language learning tasks (e.g. Kirby et al., 2008), our iterated AGL task did not involve the transfer of meanings, only signals. Participants were exposed to a series of fifteen different consonant strings one by one, and asked to accurately reproduce them after a short delay (3000ms) following each presentation. Each string was seen six times in total, at which point the participants were asked to recall all fifteen strings in any order. The output from this final recall test was then recoded to eliminate potential typing biases and/or use of anagrams, and this became the input string-set for the next generation.
The initial string-sets used to begin each chain were carefully constructed to contain very little structure. Our predictions were two-fold: that by the end of the chains the string-sets would become 1) easier to acquire, and 2) more structured. These predictions were confirmed by standard information-theoretical measures of structure, such as entropy, and also by techniques drawn from the AGL literature (see Conway & Christiansen, 2005, and references within). In particular, measures of associative chunk strength (the amount of repetition of sub-sequences) and anchor strength (the specific sub-sequences at the beginning and end of strings) show that over time certain distributional patterns emerge in the string-sets, which facilitate learning. Strings within a set also evolve to become more similar to one another, again, facilitating better learning and recall. Thus, distributional structure relevant for language acquisition (Perruchet & Pacton, 2006) emerges across chains of learners.
The results from our new iterated AGL task thus demonstrate that constraints on sequence memory alone, amplified by processes of cultural transmission, may have been an important factor in shaping linguistic structure during the cultural evolution of language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0059
Some of the earliest and most simplistic theories of the evolution of language have been coined "bow-wow" theories, positing that language sprung from utterances (forms) directly and iconically related to what they depict (meanings). In the relatively recent explosion of interest in language evolution, theories of this type have been largely forgotten or dismissed for two major reasons: (i) natural language is arbitrary, not iconic; and (ii) given that we use the auditory modality for communication, we are only able to iconically express meanings related to sound. I will argue that the former problem makes uninformed assumptions about language and language evolution, while the latter problem may find a solution in the study of cross-modality: connections between the senses. Moreover, an iconic protolanguage would offer a compelling solution to Harnad's (1990) symbol grounding problem: linguistic symbols are ultimately grounded in our perceptual system.
Objection (i), that natural language is arbitrary, makes two major errors, One, it assumes that because all languages make use of arbitrariness, that they are exclusively arbitrary. Non-arbitrariness and iconicity in language are historically understudied, but it is uncontroversial among those that do study it that almost all languages take advantage of some form of iconicity (Nuckolls, 1999; Tamariz, 2005). Directly iconic forms can be seen most obviously in onomatopoeia, but also occur in ideophones: words vividly depicting sensory events for native speakers (Dingemanse, 2009).
Two, this objection assumes language has always been as it is now. The concept of protolanguage allows for a smaller system preceding language. Simulations show that iconicity is favoured in small systems, but arbitrariness has greater advantages as the system expands (Gasser, 2004). Communication emergence experiments (Theisen, Oberlander & Kirby, in press) and documented change in sign languages (Pietrandrea, 2002) demonstrate a move from iconic to arbitrary in language growth and use.
This leaves the second objection to "bow-wow" theories: language occurs in the auditory modality, and utterances can only be iconic through the imitation of sound. However, this objection ignores the growing literature on the complimentary role gesture plays in language (Tomasello, 2008). Although the auditory modality is dominant, the gestural modality is also available for iconic expression.
Moreover, humans experience non-random cross-modal associations (e.g., Simner & Ludwig, 2009) allowing for all sensory experiences to be expressed through a single modality. These shared cross-modal biases allow both iconic expression and understanding: combined with the cognitive assets of shared intentionality and theory of mind (Tomasello, 2008), we are equipped and motivated to make inferences about the cross-modal nature of others' utterances.
In this paper, we will review the evidence for cross-modality relating to language, and present experimental evidence specifying the precise nature of the cross-modal biases involved. Ultimately, we will argue that these consistent biases provided a scaffold for an iconic spoken protolanguage.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0060
Recently, several prominent studies have developed human experimental methods to explore the emergence and evolution of simple linguistic systems. In one line of research, often termed iterated learning (Kalish, Griffiths, & Lewan-dowsky, 2007; Kirby, Cornish, and Smith, 2008), human participants see word-picture pairings in an invented mini-language. The learner then reproduces these pairings, which become the training material for the next learner. Because learning is imperfect, subsequent generations produce gradually stable, systematic versions of the "language." Another line of work has explored how social coordination between participants produces simplified linguistic systems. Simon Garrod and Nicholas Fay and colleagues (Garrod, Fay, Oberlander, Lee, & MacLeod, 2007; Fay, Garrod, & Roberts, 2008) had participants function in pairs, creating line drawings to identify a referent picture among a list of candidates. After multiple rounds of interaction, participants create a simplified set of symbolic representations for images (see also Galantucci, 2005).
In the present work, we have loosely integrated both empirical approaches, and taken them in a new direction in a multiplayer communication game we call Squiggle. Players connect to the game engine via Internet, and create and interpret visual signs for real-world objects. Successful communication occurs when other players are able to match a previously created sign to its referent. As gameplay proceeds, we track the evolution of individual signs and study in real-time the transition from iconic to symbolic communication.
The Squiggle game consists of speaking trials in which a player is presented with a picture (common objects, faces, and place pictures) and has 4 seconds to draw a squiggle—a black and white line drawing created on a computer interface—such that another person would be able to match the squiggle with the picture. On a listening trial, a player is shown a previously created squiggle along with two pictures and has to select the picture they think the squiggle refers to. They are then provided with accuracy feedback. On speaking trials, a picture is randomly chosen to be "squiggled." On listening trials, an evolutionary algorithm, factoring in the novelty and previous comprehensibility of a squiggle in the database, determined whether a squiggle had an opportunity to be presented to the community of users for a randomly selected picture. The most successful squiggles remain in game play while less successful squiggles (those which are not reliably understood) gradually disappear. Initial data were collected from 60 players who produced about 1,400 squiggles and participated in 4,100 listening trials. Many players report the game to be very entertaining (even addictive) and several played for almost an hour or more.
Basic findings suggest that the same patterns observed in previous work occur in this large-scale online game. First, squiggles get simplified. The average size of a squiggle shrinks over gameplay. Second, the evolutionary algorithm produced stability for most images, despite opportunities for novel squiggles to replace them during listen trials. Third, while squiggles are drawn highly iconically at first, participants gradually use simplified squiggles that distinguish pictures from others in the same domain. Finally, we describe a referent set in which compositionality may be emerging (akin to Kirby et al., 2008).
Currently, we are extending this game-based approach to a massively multiplayer environment for the iPhone in the hope of getting thousands to play. This will permit explorations of language evolution hitherto inaccessible to human experimentation, including questions regarding social network connectivity, small-world network structure, and the emergence and interaction between human dialects. The approach therefore holds promise by allowing us to study the emergence of linguistic communities in real-time at a very large scale.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0061
In contemporary research on the origins of human communication, games are used as a methodological tool (Galantucci, 2005; De Ruiter, Noordzij, Newman-Norlund, Hagoort, & Toni, 2007; Scott-Phillips, Kirby, & Ritchie, 2009). It has turned out that the ample degree of freedom allowed by the proposed games can be an impediment for the fruitful drawing of conclusions from the empirical findings: the creativity and flexibility of experimental participants leads to behavior that is hard to predict up front, allowing only post-hoc analysis (cf. Galantucci (2005)).
In the research reported here, an attempt is made to improve this situation by designing a game that can be played by a human player against a software agent. We hypothesize that this will provide an experimental setting that is sufficiently constrained for the design of experiments on the emergence of a communication system in which the test person's behavior is quantitatively measurable.
To test our hypothesis, a game has been defined that is sufficiently complex to allow for interesting communicative interaction to arise, but that is at the same time sufficiently simple to allow the design of software agents playing different strategies in the game. By the use of software agents, there is control over one of the players in the game, effectively reducing the range of expected behaviour of the second (human) player. The Embodied Communication Game (ECG) by Scott-Phillips et al. (2009) served as starting point for the design of our game. Current results are the design, analysis and implementation of a game and software agents with different strategies. The implemented strategies were derived from theoretical considerations and from the observation of human players playing the game.
In our game, there are a Sender and a Receiver (cf. De Ruiter et al. (2007)). The Sender controls a stick man situated in a box with four squares of one of four colours (red, yellow, blue, green). A colour might occur several times, or not at all. The stick man can travel from square to square. The Sender's goal is to communicate to the Receiver the colour of the square at the end of his turn. The Receiver watches a replay of the moves of the Sender, with all timing information removed, and with the squares grayed out. After watching the replay the Receiver has to decide on which of the four colours the Sender ended his turn. If the Receiver chooses the correct colour, both players gain a point. The goal for the players is to score the highest amount of points in succession.
Preliminary experiments have been performed to test whether the game indeed provides an experimental platform allowing the effective measuring of the behaviour of human players. We performed a 24 person experiment and tested two hypotheses. First, a less efficient signal is expected to be easier to recognize as a meaningful signal. Second, a highly repetitive signal is expected to be easier to detect than a signal without repetitions. We predicted higher scores for players paired with an inefficient, highly repetitive agent than for players paired with a highly efficient, not repetitive agent.
Inefficiency was measured as the number of moves used to travel from to the ending square minus the minimal number of moves required. Repetitiveness was measured by creating an algorithm that can detect oscillations, loops, corner oscillations and U-shapes in the moves of the Sender. The hypotheses were tested by creating four agents which all took on the signalling role. The agents used strategies differing in inefficiency and repetitiveness by using different signalling methods. One agent moved as efficient as possible, the others used combinations of oscillations and circles. Contrary to expectations, we found no significant differences in the scores between different agents. It seems that the agents' strategies were too difficult to understand; only one participant was able to score higher than what would be expected by chance.
We expected that replacing one participant with an agent and using a predefined signalling system would make the task easier. In fact, comparison with results from the ECG indicates that this made the task harder. This suggests that interaction is critical in constructing a shared signalling system. One possible reason is that the embodied behaviours of the signaller only make sense once the receiver has actually played that embodied role themselves. If so, researchers looking at these games may need to explore the space of designs more widely to find out what aspects of their games are crucial and what are not. Future work may need smarter agents that play both signalling and receiving roles.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0062
This paper presents work on air sacs that extends the work presented by de Boer, (2008a). In that paper, and before (Fitch, 2000) air sacs were identified as a likely feature of our evolutionary ancestors that may have been lost because of the evolution of speech. In the mean time, a more accurate understanding of air sac acoustics has been achieved (de Boer, 2008b; Riede et al., 2008). Ape-like air sacs modify the acoustics of a vocal tract in three ways: they add a lowfrequency resonance (near the resonance frequency of the air sac itself), they shift up the resonances of the vocal tract without the air sac, and they shift these resonances closer together. The question that is addressed in the present paper is how these changes influence perception of the difference between vocalizations…
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0063
For any cognitive system to behave properly within a dynamic, ever changing natural environment, the ability to work with concepts is quite important. Humans display an ability to talk about, reason over, infer and generate new concepts with remarkable ease, but for artificial agents this is far from trivial. In the case of human concept learning, it has been acknowledged by several authors that the formation of new concepts is heavily influenced by language. As such, the words used to describe a stimulus govern the way in which this data will be integrated into existing conceptual structures.
Several studies have shown that young children, in addition to learning directly from sensory exploration, rely on linguistic labels to acquire new concepts. Xu (Xu, 2002), for example, demonstrated how linguistic labels help 9-month old infants to establish a representation for different objects. Learning without linguistic labels, or with the presence of tones, sounds or emotional expressions is not as effective. Plunkett (Plunkett, Hu, & Cohen, 2008) came to the same conclusion in a controlled experiment in which they demonstrated how category formation in 10-month old infants is influenced by linguistic labels. Linguistic labels also have an effect on category learning in adults; adults who learn a new category did so significantly faster and showed more robust category recall when the learning experience was accompanied by novel linguistic labels (Lupyan, Rakison, & McClelland, 2007). This shows that linguistic labels facilitate category acquisition, both in pre-linguistic infants and adults. These insights tie in with linguistic relativism, which gained renewed attention as a series of experiments demonstrated how perception of stimuli and use of categories is influenced by language (Gilbert, Regier, Kay, & Ivry, 2006) (Majid, Bowerman, Kita, Haun, & Levinson, 2004).
In order to capture these insights, we developed a computational model in which a learning agent is able to learn new concepts through linguistic interaction with a teacher. To do so, we adapted interaction based on Language Games (Steels & Belpaeme, 2005) to a teacher-learner scenario. This allows for the usage of language as a steering mechanism in the acquisition of conceptual knowledge and associated meaning. A learning agent engages into a series of Language Games with a teacher, and thus gradually builds a repertoire of word-meaning mappings. To represent conceptual knowledge, we use a Conceptual Space (Gärdenfors, 2000), which consists of a geometrical representation within a number of quality dimensions with a metric, allowing for similarity measurement. Within a Conceptual Space, concepts can be stored as prototypes and associated with a lexicon of word labels. Newly perceived stimuli can be matched to existing conceptual prototypes and an appropriate linguistic expression can be found.
We propose that the model exhibits properties comparable to how young children learn new concepts; namely language driven acquisition, fast mapping and overgeneralization. The model is based on previous work as reported in (Greeff, Delaunay, & Belpaeme, 2009), in which we studied the effect of adding interactive features to the learning process. We then augmented the model with a Spreading Activation layer (Rumelhart, McClelland, & PDP Research Group, 1986) which allows for association between conceps, even when they are not perceptually similar. Typically the colour domain is used as a test case, but the model characteristics are general enough to be applied in any domain. We argue that our model is a feasible way of acquiring conceptual knowledge in a linguistic relativism spirit.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0064
Duality of patterning (DoP) refers to the reuse and recombination of a small set of meaningless units to create a near-limitless set of meaningful morphemes. We develop an experimental paradigm to explore the development of DoP. In this paradigm, human participants (each representing one generation) Icam a lexicon of random visual symbols, such that influence from their own language is minimized. Using a communication system similar to that of Galantucci (2005), participants use a digital stylus to draw symbols on a computer display but, critically, the mapping from the stylus to the screen is restricted in order to prevent the use of orthographic characters or pictographs. Participants recreate the set of symbols, and these are transmitted to the next participant in a diffusion chain through a process of iterated learning. This paradigm allows us to observe evolution of a cultural behavior such that no single participant is the driver of innovation and selection; instead the behavior is cumulatively developed across individuals.
A sample of the results is presented in Figure 1 (multiple diffusion chains were created, each consisting of 5 or more generations). Here we note few items remain unaltered by the fifth generation. Most symbols undergo small changes that accumulate over time (i.e. item #7). Lost symbols are generally replaced with novel symbols composed of segments present in other items in the lexicon. As items are modified and replaced in this way, a small set of subunits begins to pervade the entire lexicon. After several generations, items become more similar and these similarities lead to, and result from, the reuse and recombination of smaller units.
In subsequent experiments, we study how the pairing of meaning with each symbol influences the development of subunits based on semantic categories (such as abstract shapes and animate creatures) similarly to studies by Thiesen et al (to appear). Results of this paradigm suggest that an interaction of two or more factors may be partly responsible for the emergence of DoP: articulation and memory constraints. As each generation alters the symbols responding to tension between the production and perception systems, symbols become more similar and more difficult to recall. To overcome this, each "generation" modifies the set via small, often unintended, innovations in order to form distinctions within the set. In Figure 1 above, we see that by Gen 5 several symbol pairs differ only in a single feature. For example: Items #2 and #3 are mirror images of one another along the y-dimension. Items #9 and #3 differ by the direction line segments beneath the symbol.
Recently, Sandler et al. (to appear) argue that DoP is still emerging in a spontaneously created new sign language, Al-Sayyid Bedouin Sign Language (ABSL), whereas established older sign languages have contrastive phonological systems. This may be because sign languages have large articulatory spaces and can exploit iconicity, or a transparent form-meaning mapping. This paradigm may allow us to test articulatory constraints (by modifying the pad ana stylus' ability to recreate symbols) and memory constraints (by varying the number of symbols in a set) in order to evaluate effects on subjects' output forms across generations.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0065
Consideration of the origin of language was deemed futile by the Société de Linguistique de Paris in 1866, and all discussion on this topic was banned. This ruling reflects the fact that spoken language leaves no trace, prohibiting a study of language origin. Recently there has been a resurgence of interest in the science of language evolution (Fitch, 2007). Discussion of the origin of language has also been revived, although it has been largely speculative.
Three theoretical accounts of the origin of language have been proposed: vocalization, manual gesture and vocalization plus gesture. A vocalization account proposes that spoken language evolved from non-human primate alarm calls (MacNeilage, 1998). On this account spoken language emerged via the repeated association between sounds and their referents. In contrast, and in light of primates' greater manual (as opposed to vocal) dexterity, it has been argued that spoken language arose out of manual gestures (Corballis, 2003). On this account spoken language systems emerged from pantomime, where gestures were used to iconically communicate their referents. Unlike a situation where spoken language arose out of a full-blown gestural language, a combined account argues that gesture and vocalization supported each other's development until a fully-fledged spoken language was established (Arbib, 2005).
How are we to test the veracity of each account given that we have no fossil record? One way of overcoming the lack of linguistic fossils is by recreating a simplified historical record under laboratory conditions. That is, by having modern humans communicate a set of recurring concepts to a partner using vocalization, gesture, or vocalization plus gesture. Prohibiting participants from using their existing language system allows us to examine how different communication modalities lend themselves to the establishment of effective and efficient communication systems.
Forty-eight undergraduate students participated in exchange for payment or partial course credit. Participants completed the task in pairs. In each condition (vocalization only, gesture only, vocalization plus gesture) one participant (the director) tried to communicate a list of concepts (18 targets plus 6 distracters) to their partner (the matcher), such that their partner could identify each item from their (unordered) list. The experimental items fell into one of three categories: Object (rock, fruit predator, water, tree, hole, mud, rain), Action (fleeing, sleeping, fighting, throwing, chasing, washing, eating, hitting) and Emotion (tired, pain, angry, hungry, disgust, danger, happy, ill). Participants played six games, using the same item set on each game (presented in a different random order). Participants were not permitted to use spoken language. All communication was recorded audio visually.
Communication accuracy (% correct) increased across games 1-6 in each condition. However, accuracy was higher in the gesture (81.3% at game 1 and 94.4% at game 6) and vocalization plus gesture conditions (88.2% at game 1 and 95.8% at game 6) when compared to the vocalization only condition (37.5% at game 1 and 52.1% at game 6). Communication success was mediated by item type in the vocalization only condition (emotion > action > object), but not in the gesture and vocalization plus gesture conditions where all items were communicated equally welt. Communication efficiency (time to successfully communicate each item) improved (i.e., decreased) across games 1-6 in each condition. There was no difference between conditions.
In conclusion, our results indicate that gesture is a more effective means of establishing a communication system where none exists. Our findings, albeit compromised by using modem humans, lend support to the theoretical position that spoken language arose out of manual gestures.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0066
Science does not often have the opportunity to observe the genesis of a natural language. However, in the case of deaf individuals born into hearing families, language genesis can be observed as it happens. Deaf children born into non-signing hearing families often spend their early years unable to access their world's ambient language. In the most extreme case, deaf children who are unable to hear the spoken language surrounding them and who are not exposed to a conventional sign language develop their own gesture systems: homesigns. These homesign systems bear many of the hallmarks of mature languages, including stable word order, arbitrariness, and displaced talk (Goldin-Meadow, 2003, 2005) and grammatical categories (Coppola and Newport, 2005). When circumstances cause homesigners to come together for extended periods of time, the situation is ripe for the creation of a new full-blown language.
The new sign language in Nicaragua is the product of just this sort of situation. Deaf children brought their individual homesign systems with them to new schools founded in the 1970's (Polich, 2005). As they began to gesture and communicate with one another, they converged on a common system: Nicaraguan Sign Language was born. The language is now thriving, with nearly one thousand native users (Senghas & Coppola, 2001). But why did this new language grow so quickly? Though children create homesign systems in the absence of a conventional language, these systems do not arise in a vacuum. The raw materials for a homesign are the child's brain and the only accessible linguistic input in their environment: the gestures of the hearing people around them. Children around the world are born with equivalent mental resources, but the gestural input homesigners have to work with might differ.
In western cultures, hearing parents who have chosen to educate their deaf children orally are advised to use their voices whenever they communicate with their children; as a result, they rarely produce gestures without also producing speech. In contrast, hearing parents not committed to training their deaf children orally might be more open to producing gestures in the absence of speech. And gestures produced without speech have been found to display the linguistic properties of segmentation and hierarchical combination, properties not found in gestures produced along with speech (Goldin-Meadow, McNeill & Singleton, 1996). Homesigners with this more language-like gestural input might create different, perhaps more linguistically complex, gesture systems than children exposed only to gestures that co-occur with speech. Our study takes the first step in exploring this possibility by examining gestural input to homesigners in four different cultures: Nicaragua, Turkey, the USA and China.
Four deaf homesigning children from each culture were observed at play in their homes with members of their family. Sessions were videotaped and coded. All vocal and gestural utterances directed toward the child by a family member were recorded. Utterances were classified based on whether they contained gesture without speech, gesture with speech, or speech alone. All children received utterances of all types, but the proportion varied by culture. The majority of the utterances that the Nicaraguan homesigners received contained gesture without speech (black bars). In contrast, almost all of the gestures that the children in the other three cultures received were produced along with speech (white bars). The American and Turkish children also received a sizeable number of utterances containing only speech (grey bars).
Because the gestures they see are produced without speech, the Nicaraguan homesigners may be getting substantially richer input than the homesigners in other cultures. These gesture-alone utterances may be complex enough to serve as the raw linguistic materials for language creation, allowing the Nicaraguan homesigners to then create more complex homesign systems than those seen in other cultures. If so, the Nicaraguan environment may have been a particularly fertile environment for the birth of a new language, which could help to explain the rapid emergence and development that has been documented in the language over the past few decades.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0067
Human language is characterized by an arbitrary association between signal and meaning and by the combinatorial assembly of larger units of meaning from smaller referential units (Lachmann, Számadó, & Bergstrom, 2001). The stabilization of honest signaling in such systems on evolutionary time scales remains an important theoretical challenge. Lachmann et al. (2001) propose that arbitrariness can be preserved by social enforcement of honest signaling, while combinatoriality requires that the punishment of deception be associated with whole messages rather than their components. Scott-Phillips (2008) suggests that social exclusion, fueled by gossip, provides a cheap mechanism for punishing deceptive signalers. But gossip about reputation can be deceptive, as all teenagers know, and hence it seems unlikely in the absence of modeling evidence that language can guarantee its own reliability and ensure honest signaling.
We propose that the reliability of signals in human language can be preserved by a group of largely involuntary, non-linguistic signals, which we call tells; as in poker, a tell is (often) a sign of deception. The sensitivity of humans to kinesic and paralinguistic channels and the use of these channels in mediating mammalian social relationships is well-known (Bateson, 2000). More recent work demonstrates that by measuring these largely involuntary channels one can predict the outcome of many social interactions in modern humans (Pentland, 2008). The effect of such involuntary channels on the evolution and stability of honest signaling in another channel has not been studied; we take a first step by analyzing the effect of tells on three game theoretic models of signaling, two standard and one combinatorial.
In the first game, two players decide whether to escalate a conflict (Gintis, 2009); the tell reveals whether the first player has signaled its relative strength honestly. The first player is under evolutionary pressure to minimize the proba-bility p of giving a tell, and the second to become sensitive to smaller tells. The second game begins in an honest signaling equilibrium: the signaler provides information about the world via an arbitrary signal. We then change the payoffs to incentivize signaler deception. The tell alone cannot change the outcome; punishment following tells, however, preserves the signaling equilibrium for values of p > pmin, depending on the incentive and punishment for deception.
The third game is a model of combinatorial signaling. Lachmann and Bergstrom (2004) show that combinatorial signaling has a unique vulnerability to deception: individual signals can have a negative value of information (the receiver is worse off for having received that signal). This results from the fixed mechanism for combining signal components into signal meanings. Lachmann and Bergstrom (2004) suggest that this vulnerability explains why combinatorial communication has evolved so rarely. We find that honest combinatorial signaling can be stabilized when the probability of the signaler giving a tell exceeds a threshold set by the receiver's response strategy.
These games suggest that the combination of multiple signaling channels has rich strategic implications, worthy of further investigation; tells, in particular, provide a mechanism for stabilizing combinatorial communication against deception, and probably do so in concert with social enforcement. As primates, human ancestors possessed highly developed mechanisms for extracting information about social relationships from non-vocal and perhaps largely involuntary behaviors. The ability to read these behaviors could be recruited to stabilize combinatorial signaling in the primary channel, freeing our ancestors to gossip with confidence – and to talk about much more.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0068
The following sections are included:
https://doi.org/10.1142/9789814295222_0069
Given the contextual flexibility of gestural communication by apes relative to their more context-bound vocalizations (Tomasello et al., 1994), many theorists have inferred that language was initially gestural (e.g. Hewes, 1973). If language emerged from gestures, the potential to use gestures to acquire words should be present not only in young humans, but also in juvenile chimpanzees and bonobos (when they are raised in language-enriched environments). This study examines whether deictic gestures supported the vocabulary development of a chimpanzee (Panpanzee) and a bonobo (Panbanisha) as well as analyzing their developing ability to pair gestures with signals of communicative intent.
Both deaf and hearing children's gestures are initially context bound (deictic) before becoming progressively more decontextualized (Caselli, 1983). Toddlers are significantly more likely to first indicate an object through deictic gesture and only later to name it than to first name and later gesture towards an object (Iverson & Goldin-Meadow, 2005). Do juvenile apes also progress from indicating objects through gesture to decontextualized reference through lexigrams (arbitrary visual symbols signifying words)? Qualitative analysis suggests that human children and apes combine a deictic gesture and a word before combining two words (Greenfield et al., 2008). Using a sampling method developed with human children (Iverson & Goldin-Meadow, 2005), the present study aims to determine if gesture plays a role in the emergence of language across the clade. We predict that like human toddlers, apes will first rely on gesture and only later use words to name objects.
Sampling from video footage of the daily interactions of Panpanzee and Panbanisha from 10 to 24 months of age, we are coding all gestures and lexigrams that are demonstrated during 2 hours a month for each ape. As in observational studies of human linguistic development, this sampling method provides an index of when gestures emerge relative to words. Once we have coded 30 hours of video for each ape, we will use chi-square analyses to determine if the number of items that were first observed to be referenced by deictic gestures exceeds the number that were first evidenced as lexigrams. Preliminary analyses based on 5 hours of video coding for each ape suggest that deixis may indeed precede lexigram use by language-trained apes. Panpanzee first referred to 7 items with deictic gesture and 2 items with lexigrams while Panbanisha referred to 5 items first with deictic gesture and 3 with lexigrams.
Because deixis may be a mechanism that helps members of the clade learn to use words, understanding the ways that deixis is marked as communicative across phylogeny and ontogeny is also important. Intentional communication is often defined by the presence of one of the following behaviors: attention getting behaviors, gaze alternation, or persistence. Because human deixis only gradually takes on communicative elements across ontogeny (Masur, 1983), we are also analyzing how often deictic gestures and lexigrams emitted by Panpanzee and Panbanisha co-occur with communicative signals during the sampled time frames.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0070
For the last thirty years species comparative approaches to the study of the evolution of language have shown a strong bias toward uncovering differences between apes and humans in their symbolic and communicative capacities. Flying in the face of the well established fact that chimpanzees and humans share 99% of their genes, researchers and scholars seem motivated to emphasize the uniqueness of human language at the expense of understanding its origins and foundations in primate communication. Another tendency in the field is to assume that only spontaneously manifest symbolic capacities are of interest and to ignore learning mechanisms that rely on social or nonlinguistic stimuli. In order to counter these trends, I will be presenting an alternative theoretical and methodological framework.
Its first tenet is that cladistic analysis (comparative study of species descended from a common ancestor) is an important tool for understanding the primate foundation of human language and communication. This is because similarity of a characteristic across a clade (a phyologenetically related group of species with a common ancestor) indicates that it is likely to be an ancestral trait that is part of the genetic heritage of all members of the clade. Therefore, comparing behaviors across a clade is a research strategy that can uncover ancestral behavioral capabilities that served as the foundation for modern capabilities – in this case for language and communication. This theoretically based methodological point is particularly important because language, like other behavioral capabilities, does not leave fossils.
The second tenet is that earlier stages of development are more similar among members of a clade than are later stages of development. Therefore, comparing behaviors in very young members of each species across the clade is most likely to uncover evolutionary foundations for a particular system, in this case, language and communication.
Third, later capabilities build on earlier ones in both ontogeny and phylogeny. Therefore, the analysis of developmental transitions from one stage to another is useful in understanding both ontogeny and phylogeny. We therefore will examine behavioral development – in this case the development of language and communication – over time in each species,. But our analysis does not focus on time or age per se; instead it emphasizes the transitional learning mechanisms that drive symbolic and communicative development from one level to the next.
Fourth, earlier stages of development are more universal within a species than are later stages of development. This is because both ontogeny and phylogeny build on what is already there rather than erasing it, so both speciation and individual differences in development tend to be found in later rather than earlier stages of development, Therefore, sample selection and large sample size within each species is less important than when studying older members of a species: cross-species similarities early in life are likely to be robust across a wide range of species members.
Fifth, language evolved for communication. Therefore, the social stimuli provided by conversation should provide important developmental learning mechanisms.
Sixth, language evolved out of nonlinguistic behaviors and capacities. Therefore, nonlinguistic communication, specifically gesture, is a good candidate for a developmental learning mechanism.
These principles have animated a series of cross-species comparative studies that demonstrate common transitional mechanisms in the early development of symbolic communication and representation across the clade consisting of bonobo, chimpanzee, and human. According to the principle of cladistic analysis, these transitional mechanisms then become good candidates for mechanisms that lie at the very foundation of language evolution. While we rely heavily on a small number of members of each species – mainly because highly rare symbol-enculturated apes are at the heart of the research designs – the focus on early development means that our findings may not only constitute an existence proof, but also index species-typical capabilities held in common by all three species in the clade. We will provide evidence that three kinds of transitional mechanisms - gesture, dialogue, and social scaffolding - are utilized across the clade in the ontogeny of symbolic and communicative capacities. More specifically, we will show how, in bonobo, chimpanzee, and child, each of the three mechanisms leads to a similar developmental progression in symbolic representation or communication. We see these commonalities as a foundation from which human language, communication, and representation evolved after the phylogenetic split of the three species five million years ago. Rather than trying to figure out where human language went in its evolution, we are trying to figure out where it started. Learning where it started can give us critical information about the evolution of its most basic and robust characteristics. Learning where it started is also essential for understanding where human language has gone in the last five million years. Thus, it is not an either-or situation. Instead, the study of cross-species similarities complements and provides a context for the study of cross-species differences in language evolution.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0071
There are many fascinating routes to studying the evolution of language, including explorations of the fossil record, game theoretic modeling, historical analyses, and comparative experiments on nonhuman animals. All have contributed in various ways to our understanding of the origins and subsequent evolution of language. And yet, there is a nagging feeling that much of the evidence is circumstantial, and that we will never deeply understand how our capacity for language started, and what selective pressures, if any, led to subsequent transformations in both structure and function.
In this talk I explore the evolution of our linguistic competence, and in particular, the kinds of capacities that animals may have in the absence of a comparable capacity to use these abilities in communication. I then use these results, and others from the field, to argue that though we have rich theoretical frameworks for exploring problems of language evolution, our methods are woefully deficient. As such, we need to carefully consider whether we will ever have the empirical goods to address the theoretical perspectives sketched.
I begin in Part 1 of the talk by laying out a theoretical framework that I find useful for thinking about questions of evolutionary origins and change. In Part 2 I turn to a set of comparative results that I believe are relevant to this framework. In particular, I discuss recent studies of nonhuman primates focusing on spontaneously available (i.e., untrained) competences for understanding symbols and simple rules, capacities that did not evolve for language, but were subsequently used by the language faculty. In Part 3, I turn to a new set of findings on humans, showing the interface between semantics and syntax, and in particular, revealing the constraints that evolutionarily ancient, domain-general computations have on domain-specific knowledge, and in particular, language-specific computations. In Part 4, I end by critically evaluating the kinds of contributions that comparative studies — including especially, my own — have provided to our understanding of language evolution. I end with a rather pessimistic conclusion: due to methodological limitations, comparative studies can not presently answer most of the fundamental questions concerning the syntactic and semantic structures that are ubiquitous in language, and consequently, we may forever be stuck in mystery and speculation concerning the origin and subsequent evolution of language.
Part 1. The theoretical framework I develop is based on the original paper I wrote with Chomsky and Fitch, but with subsequent clarifications and extensions. In particular, I argue that from an evolutionary comparative perspective, it make sense to ask which aspects of our language faculty are shared with other animals and which are unique, and of those capacities that are unique to our species, which are unique to language. This framework, when properly articulated, is not committed to any particular view of language, including theories targeting representational structure (e.g., minimalism) and evolutionary processes (e.g., adaptation and functional design).
Part 2. I begin by discussing new experiments on rhesus monkeys that explore how our capacity to comprehend and use symbols may have emerged, and in particular, how our capacity to understand the duality of pictures as both physical objects and representations of something else evolved in evolution and develops in human ontogeny. Based on a highly simplified task, results show that rhesus discriminate pictures of food from real food, discriminate pictures of food from pictures of non-food, this discrimination is based on visual inspection alone (i.e., no contact with the pictures), and in the absence of prior experience with pictures, as well as no training or reward. Thus, important aspects of our competence to recognize the symbolic duality of pictures evolved before our uniquely human capacity to create symbols. I then turn to a series of experiments that explore the capacity of primates to extract rule-like regularities from a structured input, with the aim of targeting some of the core syntactic properties of language. In one experiment with captive tamarins, I show that they have the capacity to acquire a simple affixation rule, distinguishing structures that contain specific prefixes with those that contain specific suffixes. In a second experiment, I show that captive chimpanzees share with human adults the capacity to spontaneously extract both category information and ordering information from a structured input. More specifically, both species primarily encoded positional information from the sequence (i.e., items that occurred in the sequence-edges), but generally failed to encode co-occurrence statistics. This suggests that a mechanism to encode positional information from sequences is present in both chimpanzees and humans, and may represent the default in the absence of training and with brief exposure. As many grammatical regularities exhibit properties of this mechanism, it may be recruited by language, and constrain the form certain grammatical regularities take.
Part 3. When children acquire language, they acquire not only the syntactic categories (e.g., nouns, verbs, determiners), but the rules by which such categories are ordered and arranged. Here I present the results of an experiment with human adults looking at the capacity to acquire a simple duplication rule (e.g., AAB or ABB), where the categories are syntactic (i.e., nouns and verbs). Although subjects readily processed the categories and learned repetition-patterns over non-syntactic categories (e.g., animal-animal-clothes), they failed to learn the repetition-pattern over syntactic categories (e.g., Noun-Noun-Verb, as in "apple-car-eat"), even when explicitly instructed to look for it. Further experiments revealed that subjects successfully learned the repetition-patterns only when they were consistent with syntactically possible structures, irrespective of whether these structures were attested in English or in other languages unknown to the participants. When the repetition-patterns did not match such syntactically possible structures, subjects failed to learn them. Results suggest that when human adults hear a string of nouns and verbs, their syntactic system obligatorily attempts an interpretation (e.g., in terms of subjects, objects and predicates). As a result, subjects fail to perceive the simpler pattern of repetitions --- a form of syntax-induced pattern deafness that is reminiscent of how other perceptual systems force specific interpretations upon sensory input.
Part 4. I conclude by summarizing the results presented, and critically evaluating my own comparative studies. In brief, though I think it is probably accurate to say that animals have evolved domain-general competences that are shared with humans, and these competences are relevant to the faculty of language in the broad sense, current methods for extracting the specific mechanisms in play are weak, and thus, not up to the job of testing between competing hypotheses. This puts us in a difficult position because comparative studies may simply be unable to generate the kind of data that are necessary to explore the evolution of language, and in particular, those aspects that are part of the broad language faculty and those which are part of the narrow faculty.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0072
Human languages are in a constant state of change. New words are constantly being invented and old ones lost. Words change their meanings and pronunciations over time. In the extreme case, whole new languages are created, either through the merging of unrelated languages or the splitting of one language into many variants. These processes are known are creolisation and dialect formation respectively. The former is perhaps best known from those creoles which originated in the period of European colonization and slave trade and which still exist today in the Caribbean. In contrast, the latter process can be observed in a great many number of dialects and languages worldwide, such as the descent of the Romance Languages from Latin.
The use of computer modelling to study aspects of the evolution of language has become established over the past couple of decades. In particular, the naming games introduced by Steels (1995) have been very influential. Using this model, Steels showed how a shared vocabulary can emerge in a group of agents through a series of conversations involving two agents. The more times a word is used for an object, the stronger the association between the object and the word will become. New words can be added and words that do not have communicative success are removed.
Subsequently, Steels (1997) performed a limited evaluation of the element of distance in his game. In that paper, the agents are divided in clusters and eventually develop a stable vocabulary within the cluster and at the same time they become familiar with the words used in the other clusters. In effect they become bilingual. Our work extends this model to study what happens when clusters merge or divide, simulating the conditions in which Creoles or dialects form respectively.
In order to introduce multiple populations into the model of Steels, we place agents in a two dimensional environment in which agents are more likely to talk to agents closer to them. Additionally, we allow for this environment to be split in half and ban all communication between the halves, creating two subpopulations which are linguistically isolated from each other.
Simulations of language contact were performed by beginning in a split environment and then, after allowing each group to interact for a specific amount of time, the entire population was left to interact together. For small populations, such simulations did indeed lead to the formation of "creole" languages, with words in the final combined language being taken from both the languages spoken by the initial subpopulations. However, as the groups of agents become larger the agents have trouble in converging to a single language. No name is dominant over the other but in the conversations about an object the agents end up using one of the two names developed in the groups but they can never decide which one they prefer. This may be seen as more analogous to the creation of bilingualism in the agents.
By reversing the earlier process, first allowing the whole population to interact before splitting it into two subpopulations, the divergence of a common language can be studied. The results observed are that the similarity of the resulting languages in the groups is strongly related to the amount of time the agents have spent together in a single group at the beginning. The longer the time they have spent the higher probability they have to end up using the same language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0073
It has been shown that larynx lowering is probably not the key factor leading to the emergence of speech in Humans. Rather, it has been argued that larynx lowering is a process which allows males to produce lower frequency sounds and thus giving females the impression of a bigger (more powerful?) potential sexual partner (Fitch 2000, 2002, 2005; Ohala, 2000). Although the process of larynx lowering is found in a number of mammal species (deer, lions) with apparently a similar function, that of providing males with "impressive" vocalizations exaggerating their size (Fitch and Reby, 2001), it is not found in our closely related non-human primates. Why did our non-human primate cousins not develop a lowered larynx? One possible explanation is that these species possessed air sacs which fulfilled the same function. These laryngeal air sacs have the capacity to produce lower frequency sounds (de Boer, 2008; Gautier, 1971) but they are also able to produce louder sounds.
Our hypothesis is that the male common ancestor of non-human and human primates had laryngeal air sacs which were replaced by a lowered larynx in the line which led to Homo sapiens. Why did air sacs disappear? We propose an ecologically induced explanation. In a forest environment, it is very important to produce loud sounds for two reasons. First, forest environments often render difficult the visual identification of conspecifics. Second, sound propagation is dampened in forest environments. In such an ecological context air sacs were quite appropriate to produce both lower frequency and louder sounds. When our ancestors left a forested environment, the need for loud sounds was no longer necessary. In a savannah type of environment sounds propagate much more efficiently than in a forest environment and, in addition, it is easy to visually perceive other individuals of your group. However, in order to preserve the "exaggerated male size" the disappearance of air sacs had to be replaced by another mechanism. The ancestors of Homo sapiens selected a process used by a number of mammal species: the lowering of the larynx. Hyoid bone fossil data seem to indicate that Australopithecus afarensis had air sacs (Alemseged et al. 2006) but Homo heidelbergensis (Martinez et al. 2008) and Neandertals did not.
We will provide supporting evidence for our hypothesis through an examination of the relationship between the presence versus absence of vocal sacs among related species in contrasting ecological environments. Air sacs are present in species inhabiting forest environments, while they have disappeared in closely related species inhabiting savannah environments. The role of sexual dimorphism will also be discussed.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0074
Duality of patterning – the existence of a meaningless phonological level as well as a meaningful level of morphemes and words -- is a fundamental design feature of human language (Hockett 1960). It seems reasonable to speculate that a combinatorial system of meaningless elements emerged from holistic forms, facilitating the creation of a sizable vocabulary of perceptually distinct words.
The first deaf signers of Al-Sayyid Bedouin Sign Language (ABSL) were born about 75 years ago, and the language is now used by a community of about 150 deaf people and many of the 4,000 hearing people in the village. This language has regular word order (Sandler et al 2005) and prosodically marked phrasal constituents (Sandler et al 2008). However, we have not yet encountered minimal pairs, and we observe glaring variation in sign production across signers, of a kind that would blur phonological category boundaries of more established sign languages. It appears that signers aim for holistic iconic prototypes, and do not rely on discrete, meaningless combinatorial units to form signs.
The present study supports these observations by carefully measuring variation across signers in ABSL, and comparing it with sign productions in two other, more established sign languages, American Sign Language (ASL), and Israeli Sign Language (ISL), both of which have been shown to have phonological organization (Stokoe 1960; Meir and Sandler 2008). 47 features of the three major categories of sign formation – hand shape, location on or near the body, and type of hand movement – are coded for 15 signs as signed by 10 signers in each language. The results show more variation in nearly every subcategory in ABSL than in the other two languages (Israel 2009). Taken together with other criteria for phonological organization, the results support the claim that this new language does not yet have a level of structure consisting of discrete, meaningless, combinatorial units.
The cline of variation is consistently ABSL > ISL > ASL. ASL is the oldest of the three languages with the largest community, while ISL is about the same age as ABSL but developed through creolization in a larger and more diverse community. Our results indicate that the emergence of duality of patterning is gradual, and depends in part on social factors such as age, size, and diversity of the language community. If its emergence is gradual in a modern human community, it is reasonable to infer that the same was true in evolution.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0075
Deacon (2003) proposed a novel mechanism by which a biological property gains complexity. In his theory, the term "masking" is used to refer to an environmental change that masks a particular selection pressure. Similarly, the term "unmasking" is used to indicate a process by which a selection pressure becomes effective. Birdsong conveys two messages: species identification and individual vigor. To ensure the former, the signal should satisfy one or more species-specific features. To ensure the latter, the signal should reflect individual characteristics related to vigor. The Bengalese finch is a domesticated strain of the white-rumped munia, an endemic finch of East Asia. The process of domestication occurred about 250 years ago, comprising approximately 500 generations. Bengalese finches sing syntactically and phonologically complex songs, whereas white-rumped munias sing simpler songs in both domains (Okanoya, 2004). Female Bengalese finches engage in more breeding behaviors when stimulated by complex songs, suggesting that song complexity in Bengalese finches evolved by sexual selection (Okanoya, 2004). However, the model of Ritchie and Kirby (2005) demonstrated the possibility that domestication alone could account for the evolution of song complexity by relaxing selection pressures. In their study, domestication was used as a condition to disable selection pressure for species identification. If a system arises in which the need for species identification exhibits natural variation, that system can be used to directly test Deacon's theory
When several species of birds with similar plumage share the same environment (sympatric environment), birdsong should faithfully convey the species identification signal to avoid infertile hybridization. In Taiwan, white-rumped munias form mixed colonies with a closely related species, the spotted munia. We hypothesized that the rate of sympatry would affect song complexity. We conducted a field study at three locations in Taiwan: Huben (H), Mataian (M), and Taipei (T), where natural populations of white-rumped munias occur. During the summers of 2006-2008, we captured white-rumped munias using mist nets and recorded male songs. Totals of 30 (H), 23 (M), and 17 (T) male white-rumped munias were captured. The number of spotted munias at each location was also counted. Song linearity, an index of song simplicity, was calculated as (the number of song notes)/(the number of song note transition types). The value of this index equals one when the song sequence is completely linear and decreases when the song is less deterministic. The index is 1/N, where N is the number of song notes, when the song is completely random. The rate of sympatry was calculated as (the number of mixed flocks)/(the number of total flocks), where mixed flocks were those containing both white-rumped and spotted munias. We found that the rate of sympatry was lowest at Huben and higher at Taipei and Mataian (H < T, M; Fig. 1a). Song linearity was lowest (more complex) at Huben and greater at Taipei and Mataian (H < T, M; Fig. 1b). Therefore, the rate of sympatry corresponded with song linearity. These results were consistent with the prediction that lower pressure for species identification leads to higher complexity in birdsong.
Although other factors including sexual selection (Okanoya, 2004) are undoubtedly involved, the process of masking, as demonstrated here, may account for some proportion of signal evolution in Bengalese finches, as suggested by Deacon (2003). Similar mechanisms should be considered when examining the evolution of human language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0076
The idea that language first emerged as a form of visible bodily action, - the "gesture first" view - has attracted supporters since the eighteenth century. Since the modern revival of this idea by Gordon Hewes in 1973, it has gained many adherents with Michael Corballis and Michael Tomasello (2008), among the most recent. It is attractive, perhaps, because we can observe, through such processes as the conventionalisation of pantomime, the emergence of language-like systems, as may be seen when primary sign languages come into being. However, the problem a "gesture first" theory always confronts is that modern languages are spoken and humans are highly specialised for speech. This means that "gesture first" advocates have to account for a switch from gesture to speech. None of the modern "gesture first" theorists offer a satisfactory account of this, while opponents of this position have argued that a "gesture stage" is unnecessary. Advocates of a "speech first" position, on the other hand, overlook or downplay the fact that when speakers engage in utterance, always to some extent, and often to a considerable extent, they produce an ensemble of speech and gesture. Studies of the way in which gesture is involved in utterance construction show that gestures function like partners with speech in the creation of coherently meaningful utterances. Furthermore, gestures enter into the fashioning of utterances in many different ways. They express emphasis and emotion, but they also express concepts, often by means similar to those found in sign languages. They also express meta-discursive and pragmatic aspects of the utterance (Kendon 2004). Neither "gesture first" advocates nor "speech only" advocates have considered seriously this co-involvement of speech and visible bodily action in utterance construction.
In this paper, this partnership of speech and gesture in utterance will be our starting point. Looking at the various forms of gestural expression, we note that referential gestures, whether deictic or representational, and gestures with pragmatic functions can often be seen as derived from manipulatory actions, as if the speaker is acting on, or in relation to, objects in a virtual physical environment, including other animate beings as objects. Appealing to MacNeilage's (2008) suggestion that the actions of speech are exaptions of the mouth actions of chewing and other eating actions, and considering the evidence from many different studies that suggest hand-mouth synergies of great phylogenetic antiquity (Gentilucci and Corballis 2006), we propose that the action systems used in speaking and gesturing are descended by modification from the hand-mouth action ensembles employed as the animal manipulates, modifies or appropriates and ingests parts of its environment, as it seeks for and grasps and processes food, or manages, manipulates and modifies its environment. These systems were recruited to symbolic functions when practical actions acted out in a "vicarious way" (as when an animal must decide between more than one course of action) were recognised by conspecifics as "as if' actions. It was this that made symbolic dialogues possible (Kendon 1991). On this view, important aspects of language are only secondarily communicative in origin and the co-invovlement of hands and mouth in utterance production is accounted for because of the development of shared control networks which originated in strategies that developed in relation to food getting and environmental manipulation. Language emerged, thus, from an ensemble of oral and manual actions, an ensemble still observed in utterance production in modern humans.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0077
The following sections are included:
https://doi.org/10.1142/9789814295222_0078
The 'cooperative breeding' hypothesis (Hrdy, 2009) and the 'grandmother' hypothesis (O'Connell, Hawkes and Blurton Jones, 1999) have combined to establish a new orthodoxy that Homo erectus social structures were fundamentally female kin-bonded (and see Knight, 2008; Opie and Power, 2008). What are the implications for the evolution of language?
Hrdy accounts for the evolution of special hypersocial tendencies in genus Homo by arguing that strategies of cooperative breeding give rise to intersubjectivity – a willingness to share what I am thinking with you, and seek to know what you are thinking of my thoughts (Tomasello et al., 2005). Other great apes, especially machiavellian chimpanzees, are capable of mind-reading, but lack any intention of cooperating in allowing their own minds to be read. Precisely when an evolving hominin mother lets others take her baby off her hands, says Hrdy, selection pressures for two-way mind-reading are set up. The mother must be socially adept to elicit support and judge motivations of the alloparent towards her offspring; the baby, once handed over, must be monitoring carefully 'where's mum gone?', at the same time as probing for signs about the intentions of the new carer; while the alloparent, necessarily a relative in the original scenario, adopts a quasi-maternal role. A whole array of behaviours sprang up to help this variegated triad of mum, baby and allocarer to keep in contact: mutual gazing, babbling, kissfeeding. Hyperpossessive great ape mothers never needed such elaborate bonding mechanisms. The only other primates which have been heard babbling are tiny South American marmosets and tamarins, whose breeding systems are fully cooperative, involving shared care and provisioning of infants by allocarers. The ancestors of the first language-users, says Hrdy 'were already far more interested in others' intentions and needs than chimpanzees are.' (2009: 38)
While the cooperative breeding/grandmothering mode! offers a plausible explanation for language-ready brains, these conditions may be necessary but not sufficient for the historical evolution of actual languages. In proposing 'emotional modernity' among H. erectus, Hrdy makes clear that she is discussing the prequel to language and symbolic culture, refraining from speculation about the transition to full cognitive and linguistic modernity. What are the problems that are not solved by this scenario? Shared infant care will not increase fitness unless the offspring's psychological development prepares it for gaining reproductive success in adult life. Hrdy draws on analogy of cooperative breeding systems among callitrichids, examining male caring behaviours, but offers little discussion of how males among early Homo would have integrated into this framework of mutual understanding. How did problems of male dominance, violence and competition for mates get resolved? A cooperative childcare system implies quasi-sibling solidarity among children. Unless that principle of solidarity can be extended into adulthood, encompassing sexual relationships, selection for intersubjectivity will be limited and language will not evolve.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0079
Brain imaging techniques have greatly improved our understanding of cerebral language processing. Nevertheless, finding the neural correlates of linguistic operations seems to be no less a quest than in the field of consciousness, for which the mapping between psychological experiences and neural activation has been labelled the hard problem (Chalmers, 1995).
Poeppel and Embick (2005) argued that psycholinguistics and the neurosciences face a granularity mismatch and an ontological incommensurability problem. This lacking of a common basis concerns both the granularity levels at which the two disciplines investigate processes and the fundamental elements used and, consequently, prevents the formulation of theoretically motivated, biologically grounded and computationally explicit descriptions of language processes in the brain. To better align the two areas Poeppel and Embick suggest to use computational models, whose operations must be plausibly executable by neural assemblies and represent subroutines of linguistic computation.
We appreciate Poeppel and Embick's analysis and support their proposal to identity computationally explicit processes. Resorting to computer linguistics is however no guarantee that biologically plausible implementations are identified. Artificial neural networks can be built in a way that mimic human behaviour without basing on biology-inspired architecture, as exemplified by the Rumelhart & McClelland (1986) model of English past tense acquisition.
In order to aid the identification of the relevant atomic processes involved in language perception and production we evaluate a model of language evolution with findings from agrammatism research. Johansson (2005) suggested to work backwards from current grammars through a sequence of possible protogrammars by removing pivotal structural features. Johansson's model consists of a four-step hierarchy, which supposedly reflects the appearance of fundamental properties in human language, for the existence of which there is wide agreement across different grammar theories: the emergence of structural constraints, most notably with respect to word order; the emergence of hierarchies, i.e. the occurrence of structured units within larger-scale structures; the emergence of flexibility in the transformational sense, allowing structures to be moved around; and lastly, the occurrence of recursion, proposed to be the only domain-specific computational capacity involved in language processing (Hauser, Chomsky & Fitch, 2002, but see also e.g. Kinsella, 2009).
We hypothesise that the biologically oldest capacities recruited for language processing are least likely to suffer selective impairments from brain injuries as they presumably evolved pre-linguistically. More recent abilities, notably the proposed capacity for recursion, are more likely to have been selected specifically for linguistic purposes. In contrast to their biologically older counterparts, such capacities may be more vulnerable to break-down and affect specific aspects of language production or perception in isolation.
We use findings from studies investigating the patterns of grammar deficits in aphasia patients to assess the proposed hypothesis and to demonstrate how far aphasiology can contribute to the quest of defining linguistic capacities that are evolutionarily layered and real in both linguistic-computational and neurological terms.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0080
In attempting to explain how words first emerged in language evolution, it is often assumed that words map passively onto a pre-existing static conceptual system (e.g., Hurford,2007). However, human thought is very flexible: we can conceptualise the same referent in many different ways, depending on the context and our goal. For instance, we may conceptualise a lion as a dangerous beast, a thrilling sight, an unusual dinner, etc. Moreover, concepts may have different scope for different people. As a result, one of the main functions of words may be to provide an efficient way to share and coordinate conceptualisations with others. Indeed, to the extent that people do conceptualise things differently, it is not easy to imagine how such gaps could be bridged without the help of words and language, and previous work has shown that verbal labels can enhance category learning (Lupyan, Rakison, & McClelland, 2007). Either way, looking into this issue could contribute to our understanding of how much the advent of public symbols transformed human cognition (Deacon, 1997). The question under empirical investigation here is thus the importance of words to conceptual coordination.
Before this issue can be addressed experimentally, there is an immediate methodological challenge. In particular, in order to assess the role of words in conceptual coordination, we need a way of getting at experimental participants' concepts without relying on words. The solution that will be adopted here is the use of free classification tasks (Malt, Sloman,Gennari, Shi, & Wang, 1999): participants partition a set of items into groups, and these groups are assumed to be referential snapshots of their concepts.
I present here an experiment within the free classification framework which pits the importance of words against that of referential information. Pairs of native English speakers conducted a sequence of thirty free classification tasks involving a fluid domain of triangle-like stimuli. In each task, participants had to individually sort a set of eleven stimuli into two categories, and to label their categories. Their goal was to partition the stimuli into the same (or as similar as possible) two groups as their partner (irrespective of the labels used). Participants could not interact or communicate freely during the experiment, but they did receive feedback at the end of each task. All participants were shown their partnership's joint task score, and, depending on the condition that they were assigned to, either their partner's category groupings, category labels, both or neither (thus the experiment had a 2×2 between-pairs design).
The results revealed several patterns. First, although label agreement within pairs correlated with higher task scores, there were plenty of exceptions, where the participants used the same labels but differed in their category groupings, or achieved identical groupings despite using different labels. Second, averaging across the tasks, grouping feedback resulted in significantly higher scores, and these were higher still if accompanied by label feedback as well. On the other hand, label feedback on its own did not result in higher scores. Third, although there was a lot of fluctuation in scores even within pairs, they tended to go up over time, except in the groupings-only condition, where they stayed about the same. Due to these last two patterns, by the end of the experiment, the scores in the both condition were significantly higher than in the other three conditions. Thus it was only in the condition with both kinds of feedback that pairs started off with relatively high scores and improved over time.
Together, these results suggest that rich referential information is more useful than words for conceptual coordination, but thai high levels of coordination can only be achieved when both types are available. Of course, it would be absurd to suggest that early hominins huddled together and explicitly sorted things into categories to coordinate their concepts. However, the current results shed tight on the extent to which language may have revolutionised human cognition. Language does not seem to be crucial for conceptual coordination, but it does enhance it.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0081
Human language is too complex to have emerged in the absence of any evolutionary precursors, which suggests that primitive forms of pre-linguistic communication can be found in animals. Whether this was based on acoustic or gestural communication is an ongoing debate. An argument against a vocal origin is the absence of vocal flexibility and complexity in non-human primates. However, human language is primarily a vocal behaviour, and vocal flexibility – as assessed by vocal plasticity, semanticity, compositionality, and intentional signalling - is not a uniquely human trait. Our research focuses on the precursors of the various types of vocal flexibility in forest guenons (Cercopithecus spp). The vocal tract of nonhuman primates is in principle capable of producing speech-like sounds (Riede et al 2005) and one puzzle is why nonhuman primates do not make greater use of this feature. Instead, primates produce a finite range of calls that develop under strong genetic control. Within some call types, however, some flexibility can be seen at the level of call morphology, as for example demonstrated by socially-determined vocal plasticity and vocal sharing in Campbell's monkey contact calls used in conversation-like socially controlled vocal exchanges (Lemasson & Hausberger 2004; Lemasson et al 2010). Second, many primates produce acoustically distinct calls to specific external events, including Diana and Campbell's monkeys. In both species, the adult males and females produce acoustically different alarm calls to the same predator (Ouattara et al 2009a), but calls are meaningful to others, both within and between species. Alarm calls are not only predator-specific but also vary depending on the modality by which the predator is discovered, i.e. the visual or acoustic domain. In Campbell's monkeys females produced a complex alarm call repertoire, although differences were found between captive and wild individuals. Captive ones did not produce predator-specific calls but had a unique call to humans (Ouattara et al 2009a). For males, we found a repertoire of six call types, which could be classified into different morphs, according to the frequency contour and whether calls were trailed by an acoustically invariable suffix. Suffixed calls carried a broader meaning than unsuffixed ones (Ouattara et al 2009b). The six calls were concatenated into context-specific call sequences, following basic combinatorial principles (Ouattara et al 2009c). In sum, the vocal abilities in guenons go significantly beyond the currently assumed default case for nonhuman primates. Flexibility can be seen at all relevant levels, including limited control over call morphology, conversational rules, ability to produce context-specific calls, and some basic combinatorial properties. The data are at odds with a gestural origins of language theory. Gestural signals do not appear to play a key role in these species, while vocal flexibility is seen in all key components despite the fact that they have split from the human line about 30 million years ago. Field playback experiments will be needed to confirm whether receivers utilise these rich patterns to guide their behavioural decisions. But even in the absence of such evidence data suggest that a strong dichotomy between human language and nonhuman primate communication may no longer be tenable in the vocal domain. The visually dense forest habitat may have played a key role in the evolution of advanced vocalisation skills.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0082
It is well-known that primates – as opposed to, say, songbirds (Marler, 1970; Nottebohm, 1999) or cetaceans (Sayigh et al., 1990; Miller et al., 2004) – are poor vocal imitators, lacking volitional control over their vocal signals. It is equally evident that in this particular respect, humans are a puzzling aberration: evolution seems to have aligned us much closer to ravens or parrots than monkeys or apes! To explain this, it is not satisfactory to work backwards from speech. Speech in its modern form is a massively complex expression of the ability to control vocal signals. During earlier stages in the evolution of language, we would expect less developed capacities, more closely related to the situation among our primate cousins. We would also expect communicative reliance on manual and other visible gestures. This is simply because all primates (for obvious functional reasons) must be neurally equipped to cognitively sequence and control movements of their limbs, fingers and hands.
As and when enhanced capacities for vocal imitation and control in the human lineage evolved, we would expect selection pressures not initially for speech. Instead, on the model of songbirds and cetaceans, we might expect innovations serving purposes associated with vocal chorusing, song transmission, mimicry, deception and play. The evolutionary emergence of speech depends on the antecedent evolution of such vocal capacities; it cannot be invoked retrospectively to explain them.
There are good Darwinian reasons why non-human primates find it difficult to subject vocal communication to volitional control. Much more than facial or manual gestures, which are primarily useful in face-to-face interactions, the more complex sound signals of primates must carry over distances. This means that receivers are less in a position to immediately check veracity, hence more under pressure to insist on signal reliability. A vocal signal which can be manipulated at will is one which can easily be faked; if primate vocalizations are overwhelmingly 'hard-to-fake' it is because receivers over evolutionary time have listened to nothing else. Vocal signals whose acoustic properties suggested that they might possibly be fakes were ignored.
So what changed during the evolution of humans? One context in which hominins might routinely have deployed vocal manipulations could have been hunting, entailing deception of other species. Among Central African Forest hunters, sound signatures of the forest are systematically faked to lure prey. In this paper, we outline a model in which successful deception of non-humans, unable to develop resistance, is redeployed within human groups as the basis for mimetic storytelling, ritual and vocal play. In this perspective, 'language' is interpreted broadly, going beyond narrowly defined speech to consider all the ways that sounds can be strung together to effectively convey meanings.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0083
Certain subsystems of human languages can be profitably studied as self-organizing or emergent systems. In this presentation we will show that the birth of productive affixes from borrowed vocabulary can be treated as an emergent system, using modern databases and the Internet. We present two types of evidence. The first addresses the question of how a language borrows affixes at all. We examine the history of two suffixes in English, -ment and –ity, both of which came into English from French, using the OED on CD-Rom as a database. Both suffixes had their origins in words borrowed individually from French. Before 1600, the great majority of new words in –ity were borrowed; beginning around 1600, the percentage of coinages increased rapidly, soon reaching 80% and plateauing at 95% by 1800. Since 1900, only two words ending in –ity have been borrowed, while close to 500 have been coined. We thus have evidence that -ity became a productive suffix once there were enough exemplars for a tipping point to be reached, which is typical for self-organizing systems. The history of –ment shows that the story is not so simple. Here, although newly coined words outnumbered borrowed words two-to-one by 1600, paralleling –ity, the pattern never became a truly productive suffix. Since 1900, only 30 words ending in –ment have been coined, as opposed to almost 500 in –ity, showing that the number of exemplars is not the only factor in determining whether a system self-organizes.
The second study addresses the synonymous suffix pair –ic and –ical, which we analyze in greater detail. The overall goal in studying such pairs is to discover why a language would systematize synonymous productive affixes. English exhibits a large number of doublets containing both rival suffixes, e.g. geographic and geographical; but with many pairs one member is strongly preferred over the other: statistical is much more common than statistic, while heuristic is much more common than heuristical. First we ask whether both suffixes are in fact productive and whether one suffix is more productive than the other overall. In this study, we measure productivity as the total number of Google hits for every word ending in each suffix: for every word, we do a Google search on the exact word and list the number of hits, which provides a much larger sample than if we measured frequency within a preexisting corpus.
From Merriam-Webster's Second International Dictionary, available online, we identified 11966 stems of English words ending in either –ic or – ical. For each stem, either one or both derivatives was listed in the dictionary. We then searched for words ending in both – ic and – ical in each of the 11966 stems. For some stems, a search will record hits for both – ic and – ical words (e.g. historic and historical), while for some, only one is found (e.g. transoceanic vs. *transoceanical but pseudopsychological vs.* pseudopsychologic). We then determined for each pair whether – ic or – ical had more hits. The one with more hits is the 'winner' for that pair.
We then put this database of –ic and – ical pairs and associated numbers of Google hits in an Excel spreadsheet and subjected it to analysis. Overall, we identified 10613 – ic winners vs. 1353 – ical winners, with an overall ratio of 7.84 in favor of – ic. This demonstrates that overall – ic is more productive than – ical. Finer-grained analysis, however, reveals a subtler story. First, we sorted all the items in our database into reverse-alphabetical neighborhoods of from three to seven letters, not including final – al. When we sort the words in this way, the only set of words ending in – ical with a neighborhood >100 in size is – ological; for this subset only, – ical is the winner over – ic (e.g. psychological over psychologic), by a total ratio of 8.30, almost the exact reverse of the ratio of the full set (7.84 in favor of – ic). In other words, although overall – ic is more productive than – ical, the reverse is true for words ending in – ological. Another set of words for which – ical is more productive than expected comprises those based on nouns ending in – ics (e.g. physics, physical, physic). For these, adjectives in – ic still outnumber those in – ical, but – ical words are twice as common as normal, outnumbered only four to one by – ic, instead of the normal 8.30. Overall, and somewhat surprisingly, English derivational morphology, especially when it involves the emergence of productive affixes from sets of borrowed words (in which English is especially rich), is a fertile proving ground for the study of self-organizing systems in languages, in part because of the databases that electronic resources provide.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0084
Understanding the extra-communicative functions of language (e.g., Clark, 1998) is central for constraining theories of language evolution. If, in addition to its communicative functions, language has extra-communicative functions—affecting nonverbal cognitive and perceptual processes—then an important force in the evolution of language may have been the effect language had and continues to have on such processes. Such reasoning helps to address a question central to the evolution of language: what adaptive benefits did early language users derive from rudimentary linguistic systems?
In the present work I will present an overview of the past 4 years of research in which we find that, across a range of paradigms, language exerts a rapid and automatic influence on basic visual processes. These results provide (indirect) evidence that even in rudimentary forms, language systems may have conferred basic perceptual (and non-communicative) benefits to their users.
A critical aspect of human development is the development of conceptual and perceptual categories—learning that things with feathers tend to fly, that animals possessing certain features are dogs, and that foods of a certain color and shape are edible (Rogers & McClelland, 2004; Keil, 1992; Carey, 1987). This conceptual acquisition is, in principle, separable from the acquisition of language (a child can have a conceptual category of "dog" without having a verbal label associated with the category). However, in practice the two processes appear to be intimately linked. Not only does conceptual development shape linguistic development (Snedeker & Gleitman, 2004), but linguistic development, specifically learning words for things, appears to be impact conceptual development (Gentner & Goldin-Meadow, 2003; Lupyan, Rakison, & McClelland, 2007; Waxman & Markow, 1995). The empirical findings in the present work argue that such effects of language are not limited to long-term effects on conceptual development. It is argued that language exerts a on-line modulatory role on even the most basic perceptual processes.
The data come from standard paradigms from the vision sciences: visual search, mental rotation, cued target-detection, simple detection, and picture verification. These experiments license the following broad conclusions:
1. Hearing a category label such as "chair" facilitates the visual processing of the named category compared to trials on which participants know the relevant object category but do not actually hear its name. In some instances, producing the verbal label has similar facilitatory effects (Lupyan, 2007, 2008b, under review).
2. The above effects are transient, having a characteristic temporal profile, and are heavily modulated by the typicality of the visual exemplar. Visual processing of more typical items is more facilitated by hearing their name (Lupyan, 2007; 2008b; under review).
3. Hearing a label increases the perceptual saliency of the named category, enabling people to detect objects that are otherwise invisible (Lupyan & Spivey, 2008; under review).
4. Very brief amounts of training can alter the associations between labels and object categories suggesting that, at least in adults, such linguistic modulation of perception is highly flexible (Lupyan, 2007; Lupyan, Thompson-Schill, & Swingley, in press).
Ongoing work is showing that verbal labels evoke associated perceptual representations faster and more reliably than nonverbal stimuli. For example, people activate the visual properties of a cat faster when they hear the word "cat" than when they hear a meowing sound. Specifically, it appears that verbal labels come to have a special status of being able to activate categorical representations.
This linguistic modulation of perception may have important consequences for higher-level cognition such as the learning of new categories (Lupyan et al., 2007), memory (Lupyan, 2008), and conceptually grouping items along a particular dimension (e.g., color) (Lupyan, 2009) as well as inference in reasoning.
Theories of language evolution have maintained an almost exclusive focus on the communicative aspects of language. The present findings show that simple word-object pairings can modulate even basic visual processes, providing support for the idea that even in its early stages, languages may have conferred cognitive and perceptual benefits on their users.
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0085
Recent discussions about the evolution of communication have stressed a perceived qualitative distinction between humans and our closest evolutionary relatives, the great apes, wherein human nature is described as uniquely cooperative relative to the more competitive great apes (e.g. Tomasello, 2007). One major argument at the root of this cooperative hypothesis of the origin of language is the relative lack of declarative as opposed to imperative (request-driven) communication demonstrated by apes. For instance, it has been reported in multiple studies that apes cannot, as a rule, glean information from a human's declarative cue in a cooperative paradigm, although other, arguably more cooperative species can do so (see Lyn & Hopkins, 2009 for a review).
We recently tested apes reared in different environments on a declarative comprehension object choice task (Lyn, Russell, & Hopkins, in press). Significantly higher scores on the task were obtained from the two groups of apes that were reared in a socio-lingutstically complex environment (as part of a language project at the Language Research Center, Atlanta, GA (LRC) compared to the two standard-reared groups (F(3, 57) = 6.54, p<.01). This rearing difference was seen in both proximal and distal pointing tasks. The results further showed that bonobos, an allegedly more cooperative species, did not outperform chimpanzees (t(59) = 1.78, p>.05). Our results demonstrated that environmental factors, specifically access to a socio-linguistically rich environment, directly influence apes' ability to comprehend declarative signals. To determine the ability of great apes to produce declaratives, we compared the lexigram and gesture utterances of two bonobos and one chimpanzee reared at the LRC to the verbal utterances of two normally-developing children. All three apes made declarative gestures and lexigram utterances. These utterances fell into most of the same categories as the declaratives made by the children, including declaratives about past events, future behavior, and naming concrete and nonconcrete items, among others. Apes and children did differ in the frequency of declaratives (χ2(1, n=l10074) = 8315, p<.001) and the frequencies of certain declarative types (e.g. the children had more declaratives that were concrete naming (χ2(1, n=745l) = 283.8, p<.001), and the apes had more declaratives about future plans (χ2(1, n=7451) = 572.3, p<.001)).
According to our results, both chimpanzees and bonobos are capable of utilizing the environment to support declarative communication. As the end result is a distinction of quantity, we must look elsewhere for a qualitative communicative difference between ourselves and our nearest evolutionary relatives, for instance, in the formation of social structures that support these abilities.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0086
The following sections are included:
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0087
Although chimpanzees have an equally long evolutionary history as our own since the last common ancestor of these two species, their morphology is often taken as a closer approximation of the ancestral condition. Early chimp language training experiments failed to elicit language-like output in the vocal channel, for reasons that may relate both to neural control systems and to the morphology of the vocal tract itself. It therefore remains relevant to ask whether the geometry and muscle architecture of the chimpanzee vocal tract could enable the stable deformations required to produce a human-like phonological repertoire. If not, then it also becomes relevant to ask at what point in human evolution the hominin vocal tract evolved a shape that was consistent with such articulatory manoeuvres.
We obtained CT-scans of an ontogenetic series of chimpanzee cadavers from the Zurich Anthropology Institute collections, and isolated their vocal tracts for morphological analysis and acoustic/articulatory modelling. We analysed these vocal tracts in three dimensions, and identified a set of anatomical features that define characteristic constrictions and expansions along their lengths. Acoustic modelling of these tracts in a multi-tube configuration (which best approximated the observed morphology) enabled us to predict their passive or resting-state acoustic potential, and also enabled us to digitally perturb this resting-state configuration to explore their phonetic potential in relation to the human vowel triangle. We will compare our anatomical results with those of Nishimura (e.g. 2005), and our acoustic results with those of Lieberman, Cretin and Klatt(1972).
We also analysed this chimpanzee head-and-neck CT scan series and a sample of 20 adult human head-and-neck CT scans, to identify hard-tissue landmarks and inter-landmark relationships that constrain the position of soft tissue features of the vocal tract of each of these extant species. We then applied these relationships to predict hyoid position in Neanderthals using an additional set of 3D-scanned fossil skulls from this extinct species. We report a potential envelope of hyoid positions for adutt Neanderthals based on linear regression analysis, in which we predict their hyoid positions from inter-landmark distances of skull and mandible in three dimensions (using human and chimpanzee as alternative reference models). The human models result in anatomically-viable hyoid positions for Neanderthals whereas the chimpanzee models result in less anatomically plausible Neanderthal hyoid positions. The Neanderthal hyoid bone most likely occupied similar positions in relation to skull and mandible (and the vertebral column) to those observed in modern humans, which means that it cannot be used to discount the presence of human-like speech in this species.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0088
The structure of language is the result of a complex interaction between biologically given prior biases, the arena of language use, and the space of meanings that language users convey (see, e.g., Hurford, 1990). Recent research has attempted to understand these interactions by focussing on the way in which languages persist over time by repeated cycles of learning and use – a process called iterated learning (e.g. Kirby et al., 2007).
Previous work has used a combination of artificial language learning and diffusion chain methodologies to observe the iterated learning of language in human participants in laboratory conditions (Kirby et al., 2008). The results from these studies demonstrate that compositional structure gradually emerges as an adaptation by language to various pressures placed on it during cultural transmission.
A noticeable feature of this work, however, is that the language structure that emerges is bound to reflect precisely the fixed (and finite) structure of the set of meanings that experimenters have pre-specified. In particular, the meanings used in these kinds of studies have typically been made up of a finite set of features each taking one of a fixed finite set of values.
Of course, in reality we use language to convey meanings from a set that is neither finite nor parceled neatly into features and values. Instead, the world is by-and-large characterised by continuously variable dimensions which permit a wide range of possible categorisations. Put crudely, although language is digital, the world it refers to is analogue.
If the broad conclusion of previous work on iterated learning is correct – that language structure is an adaptive consequence of being culturally transmitted – then we should expect languages to evolve whose signals reflect structure in meanings even if those meanings are drawn from a continuous space. To test this, we ran a modified form of Kirby et al.'s (2008) cultural transmission experiment in which participants were asked to try and learn strings in an "alien" language that was in fact the previous participant's output at test (with the first participant being exposed to a completely unstructured, random seed language). Whereas Kirby et al. (2008) used pictures to represent the alien meanings, which contained one of three shapes in one of three colours moving in one of three possible ways, we used pictures drawn from a continuous space of shapes that smoothly vary between triangles and rectangles (see Fig. 1).
The results are striking: just as in the previous work, the languages adapt to become more learnable as they are passed down each of the different chains of participants in the study. The measure of learnabiliry is determined by the transmission error in the language from one participant to the next in each chain of participants. In addition, as before, linguistic structure emerges out of the random initial state. However, in this experiment there are no clear dimensions or pre-specified category boundaries for language to adapt to. Instead, each chain of participants in the experiment gives rise to a language that supports a distinct way of conceptualising the space. That is, we find chain-specific variation in the categories that emerge in each language (e.g., some categories account for rotation, while others do not). The particular scheme a language uses emerges gradually in the experiment, often out of a competition between different ways in which similarity can be defined between shapes, and is reliably maintained after the sixth generation (of a total ten generations).
This study supports the general conclusions of the ongoing work on iterated learning, that language structure arises from the cycle of repeated learning by individuals in a population. In addition, it demonstrates that iterated learning can be used to explore the emergence not only of features like compositionality, but also of categorisation and conceptualisation.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0089
Human language is an extraordinary and unique means of communication involving left-hemispheric specialization for both production and comprehension, and specific properties such as intentionality, flexibility, categorization, referential properties, etc. Since nonhuman primates are very close to humans from a phylogenetic standpoint, research on their communicative systems might provide elements for inferring the features of our ancestral communicative systems. There is a considerable debate about whether precursors of language may be found either in the gestural or in the vocal communicative systems of our primate cousins within theoretical framework about the origins of language (e.g., Ghazanfar & Hauser, 1999, Corballis, 2002).
In the present paper, we propose that the advocates of gestural vs vocal origins of language might be reconciled if we foresee distinguishing the origins of the linguistic perceptive system from the human speech production system. In fact, it turns out that most of the arguments proposed by the proponents of the vocal hypothesis come from the findings that are specifically related to the perception of vocalizations, that might involves amodal processes of categorization and understanding of the external world, rather than the vocal modality itself (Meguerditchian & Vauclair, 2008). It is largely admitted that conspecific vocalizations are referential since listeners are able to extract information from vocal signals such as the identity of the caller, the nature of the social relationships among conspecifics, matrilineal kin, dominance rank. In congruence with such a potential behavioral continuity with speech comprehension, behavioral asymmetries and neurobiological studies showed that perception of vocalizations involve a left-hemispheric dominance and a cerebral circuit that might be related to Wernicke's area in humans (involved in language comprehension). Concerning the production of vocal signals, contrary to speech production, although a certain degree of audience's effect and of plasticity of the vocal system have been demonstrated in nonhuman primates, (1) they fail to be dissociated from their appropriated emotional context and are related to a whole group rather than a specific recipient; (2) the vocal repertoire remains inextensible across groups of a given species and the subtle inter- or intra-group structural variations reported in some vocal signals concern only existing species-specific vocalizations of this repertoire; (3) there is no evidence that vocal production involves left-hemispheric specialization and homologous language's areas, but rather subcortical areas and the limbic system (related to emotions in humans). By contrast, as in human language production, the use of communicative gestures appears to be (1) much more flexible within a extensible gestural repertoire, (2) independent of a specific social context; (3) clearly intentional, exclusively directed to a specific recipient (Pika, 2008); (4) to involve left-hemispheric specialization and homologous of Broca's area in chimpanzees according to both behavioural asymmetries studies and neuroanatomical and neurofunctional imaging studies (Taglialatela et al., 2008), In conclusion, we suggest that the abilities in nonhuman primates for processing meaningful vocalizations are better related to their remarkable capacity to understand and categorize the external world, including vocalizations and visual events (Cheney & Seyfarth, 1990), rather than to their specific vocal production system. Then, such features might constitute the precursor of the representational processes involved in the comprehension of language in humans whereas the properties of gestural communication appear more convincing for inferring the prerequisites of the speech production system.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0090
The following sections are included:
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0091
Behavioural and brain asymmetries at a population level have been historically considered as unique to human evolution and exclusively associated to the emergence of speech. But the recent highlighting of similar asymmetries in numerous species of vertebrates questions the validity of this hypothesis (see Rogers & Andrews, 2002; Vallortigara & Rogers, 2005; Hopkins, 2007 for reviews) and opens new debates concerning the neurobiological origin of language. This very controversial topic opposes two basic ways of thinking: (1) the gestural origin of language (see Meguerditchian & Vauclair, 2008 for a review) and (2) the vocal origin of language (e.g. Zuberbühler, 2005). However, an increasing number of studies converge to the hypothesis according to which language is the product of the evolution of gestural communication rather than vocal communication (Corballis, 2002; Arbib, 2005). Because non human primates are phylogenetically closed to humans and show behavioural asymmetries at the population level, they appear to be an ideal model to investigate the precursors of brain hemispheric specialization in humans. Studies on great apes (Hopkins, 1995; see also Pika, Liebal, Call, & Tomasello, 2005 for a review) and other non human primates (Vauclair, Meguerditchian, & Hopkins, 2005) showed a significant hand preference at a group level for a coordinated bimanual task and, with a stronger effect, for a communicative gesture. This latter result indicates that gestural communication in those species could involve a specific cerebral system, different from the one involved in non communicative bimanual coordinated tasks. In this context, the study of hand specialization in gestural communication appears necessary to investigate the precursors of speech. Because the existence of such a specific system involving a higher degree of lateralization is still uncertain, more comparative studies between tasks and species, including similar experimental procedures are needed. The present study aims to bring such elements using an adaptation of the Bishop's Quantifying Hand Preference task (QHP, Hill & Bishop, 1998), which allows the precise and comparable measurement of the degree of handedness in different tasks. We used the QHP test with two different species of non human primates, olive baboons (Papio Anubis) and rhesus macaques (Macaca mulatta), to conduct three experiments: (1) a simple reaching task, (2) a bimanual coordinated task and (3) a gestural communication task, in which subjects have to point at an object in order to obtain it. Results from simple reaching task revealed the crucial influence of item position on handedness in both species, even from position separated of only 30° from the sagittal median plan of the subject. For this task, although numerous baboons and macaques were individually lateralized, we did not observe population-level handedness for the central position. On the other hand, preliminary results of the gestural communication task showed a population-level right handedness for the same central position. Concerning the other positions, individuals tended to use more their right hand to point than to reach an object. Data from the bimanual coordinated task are currently under analysis. This set of results will be discussed within the theoretical framework about hemispheric specialization for objects manipulations, communicative gestures and the origin of language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0092
Among the many theories of language evolution, the Gestural Origins Hypothesis (Hewes, 1973; Corballis, 2002) makes clear and testable predictions about common neural substrates for language and complex body actions. In particular, there is growing evidence that the left hemisphere is responsible for action planning (Frey, 2008). While the archaeological data are silent about language origins, they clearly show an increase in tool complexity at 1.7 million years ago with the beginning of the Acheulean industries. A co-evolution of complex language and hand actions would predict shared neural areas for language and Acheulean tool-making.
This paper will present the first ever study to directly compare cerebral blood flow lateralisation for a language task and a complex stone-tool-making task, using functional transcranial Doppler ultrasound. This non-invasive method measures relative blood flow changes between the cerebral hemispheres at all moments during task execution, allowing the study of precise temporal events (Deppe et al., 2004). In addition, it is well suited to studying stone tool-making because it does not restrict participants' natural postures or movements.
According to previous studies using PET (Stout & Chaminade, 2007, 2009; Stout et al., 2008), complex stone tool making (Acheulean) caused increased lateralisation to right Broca's homolog, but no activation of action planning circuits was found. However, according to Kroliczak & Frey (2009) there should be left hemisphere activation for tool-use planning. Our experiment was designed to isolate the complex planning component of Acheulean handaxe production, so that the motor execution of the task was controlled for. In the tool-making task we contrasted cerebral blood flow during handaxe production with a control condition in which hammerstones were struck together. In the language task we used the standard paradigm of silent word-generation.
We will present the results from ten subjects and discuss their implications for the Gestural Origins Hypothesis of language evolution. Our previous work on body action and speech recognition (Meyer et al., 2009) shows that both tasks invoke shared circuitry in the brain and is consistent with the view that complex action planning and speech may have co-evolved. Our functional transcranial Doppler ultrasound data isolate the motor planning components required for the creation of the earliest complex tools and therefore allow a direct test of whether complex action planning and speech both are left-lateralised. We show highly correlated (p<0.05) relative lateralisation changes in the target compared with the rest condition.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0093
One of the most prominent questions when working on dialogue is why speakers of all languages have different ways to express the same idea depending on different communicative circumstances. Research on information structure addresses exactly this question. This paper presents an ongoing study of the evolution of information structure through computational experiments in which autonomous communicative agents engage in a series of language games.
Information structure can formally be represented in various ways, such as through prosody, grammatical markers, order of syntactic constituents in the form of complex grammatical constructions, depending on which language strategy is accepted by the language community (Steels, to appear). A language strategy consists of a set of procedures helping members of a community to become and remain successful in communication. More specifically, these are procedures for acquiring or inventing new concepts, words or grammatical constructions demanded by the context. Additionally, a language strategy provides a mechanism for aligning the language systems of interacting agents.
This research presents a computational implementation of the language strategies that are needed for investigating the evolution of information structure through a case study of German, which expresses information structure through word order influenced by case, focus and determination constraints (Lenerz, 1977). Speakers of German have to establish knowledge about those constraints to produce and understand utterances correctly. Typologically speaking, German is an interesting case because its combination of syntax and focus structure does not nicely fit the cross-linguistic tendency of either having a rigid syntactic structure combined with a flexible focus structure or a rigid focus structure combined with a flexible syntax.
In the first phase of the experiments, we operationalized the German language system for expressing information structure by reverse engineering the necessary production and comprehension procedures in Fluid Construction Grammar (FCG) (De Beule & Steels, 2005). This formalization effort includes the ontology, lexicon and grammatical constructions that are necessary for handling German declarative sentences including intransitive, transitive and ditransitive verbs. In the next step, we designed a language game for studying the function of this language system in symbolic communication. We propose a Question-Answering (QA) game, in which two agents are randomly picked from the population. One agent asks a wh-question about a jointly observed scene. Joint attention between the agents is required and assumed. The other agent has to respond to the question accordingly.
The next set of experiments investigates the learning mechanisms which are necessary for agents to acquire information structure in communication. The population includes 'tutor agents' and 'learning agents', not possessing the same grammatical proficiency. Interlocutor-1 (I-1) chooses one of the observed events and picks a topic, which corresponds to one of the event participants. I-1 produces a question by using the FCG rules and expects an appropriate answer from Interlocutor-2. All agents can always play both interlocutor roles. The experiments use local measures for steering linguistic behavior: communicative success and cognitive effort. The results show that even though explicit focus-marking is not necessary for reaching communicative success, learners acquire the correct rules for emphasizing the participant asked for in order to reduce cognitive effort. Cognitive effort is measured by calculating the number of processes that are needed for retrieving the topic of the question (e.g. case comparison). As argued by e.g. Selkirk (1986), questions allow control over which syntactic constituent in the answer has to be emphasized. In the current scenario this means that it has to be focus-marked by using a grammatical rule.
The final set of experiments investigates the emergence of information structure, by allowing the agents to innovate if they experience too much processing effort or if they want to avoid ambiguity. It is shown that given the right language strategy agents can develop language systems for conventionalizing the position of the topic in an utterance, on top of existing determination and case systems.
The current set-up only involves one language strategy, but future research includes how multiple strategies, such as fronting, intonation and word order, compete with each other in a single population, and how this might lead to different kinds of language systems for expressing information structure.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0094
Within the growing perspective of cultural transmission as a pivotal component to language evolution (e.g., Kirby, Christiansen & Chater, 2009; Tomasello, 2008), it has been proposed that language has been substantially shaped by the human brain—constrained in its evolution by socio-pragmatic, perceptuo-motor, cognitive and other nonlinguistically-specific factors defining the nature of human learning and processing biases (Christiansen & Chater, 2008). As one such factor, pre-existing learning mechanisms for the processing of sequential structure may have played a substantial role in the evolution of human language. This hypothesis is supported from artificial grammar/language learning studies and computational simulations examining the relationship between sequential learning biases and the structure of evolved languages (see, e.g., Kirby et al., 2009), but few studies exist that directly test within individuals for an empirical link between such learning and language.
A clear prediction of the above theoretical view would be that observed variation in language processing performance should be associated with variation in sequential learning abilities. We investigated this hypothesis, using a within-subjects design in which 50 monolingual native English speakers were assessed on both sequence learning and on-line language processing. In our sequence-learning task, an artificial language (Gómez, 2002) was instantiated within an adapted serial reaction time (SRT) task, thereby providing continuous reaction-time (RT) measures of learning as it unfolded. The group learning trajectory revealed a gradually emerging sensitivity to nonadjacent dependencies in the artificial language. Learning was further confirmed by an offline standard grammaticality judgment post-test in which scores were significantly above chance. Crucial to our study aim, we calculated a learning score for each participant by subtracting their RT performance in the initial training block of string-trials from that in the final training block, with resulting scores reflecting substantial individual differences in pattern-specific sequential learning.
To determine whether good sequential learners are also good at tracking the long-distance dependencies characteristic of natural language, the same participants completed a word-by-word self-paced reading task involving sentences with center-embedded subject- (SR) and object-relative (OR) clauses. Individual differences in processing these sentences are well documented, with ORs eliciting longer reading times, especially at the main verb (King & Just, 1991). We found a positive relationship between continuous individual differences in sequential learning and better processing performance for the relative clauses at the main verb. Additionally, when classifying learners as "good"/"poor" based on scores from the sequence learning task, good learners displayed reading patterns characteristic of more proficient language processors, with less difficulty at the OR main verb and less of a divergence in processing patterns for the two clause-types.
These findings thus provide an empirical association between individuals' on-line sequential learning of nonadjacencies and their on-line processing of complex, long-distance dependencies in natural language. These results further dovetail with recent molecular genetics findings implicating the involvement of FOXP2 in sequential learning on a SRT task. Common allelic variation in FOXP2 was associated with differences in sequential learning patterns, which in turn were linked to variation in grammatical abilities (Tomblin et al., 2007). This suggests that FOXP2 may have served as a pre-adaptation for human sequential learning mechanisms, while providing further evidence for a key role of such abilities in the cultural transmission of language. By empirically connecting sequential learning and language, these studies thus offer a heretofore missing link in the cultural evolution of language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0095
In 1866, the Société de Linguistique of Paris met to issue a moratorium on papers concerning language evolution. Less well known is that the same meeting also prohibited papers discussing universal languages. These universal language studies were speculative attempts to rediscover the pre-Babel tongue, and several proposed that the arbitrariness of the mapping between the spoken form and the meaning of words was an indication of imperfection in human communication. If one could describe a perfectly systematic language in terms of form-meaning mappings then one could rediscover the universal language (see Eco, 1995, for a review). This ideology led to artificially constructed languages such as Wilkins' "Analytical Language" (1668), in which similar concepts had orthographically similar referents. However, systematicity results in confusions because similar sounding words are being used in similar contexts, with survival: "edible plants could be confused with poisonous ones, and animals that attack be confused with benign ones" (Corballis, 2002, p. 186).
We contend, instead, that the cultural evolution of language has resulted in a crucial balance between arbitrariness and systematicity (see also Tamariz, 2008). Though arbitrariness may be advantageous for individuating referents, it impairs recognition of similarities among words. Indeed, form-syntactic category mappings demonstrate a high degree of systematicity in natural language (Monaghan et al., 2007). In a series of computational simulations and experiments we tested the extent to which arbitrariness benefits individuation and systematicity benefits categorization in a language learning task.
In two simulations, we trained a feedforward connectionist model to learn mappings between form and meaning representations for 12 words referring to 6 actions and 6 objects. The representations were either correlated (systematic) or uncorrelated (arbitrary). The second simulation had additional contextual information as a part of the model's input by indicating whether the word was an action or an object, more realistically representing the multiple contextual cues available in the language learner's environment. In two experiments, we trained undergraduate participants on the same 12 words and pictures of actions and objects, either with or without contextual information in the form of distinct words preceding either the object or the action word. We tested performance at four points in training in terms of learning to: (1) individuate the meanings of the words and (2) categorise the words into objects or actions.
Without context, for both the simulation and the experiment, there was a systematic advantage for individuation and categorisation. Learners exploited the generalizations in the systematic relationships. With context, for the categorization task there was a systematic advantage for both the simulation and the experiment. However, for individuation, there was a significant interaction between training time and systematic/arbitrary condition – an initial systematic advantage became an arbitrary advantage later in training.
Our studies illustrate the importance of both arbitrariness and systematicity, and indicate how each contributes to learning different aspects of language, interacting in complex ways with contextual information in the environment. We suggest that because word-learning involves not only discovering its form-meaning mapping but also how to use it in syntactic contexts, language has evolved to balance arbitrariness and systematicity.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0096
Human speech is a hierarchically organized coding system in which meaningless sounds, called phonemes, are combined into larger meaningful units: words. An important role in the coding process is played by formants - vocal tract resonances that can be altered rapidly by changing the geometry of the vocal tract using different articulators such as tongue and lips. Changing the formant pattern of an articulation results in a different vowel produced. Although human voices differ in acoustic parameters such as fundamental frequency and spectral distribution the relative formant frequencies of an utterance enable intelligibility of speech regardless of individual variation across speakers. Although it has been argued originally that speech is special and uniquely human (Lieberman 1975), several studies have shown that some aspects of speech perception also apply to other species. Chinchillas for example show the same phonetic boundary effect as humans do when discriminating between /d/ and /t/ consonant-vowel syllables (Kuhl & Miller, 1975). Nevertheless, there is still an ongoing debate about which characteristics of speech production and perception are unique to humans and which are shared with other species (Hauser et al., 2002; Trout, 2003; Pinker & Jackendoff, 2005). In this study (Ohms et al., 2010) we addressed the question whether birds (zebra finches) are able to distinguish between spoken words with a minimal difference in acoustic features, and, if so, which cues they might use to do so. We trained 8 zebra finches on a go/no-go operant conditioning task to discriminate between the Dutch words wit (wIt) and wet (Wεt) which differ in their vowels only and which were recorded from several native speakers. The results show that zebra finches when trained to discriminate a single minimal pair can transfer this discrimination to unfamiliar voices of the same and even to the other sex. When confronted with new voices the discrimination performance was immediately clearly above chance level. However, our data also revealed a learning process since performance increased constantly. This suggests that both intrinsic and extrinsic speaker normalization are involved in discriminating between the two words. These results indicate that formant normalization and the capability of normalizing formant patterns across different speakers and sexes is a perceptual trait that not only occurs in humans but also in songbirds.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0097
Most language evolution research focuses on primates, positing a hominid transitional link with the beginnings of learned vocal communication. Interest in primate models of language evolution increased after apes, humans' closest genetic relatives—although incapable of acquiring full, complex human language—learned elements of human communication systems. But how vocal language, and vocal learning, developed from what was likely a precursor gestural communication system is still a matter of speculation. Other species, however, phylogenetically distant from primates, notably Grey parrots (Psittacus erithacus) and cetaceans, acquire human-like communication skills comparable to those of great apes (Hillix & Rumbaugh, 2003), and, unlike present-day nonhuman primates, engage in vocal learning. Many studies have also demonstrated striking parallels between both the ontogeny and the neurological underpinnings of vocal communication in birds and humans (e.g., Jarvis et al. 2005). Recently, an avian species, once thought to be incapable of vocal learning, has shown elements of such acquisition (Kroodsma, 2005; Saranathan et al. 2007), suggesting that it might be a living avian model for the transitional link between our nonvocal-learning and vocal-learning hominid ancestors. This paper explores the data supporting use of such an avian model for language evolution.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0098
Chimpanzees vocalizations are referential, but only functionally (Mitani & Brandt, 1994; Slocombe & Zuberbühler, 2006). Contrary to human linguistic signs – which are used to influence what others know, think, believe, or desire (Grice, 1957) – they don't intentionally provide conspecifics with information(Seyfarth & Cheney, 2003). On the other hand, research on gestural abilities of human-reared or language-trained great apes showed that signalers use gestures in intentional and referential ways (Gardner et al., 1989; Savage-Rumbaugh et al., 1986). Furthermore, Pika and Mitani (2006) recently described a distinct gesture, the 'directed scratch', used by adult chimpanzee males in the wild to indicate just where on their bodies they wished to be groomed.
The present study aims to provide further insight on the meaning and function of 'directed scratches' and other related grooming gestures by distinguishing between the perspectives of signalers and receivers (Smith, 1965). Analyses are based on behavioral observations of ~100 grooming sessions between twenty adult male chimpanzees, collected during June and July 2008 at the Ngogo community, Kibale National Park, Uganda.
The results reveal that chimpanzee signalers use their grooming gestures in flexible, manifold ways to request an intended meaning, which is understood by chimpanzee recipients. "Answers" to these request however vary in relation to rank, age and strength of social bonds between signalers and recipients. These results will be discussed with a special focus on recent theories of gesture acquisition, signal evolution, and cooperation.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0099
The crucial component uniting contemporary theories that posit a role for sexual selection in language evolution is the postulation that language, or some aspect of it, served as an honest indicator of an individual's genetic or phenotypic quality. Okanoya (2002) and Mithen (2005) have proposed the existence of an ancestral 'protolanguage' constituting elaborately structured vocalisations and body movements that was driven in part by sexual selection.
An interesting question that follows from this hypothesis concerns the nature of the relationship between such a protolanguage and underlying genetic quality. The answer may lie in the developmental stress hypothesis (DSH) which has been formulated as a result of findings from experimental studies involving observations or manipulations of early stress conditions in avian species. Environmental stress affects the development of forebrain structures necessary for the production of song features shown to be important to females when selecting mates, and also affects other aspects of male phenotypic quality (Nowicki et al. 1998). There is also some evidence that certain genotypes fare better than others in conferring resistance to developmental stressors (see Buchanan et al. 2004). Given the striking similarities between birdsong and human language, it is important that the DSH is acknowledged, investigated, and applied to understanding language evolution. As Ritchie, Kirby & Hawkey (2008) suggest, the DSH may provide an explanation for the evolutionary maintenance of vocal learning in both songbirds and humans. Perhaps at some stage in language evolution, the ability to learn and produce structurally complex vocalisations served as an indicator of an individual's ability to cope with the costs involved in the growth of neural substrates underlying fine motor control in the face of environmental stress.
A concomitant issue concerns whether developmental stress influences language development in contemporary humans. One hypothesis that can be investigated empirically is that individuals use linguistic adeptness to evaluate the developmental stability of potential mates or in maximizing indirect benefits by choosing mates who possess high quality genotypes or those that confer resistance to environmental stressors. The first main prediction to he tested is that developmental stress affects brain development and linguistic competence. The second is that developmental stress affects the ability to produce linguistic parameters important in mate choice. A recent developmental study following a natural disaster provides support for the former (Laplante et al. 2008). The longitudinal extension of studies such as this is encouraged to determine whether linguistic deficits persist in later life, and it may be fruitful to involve subjects in studies of human mating preferences that focus on linguistic parameters in order to investigate the latter prediction. Moreover, the close examination of individual differences in groups exposed to similar levels of stress may reveal whether certain genotypes cope better with environmental stress than others. This may help to ascertain whether linguistic competence functions as a marker of developmental stability or if it is intrinsically related to underlying genetic quality. These large-scale studies can be complemented by experiments investigating possible relationships between linguistic ability and proposed morphological measures of developmental stability.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0100
Human populations are organised in social networks, which are structurally distinct from other types of network (Newman & Park, 2003). Social structure provides a constraint on the transmission of language, such that language is influenced by the structure of the population over which it is transmitted (Chambers, 2003), a relationship which is often ignored in models of language evolution. Furthermore, social networks and languages might co-evolve: human social networks may have the form they do in part due to the transmission of language over those networks. We present a model of the co-evolution of social network structure and language by means of learning and interaction among agents in a dynamic population. This study models network growth and language evolution using a plausible mechanism, namely homophily, the tendency of individuals to establish and maintain social bonds based on similarity (McPherson et al., 2001).
Boguñá et al. (2004) have shown that the presence of communities in a social space is sufficient to trigger social structure when the agents show a preference for similarity. However, they treat communities as predefined static entities. Centola et al. (2007) demonstrate the influence of homophily in the co-evolution of both network structure and cultural diversity.
We define a model of cultural transmission over a dynamic network using homophily and cultural learning. The model consists of a population of agents. Each agent is randomly assigned a position in the social space, represented by a vector of continuous real numbers. This is analogous to a number of linguistic trails that an individual possesses. The social distance between two agents is defined as the sum of the difference between each trait. The population is represented by an un-weighted, undirected network of vertices. It is initially unconnected and evolves by three methods & attachment to similar vertices, detachment from dissimilar vertices, and learning from adjacent vertices. These correspond to mechanisms of homophily and cultural learning events such as language learning.
Evolved networks (Fig. 1) possess the characteristic measures of social networks: assortativity, transitivity and community structure (Newman & Park, 2003). Social distance shows a positive correlation with network distance, displaying emergent social clusters of similar languages. This shows that individuals communicative success decreases with social distance. The rate of learning affects the size, density and linguistic diversity of the communities that form. The model demonstrates that the existence of a learnable language and a preference for establishing and maintaining connections with similar individuals can lead naturally to social structure. The co-evolutionary dynamic of homophily and learning influences both the topology of the resulting network and the type and distribution of the emergent languages.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0101
Social markers appear to originate, at least partly, in the need to protect cooperative networks from outsiders (Nettle & Dunbar, 1997). Simulations suggest that the use of language as a source of such markers introduces a selective pressure—social selection—to language evolution, which contributes significantly to the development of linguistic diversity (Nettle & Dunbar, 1997; Nettle, 1999). Other simulations, however, have challenged these findings (Livingstone, 2002), suggesting that a high level of diversity can emerge through variation in the frequency of interaction, without the need for social selection. Similar disagreements exist in sociolinguistics (e.g. Labov, 2001; Trudgill, 2008; Baxter et al., 2009).
To investigate this question experimentally, a study was carried out in which participants played an economic game that involved negotiating anonymously, on an instant-messenger-style program, to exchange resources. The experiment involved 80 participants and had a 2 × 2 design, with five games of four players each in every condition (see Table 1). Each game consisted of a series of rounds in which every player was partnered with one of the other three players; no two players were paired up for more than two rounds in a row. Each player began the game with 28 points of resources and, during each round, negotiated to ex-change resources by typing messages to their partner in an artificial 'alien language' of twenty randomly generated words (e.g. seduki, kago). All four players were trained on the same language. After negotiation, players could give resources away to their partners; any resource given was worth double to the receiver. In the cooperative conditions, all four players in a game belonged to one team, and the object was to accumulate resources for the team. In the competitive conditions, players were divided into two teams of two, and the object was to acquire more resources for one's own team than the opposing team. It was thus advantageous to give gifts to team-mates, but not to opponents. Except in the first round, however, players were not told whether they were negotiating with a team-mate or opponent until the end of the round. Since the alien language contained only twenty basic lexical items (e.g. I, want, meat, thanks etc.), there was little scope for developing explicit strategies, and players had to rely on linguistic cues to identify team-mates.
The frequency with which players were paired was also manipulated (see Table 1); players were not made aware of the frequency or order of pairings.
Players in the high-frequency competitive condition did significantly better than chance at recognising when they were paired with team-mates (S = 75 incorrect guesses out of 285; p < .001). In addition, the alien language diverged significantly into team 'dialects' in this condition only (p < .001), based on how often different players used different variant forms in the alien language (new variants having arisen chiefly through error). This suggests that a combination of frequent interaction and a pressure to mark identity can lead to divergence over a short time period. Neither factor, however, was sufficient on its own.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0102
Language is considered as the great divide between the cognitive and social ability of humans and those of other animals. How did language emerge and evolve into such a complex system? A promising approach to this is to compare and contrast language with animal signal systems, which involve key substrates for language. We focus on birdsong development from the aspect that language is a learned vocal behavior. Although birdsong is different from language in many ways, they share biological foundations for vocal learning. So far birdsong study has yielded significant implications for language evolution; for example, a cultural evolution of birdsong (Fehér, Wang, Saar, Mitra, & Tchernichovski, 2009).
Here we study the development of phonology and syntax in Bengalese finch song. Adults sing complex song that consists of a number of chunks, which in turn consist of a few patterned notes (Okanoya, 2004). Juveniles learn individually distinct song by imitating adult males. To track the entire song development, 24-hour recording was conducted for 16 juveniles every 4 to 5 days after hatching. When recording, each juvenile was kept in a soundproof box with a microphone, and all singing activities were recorded. From all the recordings, we computed six acoustic features of notes, such as note duration, mean pitch, and mean Wiener entropy. Figure 1 shows (a) a phonological development in the acoustic feature space, and (b) a syntactic development. At day 50, every note was acoustically similar to each other. At day 60, notes with longer duration emerged abruptly, and then ones with harmonics diverged from the residuals, which gradually differentiated with development. Finally, eight types of notes emerged from a single acoustic stem-cluster. The recorded songs were converted to texts by annotating letters to identical note types, and then a grammatical inference method (Kakishita, Sasahara, Nishino, Takahasi, & Okanoya, 2009) was applied. Before day 70, it was not able to extract a syntax due to the transitional instability of notes. Song notes stabilized with development, becoming some patterned chunks, and the transitions also gradually stabilized. After day 100, the song syntax was crystallized.
The results demonstrated a co-developmental process of phonology and syntax in birdsong; exposed in a social environment, juveniles developed note types and sequential and sub-sequential structures as well. While the striking parallels between birdsong and language in development, there is a significant difference. It is known that human infants develop words with the aid of contextual semantic cues. Thus words develop based not only on phonological rules but also on semantic constraints. Song chunks look similar to words in form, but they develop without any atomic meanings. Birdsong therefore lacks 'double articulation,' by which small meaningless sound units combine into large meaningful units. These suggest that a precursor of syntax, like song syntax, could emerge from a learned vocal behavior, evolving relatively independent of semantics; however, to become syntax with double articulation, semantic constraints are indispensable in development. What adaptation mechanism is required for the syntax-semantics entanglement to become exist? Our findings raise further questions to be solved.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0103
The following sections are included:
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0104
Chimpanzees, like many other species, produce specific vocalisations when they encounter food (Goodall, 1986). In chimpanzees these are called rough grunts. Listening individuals often approach the caller and share the food source, imposing a cost on the caller. In order to for such a seemingly altruistic behaviour to be an evolutionary stable strategy, the caller must accrue some benefits to offset these costs. Chimpanzees, unlike many other smaller bodied species, do not benefit from a reduction in predation risk or vigilance costs when attracting others to a food source. Instead, we propose that in chimpanzees this vocal behaviour fulfils a similar social function to grooming: the benefits are in terms of increasing affiliative relationships with socially significant individuals.
Manual grooming serves to establish and maintain strong affiliative social relationships between individuals, but it is time consuming and can only be conducted during rest periods. It has been proposed that as group size increases it becomes increasingly difficult to satisfactorily service all relationships with grooming (Dunbar 1996). It is suggested that in evolution when humans reached such a critically large group size, our own species supplemented physical grooming with 'vocal grooming', to maintain social cohesion (Dunbar 1996). The production of such affiliative social vocal signals, that functioned to increase social bonds between individuals, may have been one of the earliest driving forces behind the evolution of language (Dunbar 1996). In order to test the hypothesis that chimpanzee food-associated calls function as 'vocal grooming' signals we conducted a systematic study of the factors that determine whether chimpanzees produce food-associated calls or not. We predicted that chimpanzees should be sensitive to the composition of their audience and preferentially produce calls in the presence of socially significant others, such as grooming partners.
We collected data on 9 free ranging adult male chimpanzees of the Budongo Forest, Uganda for 9 months. We collected ecological data about the food source, including an estimate of patch quality (cumulative feeding time of all chimps present). We also recorded the arrival and departure of all individuals to the focal individual's feeding tree and the timing of all food-associated calls along with the identity of the callers. In addition we recorded social data on the focal individual, including all grooming interactions.
We found that chimpanzees only produce calls in just over half of all feeding bouts, indicating that calls are not simply an involuntary emotional reaction to food. Instead, in line with our predictions, we found that chimpanzees were more likely to produce food-associated calls when individuals they choose to groom were present, rather than absent. The presence of such a grooming partner was the factor that accounted for most variance in whether rough grunts were produced or not. The quality of the food patch also influenced the likelihood of call production, with higher quality patches eliciting more grunts, however this explained less variance than the social factor. The presence of oestrus females and the number of individuals in the party did not influence calling behaviour.
These findings show that chimpanzees are selectively producing food-associated calls in the presence of individuals they groom. Basic factors such as the number of individuals in the vicinity did not influence calling behaviour, indicating that simple mechanisms such as social facilitation can not explain our results. Feeding takes up a considerable proportion of a wild chimpanzee's day (Goodall, 1986) and during this time they cannot engage in grooming, the primary mechanism we know of for strengthening social bonds between individuals. Our data suggests that rough grunts may represent a kind of 'vocal grooming' that allows individuals to maintain positive relationships with important others in the feeding context. This may represent an empirical example of vocal signals functioning to maintain relationships, that Dunbar (1996) argues could be one of the driving forces behind the evolution of human language.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0105
The evolution of language is a special case of the evolution of behavior. Evolutionary biologists have long recognized that behavioral change drives biological change, rather than the other way around (Mayr 1978). This has recently been highlighted specifically with respect to language evolution (e.g., Christiansen and Chater 2008).
In the context of human evolution, this means that cultural evolution will, to a large extent, drive biological evolution. The transition from quadrupedalism to bipedalism, for example, was driven by behavioral changes (Hunt 1994). We didn't evolve bipedal anatomy first, only to stumble upon its usefulness later. The spread of agriculture lead to selection for sickle-cell alleles (Livingstone 1958). The domestication of dairying animals lead to selection for continued lactase production (Durham 1991).
Applying this logic to language evolution, for every generation in which greater facility at communication was adaptive, individuals would have used pre-existing cognitive abilities to communicate as best they could. Genetic changes would have therefore been strongly biased towards those that modified pre-existing abilities, rather than entirely new neural circuits devoted exclusively to language. This also means that we should expect homologs of human language circuits in non-human primate brains (Schoenemann 1999).
Homologs of Broca's and Wernicke's areas have in fact been located in primates (Striedter 2005), and finding out what they use them for is critical to understanding the coevolutionary process that lead to language in humans. One fruitful approach is to identify non-language abilities that are also processed in human language areas. Broca's area in humans has been implicated in non-linguistic sequential processing (Christiansen and Ellefson 2002; Petersson et al. 2004), hand/tool manipulation (e.g., Binkofski et al. 2000; Higuchi et al. 2009), and non-verbal auditory processing (e.g., Muller et al. 2001). Because these are non-linguistic, their functional localization can also be explored in non-human primates. If they also activate Broca's area homologs, this would support the view that language adapted to pre-existing cognitive architectures, rather than requiring the creation of completely new, language-specific brain areas.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0106
In the process of acquiring a second language outside the classroom, adult learners go through a stage that has been characterized as being (1) determined by a small number of organizational principles, (2) largely independent of the source or target language of the learner and (3) simple but successful for communication (Klein & Perdue, 1997). This stage is called the Basic Variety (henceforth BV). In the BV, a speaker constructs relatively short sentences and a striking characteristic of these sentences is that there is no inflection. Some examples of organizational principles of this variety are FocusLast ('put the information that is in focus, new information, in the end of the sentence') and AgentFirst ('the NP referent with the highest control comes first'). The BV is thus not seen as an imperfect version of the target language, but as an independent linguistic system.
In the talk I focus on the expression of temporal displacement (reference to past and future) in the BV. Languages generally have sophisticated ways to express temporal structure (tense and aspect), quite often through inflection on the verb. In the BV. verbs are used but usually not infllected. Still, people refer to past and future, and the way they do it seems a very effective and robust strategy, as in the following example from Starren (2001):
(1) 'Gisteren ik bergen gaan naar' (p. 149)
Yesterday I mountains go to
Yesterday, I went to the mountains
In this example a temporal adverb is fronted to indicate that the event de-scribed took place in the past. This strategy is observed in learners of different languages (even when it is highly marked or ungrammatical), as well as speakers of homesign (Benazzo, 2009).
The fact that strategies like the above are structurally found in the BV, and that they are largely independent from source and target language, plus the observation that there are similarities between the BV and other 'restricted linguistic systems' like homesign and pidgin, makes it interesting for the debate about the emergence and evolution of language. Evolutionary claims have been made on the basis of observations from the BV. E.g., Jackendoff (2002) hypothesises that the principles that govern the BV are fossil principles from protolanguage.
Data from the BV would be very welcome as a source of evidence in the language evolution debate, especially because a lot of data is available from learners of different source and target languages (Perdue, 1993). But to avoid mere speculations, we need to formulate precisely what the structures in the BV tell us about which aspects of the evolution of language, and why. Unfortunately, not many people have concentrated on these questions, although a general framework is sketched in Botha (2005). In the presentation, I concentrate on the hypothesis that data from the BV reveals information about early human language forms, and justify this hypothesis on the basis of two strategies that seem implicitly present in recent literature on restricted linguistic systems.
One strategy is to claim that the sentence structures found in BV utterances are direct reflections of cognitive biases, and that these biases were already present in our evolutionary ancestors. If we were to choose this strategy, we would have to explain why the cognitive structures that were relevant in our evolutionary ancestors are still relevant in speakers of the BV.
Another strategy becomes relevant once we take the claim seriously that utterances in the BV are shaped by communicative needs. The structure of the utterances in the BV might not be simply a reflection of the cognitive structures of their speakers, but be shaped indirectly by their usage of the structures in communication, and whether they reach communicative success.
I argue that, in order to arrive at a good justification for evolutionary claims on the basis of BV utterances, both strategies need to be taken into account, and I sketch a way to combine the two, by taking the second strategy as a basis, and showing that the role of cognitive biases can be incorporated in this approach.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0107
Ambiguity is commonplace and indeed inevitable in everyday language; an utterance produced in one context can have a quite different meaning in another context. Despite this, listeners almost always converge upon the speaker's intended meaning. How is the achieved? Grice's cooperative principle (Grice, 1975) provides a still widely accepted answer. It comprises four maxims of conversation: quality (tell the truth), quantity (do not say too much or too little), relation (be relevant) and manner (be clear and concise). It is, according to Grice, because listeners assume that speakers follow these maxims that they are able to interpret utterances in a contextually sensible way.
Since Grice's seminal contribution, numerous refinements, additions and extensions to his work have been proposed (e.g. Horn, 1984; Levinson, 1983). The Gricean foundation, however, remains widely accepted. This acceptance means that the neo-Gricean framework has also been influential in several related disciplines, including psycholinguistics (Clark, 1996), the philosophy of language (Lycan, 2008), and indeed language evolution (Cheney & Seyfarth, 2005; Gärdenfors, 2006; Haiman, 1996; Hurford, 2007). One alternative is Relevance Theory (Sperber & Wilson, 1995), which supplants the four maxims with a single notion of relevance, which is claimed to be more basic than the Gricean maxims. As such, Relevance Theory constitutes "an ambitious bid for a paradigm-change in pragmatics" (Levinson, 1989, p.469).
One way in which we can choose between competing theories in linguistics is to use evolutionary considerations (Kinsella, 2009). This presentation (which is based upon Scon-Phillips, in press) will describe a very basic and simple evolutionary game-theoretic model of the evolution of communication. It assumes only that listeners maximise their payoffs, and that speakers do the same, given that listeners will do this. Two entirely general statements about the evolution of communication are generated. These are functional descriptions that will apply to all evolved communication systems. It is then asked what they imply for linguistic communication in particular.
The answer is that they predict, quite precisely, the two principles of relevance that lie at the heart of Relevance Theory: that listeners will seek to maximise relevance, and that the very production of an utterance brings with it a guarantee of relevance. This suggests that something like Relevance Theory, and the cognitive mechanisms that it posits, must be correct; and hence that Relevance Theory, rather than the Gricean paradigm, should be the default framework for pragmatics.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0108
The following sections are included:
Note from Publisher: This article contains the abstract only.
https://doi.org/10.1142/9789814295222_0109
The following sections are included:
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0110
Theoretical accounts of language evolution usually argue exclusively for a vocal or gestural origin of language. Such theories draw heavily on comparative evidence of the communicative competencies of extant primates. Similarities between language and primate communication, in terms of gestures being flexible, intentional and generative and vocalisations being referential and following simple combinatorial rules, are often highlighted by the respective theories. Currently many theories also place significant weight on the comparison of primate abilities across the vocal and gestural modalities (e.g. Tomasello 2008; Corballis 2002). Arguments directly comparing the evidence from the two modalities, and using the absence of a certain facet in the opposing modality as evidence in favour of the other, are common. Does the evidence warrant such comparisons?
We present a systematic review of comparative communication studies that have been published in peer review journals between 1960 and 2008. We suggest that cross-modal comparisons are problematic due to inherent biases in the methodological approach, study species and focus of unimodal research. For instance ¾ of gestural studies focus on great apes, compared to just 1 in 10 of the vocal studies. The relative number of studies conducted with wild and captive primates also shows considerable divergence depending on the modality being studied. The proportion of experimental and observational studies in each modalities is also inconsistent. Finally, partly due to methodological constraints, vocal and gestural researchers have tended to focus on communication in different contexts (evolutionarily urgent vs relaxed social). The review demonstrates that 95% of studies focus on only one modality. Given the stark differences in the profile of the research in each modality it seems many direct cross-modal comparisons are problematic. In particular it is important to be aware of the relative lack of vocal research on apes, gestural work conducted in the wild and with monkeys and facial research conducted in the wild and of an experimental nature, making direct comparisons problematic.
We question the validity of the arguments favouring one modality over the other in terms of language evolution that are based on such an incomplete comparative dataset. We argue therefore that absence of evidence for a certain competence in a particular modality cannot be cited as absence of ability: it may simply reflect the inherent biases in methodological approaches of studying each modality. We propose the way forward is firstly to focus research effort into filling in the critical gaps in our knowledge. Secondly, as a complement to unimodal research, integrated multimodal research should be encouraged. This could offer a number of advantages: First, by studying modalities side by side we will use comparable methods and contexts for each modality, which is vital for the generation of data for the purpose of direct comparison. The second reason to focus on multi-modal communication in primates is that as human language is customarily exchanged in a multi-modal format, this may be the appropriate comparison to language. Communication is important not simply to transfer information about the external environment, but also to learn about others' emotion and motivation. By examining how vocal, gestural and facial communication interact we may be able to enhance our understanding of emotional and cognitive integration in communication and consequently better understand the phylogenetic pre-cursors to human language.
In conclusion we urge proponents of unimodal theories of language origin to consider the validity of critically comparing the competencies of primates in different modalities without consideration of the inherent biases and gaps in our present understanding of each modality. We propose that empirical primate research should focus on addressing the gaps in our knowledge and integrated multi modal research should be encouraged.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0111
Natural languages do not differ arbitrarily, but are constrained so that certain properties recur across the languages of the world. These constraints presumably arise, at least in part, from the nature of the human brain, but the nature of the mapping from brain to language structure is unclear. One theory argues that strong (or even absolute) constraints are built into the language faculty and imposed by individual learners (Chomsky, 1965). An alternative suggestion (e.g. Kirby, Dowman & Griffiths, 2007) is that the same typological distributions could arise given only weak biases in individual learners, as a consequence of cultural transmission within populations. This debate has profound implications for theories of the origins and evolution of language, because the culturally-mediated mapping between learner biases and language structure complicates the biological evolution of the language faculty (see e.g. Smith & Kirby, 2008).
A test-case for the relationship between cognitive biases of individuals and structural properties of language is linguistic variation. Variation in language tends to be predictable: in general, no two linguistic forms will occur in precisely the same environments and perform precisely the same functions. Instead, usage of alternate forms is conditioned in accordance with phonological, semantic, pragmatic or sociolinguistic criteria. Experimental studies (e.g. Hudson Kam & Newport, 2005) show that, given a language in which two forms are in free variation, adult learners tend to probability match (i.e. produce each variant according to its frequency in the input), whereas children are more likely to regularize, suggesting that unpredictable variation is absent from natural languages simply because it cannot be acquired by children.
We show here that iterated learning can produce linguistically-conditioned stable variability (see also Reali & Griffiths. 2009). 25 adult participants were trained on an artificial language exhibiting unpredictable variation (plurality could be marked using two forms, which alternated freely). These participants showed no evidence of having eliminated this variability, i.e. they appeared to probability match. Ten of these participants were then used as the first generation for ten independent iterated learning chains, with the language produced by the first generation on test being used to train the second generation, and so on. Variability was preserved across five generations in seven of the ten chains. However, the predictability of that variability gradually increased, until nine often chains exhibited entirely predictable plural marking: the choice of marker became conditioned on the noun being marked. This demonstrates that adult learners have a relatively weak bias against unpredictable variation (not detectable in a sample of 25 individual learners), which nonetheless becomes apparent through iterated learning. The predictability of variation in natural language might therefore be explained as a consequence of either strong learner biases against unpredictability (in children), or the repeated application of far weaker biases (in adults, or children, or both).
Cultural transmission may act to amplify weak biases, and therefore obscure the relationship between learner biases and linguistic consequences of those biases. This implies that we cannot simply read off the biases of learners from population-level behaviour, nor extrapolate with confidence from individual-based experiments to population-level phenomena. Furthermore, evolutionary pressures acting on the language faculty face a similarly opaque mapping between the structure of the languages (over which selection presumably acts) and the cognitive traits that produce those linguistic structures
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0112
The use and testing of computational models has become a core methodology for formalising, operationalising and testing theories across the natural sciences. Applied to language origins research, such methods can make explicit the assumptions needed to make a particular theory work, leading to conclusions potentially useful to evolutionary linguists, psychologists, archaeologists and anthropologists.
A frequent objection to the computational models and experiments developed by Steels and his team (e.g. Steels 2009) is that they are biologically implausible. Steels' robotic agents spontaneously evolving lexicons and grammars as they repeatedly interact are not Darwinian organisms. Lacking selfish genes, they don't fight or deploy their signalling capacities for purposes of deception. Since their signals need not demonstrate honesty or reliability, the strategic costs of producing an effective signal in the animal world (Maynard Smith & Harper 2003) are absent by experimental design. Many of the problems likely to have been encountered by ancestral humans attempting to establish linguistic communication are consequently not represented.
Steels' work is useful to the extent that it differentiates problems already solved from those which cry out to be addressed using other methods and assumptions. Symbolic communication by its very nature presupposes intentional honesty and communal coherence. Speakers might occasionally cheat once a linguistic code is in place, but a shared code cannot be established unless honesty is the default. It is not difficult to release robots into a community free of competition or conflict. Signallers may then communicate on the basis of infinite trust. In fact, levels of trust beyond anything biologically plausible have been shown to be optimal for linguistic self-organization across a community (Steels 2009). This is an interesting result, confronting Darwinians with a theoretical challenge. Can natural selection design minds to expect 'infinite trust'?
What if Steels' robots could be designated male and female, each male seeking out fertile females and calculating whether to invest in his current partner's offspring or abandon her in favour of mating opportunities elsewhere? Female robots seeking to attract investment for their offspring would develop strategies aimed at maximizing the costs of philanderering. The Female Cosmetic Coalitions (FCC) model (Power 2009) sets out from assumptions in Darwinian behavioural ecology. Instead of invoking principles such as kin selection or reciprocal altrusim in the abstract, it accounts for distinctively human ultrasociality under specified conditions, distinguishing female fitness-enhancing strategies from male ones, differentiating between adjacent generations and connecting the logic at all stages to palaoanthropological and archaeological data. It posits costly communal ritual as the mechanism capable of enforcing cooperation across whole communities, and explains why such ritual should be focused on initiation, especially female initiation timed to coincide with first menstruation. It explains how the experience of initiation transposes language-ready minds from 'brute' reality into 'virtual' or 'institutional' reality – a world of patent fictions collectively taken on trust. It makes fine-grained predictions testable in the light of archaeological data – predictions recently vindicated by finds of Middle Stone Age ochre pigments at Blombos Cave and comparable South African sites. Until a rival hypothesis emerges, FCC seems the most promising way of connecting Steels' experiments and findings with the available archaeological and other empirical data.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0113
Different models of language change and evolution place emphasis on different factors driving evolution. Croft (2000), for instance, highlights the effects of social prestige as a selective pressure behind language change. This paper outlines a new methodology to quantitatively assess whether a proposed factor exerts selective pressure on the evolution of linguistic variants, or whether evolution is neutral with respect to that factor. The method involves running simulations of the spread of linguistic variants (using Pólya urn dynamics, see below) and then applying the Price equation, a tool from evolutionary biology (Price, 1970) recently applied to models of language evolution (Jaeger, 2008), to the simulation outcomes. A Pólya urn contains a number of tokens of different variant types. At each time step a token is drawn at random and then it is returned to the urn and n tokens of the same type are added to the urn. Urns represent agents and the tokens are exemplars of cultural variants. Drawing a token stands for production of an exemplar and addition of new tokens represents storage of perceived exemplars. The variant population evolves as the relative proportions of the types change.
The price equation (Eqn. 1) quantifies the respective contribution of selection (the covariance term in Eqn. 1) and transmission error (the expectation term in Eqn. 1) to change in a quantifiable feature z of the tokens (Δz in Eqn. 1). In this paper we focus on selection; if we find that the covariance term is different from zero, we can infer that the feature constitutes a selective pressure. The proposed methodology is illustrated by examining whether the factor "variant prestige" exerts selective pressure on variant evolution in the Pólya urn simulations or not, In a simulation with variant prestige in place, when a high-prestige variant is selected, three tokens of that type are added to the urn (modeling production of a high-prestige variant having a high impact on hearers); conversely, when a low-prestige variant is drawn, only one token of that type is added to the urn. In the "no prestige" condition, one token is added regardless of which type is drawn.
The Price equation is applied at every timestep in the simulation by comparing the state of the urn at the current and the previous timestep. Fig. 1 shows the average covariance values with and without prestige. Positive covariance on the right-hand plot indicates that prestige level covaries with fitness, therefore prestige poses positive selective pressure on variant evolution.
This is a simple illustration of a methodology that can be extended in multiple ways: by examining the second term of the Price equation, transmission error can be investigated; the Pólya urn simulation can be extended to include social network structure, learning algorithms, generation turnover, random or directed mutation, etc.; the feature of interest can be not only social prestige, but also novelty value, resilience to noise, ease of production etc.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0114
Inspired by the assumption that all evolutionary systems are instantiations of the same process of random variation and differential retention of variants (Hull, 1988), a number of scholars have based their models of language change and evolution on evolutionary biology. The convention in evolutionary anthropology and archaeology tends to be that knowledge, skills and values residing in people's brains constitute the cultural genotype, while artifacts and behaviours constitute the phenotype (e.g. Boyd & Richerson, 1995). Some models of language evolution also see the genotype in mental entities such as rules (Kirby, 1999) or information in neural assemblies (Ritt, 2004). Others, however, define mental phenotypes: grammars, or sets of learned rules (Croft, 2000; Mufwene, 2008) and public, behavioural genotypes residing in usage data (utterances). This paper offers theoretical support and evidence for the latter option.
The relationship between phenotype and genotype is an asymmetric one: the central dogma of molecular biology (Crick, 1970) states that information cannot flow back from protein to gene. Generalising, phenotypic features, acquired during development, cannot be encoded in the genotype and therefore cannot be inherited. Genotypic information acquired during replication or mutation, on the other hand, is indeed heritable.
Let us consider two examples from language change: First, the ongoing collapse of a three-gender into a two-gender system in Dutch. Dutch masculine and feminine nouns take the definite article de (while neuter nouns take the article het), and speakers in some communities can't tell the gender of etymologically masculine or feminine words. When prompted to produce an utterance where a de-noun requires a (gendered) possessive, some speakers will apply the feminine possessive and others the masculine. This is evidence that, from the same usage data, some learners induce the masculine rule for a given noun while others induce the feminine rule. If the rules are the genotype, which replicate when speakers induce rules from data, and the data is the phenotype, we have here a case where genotypic information is not being faithfully replicated from generation to generation: different people have different rules in their grammars, but this is not noticed during communication because de-nouns seldom have gendered modifiers. However, if the data is the genotype and the rules are the phenotype that develops from the interaction between the data and the learner's brain, the different rules induced by different learners are simply phenotypes with different developmental trajectories. These different rules, as expected, are nevertheless faithfully replicate the genotypic information (de followed by noun) in the data they produce.
Second, during the process of degeminatton where a double consonant becomes a single consonant (e.g. Latin cuppa becomes Spanish copa), speakers before the change had a rule that distinguished between double and single consonants. After the change, speakers have a new rule for pronouncing the consonants that does not include such distinction. During the transition, speakers with the distinction rule produced data where double and single consonants were barely distinguishable; learners exposed to such data must have induced the no-distinction rule. If the rule is the genotype and the data is the phenotype that develops from the interaction between the rule and the social communicative environment, the loss of distinction between double and single consonants is acquired during development of the phenotype (production of data). The central dogma would not allow that information from being encoded into the genotype (the learners' rule); yet here phenotypic information does precisely that and continues to be inherited over subsequent generations. This is why it has been proposed that cultural evolution is Lamarckian, as it allows inheritance of acquired characters. A solution to this problem that does not require appealing to Lamarckism is to take the data to be the genotype and the rule to be the phenotype that develops from the interaction between the data and the brain during social communication. Now, the loss of distinction is an error in replication of the genotype (a mutation), which is, as expected, heritable.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0115
The following sections are included:
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0116
According to a controversial hypothesis, a characteristic unique to human language is recursion (Hauser, Chomsky and Fitch, 2002). Contradicting this hypothesis, it has been claimed that the starling, one of the two animal species tested for this ability to date, is able to distinguish between acoustic stimuli based on the presence or absence of an abstract, center-embedded recursive structure i.e. they did distinguish between song fragments (A, B, etc) structured as AABB (recursive) and ABAB (non-recursive) (Gentner et al, 2006). In our experiment (van Heijningen et al, 2009) we show that another songbird species, the zebra finch, can also discriminate between artificial song stimuli with these structures. Zebra finches are able to generalize this to new songs constructed using novel elements belonging to the same categories, similar to starlings, i.e. new A and B type exemplars. However, to demonstrate that this is based on the ability to detect the putative recursive structure it is critical to test whether the birds can also distinguish songs with the same structure consisting of elements belonging to novel, unfamiliar categories, in this case C's and D's. We performed this test and show that seven out of eight zebra finches failed it. This suggests that the acquired discrimination was based on phonetic rather than syntactic generalization. The eighth bird, however, must have used more abstract, structural, cues. Nevertheless, further probe testing showed that the results of this bird, as well as those of others, could be explained by simpler rules than recursive ones. The eight bird for instance, used 'recency xx', meaning that he seemed to respond to structures ending with a repeated element and so made the distinction between AABB and ABAB instead of using all elements. Although our study casts doubts on whether the rules used by starlings and zebra finches really provide evidence for the ability to detect recursion as present in 'context-free' syntax, it does provide clear evidence for abstract learning of vocal structure in a songbird.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0117
The critical period for language acquisition is often assumed to be nothing more than a by-product of development. However, evolutionary computer simulations show that it can be explained as a result of biological evolution (Hurford, 1991). In the present study the aim is not to explain how and why this age sensitivity evolved but to investigate the consequences of this individual-level disadvantage on a culturally evolving vowel system as a whole. Using two different agent-based computer models it will be argued that a difference in learning ability between children and adults can improve the stabilization and preservation of complexity of vowel systems in a changing population.
The first model is a re-implementation of the one described by de Boer and Vogt (1999), which consists of a population of agents that interact through imitation games using realistic mechanisms for production and perception of vowels. The agents have a vowel memory in which they store learned prototypes of vowels and in response to their interactions with other agents they update their memory and learn new sounds. Analogous to the results of de Boer and Vogt (1999) the model shows that a population in which new members are born and old members die, a critical period stabilizes vowel systems over the generations. In this case the adults provide the learners with a stable target facilitating the acquisition process. Figure 1 shows the difference in the changes of the vowel system after transmission in a population with and without age structure.
The second model is a variation on the first which integrates the linguistic paradigm of Optimality Theory (OT). In this version of the model, the agents imitate each other using their own bidirectional stochastic OT grammar (Boersma & Hamann, 2008) consisting of a ranked set of articulatory and cue constraints. To produce or perceive a speech signal, a set of possible candidate forms is evaluated by the grammar. The candidate that violates the fewest highly ranked constraints is selected. In response to their interactions with other agents they learn by adjusting the ranking values in their grammar. This new approach replicates the stabilizing effects on the emerged vowel systems.
The results suggest that the critical period might be more than just an unfortunate consequence of development since its influence can be beneficial at the population level. If complexity can be more faithfully transmitted from one generation to the next, there is less need for new agents to reinvent structures that were already present. A cultural behavior can lead to biological adaptations for this behavior (the Baldwin effect). However, in language, one of the obstacles involved in this process is change (Christiansen & Chater, 2008). Biological evolution takes a long time to evolve adaptations, so the more stable the linguistic environment, the higher the expected role of the Baldwin effect. We propose that the age structure plays a role in the evolution of adaptations for functional features of language. It may provide the stability needed for the evolution of learning biases that favor the acquisition of more complex speech (and language).
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0118
The following sections are included:
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0119
To gain insight into possible scenarios for evolution from what we know about the development of language it is useful to distinguish what is 'old' from what is 'new' in the species. I take just two properties of human language to be critically 'new': the wide diversity of sound patterns made possible by our uniquely flexible speech production mechanism – whereas the corresponding perceptual apparatus, widely shared by other mammals, is evolutionarily ancient (Hauser, 1996) and our capacity for declarative memory, which makes it possible to rapidly retain arbitrary sound-meaning pairings and thus to encode symbolic meaning (Deacon, 1997) – whereas the slow mechanisms of procedural memory are broadly characteristic of biological species (O'Reilly & Norman, 2002).
To understand ontogenetic development from (i) the production of speech-like or 'canonical' syllables, which emerge quite suddenly in the middle of the first year of life, through (ii) interpersonal discourse, which supports the construction of meaning, to (iii) symbolic word use and the beginnings of phonological systematicity in the second year, just two more characteristics of human brain function must be added; the Mirror Neuron response (di Pellegrino et al., 1992), which makes the actions of others salient sources of imitative behavior once a child's own repertoire includes those actions (Vihman, 2002), and the rhythmic underpinning of emergent motor skills (Thelen, 1981), including canonical babbling (Oiler, 2000). One key element in this developmental profile cannot be projected back to evolutionary time – namely, 'interpersonal discourse'. However, Knight (2000) argued persuasively that vocal exploration in the safety of the mother-child relationship, with instinctive turn-taking and mutual imitation, is a highly plausible source for the 'discovery' of the potential of distinctive sound patterns for carrying referential meaning.
The first vocal symbols take on meaning from their use in consistent situational or affective contexts, whether learned from adults or, in the case of 'protowords', 'invented' as a response to the expressive impulse (Vihman, 1996). The next step is the formation of categories of prosodic shapes or 'templates' based on distributional learning over the early vocal forms (or exemplars), as is found in both first and second language learning (Ellis, 2005; Vihman & Croft, 2007). Here the principles of rhythm, alliteration, assonance, etc. which underlie adult poetic practice (and which serve a mnemonic as well as an aesthetic function) must be at work. Since subtle templatic patterns have been identified in both Semitic (McCarthy & Prince, 1995) and non-Semitic adult languages (Scheer, 2004), this patterning in the service of memory, which we see as the origins of system in the child, could have arisen in a similar way in prehistory, through cycles of declarative and procedural learning – once vocal exploration had led to the first symbolic expression.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0120
The heterogeneous category of phenomena covered by the term body language (roughly equivalent to nonverbal communication, NVC), although essential to human day-to-day communication, is also largely dissociable from human verbal behaviour. As such, it has received little attention in the area of evolution of language research. In this paper we point to an important factor – signal reliability (honesty) as an elementary constraint on communication as an evolutionarily stable strategy (ESS) – which shows promise of restoring the relevance of broadly construed body language to the evolution of language.
Contemporary research on the emergence of language-like communication has tended to target the language-related cognitive capacities, with relatively less focus on the fundamental game-theoretic constraints as dictated by evolutionary logic. Communication, in order to remain an ESS, must be honest, i.e. signals must be reliably correlated with those aspects of the environment for which they are shorthand1. Despite suggestions at possible mechanisms (e.g. Scott-Phillips 2008), the origin of honest, cooperative signalling in human phylogeny remains among the least understood aspects of the evolution of language.
It has been compellingly argued that the evolution of communication in nonhuman animals is reception-driven, i.e. it is the receivers that are selected to "acquire information from signalers who do not, in the human sense, intend to provide it" (Seyfarth & Cheney 2003: 168). Body language is characterised by similar properties, that is the transfer of information not intentionally provided by the signaller. Crucially, it is this last property that makes body language resistant to manipulation, and thus endows it with relatively high signal reliability (honesty). At the same time, in mimetic (Donald 1991) creatures, body language can be brought under limited voluntary control by the signaller, with its elements selected as self-contained individual communicative segments. Consequently, although lacking continuity in most other respects, in this respect body language becomes continuous with language-like communication. This fact is most clearly reflected in gesture studies where gesticulations are placed on a continuum, through pantomime and emblems, to linguistic signs (McNeill 2005).
We argue that the set of phenomena subsumed under the term 'body language' is very likely to have played an essential role at the critical bootstrapping stages of (proto)language evolution by attenuating its initial fragility. At a minimum, body language could have provided a reliable frame of reference to check against during exchanges of first language-like messages (e.g. Laver & Hutcheson 1972 for examples from modem human communication). More boldly, however, it can be proposed that microbehaviours originating in body language could have themselves become taken over and employed as segments in a qualitatively new communicative system. This possibility is relevant to increasingly popular 'gesture-first' theories (e.g. Corballis 2002), and still more relevant to 'gesture-together-with-speech' theories (e.g. McNeill 2005), providing a noteworthy alternative to the assumption that the first signs had their origins, through ritualisation or otherwise, in instrumental action. We offer this last suggestion merely as an interesting conjecture, which nevertheless has the merit of pointing to a yet unexplored research area.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0121
In humans, facial expressions, associated or not with speech, constitute a mean of communication in itself. By contrast, facial and vocal productions in nonhuman primate are essential communicative signals that form an integral part of inter-individual social interactions. The measures of asymmetrical facial expressions in humans show a right hemi-face bias when speaking and a left side bias for expressing facial emotions (Graves & Landis, 1990). These findings demonstrate the right hemispheric dominance theory for the control of emotions whatever the emotional valence (i.e., Borod et al., 1997). However, Davidson et al. (1990) have formulated a hypothesis named the valence theory for which negative emotions are controlled by the right cerebral hemisphere and positive ones by the left cerebral hemisphere. Few studies have investigated hemispheric lateralization for vocal and facial productions in nonhuman primates. In chimpanzees and in rhesus monkeys, a significant leftward bias (hence right hemisphere dominance) for emotional expressions was found, whereas a significant left bias for positive emotions and a right bias for the negative ones was reported in marmosets (for a review, see Hopkins & Fernandez-Carriba, 2002). Reynold Losin et al. (2008) have found a right hemi-face bias for producing learned vocal signals (atypical sounds intentionally produced by captive chimpanzees). In view of these few studies, one of our primary interests is to study the presence of asymmetrical oro-facial productions in baboons in order to determinate if these communicative signals reflect cerebral control that are related to homologues of the language area, or by contrast are of a purely emotional nature. According to the emotional hemispheric dominance theory, it is expected that the left side of the baboon's face, and thus the right hemisphere, would be more involved in the production of vocal and facial expressions whatever the emotional valence. However, in compliance with the valence theory, we should observe an involvement of the right hemisphere for negative emotions and a left hemisphere specialization for positive emotions. Still images of full expressions from videos were obtained on a sample of 73 captive baboons (Papio anubis) concerning affiliative (lipsmack, copulation calls) and agonistic behaviors (screeching, eyebrow rising). To analyze still pictures, a line is drawn between the inner corners of the eyes and compared to the horizontal lines on a fixed grid in order to rotate the face into a vertical position. A perpendicular vertical line is drawn at the midpoint of the line between the inner corners of the eyes that split the face into two halves. To measure the hemi-mouth's area, a freehand line on the inner side of each hemi-mouth is drawn and the surface (in pixels) is calculated for the two herni-mouths. A Facial Asymmetry Index (FAI) is calculated by subtracting the left hemi-mouth area from the right hemi-mouth area and divided by the sum of right and left measures. The results are still under analysis. However some interesting findings are already available. Thus, for screeching, an agonistic behavior, a significant leftward bias appeared (t(48) = -0.07, p = <.01). The results will be discussed in the light of the available literature concerning asymmetrical facial and vocal emotional productions in nonhuman primates and hypotheses regarding language evolution from the perspective of our primate heritage.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0122
Several insights of evolutionary biologist Richard Dawkins have contributed to the study of language evolution. One often cited is the meme – a culturally-transmitted replicator (Dawkins, 1976). In addition, the work of Krebs and Dawkins (1984) on animal signaling is widely referenced. One less-cited Dawkins concept is the extended phenotype (Dawkins, 1982). Kirby observed that language is "part of our extended phenotype" (1998, emphasis his), and Levinson and Evans state that humans have "a very highly developed 'extended phenotype'" (2009), but how might this apply to language evolution?
It is the view of Dawkins that the somatic phenotype is incomplete, that the effects of genes can also constitute an extended phenotype ("EP") in the form of artifacts or effects on the behavior of others. This concept was introduced as part of his larger project of developing a "gene-centric" view of evolution, so his book on the topic (Dawkins, 1982) does not discuss language as such.
For example, consider the spider's web. Although an artifact, the web is as much a part of her phenotype as her legs or eyes, "a huge extension of the effective catchment area of her predatory organs" (Dawkins, 1982). The work of beavers is a related artifactual example, although there is a key difference. The web is transient; its functional value disappears with the death of the spider. A dam, on the other hand, can outlive the beaver(s) that built it and continue to function for future populations.
Dawkins also considers phenotypes extended not by construction but by instruction. Many examples can be found in the somewhat gruesome world of animal parasites. At some stage in their often complex life cycles, many parasites use chemical signals to control the behavior of their intermediate hosts, often to the hosts' detriment (Moore, 2002). This is a different sort of EP, not artifactual but behavioral.
A more complex behavioral example is the alarm system of vervet monkeys (Cheney & Seyfarth, 1990). They use three calls–one for each of three predator classes and each yielding a specific evasive behavior. When one monkey sees a predator, it emits the characteristic alarm and nearby monkeys engage in the appropriate behavior. The "lookout" is part of the EP of every other monkey, but the others also are part of the lookout's EP. The monkeys appear to share a common, socially-constructed EP, but their audible behavioral controls work only in the present time and in the local environment.
Do these examples clarify whether the EP is relevant to language evolution? Two thoughts come to mind. First, language is an artifact like a spider's web or a beaver's dam. It persists as long as new speakers come into being, and its functionality outlives its creators. But it is also an artifact that can continue to influence the behavior of others – not only a construction, but a construction that embodies instruction. Through language, the human phenotype can extend without limit spatially and temporally.
One constraint is that the EP is limited to the effects of language on others' behavior, and while these effects contribute raw material to natural selection (Waddington, 1972), they are not the entire story. Also, it is debatable whether a strict "gene-centric" reading of Dawkins includes artifacts and influences that are not under direct genetic control. Perhaps when Kirby and others speak of the EP, they are speaking metaphorically. But metaphor or not, the EP appears worthy of further analysis for thinking about language evolution.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0123
In all human languages, the link between forms and meanings is highly systematic. Particular forms correspond to particular meanings, and particular compositions of forms correspond to particular compositions of meanings (compositionality). The conditions under which compositional languages may evolve have been extensively studied, often employing computational models (Briscoe, 2002). Such conditions are for instance the presence of a learning bottleneck (Kirby, 2002) and the presence of certain innate or acquired learning biases (Smith, 2003).
Smith (2003) developed a model to investigate which learning biases are required for a compositional language to evolve in a population of agents. In the model, each agent is modelled as an association network linking signals to meanings. The algorithms for signal production and signal interpretation strongly resemble encoding and decoding. For instance, interpreting a signal amounts to retrieving the meaning to which the signal is most strongly associated.
Smith (2003) concluded that two learning biases are necessary for the emergence of a highly compositional language. First, the agents in the population need a bias in favour of one-to-one mappings between signals and meanings. Second, the agents need a bias in favour of decomposing signals and meanings into smaller parts. In a population of agents lacking one or both of these biases, a compositional language cannot be maintained through a learning bottleneck.
This conclusion holds, at least, when the agents involved do not possess any inferential capabilities. The model in (Smith, 2003) is based on the code model of communication, which assumes that signal production and interpretation can be fully described as a matter of encoding and decoding (Shannon & Weaver, 1949). The central position of the code model in many computational models of language evolution has been food for discussion during previous editions of the Evolang conference. What will happen to the necessity of certain learning biases, as proposed by Smith (2003), if the assumptions underlying the code model of communication are dropped?
I will present a re-implementation of the model by Smith (2003) that addresses this question in two ways. First, following Langacker (1987) the model attempts to embed linguistic knowledge into the more general framework of conceptual knowledge. Utterance production and interpretation are regarded as the same cognitive process, rather than the antagonistic encoding and decoding. Additionally, following Hoefler (2009) the (cognitive) distinction between signals and meanings is entirely dropped.
Second, the model incorporates the inferential account of communication as formulated within Relevance Theory (Sperber & Wilson, 1995). The inferential account of communication describes utterance interpretation as an inferential process, rather than a mere decoding. Relevance Theory grounds inference in the fundamental principle that all cognitive processes tend to the maximisation of relevance, i.e. obtaining the highest effect, which is context-dependent, with the least effort. Context, finally, is modelled using the notions of ignorable and inferable information as introduced by Hoefler (2009).
Abandoning the code model as such has resulted in a synthesis of the model developed by Smith (2003) to simulate learning biases and the model developed by Hoefler (2009) to simulate the role of context. The results obtained with this synthesis confirm the findings of Smith (2003) and Hoefler (2009) individually. However, exploring the parameter space further, by combining different learning biases with different kinds of contexts, has so far revealed an interesting and complex interplay of language learning and language use.
I will explain how the learning bottleneck, learning biases and inference may together determine the evolution of systematic languages. Based on the results obtained, I will discuss the validity of the code model of communication for simulations of language evolution.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0124
Over the last decade, the potential explanatory power of cultural evolution has been widely promoted in the field of language evolution. However, a recent study reports an intriguing case of cultural evolution (Fehér, Wang, Saar, Mitra, & Tchernichovski, 2009). Birds reared in a deprived, unexposed environment consisting of singing males acquire different types of songs to the wild-type. Interestingly, if their offspring of the captive birds are exclusively exposed to songs in this lineage, within a few generations, their songs converge back to the wild-type. This suggests that cultural evolution may not complexify a cognitive system if it is strongly genetically biased.
Deacon (2003) has proposed that the complexfication of a cognitive system is often triggered by the degradation of genetic biases. Masked from natural selection, a given cognitive system would be unharnessed from its genetic biases: it would accept a novel information flow from various neural modules, and synergisticalty exhibit a new property. He termed this "genetic redistribution." Given this, Deacon hypothesizes that the perplexing case of the song evolution of the Bengalese finch (Okanoya, 2004) is due to the degradation of the genetic bias of song learning. He proposes that the domestication of the species allows novel neural modules to affect its song learning, and consequently complexifies the song pattern of the finch compared to that of its feral ancestor.
We model this hypothesis within the Iterated Learning Framework. Agents are represented as Jordan recurrent neural networks (Figure.1). The network is specifically chosen as it can naturally model sequential song production based on auditory feedback. During the learning period, a learning agent receives song inputs from an adult. Before the experiment begins, a neural network is trained to master a simple, linear song, and its weight configuration is then transferred to the first agent as the genetic bias. Although this bias is inherited by the agent's descendants in order to model a masked situation, no selection is introduced. Thus, mutations gradually erode the original weight configuration over the generations, and degrade the genetic bias. Finally, to model the redistributional nature of the masking process an extra set of input nodes is designed to provide an additional source of information flow and random noises are constantly added to the nodes. In the early generations, inputs from these nodes may be ignored as the network is trained to focus only on inputs fed back from the output layer in order to acquire the simple song provided during the training mode.
Our result demonstrates that as the genetic bias degrades over the generations, agents start to acquire more complex songs. This tendency is further strengthened by the iteration of learning, as later generations receive the deformed songs as their inputs. By and large, the result is on a par with (Ritchie & Kirby, 2005). However, we find that as noise is removed from the network, the birdsong tends to increase in single note repetition, and hence decreases in its complexity. The result supports Deacon's hypothesis.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0125
In the mid 1990's, Chomsky started a new research program called "the Minimalist program". He abductively infers that the kernel of the language faculty (the faculty of language in the narrow sense, FLN) would be perfect in an ontological sense: one which is similarly found in physical systems such as snow flakes or soap films (Chomsky, 2004). These physical entities can spontaneously optimize their states against restrictions acting as their boundary conditions. Along the same line, Chomsky writes that the peculiar property of FLN is mostly described as the result of its spontaneous reaction to constraints imposed by other neural modules where FLN is inserted. Using its strongest thesis, Minimalism postulates that FLN is completely comparable with physical, non-adaptive principles.
One of the secondary effects of this thesis is its implications for language evolution: If it is really proven, Chomsky claims, a teleonomic explanation of the ontological emergence of FLN becomes unnecessary (Chomsky, 2004). Instead, one can consider the emergence of FLN as a result of non-organismal processes in living organisms. Mayr (1974) has pointed out that apparently end-directed processes also exist in physical systems (such as the fact that a pendulum always stopping at the plumb line). He called such a propensity in physics "teleomatic", although he did not himself believe that teleomatic properties have a causal power in evolution. Yet, this line of thinking has been elaborated on recently. For instance, Kauffman (1989) assumes that self-organization, a teleomatic process in non-equilibrium systems, would be utilized a number of times during the history of biological evolution. Minimalism follows this avenue, and this is partly why self-organization has become one of the key terms in the program.
We believe, however, that this derived conclusion seems to be a little far-fetched. For example, while approving his creative contribution to evolutionary biology, Gould (Gould, 1971) refuted D'Arcy Thompson's view of physico-mathematical regularities in living organisms. Thompson indicated that such regularities found in various forms of organisms are traces of physical principles working on them, and there is no need to invoke adaptive explanations. However, this does not necessarily mean that there is no teleonomic cause involve. Consider the case of honeycomb, one of the most cited examples of perfect physico-mathematical regularities found in living organisms. If its ontological emergence is purely teleomatic, how do we explain its obvious function as a nest (e.g., structural strength, minimal usage of resources, and/or maximized capacity with minimized surface occupation)? If these functions are not teleofunctional, they must be somehow exapted. However, this claim is implausible as it suggests that the ancestral honeybees had somehow started to create honeycomb (with the perfect form) for nothing, and then began to use it as a nest.
We identify as a problem with the program that it vests teleomatic processes with too much explanatory power. The strong minimalist thesis seems like an attempt to expel all possible teleonomic factors from the kernel of the language faculty. To rectify this view (and while allowing for the possibility of optimality in FLN), we hypothesize that FLN has taken advantage of teleomatic properties at various stages in its evolution. In other words, teleomatic processes are subsumed within teleonomic processes in evolution. Adaptive evolution is an optimization process of a number of parameters distributed on a higher-order space. As Kauffman (1989) has eloquently expressed, teleomatic processes enable organisms to establish orders for free. Together with the fact that breaking physical stabilities would be a costly option, evolution may sometimes leave organisms under local minima in adaptation (This is somewhat similar to the fact that historical contingencies often interfere optimizing adaptation), but carve physico-mathematical traces on their forms. This is why, we believe, parts of the language faculty appears as though they are defying potential adaptive values of the system. We also assume that cultural evolution is a part of this synergy of teleonomic and teleomatic processes: by cascading information through learning, the parasitic knowledge of language becomes self-organized to take constraints imposed by the brain. Therefore, boundary conditions mostly hold in the acquisition process, as, for example, in the poverty of the stimulus.
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_0126
We aim to add some clarity to the ongoing debate between gesture-primacy and speech-primacy theories of language evolution by addressing the questions: (1) What is meant by "gestural primacy"? (2) What kind of evidence can be adduced for (or against) it? With respect to (1), we must distinguish between theories of (1a) an evolutionary stage of gestural language (Corballis 2002), (1b) gestural protolanguage ("protosign") (Arbib 2005) and (1c) gestural, and more generally mimetic prerequisites for (proto)language (Donald 1991, 1999; Zlatev 2003, 2008).
Concerning (2), a wealth of convergent evidence in favor of gestural primacy has been presented over the past decade: (2a) the ubiquity and universality of gesticulation (which differs from signed languages by not being fully conventional), in both speakers and signers; (2b) the overlap between the cortical regions involved in action, gesture and speech, with BA 45 standing out as a late specialization for the latter; (2c) the fact that human non-verbal communication is multi-modal, involving the whole body, and largely preserved in aphasia; (2d) the primacy of iconic and pointing gestures (and joint attention) with respect to speech in ontogenetic development; (2e) paleontological and archeological evidence showing adaptations in early Homo for tool use, but not for anatomical structures that have been associated specifically with speech, such as an enlarged hypoglossal canal and the canal down to the thorax: evidence for improved motor control of the tongue and breathing, respectively; (2f) the greater flexibility (of a limited range) of gestures in non-human apes compared to vocalizations.
While all of these can be (and have been) debated, when taken together they constitute a strong case for early Homo communication being carried out with the whole body, serving as a basis for the gradual recruiting of voluntary vocal signs (i.e. speech) overlying bodily communication as the main channel of language in hearing people – without replacing it, i.e. for theories of the type (1c).
If speech had evolved first, as a specific adaptation similar to birdsong, there would be no rationale for the late evolution of a multi-modal communication system. If a purely manual signed language (la), or even protolanguage (lb), had evolved first, the evolution of speech remains problematic, as the constant recurrence of the counter-argument "why then don't we all use signed languages?" testifies. If speech and gesture evolved simultaneously and constitute an inseparable "single system", evidence of the type (2b-2f), and especially the preservation of whole body communication in speech breakdown becomes extremely difficult to account for.
The remaining alternative is that of the title: not "from hand to mouth" (Corballis 2002), but from whole-body communication, supported by species-specific adaptation(s) for bodily mimesis, to the multi-modal system of linguistic communication which we use today, involving both speech and "gesture", in a wide sense of the term (Zlatev and Andrén 2009).
Note from Publisher: This article contains the abstract and references.
https://doi.org/10.1142/9789814295222_bmatter
Author Index
Sample Chapter(s)
Chapter 1: Is Grammaticalization Glossogenetic? (1,243k)