Genes, Peoples and Languages

Luca Cavalli-Sforza
Stanford University
Genetics Department

My interest in languages began together with that of cultural evolution. I was attracted by the obvious potential for quantitative investigations of language change. Trees of languages were constructed even before those of biological species (the most famous one being the tree of Indoeuropean languages, Schleicher 1865), and the timing of language separation was attempted before similar applications in genetics (glottochronology, Swadesh and Lees, 1951). The geographic analysis of the spread of word changes led to the “wave theory” (Schmidt), which was introduced as an alternative to tree analysis. There is a similar contrast in genetics, where trees are useful for analysing the history of old separations among populations, while for populations that are close neighbors migration generates enough exchange that one finds a strong correlation between geographic and genetic distances. The same applies to linguistic distances for languages that are close geographically and genetically, and their correlation with geographic distances is a compact way of testing wave theory. One can thus show that there is a strong variation in the rate of change of individual words, and that there is a correlation between the rate of change of a word in space and that in time (LCS and Wang 1986).This type of analysis leads to recognizing easily the usefulness of standard population genetics theory for the study of linguistic evolution.



Some anthropologists (e.g. Bateman et al. 1990) have reacted very negatively to our study of the similarity of the genetic tree of populations and that of linguistic families reconstructed today (LCS, Piazza, Menozzi and Mountain, PNAS 1988). The wide disagreements existing among linguistic taxonomists, and the incompleteness of linguistic classifications are in part responsible for it. More recent efforts to build higher superfamilies seem to be quite favorable to the correlation.



Of the two major mechanisms of language spread postulated by Renfrew (1987), demic diffusion, and language replacement mediated by elite dominance (essentially, conquest), the first can be equated to vertical, and the second to horizontal cultural transmission. When languages are transmitted vertically (from parent to child or more generally in the family) one can expect a correlation between genetic and linguistic trees. Recent history shows clear examples of language replacement (also called shift) due to military conquest. Its incidence in paleolithic times is difficult to assess, but I suspect it was less important. Language repacement tends to destroy the correlation. As most linguistic families recognized today may have originated in paleolithic times or shortly thereafter, one will find the correlation between linguistic and genetic trees more easily for families, than for individual languages, but in many cases it is still clear even at the level of single languages.The similarity of the evolutionary patterns of certain viruses (e.g. hepatitis B, papilloma) and bacteria (Helicobacter pylori) which are transmitted vertically, and the human one shows that we can expect all traits transmitted vertically to show a pattern very similar to that of humans. Until writing which started only 5000 years ago and only in a vfcew places, and schools, most language learning must have occurred in the family or in a small social circle.



Genetic trees of haploid DNA , essentially Non Recombinant Y- chromosome, and, somewhat less clearly, mtDNA, are today the best indicators of human evolution. They clearly suggest three major radiations: the earliest one to Africa, around 100,000 years ago; a somewhat later one to Asia, probably by the coastal route of S.Asia, which spread further south from S.E. Asia to Oceania and New Guinea, but also north towards China and Japan; and the latest, around 40-50,000 years ago to central Asia, and from there to Europe, East and N.E.Asia and from there to America. It is reassuring to note that the 1988 tree obtained from 110 autosomal genes is in good agreement with those of NRY and mtDNA. A conclusion that comes uniquely from haploid DNA studies is that the originally expanding population was in E. Africa, and a very small one. This suggests that, even if there were already different languages spoken in Africa at that time, the expanding population spoke only one language. This language must have already been as sophisticated as are all modern languages spoken today. Although the phonology and the lexicon of living languages are extremely varied,structure is more conserved and is quite similar in all of them. The ability to communicate by such a sophisticated language must have given a tremendous advantage to the radiating founder population, and may have been, together with other innovations like nautical means, however primitive, and the new set of tools coming out at the time of the expansion (aurignacian) among the major promoters of the expansion to settle to the whole world and replace earlier human types.



The great speed of differentiation of languages makes it difficult to recognize the unity of origin of all spoken languages, but various genetic considerations make it very likely. Apart from the small size of the founder population, another consideration is that there must have been several mutational steps that generated the biological changes that make human speech possible. Mutations involving communication must be present for 2-way communication to be possibl,e, both in the speaker and the listener, so that they can exchange roles. So the same mutation must be present in the two individuals who communicate. Most improvements in communication must give their full potential advantage only in pairs of individuals of the same family, like parent and child or sibs, who carry the same mutation. The mutation can then spread to a small population in a few generations. This is true, in general, of all phenomena of cooperation (Eshel and LCS). Thus, most advances in communication (and cooperation) must have evolved in a small population, and have given to it a considerable selective advantage.


Back to Long Programs