This volume presents the lexical and grammatical evidence that deﬁnes the Amerind linguistic family. The evidence is presented in terms of 913 etymologies, arranged alphabetically according to the English gloss. Each etymology begins with the English gloss followed by a hypothetical phonetic form from which the individual Amerind forms are presumed to have derived. Within the body of each etymology the evidence is arrayed in terms of the thirteen branches of Amerind in a roughly north to south (or sometimes west to east) order: Almosan, Keresiouan, Penutian, Hokan, Central Amerind, Chibchan, Paezan, Andean, Equatorial, Macro-Tucanoan, Macro-Carib, Macro-Panoan, Macro-Ge.
Of course not all of the etymologies presented here derive from Proto-Amerind; many are later innovations within the Amerind family and deﬁne
Amerind sub-groups, but it is often diﬃcult to distinguish the taxonomic level of each etymology because data on Amerind languages are far from complete. An example of this problem is Greenberg’s Amerind etymology 125, GIRL. Greenberg cited 17 forms from Almosan, Penutian, Hokan, and Central Amerind, some of which I have eliminated. The distribution of Greenberg’s forms could be interpreted either as an Amerind etymology whose forms never made it to South America, or as an innovation within Northern Amerind branches after the South American Amerind groups had already split oﬀ.
Subsequent research has, however, clariﬁed this issue, as can be seen in etymology 699 in this book. This etymology now contains over 800 forms and
is found in all 13 branches of Amerind, leaving little doubt that it derives from Proto-Amerind. Moreover, this complex pattern was an invention of the
Amerind people, and not a trait that was inherited from their Asian ancestors, for this pattern does not exist elsewhere in the world.
The Amerind pronoun pattern na/ma ‘I/you’ (etymologies 399 and 781), which is widespread in Amerind and virtually absent elsewhere in the world,
also almost certainly derives from Proto-Amerind, though it is not found in every branch. There is, however, another way of showing that a particular
etymology derives from Proto-Amerind even though it may be found in only a few Amerind branches. If a root is found in the Old World and in Amerind, the presumption is that it was brought into the Americas with the Amerind migration and was therefore part of Proto-Amerind. Examples of such roots in this book are indicated in comments following the etymology (e.g. 53, 100, 101, 127, 137, 156, 163, 168, 172, 192, 267, 268, 272, 326, 357, 395, 796, 824).
Each of the thirteen Amerind subgroups begins with the subgroup name, followed by the data from diﬀerent Amerind groups (or languages) that are arranged genetically where the genetic structure is known (i.e., groups or languages closest to one another are listed side by side). The complete classiﬁcation of all Amerind languages considered is given in the back of the book. Some readers will note that both Almosan-Keresiouan and Chibchan-Paezan are here broken up into their two branches, rather than treated as single branches, as in Language in the Americas. This is not to be interpreted as a retraction of the claim that Almosan and Keresiouan, and Chibchan and Paezan, are both valid higher-level taxa within the Amerind family. It is done merely as a clearer way of arraying the evidence.
Furthermore, the internal structure of the Amerind family is a vastly more diﬃcult problem than the simple delineation of the family as a whole.
Despite claims often made to the contrary, higher-level families are often much easier to discern than intermediate subgroups, as Indo-European, Austrone- sian, and Amerind all attest. This is especially important for Amerind. It is often claimed that as one goes back in time words are lost in all languages at a gradual rate so that the taxonomic picture gets dimmer and dimmer and ﬁnally turns to black. While it seems obvious to anyone that as one goes back in time things do get dimmer, it’s not true. Sometimes as one goes back in time things become more clear. The failure to understand this fundamental taxonomic principle is the source of much of the current confusion in historical linguistics. Linguistics, unlike biology, was from its inception strictly cladistic, that is, each node in a phylogenetic tree is deﬁned by one or more innovations. However, whether such innovations develop, thus revealing the internal subgrouping of a family, depends on the rate of disintegration of that family, as illustrated in Fig. 1 on the following page.
Let us assume that there is a population on island A. At some point in time part of that population moves to island B. Later part of that population
moves to island C, and ﬁnally part of that population moves to island D. Let us further assume that there is no further contact between these populations. The correct phylogenetic tree will then be as in 1a and many historical linguists believe that F3 will be more obvious, and better supported, than F2, and F2 more obvious than F1. Whether this is true or not depends on the amount of time between the separation of the populations. Let us assume that some people on island A moved to B on Monday, then part of that population to C on Tuesday and part of that population to D on Wednesday, and then 500 years pass. Under these circumstances the phylogenetic tree, based on linguistic evidence, will be as in 1b since there will not have been any time for innovations to develop which would distinguish F2 and F3. The language on each island would have been identical at the start and each would have then gone its own way, in a process similar to genetic drift.
If, however, the time of the separations was 500 years, not one day, then both F2 and F3 will be well deﬁned by innovations that have accumulated
during those 500 years. In both scenarios, however, F1 will be well deﬁned by those words that have been preserved on islands A, B, C, and D. Thus,
whether the intermediate nodes can be identiﬁed or not depends on the rate of separation. With a rapid disintegration it is only the highest level node that will be clear, the intermediate nodes often being unclear or even invisible. An example of this principle is the initial peopling of the Americas by the Amerind family. Archaeological evidence indicates that the initial peopling of the New World was a very rapid migration that ﬁlled two empty continents with people in 1,000 years. Under these circumstances it is the highest level node-Proto-Amerind-that will be the most obvious, as the n/m pronoun pattern (see etymolgies 399 and 781) and the t’ina/t’ana/t’una root (see etymology 699) clearly show, as does the rest of this dictionary.
In fact, the t’ina/t’ana/t’una etymology itself gives linguistic support for the rapid expansion through all of North and South America because this entire complex system can be reconstructed on the South American evidence as well as on the North American evidence, indicating that the entire system made it in- tact into South America before breaking up, as it has in all extent Amerind languages. There is also genetic evidence that the Americas were peopled by a small population in a short period of time. Cavalli-Sforza et al. (1988) found in their seminal work on the genetic structure of human populations that New World populations were divided into the same three groups that Greenberg had identiﬁed on linguistic evidence: Eskimo-Aleut, Na-Dene, and Amerind. However, subsequent research on the correlation of genes and language within Amerind often found virtually no correlation at all.
Does this mean that genes and languages do not correlate after all? Not at all. What it shows is genetic drift at work. When small populations break up into even smaller populations the distribution of genes can diverge radically in any direction, thus quickly obscuring the original unity and making subgrouping diﬃcult, or impossible, to discern. This is exactly what happened with the initial Amerind population. Once this small population entered the Americas, then
split into numerous smaller groups, and peopled two continents in 1,000 years, this was probably the greatest experiment in genetic drift the world has ever seen. The fact that virtually all Amerind people are of blood group O is sometimes explained as a consequence of genetic drift.
The question of the precise date of the ﬁrst migration into the New World has long been a focus of dispute, with some archaeologists, linguists, and geneticists maintaining that there were people in the New World over 30,000 years ago, while other archaeologists, linguists, and geneticists have argued that there were no people in the New World until the ice-free corridor through Canada opened around 13,500 years ago, allowing one small population to ﬁll two continents in 1,000 years. Recently, Mark Seielstad et al. have found a mutation on the Y chromosome (M242) that lies between two mutations that are known to have occurred in Asia (M45, M74) and a mutation that arose in the Americas (M3) (Seielstad et al. 2003). They have dated the M242 mutation to the range 15,000–18,000 BP and have argued that this date provides a fairly certain upper bound on the time of the ﬁrst entry into the Americas. This date corresponds well with the archaeological date.
There is also other archaeological evidence that supports a late, rather than early, entry into the Americas. The archaeological record shows that
between 13,000 BP and 11,000 BP there was suddenly a rapid extinction of numerous species of large animals, an extinction that is often attributed to
the ﬁrst appearance of humans in the Americas. These animals had never seen humans before and thus had no fear of them, to their detriment.
In addition, it is now believed that dogs were ﬁrst domesticated in East Asia around 15,000 years ago (Wade 2006). Since the ﬁrst Americans brought domesticated dogs with them they could not have left Asia before 15,000 BP or they would have had no dogs. It thus appears that the ﬁrst entry into North America took place not long after the domestication of dogs. For unknown reasons all of these Asian dogs brought to America have gone extinct, replaced by European dogs who arrived much later. As far as the internal subgrouping of Amerind is concerned, Greenberg suggested that Almosan-Keresiouan, Chibchan-Paezan, Equatorial-Tucanoan, and Ge-Pano-Carib probably form valid higher-level subgroups. Greenberg also proposed that Almosan-Keresiouan, Hokan and Penutian form a higher level group which he called Northern Amerind. Cavalli-Sforza and I also found support for this Northern Amerind group, as well as evidence for a South-ern Amerind group which would contain the eight South American branches (Ruhlen 1991).
The Central Amerind group is remarkable for its broad distribution-from Southern Mexico to Utah. Central Amerind has three branches: Oto-Manguean, Uto-Aztecan, and Tanoan. Oto-Manguean is found in Southern Mexico, Uto-Aztecan from Southern Mexico to Utah, and Tanoan in New Mexico. Whorf and Trager (1937) proposed that Uto-Aztecan and Tanoan formed a higher level group which they called Aztec-Tanoan. It is clear that the Oto-Manguean homeland was in Southern Mexico, but where was the Aztec-Tanoan homeland? It was originally thought that the Uto-Aztecan homeland was in the north and the expansion was from north to south. That Tanoan is in the north would support this conclusion. However, it is now known that the Uto-Aztecan expansion was south to north-from Southern Mexico to Utah-and thus Tanoan appears to have been an oshoot of this south to north migration. Looking at the structure of Amerind from this perspective would lead one to the conclusion that Central Amerind represents an expansion back into territory that was originally Northern Amerind.
My current view of the phylogenetic structure of the Amerind family is shown in Fig. 2, though this will inevitably evolve-in unpredictable ways-as further research is carried out. Possibly Andean will turn out to be closer to North-Central Amerind than to the other South American groups. The precise position of Chibchan-Paezan is also problematical.
This book is essentially the ﬁrst step in the Comparative Method, the identiﬁcation of morphemes in different languages that are similar in sound and meaning and can therefore be presumed to have “sprung from some common source, which, perhaps, no longer exists,” as Sir William Jones put it in 1786. This ﬁrst stage is simply taxonomy. The second step in the Comparative Method is what is commonly called historical linguistics, the attempt to analyze these similar words and to ascertain the original form from which all descend and to ﬁgure out the sound correspondences that explain the differ-ences. The time and place of the original language is also investigated. It should be obvious that this second stage must follow the ﬁrst.
Each etymology in this book concludes with an indication, in brackets, of the source of the etymology and, in some cases, a comment on the etymology follows the indication of its source. Etymologies that have no source indicated are presented here for the ﬁrst time. In addition, etymologies attributed solely to Greenberg have often been modiﬁed by the inclusion or exclusion of certain forms, so they are often dierent from the etymologies given
in Language in the Americas. In indicating the source of the etymology-the vast majority of which derive from Language in the Americas-the following
abbreviations are used for Amerind and its branches: A: Amerind, AL: Al-mosan, KS: Keresiouan, P: Penutian, H: Hokan, CA: Central Amerind, CH:
Chibchan, PZ: Paezan, AN: Andean, E: Equatorial, MT: Macro-Tucanoan, MC: Macro-Carib, MP: Macro-Panoan, MG: Macro-Ge.
In addition to this somewhat different arrangement of the data, as compared to that in Language in the Americas, I have also modiﬁed the spelling of certain language names, where a different spelling seemed preferable (e.g. “Uitoto” has been changed to “Witoto”).
The phonetic forms have also been slightly modiﬁed from the earlier work, most notably by the symbols used for ejectives (p’, t’, k’ in this book, replacing p $, t$, k $). In addition, ts is now c, yod (the high front semivowel) is indicated by y, rather than by j, the front rounded vowels are now indicated by u¨ and o¨rather than y and ø, respectively, and ü is now ı.¨
These changes seem more in keeping with Americanist practice. Following the etymologies the reader will ﬁnd a summary of the classiffication, maps indicating the distribution of various language groups, and a list of references. No attempt is made to list every source for every word cited; indeed, no etymological dictionary has ever attempted to provide this level of detail for all sources. Greenbergs’ notebooks, which indicate sources for all of his data, are publically available. The sources I have used in revising Greenberg’s book are given in the References.