There is no question in Indian prehistory that has caused more heat and dust than this one: ‘when and how did Indo-European language speakers, who called themselves Aryans, reach the Indian subcontinent?’ This is curious because no similar and extreme controversy surrounds questions like ‘when did the first inhabitants reach India?’ or ‘when did Dravidian language speakers reach India?’ or ‘when did the Mundari, Khasi or Meitei language speakers reach India?’ It provokes no one’s ire when it is said that the original inhabitants of India came from Africa; that Proto-Dravidian is related to the Elamitic language of Iran; or that Mundari, Khasi and Meitei speakers came from east Asia. All of this is taken with a shrug because, after all, there is no nation in the world today that has not been shaped by repeated mass migrations. Europe has seen its demography upturned at least two times through mass migrations. The Americas saw at least three major migrations that shaped their demography and these were even before the first European set foot there. East Asia has seen at least three major migrations, while central Asia and west Asia have been the sites of so many invasions and migrations that it is difficult to keep count.
And it is not as if Indians have not ventured out and influenced other regions massively either, especially in the early centuries of the Common Era. All of southeast Asia, from today’s Vietnam and Cambodia to Burma, Thailand and Indonesia, once fell within the ambit of India’s cultural pre-eminence. Even China came under the spell of India for a while. Occasionally this may have involved invasion, but more often it involved the ceaseless efforts of Buddhist missionaries keen to spread their religion, and very often it had to do with merchants out to make a profit and protect and further their interests. That Buddhism has 488 million adherents around the world, with only a minority of them in India today, is testimony to the impact India made outside of its natural boundaries.
So what accounts for this special sensitivity to the question about the arrival of Indo-European language speakers? The answer is simple: it is the unstated but underlying assumption that Indian culture is identical or synonymous with ‘Aryan’, ‘Sanskrit’ or ‘Vedic’ culture. Therefore, to ask when Indo-European languages reached India would be seen to be the same as asking, ‘when did we import our culture?’
This is ridiculous on two counts. First of all, Indian culture is not synonymous with, or identical to, ‘Aryan’ or ‘Sanskrit’ or ‘Vedic’ culture. ‘Aryan’ culture was an important stream that contributed to creating the unique Indian civilization as we know it today, but by no means was it the only one. There were other streams that have contributed equally to making Indian civilization what it is. Second, to say that Indo-European languages reached India at a particular historical juncture is not the same as suggesting that the ‘Vedas’ or ‘Sanskrit’ or the ‘Aryan’ culture was imported flat-packed and then reassembled here. ‘Aryan’ culture was most likely the result of interaction, adoption and adaptation among those who brought Indo-European languages to India and those who were already well-settled inhabitants of the region.
So to come back to the question: did Indo-European language speakers, who called themselves Aryans, come from elsewhere, and if they did, when did they do so?
Out of India is out of the reckoning
Until recently, there was some room for debate on the question whether the spread of Indo-European languages around the world could be explained by people moving out of India with an early version of Sanskrit rather than people moving into India with an early version of Sanskrit. But genetic studies, especially those based on ancient DNA, are rapidly closing the door on that debate. Here is how they are doing it. About three-quarters of Indians today speak an Indo-European language such as Hindi, Gujarati, Bengali, Punjabi or Marathi. So does about 40 per cent of the world, with Spanish, English, French, Portuguese, Iranian, Russian and German being some of the other widely spoken Indo-European languages. The Indian subcontinent forms the easternmost limit of the Indo-European language family range, there being no large populations speaking any Indo-European languages to our east. So the natural question arises: how did this language family become the dominant language in India? There are only two possible answers: either it came to India from the outside sometime in the past, or it went from India to the rest of the world that is west of it.
Let us consider the second possibility first: that a large number of Sanskrit- or Proto-Sanskrit-speaking Indians once ventured west and they and their descendants spread out over vast regions all the way from Iran to central Asia to west Asia to eastern Europe and western Europe, thus spawning versions of Indo-European languages along the route. What would you then expect to see in the genetic record of all those regions? A fair sprinkling of the genetic signature of the First Indians, the descendants of the Out of Africa migrants. As we saw in the previous chapters, the first migrants had spread all over the subcontinent and were part of the population that built the Harappan Civilization as well. So if significant emigrations from India any time before or after the Harappan Civilization were responsible for the spread of Indo-European languages, we would have to see their genetic footprints all the way from central Asia to western Europe. Is there such a large signature across this region? No. On the contrary, as we saw in chapter 1, the descendants of the First Indians have no close relatives anywhere else in the world. So the idea of an Out of India migration that spread Indo-European languages around the world is a non-starter.
There is one exception to this that proves the rule – the Roma, a relatively small itinerant ethnic group living mostly in Europe and the Americas who were earlier known as Gypsies. Genetic studies have confirmed that they come from a single ethnic group that left north-western India (the regions of Punjab, Sindh, Rajasthan and Haryana) some 1500 years ago – long after Indo-European languages became well established in Europe and elsewhere. So as they migrated west, did they carry the typical genetic signature of the First Indians? Yes, they did.
According to a study titled ‘Reconstructing the Indian Origin and Dispersal of the European Roma: A Maternal Genetic Perspective’ published in 2011:
Two different groups of lineages could be distinguished among the Roma. The European/Middle Eastern haplogroups accounted for 65% to 94% in different Roma groups, whereas the rest of the lineages belonged to haplogroup M. This last haplogroup is common in East Africa and Asia, but is rarely found in Europe. Within haplogroup M, all lineages were of clear Asian origin except one East African M1a1 sequence found in two Portuguese Roma. The main Asian subhaplogroups found were M5a1, M18 and M35b, which have been reported to have an Indian origin.
In other words, when there was an emigration of people from the Indian subcontinent towards the west and all the way to Europe, a genetic signature indeed went with them – of the descendants of the first inhabitants of south Asia: the deep-rooted maternal haplogroup M, which is ‘rarely found in Europe’. Since the Romas couldn’t have introduced Indo-European languages to Europe and since there is no other significant genetic south Asian signature in Europe or central Asia, we have to consider the case for Out of India as closed.
This takes us to the next question: if migrations into India led to the spread of Indo-European languages in the Indian subcontinent, when did they happen, and where did the migrants come from?
Genetic signature of the Aryans
This question, which has vexed scholars and animated partisans for over a century, is now being settled by genetic evidence made possible by new techniques for extracting and analysing ancient DNA which allow us to see how people moved and how demography changed over time. By analysing ancient DNA from the same location at different periods, or from the same period at different locations, geneticists can answer what changed when.
But before exploring further the question regarding the migration of Indo-European language speakers to India, let us take a step back and answer a variant of the question that we asked while discussing the Out of India hypothesis: if Indo-European languages are spread over a large area of Eurasia, is there a genetic signature visible across this geography? Yes, there is: the Y-chromosome haplogroup R1a or, more specifically, its subclade R1a-M417, which accounts for almost all the R1a lineages in the world today. A map of R1a-M417 distribution would show it extending from Scandinavia to south Asia, covering almost all of the Indo-European language-speaking world.
Steppe migrations: Step by step
The study that put the question of Indo-European language speakers migrating to India to rest was released as recently as on 31 March 2018, and was titled ‘The Genomic Formation of South and Central Asia’. This path-breaking work that for the first time had access to ancient DNA from south Asia, Kazakhstan and eastern Iran, and its galaxy of eminent authors, was discussed in detail in chapter 2 (pp 91–97). So here we will limit our recap of that study to parts that deal specifically with ‘Aryan’ migration.
The study says there was indeed a southward migration of pastoralists from the Kazakh Steppe – first towards southern central Asian regions, that is, present-day Turkmenistan, Uzbekistan and Tajikistan, after 2100 bce; and then towards south Asia throughout the second millennium bce (2000 bce to 1000 bce). On their way, they impacted the Bactria–Margiana Archaeological Complex (BMAC, a civilization that thrived between 2300 bce and 1700 bce, centred on the Oxus river and covering today’s northern Afghanistan, southern Uzbekistan and western Tajikistan), but mostly bypassed it to move further down towards south Asia. Here they mixed with the existing people of the Harappan Civilization, thus creating one of the two main sources of population in India today: Ancestral North Indians, or ANI, the other being Ancestral South Indians, or ASI, who were formed by the mixing of the people of the Harappan Civilization with the First Indians in southern India around the same time.
The study arrived at these conclusions after detecting signals of the migrations in the ancient DNA. To quote: ‘Our analysis shows no evidence of Steppe pastoralist ancestry in groups surrounding BMAC sites prior to 2100 bce, but suggests that between 2100 to 1700 bce, the BMAC communities were surrounded by peoples carrying such ancestry.’ This shows there was a migration of the people of the Steppe to the BMAC region around 2100 bce.
Also, as mentioned earlier, among the ancient DNA from BMAC sites – as well as among the ancient DNA from the eastern Iranian site of Shahr-i-Sokhta – the study made some surprising discoveries with major consequences: three outlier individuals dated to between 3100 bce and 2200 bce, with an ancestry profile that was unique. Unlike other ancient DNA samples from the same sites, these had 14 to 42 per cent ancestry from the First Indians, in addition to ancestry from Zagros agriculturists. The Harappan Civilization was known to have had contacts with both the BMAC and Shahr-i-Sokhta, so the study concluded that the outlier individuals were recent migrants from there.
These individuals, like others around them, had no Steppe ancestry whatsoever. This fits with the view that the Steppe pastoralists started migrating southward only around 2100 bce.
But the clincher was yet to come. The scientists also had access to ancient DNA from the Swat valley of Pakistan, dated to between 1200 bce and 1 ce, more than a thousand years later than the Shahr-i-Sokhta and BMAC samples. The Swat valley samples were genetically very similar to the three outliers from Shahr-i-Sokhta and the BMAC and, like them, had ancestry from the First Indians and Zagros agriculturists. But there was one crucial and telling difference: they also had Steppe ancestry of about 22 per cent. The study says: ‘This provides direct evidence for Steppe ancestry being integrated into South Asian groups in the second millennium bce, and is also consistent with the evidence of southward expansions of the Steppe groups through Turan at this time.’
The study then notes that a great majority of the people speaking Indo-European languages in Europe and Asia today carry ancestry that is related to Steppe pastoralists known as the Yamnaya (more on the Yamnaya in the next section). This is in accordance with the long-held theory that the Yamnaya spoke late Proto-Indo-European and that they spread the Indo-European languages both to Europe and to Asia. Earlier genetic studies had documented the westward movement of the Yamnaya into Europe beginning 3000 bce, but until this study was released there was no direct ancient DNA evidence of the chain of transmission of Steppe ancestry to south Asia. The authors of the study believe that their documentation of large-scale movement of Steppe groups southwards in the second millennium bce now provides this missing evidence.
There’s more. Remember we said the present-day Indian population is a product of the mingling between the ANI [Harappans (First Indians + Zagros agriculturists) + Steppe pastoralists] and ASI (Harappans + First Indians)? When the geneticists tested whether the ANI–ASI mixture model fits 140 present-day population groups in south Asia, ten groups stood out – each of them being poor fits because they had much more than the expected levels of Steppe ancestry. The strongest signals of elevated Steppe ancestry were in two groups that were of traditionally priestly status, expected to be custodians of texts written in Sanskrit. What could explain this? According to the study, one possibility is that the migration of Steppe pastoralists into south Asia created different groups with different proportions of Steppe ancestry. And those with higher levels of Steppe ancestry seem to have had a central role in sustaining or spreading early Vedic culture.
Strong rules of endogamy (marrying within one’s own community) among some population groups may have resulted in this excess Steppe ancestry persisting to this day.
Excerpted from Early Indians by Tony Joseph, published by Juggernaut, Rs 531