Steppe ancestry spread into South Asia 2,300–1,500 BCE

Narasimhan et al. 2019 (523 individuals): Steppe ancestry spread 2,300–1,500 BCE, up to 30% modern South Asian ancestry. IVC people mixed with Steppe = ANI, with peninsular = ASI.

Confirmed

Detailed Analysis

The Narasimhan et al. 2019 study in Science represents the most comprehensive ancient DNA survey of Central and South Asian populations to date. Analyzing 523 ancient individuals spanning from the Mesolithic to the Iron Age, the study reconstructed the genetic history of the region with unprecedented resolution. The key findings for South Asian history are as follows. First, the Indus-Saraswati Civilization population (represented by 'Indus Periphery' individuals from Gonur and Shahr-i-Sokhta, plus the Rakhigarhi individual) was a mixture of ancient Iranian-related ancestry and South Asian hunter-gatherer ancestry, with no Steppe component. Second, Steppe-related ancestry (associated with the Yamnaya horizon and its descendants, particularly the Sintashta-Andronovo cultures) began appearing in Central Asian populations by 2,300 BCE and in South Asian populations between 2,000 and 1,500 BCE. Third, this Steppe admixture formed two distinct ancestry clines in modern South Asia: the Ancestral North Indian (ANI) component — a mixture of IVC-related and Steppe ancestry, highest in northwestern and upper-caste groups — and the Ancestral South Indian (ASI) component — a mixture of IVC-related and peninsular hunter-gatherer ancestry, highest in southern and tribal groups. The Steppe contribution to modern South Asian ancestry ranges from negligible in some southern tribal groups to approximately 30% in some northwestern populations. The gradient is geographic (north-to-south) and social (correlating with traditional caste rank), a pattern first identified by Reich et al. (2009) and confirmed with ancient DNA by Narasimhan et al. The implications for the Aryan Migration debate are significant but not fully resolved. The genetic evidence confirms that a substantial population movement from the Central Asian steppe into South Asia occurred between 2,300 and 1,500 BCE — overlapping with the decline of the Indus-Saraswati Civilization and the beginning of the Vedic period as conventionally dated. The correlation of Steppe ancestry with Indo-European language speakers in both Europe and South Asia is striking. However, genes and languages do not always travel together. The Steppe genetic contribution could represent a demographic event that adopted local languages, or a demographic event that brought new languages, or a complex interaction of both. The genetic data alone cannot distinguish between these scenarios. Proponents of indigenous origins argue that Vedic culture predates the Steppe admixture, pointing to the Rakhigarhi DNA and IVC-Vedic continuity evidence. Proponents of migration argue that the timing, direction, and linguistic correlation make language transmission the most parsimonious explanation. What the genetic evidence does conclusively establish: (1) the IVC population was indigenous; (2) Steppe-related ancestry entered South Asia after 2,300 BCE; (3) this ancestry is now distributed in a cline that correlates with geography and social structure; (4) the genetic mixing was real and substantial, not marginal.

Methodology

Ancient DNA extraction and whole-genome sequencing of 523 individuals from archaeological sites across Central and South Asia (Narasimhan et al. 2019). Population genetics modeling using qpAdm, ADMIXTURE, and f-statistics. Radiocarbon dating of skeletal samples for chronological control. Comparison with modern South Asian genome data from the Indian Genome Variation Consortium and 1000 Genomes Project.

Counter-Arguments & Responses

Challenge

The correlation between Steppe ancestry and Indo-European languages could be coincidental. Language shift can occur without genetic replacement — see the adoption of Turkish in Anatolia with minimal Central Asian genetic contribution.

Response

Language shift without genetic replacement is well documented (Turkey, Hungary, etc.). However, the South Asian case differs in showing substantial genetic admixture (up to 30%), not just elite cultural influence. The question is whether 30% ancestry replacement is sufficient to explain language shift, or whether additional mechanisms were involved.

Source: Lazaridis et al. (2016). Nature 536, 419-424.

Challenge

If Steppe ancestry brought Indo-Aryan languages, why do Dravidian-speaking populations also carry some Steppe ancestry?

Response

Millennia of endogamy and admixture have blurred the original distribution. The Steppe ancestry is highest in Indo-Aryan-speaking northwestern groups and lowest in Dravidian-speaking southern groups — the gradient is real even if not absolute. Partial admixture across linguistic boundaries is expected over 4,000 years of geographic proximity.

Source: Narasimhan et al. (2019). Science 365(6457).

Falsifiability Criteria

If ancient DNA from pre-2,300 BCE South Asian contexts revealed significant Steppe ancestry, the 'post-2,300 BCE arrival' timeline would be falsified. If Steppe ancestry were found in IVC-era individuals at core IVC sites, the indigenous-IVC model would need revision.

Supporting Media & Resources

Narasimhan et al. (2019) — The Formation of Human Populations in South and Central AsiaScience 365(6457) · paper