What ancient DNA does and does not prove about the Aryan question
Confirmed: IVC people had no Steppe ancestry. Confirmed: Steppe-related ancestry arrived after IVC decline (~2,000–1,500 BCE). Confirmed: agriculture developed independently. Open: whether genetic mixing necessarily means language replacement.
Detailed Analysis
Ancient DNA has transformed the debate about the origins of South Asian populations and the spread of Indo-Aryan languages. However, what genetics proves and what it does not prove are frequently conflated in popular discussions. **What ancient DNA confirms**: 1. **IVC people had no Steppe ancestry (2,500 BCE)**: The Rakhigarhi individual (Shinde et al. 2019, Cell) had no Steppe pastoralist ancestry and no Iranian farmer ancestry. Her genetic profile is 'the primary ancestry source in South Asia today.' 2. **Steppe-related ancestry arrived after IVC decline**: The Narasimhan et al. (2019) study of 523 individuals shows Steppe-related ancestry spreading into South Asia between 2,300 and 1,500 BCE — after the IVC declined (~1,900 BCE). This ancestry contributes up to 30% of the modern South Asian genome. 3. **Agriculture in India developed independently**: The genetic data shows South Asian agriculture emerged without requiring gene flow from Iranian farmers. The IVC people were genetically distinct from Iranian Neolithic populations. 4. **ANI/ASI formation**: After IVC decline, IVC people mixed with Steppe groups to form Ancestral North Indians (ANI), and with peninsular groups to form Ancestral South Indians (ASI). This two-population model (Reich et al. 2009) has been refined but broadly confirmed. **What ancient DNA does NOT prove**: 1. **What language the migrants spoke**: DNA carries no linguistic information. Steppe-related ancestry correlating with Indo-Iranian language spread is an inference based on geographic and temporal overlap — not a direct proof. 2. **Whether language shift preceded, accompanied, or followed genetic mixing**: It is possible that Indo-Aryan languages were already present in South Asia before Steppe genetic admixture (the Out-of-India hypothesis), or that language spread preceded significant genetic mixing (elite dominance model), or that language and genes spread together (mass migration model). DNA cannot distinguish between these scenarios. 3. **Cultural identity of the migrants**: 'Steppe ancestry' is a genetic label, not a cultural or linguistic one. The people carrying this ancestry may or may not have identified as 'Aryan' or spoken an Indo-European language. 4. **Whether the Aryan Migration Theory is true or false**: The genetic data is consistent with an external origin for a component of South Asian ancestry. It is also consistent with several alternative models. The binary framing ('AMT proved/disproved') does not match the data's nuance.
Methodology
Synthesis of major ancient DNA studies: Shinde et al. (2019, Cell), Narasimhan et al. (2019, Science), Reich et al. (2009, Nature), Lazaridis et al. (2016). ADMIXTURE analysis, qpAdm modeling, f-statistics. Linguistic correlations from comparative Indo-European studies.
Counter-Arguments & Responses
The correlation between Steppe ancestry and Indo-Iranian language spread is strong enough to be considered proof of the Aryan Migration Theory.
Correlation is not causation. The temporal and geographic overlap is significant, but DNA does not carry linguistic tags. Elite dominance scenarios (small migrating group imposing language without proportionate genetic contribution) are consistent with the same data. The genetic evidence constrains the timing but does not determine the mechanism of language spread.
The Out-of-India hypothesis is falsified by the genetic data showing external Steppe ancestry.
The genetic data shows Steppe-related admixture in South Asia, but this is a genetic observation, not a linguistic one. The OIT hypothesis could accommodate external genetic admixture if it argues that Indo-Aryan languages were already in India before the Steppe-related populations arrived. The genetic data makes OIT less parsimonious but does not logically falsify it.
Falsifiability Criteria
If ancient DNA from IVC sites consistently showed Steppe ancestry at pre-2,000 BCE dates, the migration timeline would need fundamental revision. If a method were developed to extract linguistic information from DNA (hypothetically), the question could be resolved directly.