South Asia Genetic Chart

ArainGang
2 min readSep 11, 2021

Using South Asian genetic samples run through the Harappa Calculator, I’ve constructed averages for various ethnic groups and castes across the Indian Subcontinent. Below is a quick guide on how to interpret the figures.

S Indian: A signal representing the Ancestral Indians, who were the indigenous hunter-gatherers of India. Their ancestry is most commonly seen in Dravidians, Adivasis, and Dalits.

Baloch: A signal representing pastoralists from eastern Iran who migrated to India thousands of years ago. Their ancestry is found most commonly in the Brahui and Baloch of Pakistan.

Caucasian: A signal representing ancient western Iranian populations, its found most often in populations near the Caucasus mountains.

NE Euro: A signal representing ancient eastern Europeans, its often used as a proxy for Aryan ancestry, and is most common among northeastern Europeans today.

SE Asian: A signal representing ancestry from early southeast Asian populations, most commonly seen in populations like Cambodians, Malay, and Southern Chinese.

Siberian: A signal representing ancestry from northern Asian populations, most commonly seen in Turks, Mongols, Tungus.

NE Asia: A signal representing ancestry from northeast Asia, seen most commonly in Japanese and Northern Chinese.

Papuan: A signal representing ancestry found in Papuan peoples and neighboring islanders in southeast Asia and Oceania.

American and Beringian: Signals representing ancient northeast Asian populations, most commonly seen among indigenous Americans.

Mediterranean: A signal representing early European ancestry, most commonly seen in modern Mediterranean populations.

SW Asian: A signal representing early Middle-Eastern ancestry, seen most commonly in Bedouin Arabs.

San: A signal indicating south African ancestry.

E African: A signal indicating east African ancestry.

Pygmy: A signal indicating pygmy related African ancestry.

W African: A signal indicating west African ancestry.

Note that genetic calculators are not perfect and can have glitchy results at the extremes of certain population clusters. They are however generally reliable, and are best used when pairing them with tools like G25.

All samples were drawn from publicly available forums like Anthrogenica, Eupedia, Reddit, and Genoplot.

--

--