User:Wikiality123/Sandbox

Paper 1

Genetic affinities among the lower castes and tribal groups of India: Inference from Y chromosome and mitochondrial DNA

Authors

Thanseem, I.a , Thangaraj, K.a , Chaubey, G.a b , Kumar Singh, V.a , Bhaskar, L.V.K.S.a , Reddy, B.M.c , Reddy, A.G.a , Singh, L.a

a Centre for Cellular and Molecular Biology, Uppal Road, Hyderabad- 500 007, India
b Estonian Biocentre, Riia 23, Tartu- 51010, Estonia
c Biological Anthropology Unit, Indian Statistical Research Institute, Habsiguda, Hyderabad, India

Abstract

Background: India is a country with enormous social and cultural diversity due to its positioning on the crossroads of many historic and pre-historic human migrations. The hierarchical caste system in the Hindu society dominates the social structure of the Indian populations. The origin of the caste system in India is a matter of debate with many linguists and anthropologists suggesting that it began with the arrival of Indo-European speakers from Central Asia about 3500 years ago. Previous genetic studies based on Indian populations failed to achieve a consensus in this regard. We analysed the Y-chromosome and mitochondrial DNA of three tribal populations of southern India, compared the results with available data from the Indian subcontinent and tried to reconstruct the evolutionary history of Indian caste and tribal populations. Results: No significant difference was observed in the mitochondrial DNA between Indian tribal and caste populations, except for the presence of a higher frequency of west Eurasian-specific haplogroups in the higher castes, mostly in the north western part of India. On the other hand, the study of the Indian Y lineages revealed distinct distribution patterns among caste and tribal populations. The paternal lineages of Indian lower castes showed significantly closer affinity to the tribal populations than to the upper castes. The frequencies of deep-rooted Y haplogroups such as M89, M52, and M95 were higher in the lower castes and tribes, compared to the upper castes. Conclusion: The present study suggests that the vast majority (>98%) of the Indian maternal gene pool, consisting of Indio-European and Dravidian speakers, is genetically more or less uniform. Invasions after the late Pleistocene settlement might have been mostly male-mediated. However, Y-SNP data provides compelling genetic evidence for a tribal origin of the lower caste populations in the subcontinent. Lower caste groups might have originated with the hierarchical divisions that arose within the tribal groups with the spread of Neolithic agriculturalists, much earlier than the arrival of Aryan speakers. The Indo-Europeans established themselves as upper castes among this already developed caste-like class structure within the tribes.

Paper 2

Presence of three different paternal lineages among North Indians: A study of 560 Y chromosomes

Authors

Zhao, Z.a b c , Khan, F.d , Borkar, M.d , Herrera, R.d , Agrawal, S.d e

a Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, United States
b Department of Human Genetics, Virginia Commonwealth University, Richmond, VA, United States
c Center for the Study of Biological Complexity, Virginia Commonwealth University, Richmond, VA, United States
d Department of Medical Genetics, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, (UP) 226014, India
e Department of Medical Genetics, Sanjay Post Graduate Institute of Medical Sciences, Raebareli Road, Lucknow (UP) 226014, India

Abstract

Background: The genetic structure, affinities, and diversity of the 1 billion Indians hold important keys to numerous unanswered questions regarding the evolution of human populations and the forces shaping contemporary patterns of genetic variation. Although there have been several recent studies of South Indian caste groups, North Indian caste groups, and South Indian Muslims using Y-chromosomal markers, overall, the Indian population has still not been well studied compared to other geographical populations. In particular, no genetic study has been conducted on Shias and Sunnis from North India. Aim: This study aims to investigate genetic variation and the gene pool in North Indians. Subjects and methods: A total of 32 Y-chromosomal markers in 560 North Indian males collected from three higher caste groups (Brahmins, Chaturvedis and Bhargavas) and two Muslims groups (Shia and Sunni) were genotyped. Results: Three distinct lineages were revealed based upon 13 haplogroups. The first was a Central Asian lineage harbouring haplogroups R1 and R2. The second lineage was of Middle-Eastern origin represented by haplogroups J2*, Shia-specific E1b1b1, and to some extent G* and L*. The third was the indigenous Indian Y-lineage represented by haplogroups H1*, F*, C* and O*. Haplogroup E1b1b1 was observed in Shias only. Conclusion: The results revealed that a substantial part of today's North Indian paternal gene pool was contributed by Central Asian lineages who are Indo-European speakers, suggesting that extant Indian caste groups are primarily the descendants of Indo-European migrants. The presence of haplogroup E in Shias, first reported in this study, suggests a genetic distinction between the two Indo Muslim sects. The findings of the present study provide insights into prehistoric and early historic patterns of migration into India and the evolution of Indian populations in recent history.

Paper 3

Genetic heterogeneity among the Hindus and their relationships with the other 'Caucasoid' populations: New data on Punjab-Haryana and Rajasthan Indian States

Authors

Tartaglia, M., Scacchi, R., Corbo, R.M., Pompei, F., Rickards, O., Ciminelli, B.M., Sangatramani, T., Vyas, M., Dash, S., Modiano, G.

Dipartimento di Biologia, Universita 'Tor Vergata', Via della Ricerca Scientifica S.N.C., 00133 Roma, Italy

Abstract

The genetic structure of Rajasthan Hindus and Punjab-Haryana Hindus and Sikhs has been studied for ABO RH, APOC2, C6, C7, F13A, F13B, HP, ORM1, ACP1, ADA, AK1, ESD, GLO1, PGD, PGM1 subtyping, and PGP. This is the first genetic survey on Hindus of Rajasthan. Furthermore, many of these markers have never been studied on Hindus before (APOC2, C6, C7, F13A, F13B, ORM1, PGP). These data, together with those previously available for Hindus, have been utilized to analyze the within-Hindus genetic heterogeneity by R(ST) statistic and correspondence analysis. The genetic relationships of Hindus to other Causcasoid populations were also investigated. In the first analysis, two eastern states (Orissa and Andhra Pradesh) were found to be quite separate from each other and clearly distinct from the northwestern and western states. Out of the markers which could not be utilized in this analysis, PGM1 subtyping turned out to discriminate between the Dravidian-speaking and the Indo-Aryan-speaking Hindus. The second analysis shows a clear-cut separation of Hindus from Europeans, with Near Eastern and Middle Eastern populations genetically in an intermediate position.

Paper 4

Distribution of HLA (class I and class II) antigens in the native Dravidian Hindus of Tamil Nadu, south India.

Authors

Subramanian, V.S., Selvaraj, P., Narayanan, P.R., Prabhakar, R., Damodaran, C.

Tuberculosis Research Centre (ICMR), Madras, India.

Abstract

HLA - A, B, C, DR, DQ antigen profile of South Indian Tamil-speaking Hindus of Dravidian descent was studied. Phenotype, gene and haplotype frequencies were calculated and compared with the literature. There was a complete lack of A23, A25 and A32 antigens in the sample presently monitored. Except for minor differences (higher incidence of Cw6 and DR10 antigens), the Dravidian Hindus show similarity to North Indio-Aryan and other Hindu samples. The haplotypes A1, B17; A2, B5; A2, B51; A1, DR7; B12, DR7; B13, DR2; B17, DR7; DR2, DQ1; DR3, DQ2; DR4, DQ3; DR5, DQ3; DR7, DQ2; DR11, DQ3; show significant positive linkage disequilibrium whereas A1, DR2; DR2, DQ2; DR7, DQ1 were significant for negative linkage disequilibrium in the Dravidian Hindus.

Paper 5

The Eurasian Heartland: A continental perspective on Y-chromosome diversity PNAS August 28, 2001 vol. 98 no. 18 10244-10249

R. Spencer Wellsa,b, Nadira Yuldashevaa,c, Ruslan Ruzibakievc, Peter A. Underhilld, Irina Evseevae, Jason Blue-Smithd, Li Jinf, Bing Suf, Ramasamy Pitchappang, Sadagopal Shanmugalakshmig, Karuppiah Balakrishnang, Mark Readh, Nathaniel M. Pearsoni, Tatiana Zerjalj, Matthew T. Websterk, Irakli Zholoshvilil, Elena Jamarjashvilil, Spartak Gambarovm, Behrouz Nikbinn, Ashur Dostievo, Ogonazar Aknazarovp, Pierre Zallouaq, Igor Tsoyr, Mikhail Kitaevs, Mirsaid Mirrakhimovs, Ashir Charievt, and Walter F. Bodmera,u

From discussion

Intriguingly, the population of present-day Iran, speaking a major Indo-European language (Farsi), appears to have had little genetic influence from the M17-carrying Indo-Iranians. It is possible that the pre-Indo-European population of Iran—effectively an eastern extension of the great civilizations of Mesopotamia—may have reached sufficient population densities to have swamped any genetic contribution from a small number of immigrating Indo-Iranians. If so, this may have been a case of language replacement through the “elite-dominance” model (29). Alternatively, an Indo-Iranian language may have been the lingua franca of the steppe nomads and the surrounding settled populations, facilitating communication between the two. Over time, this language could have become the predominant language in Persia, reinforced and standardized by rulers such as Cyrus the Great and Darius in the mid-first millennium B.C. Whichever model is correct, the Iranians sampled here (from the western part of the country) appear to be more similar genetically to Afro-Asiatic-speaking Middle Eastern populations than they are to Central Asians or Indians. This finding contrasts with a recent analysis of Eastern Iranian populations, which have high frequencies of Y-chromosome haplogroup 3, defined by the M17 analogue SRY-1532A (30). It is likely that the Dasht-e Kavir and Dasht-e Lut deserts in the center of the country have acted as significant barriers to gene flow.

Paper 6

Phylogeography of mtDNA haplogroup R7 in the Indian peninsula

Chaubey, G.a b , Karmin, M.a , Metspalu, E.a , Metspalu, M.a , Selvi-Rani, D.b , Singh, V.K.b , Parik, J.a , Solnik, A.a , Naidu, B.P.b , Kumar, A.b , Adarsh, N.b , Mallick, C.B.b , Trivedi, B.b , Prakash, S.b , Reddy, R.b , Shukla, P.b , Bhagat, S.b , Verma, S.b , Vasnik, S.b , Khan, I.b , Barwa, A.b , Sahoo, D.b , Sharma, A.b , Rashid, M.b , Chandra, V.b , Reddy, A.G.b , Torroni, A.c , Foley, R.A.d , Thangaraj, K.b , Singh, L.b , Kivisild, T.a d , Villems, R.a

a Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia b Centre for Cellular and Molecular Biology, Hyderabad, India c Dipartimento di Genetica e Microbiologia, Università di Pavia, Via Ferrata 1, 27100 Pavia, Italy d Leverhulme Centre of Human Evolutionary Studies, Henry Wellcome Building, University of Cambridge, Fitzwilliam Street, Cambridge, CB2 1QH, United Kingdom

Abstract

Background. Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East. Results. Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ∼12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages. Conclusion. Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity. © 2008 Chaubey et al; licensee BioMed Central Ltd.

Paper 7

Genomic inferences on peopling of south Asia

Partha P Majumdera,

Human Genetics Unit, Indian Statistical Institute, 203 B.T. Road, Kolkata 700108, India

Available online 11 August 2008.

South Asia has been a major corridor for the geographic dispersal of modern human from out-of-Africa to other regions of the world. Genomic markers have provided key information for tracing trails of human migration. An overall view of these trails has emerged, though there are still many contentious issues. The nature of genomic differentiation in south Asia is high, resulting from a combination of admixture and isolation.

A brief history of south Asia

South Asia — comprising present day Pakistan, India, and Bangladesh — has served as a major corridor for the dispersal of modern humans that started from out-of-Africa about 100 000 years ago [1••]. The date of entry of modern humans into this region remains uncertain. However, modern human remains dating back to the late Pleistocene (55 000–25 000 years before present, ybp) have been found [2] and by the middle Paleolithic period (50 000–20 000 ybp), humans appear to have spread to many parts of India [3]. There is evidence of cultural contacts between people of the Indian subcontinent and other regions, near and far, for a very long period of time, possibly going back even to the prehistoric period. This region has also experienced a large number of invasions [4]. These cultural contacts and invasions have resulted in a high degree of genetic and cultural differentiation among the people of India. There is also evidence of a complex civilization — the Indus valley (Harappan) civilization — that is approximately 4500 years old, but disappeared about 3500 ybp. Almost simultaneously with the disappearance of the Harappan civilization, there was a conquest of this region by nomadic people from Central Asia, who spoke Indo-European languages. This conquest by Indo-European speakers introduced a social structure that is hierarchical (the caste system), and persists even to this day. Among other social rules, this social restructuring resulted in the creation of strict rules governing mate-exchange that has resulted in the formation endogamous population groups and has further accelerated genetic differentiation. It is believed that speakers of Dravidian languages were widespread throughout India before the arrival of Indo-European speakers [5••]. Subsequent to their arrival, the Indo-European speakers exercised dominance over the pre-existing populations, and currently the Dravidian speakers are confined primarily to the southern parts of India. Northern India has become home to the Indo-European speakers. The north-eastern region of India is predominantly occupied by speakers of the Tibeto-Burman languages (of the Sino-Tibetan family). The other language family found in India is the Austro-Asiatic and speakers of this family of speech are spread in different pockets of India, primarily east and central India (where the Mundari sublineage of the Austro-Asiatic family is spoken), but also northeast India (where the Mon-Khmer sublineage is spoken).

There are thousands of extant endogamous population groups in India. The vast majority belongs to the caste system. There are over 4000 such groups [6]. There are about 400 tribal groups, and about 100 groups who belong to neither the caste nor the tribal folds, but are religious (Muslim, Christian, Buddhist, etc.) or migrant groups (Parsee, Irani, etc.) [6]. While Indo-European, Dravidian, and Tibeto-Burman speakers belong to both caste and tribal folds, the Austro-Asiatic speakers are exclusively tribal.

Major components of the structure of south Asian populations inferred from ‘classical’ genetic markers

The most comprehensive analysis and synthetic inferences based on allele frequencies of classical genetic markers (blood groups, serum proteins, and red-cell enzymes) in various populations of south Asia, that were drawn from a large number of earlier publications of various researchers, were made by Cavalli-Sforza et al. [7••]. Their analyses showed that populations of south Asia were positioned between populations of west and southeast Asia. Within south Asia, allele frequency distributions over geographical space for the vast majority of genetic marker systems were ‘patchy’, but showed a high degree of microgeographic variation. A major contributor to this variation is undoubtedly due to the influx of genes through invasions. Another contributor is the relative isolation of population groups due to endogamy that resulted in significant genetic drift within these groups. Principal components analysis of allele frequencies revealed that there are four major components of the genetic structure of India, representing firstly, a substrate of Paleolithic people, perhaps the indigenous tribals; secondly, early farmers from the fertile crescent region, who brought in the technology of agriculture to this region; thirdly, the Indo-European speakers, who arrived about 3500 ybp from central Asia; and fourthly, the Austro-Asiatic and Tibeto-Burman speakers. Linguistic differences among the people of the Indian subcontinent accounted for the highest fraction of genetic diversity, though it is important to emphasize that there is a high degree of confounding between language and geography (because Indo-European speakers are almost exclusively confined to the north, the Dravidians to the south, and the Tibeto-Burmans to the northeast).

South Asia has played a pivotal role in geographic dispersal of modern humans

Reconstruction of the process of peopling is problematic using unlinked autosomal (biparental) markers. Mitochondrial DNA (mtDNA) and Y-chromosomal markers have proved to be very useful in reconstructing patterns and tracing trails of human migration, in spite of the limitation that these genomic regions do not undergo recombination and hence are essentially transmitted as single markers. Another limitation of inferences based on mtDNA and Y-chromosomal studies is that since these are uniparental markers, the impact of genetic drift on these marker systems is greater because of reduced effective population size compared to biparental (autosomal) markers. In spite of these limitations, uniparental markers have provided considerable information for tracing trails of human migration.

Basal mutations that are shared by clusters of phylogenetic lineages of each of these uniparental marker systems are called haplogroups. These haplogroups usually have a strong geographical patterning with respect to their presence and their frequencies. Sub-Saharan African lineages belong to haplogroups L1, L2, and L3. All of the mtDNAs outside of Africa are derivatives of just two haplogroups M and N, which arose from L3. mtDNAs of European, north African, and western Asian Caucasians belong to haplogroups H, I, J, K, T, U, V, W, and X; and haplogroups A, B, C, D, E, F, G, and M emcompass the Asian, Oceanian, and native American mtDNA lineages [8]. The two founding clades M and N (including its daughter haplogroup R) gave rise to a large number of subclades that are dispersed throughout Eurasia. Of these subclades, the most deep-rooting ones (coalescence time between 40 000 and 60 000 ybp) are found almost exclusively in south Asia, indicating that south Asia has played a pivotal role in the out-of-Africa colonization and dispersal of modern humans [[8] and [9]].

Human dispersals from Africa to Eurasia: one or two waves?

Contrasting patterns of usage of stone-tool technology found in Eurasia led to the proposition [10•• M.M. Lahr and R.A. Foley, Towards a theory of modern human origins: geography, demography and diversity in recent human evolution, Yearbook Phys Anthropol 41 (1998), pp. 137–176. View Record in Scopus | Cited By in Scopus (78) A very detailed review of inferences on human evolution, primarily based on nongenetic data. 10••] that there were at least two separate waves of human dispersal from northeastern Africa. The ‘northern’ wave of dispersal extended northward via the Nile Valley and the Sinai Peninsula into southwestern Asia, and eventually into Europe. An independent wave of dispersal, via the ‘southern exit route’, extended from the Horn of Africa across the mouth of the Red Sea along the coasts of southern and southeastern Asia into Australia [10••]. The stone-tool technologies found along the southern exit route were classified as ‘simpler’ than those found along the northern dispersal route, and therefore it was believed that the southern wave predated the northern wave. Whether there were two waves of dispersal has been debated by both archaeologists and molecular geneticists. The debate continues. On the basis of new archaeological evidence and reinterpretation of older evidence, Mellars [11] has strongly favored a single ‘southern’ wave of dispersal. The strongest genomic evidence in support of the ‘southern exit route’ comes from the analysis of mtDNA. All non-African mtDNA lineages belong to a subset of African mtDNA haplogroup L3. The L3 haplogroup differentiated into M and N haplogroups. The distribution of M lineages and estimate of their coalescent age (65 000 ybp [12•]) favor the early presence of this haplogroup in south Asia. Haplogroup M is present in Ethiopia, but primarily in south and east Asia. This haplogroup has not been found along the northern exit trail, but is found in high frequency in India [13], the first major dispersal point along the southern route. Further, all of the earliest genetic branches of M are found in India and at high diversity. Two recent mtDNA studies conducted in ‘relict’ populations of southeast Asia [14] and the Andaman and Nicobar Islands [15] also point to human dispersals through the southern exit route. Further, Y-chromosomal haplogroup data are indicative of dispersal through the southern route; haplogroups C and D are found only in the Asian continent and Oceania [[9] and [16]] and not in western Eurasia and north Africa. It may be pertinent to mention that there is mtDNA evidence of a second wave of human dispersal from Africa through the northern route ranging 43 000–53 000 ybp [8], which reached central Asia and radiated from there to north and east Asia carrying the prominent mtDNA haplogroups A and B. Later expansions can be detected by the presence of subclades of haplogroup U in India and Europe [[17] and [18•]]. Thus, genetic evidence points to the older southern route as having been the more significant one for human dispersal than the later northern route.

Genetic evidence for early settlers

The tribal populations of India possess higher genetic diversity than caste populations, indicating that tribals are the older settlers of India [[18•] and [19•]]. It has been debated whether the Austro-Asiatic speaking tribals are the original inhabitants of India [[5••], [20] and [21]]. Maternally inherited mtDNA studies strongly support the view that they are the earliest inhabitants of India. They possess the highest frequencies of the ancient haplogroup M and exhibit the highest genomic diversity within a fast evolving segment (HVS1) of the mtDNA [18•]. They also have the highest frequency of subhaplogroup M2 (20% [[17], [18•], [22] and [23]]), which has the highest HVS1 nucleotide diversity compared with other subhaplogroups and therefore possibly the earliest settlers (the estimated coalescence time is 63 000 ± 6000 ybp [17]). Recent results [24] also indicate that haplogroup O-M95 had originated in the Indian Austro-Asiatic populations 65 000 ybp and their ancestors carried it further to southeast Asia via the northeast Indian corridor. These findings are consistent with Renfrew's [25] observation that the present distribution of the Austric language group is owing to the initial dispersal process out of Africa, whereas later agricultural dispersal can account for the Elamo-Dravidian or Sino-Tibetan (to which family Tibeto-Burman languages belong) distributions.

The tribal groups who speak Tibeto-Burman languages are concentrated in the northeastern part of India; presumably they arrived on waves of migration from southern China through the northeastern corridor of India. The Tibeto-Burman tribals can be distinguished from the Austro-Asiatic tribals on the basis of frequencies of specific genetic variations found on the Y-chromosome [18•] and mtDNA [26]. Cordaux et al. [27] have opined, by studying mtDNA and Y-chromosomal variations in populations inhabiting the northeast Indian region that after humans entered this region they remained land-locked in this region. The narrow strip of land that connects the northeast Indian region to the remaining part of India has served has a bottleneck to dispersal. They [27] have also suggested that the Austro-Asiatic and Tibeto-Burman speakers may be recent immigrants. Thus, the issue identifying the earliest settlers remains unresolved, though genetic evidence overwhelmingly points to the Austro-Asiatic speaking tribal groups being the most ancient inhabitants of south Asia.

Genetic diversity among tribals and their relationships to castes

Within India, the caste and tribal groups are significantly differentiated [[18•] and [19•]]. There is also considerable genetic heterogeneity among the Austro-Asiatic, Tibeto-Burman, and Dravidian tribal groups, indicating that they have diverse origins and population histories [[18•], [26] and [28••]]. The origins of some tribal groups, and the relationship between the genetic origins of tribals and castes have been highly debated. Two rival models, based primarily on mtDNA-chromosome and Y-chromosome data, have been proposed. One model suggests that the tribes and castes share considerable Pleistocene heritage, with limited recent gene flow between them [9], whereas an exact opposite view concluded that caste and tribes have independent origins [27]. Another analogous debate concerns the origins of the hypothetical proto-Elamo-Dravidian language, which is thought to be the precursor of Tamil. It has been proposed that the proto-Elamo-Dravidian language spread eastward from southwest Persia into South Asia with agriculture [29], and the argument is bolstered by the existence of a solitary Dravidian-speaking group, the Brahui, in Pakistan [30]. On the basis of a recent extensive genetic study of a large number of Y-chromosomes sampled from ethnic populations representing the cultural (tribe and caste), linguistic, and geographical diversity of India, we [28••] and others [31] have recently concluded that the influence of Central Asia on the pre-existing Indian gene pool was minor. This conclusion is significant because a widely prevalent view is that the extent of contribution of west and central Asians to the Indian gene pool was large, because the Indo-European speakers who entered India have had a major influence on the cultural landscape of India, including the introduction of the caste system. We found that the ages of accumulated microsatellite variation (a specific type of genetic variation that is very informative in inferring population origins and histories) in the majority of Indian haplogroups exceed 10 000–15 000 years, which attests to the antiquity of regional differentiation, because human immigration from central Asia has been more recent. Observed Y-chromosomal variation, especially of haplogroups R1a1 and R2, in India is consistent with diverse origins of Indian tribals. Associated microsatellite analyses of the high-frequency haplogroup R1a1 chromosomes indicate independent recent histories of the Indus Valley and the peninsular Indian region [28••]. Our data are also more consistent with a peninsular origin of Dravidian speakers than a source with proximity to the Indus and with significant genetic input resulting from human dispersal associated with agriculture. Thus, pre-Holocene and Holocene-era — not Indo-European — expansions have shaped the distinctive South Asian, including India, Y-chromosome landscape. Before the arrival of the Indo-European speakers into India, who established the caste system in India, considerable genetic variation pre-existed. The immigrants added to the pre-existing diversity, but the extent of their genetic contribution was minor. The genetic patterns of the extant caste groups therefore indicate an overlay of some predominant genetic signatures found in central Asia (Indo-European speakers) with signatures that are indigenous to India. There is a declining proportion of Indo-European admixture with decrease in rank of caste groups, since genetic distances between extant European populations and Indian caste groups increase with decrease in rank of the caste groups [22]. The pattern of genetic variation indicates that Dravidian speakers in India may have been much more widespread before the Indo-European speakers came into India. A large section of Dravidian speakers possibly retreated to their present habitat (southern India) to avoid Indo-European dominance [18•].

Conclusions and an additional comment

Genomic reconstruction of migration trails of modern humans indicates that south Asia has served as a major corridor for the geographic dispersal of humans from out-of-Africa. The most significant route of this dispersal seems to have been from the Horn of Africa across the mouth of the Red Sea along the coasts of southern and southeastern Asia into Australia. The extant population of south Asia comprises a large number of isolated groups of varying sizes. This, coupled with the facts that Indian populations have had early contacts with populations outside of the region and that there have been many invasions of India, has resulted in a high degree of genetic diversity in this region. No clear geographical patterning of the distribution of allele frequencies of genetic markers can be observed within this region. The entry of the Indo-European speakers from central Asia to south Asia resulted in a massive social restructuring. This social restructuring has left genetic imprints, and has also resulted in a significant genetic differentiation between caste and tribal populations. Although the tribal groups carry genetic signatures of being early settlers, it has been difficult to identify which specific subgroup of tribals may have been the earliest settlers. The overwhelming evidence points to the Austro-Asiatic speaking tribals being the earliest settlers. The contribution of the Indo-European speakers to the south Asian gene pool was overestimated; this contribution appears to have been small.

How are genome diversity studies helpful in the context of research on human disease? In addition to providing insights into human evolutionary processes, human genome diversity studies have started to yield insights into various central biological questions, such as, the extent and nature of variability in recombination rates in the human genome [[32••], [33] and [34••]], the nature of action of natural selection in shaping the variability found in the human genome [[35••] and [36••]]. Data from genome diversity research on small and recently admixed populations have been helpful in identifying genomic regions with strong linkage disequilibrium. In recently admixed populations, disequilibrium can be observed over chromosomal areas as large as 5–10 cM, which is about 2–5% of the estimated linkage length of an average human chromosome [37]. These findings have helped the design of genetic studies and have proven useful in localizing genes that are associated with or cause various disorders [[38], [39] and [40]]. Genome diversity studies, especially those pertaining to human migration, have also been helpful in tracing the origin and understanding spatial patterns of disease-related genetic variants, such as the 3-basepair deletion Δ508 that causes cystic fibrosis [[41••] and [42]], or the origin, distribution and relative proportions of alleles in relation to the spread of agriculture or infectious diseases through demic diffusion, such as that of the ABO blood groups in relation to malaria [43] or Δccr5 in relation to HIV1 [44]. Therefore, the understanding of the processes of peopling has a direct bearing on the understanding of epidemiology and genetics of diseases.

References and recommended reading

Paper of particular interest, published within the period of review, have been highlighted as:

• of special interest

•• of outstanding interest

References

1•• R.L. Cann, Genetic clues to dispersal of human populations: retracing the past from the present, Science 291 (2001), pp. 1742–1748. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (56) A succinct review of inferences on the spread of humans from out-of-Africa based on genomic data.

2 K.A.R. Kennedy, S.U. Deraniyagala, W.J. Roertgen, J. Chiment and T. Sisotell, Upper Pleistocene fossil hominids from Sri Lanka, Am J Phys Anthropol 72 (1987), pp. 441–461. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (13)

3 V.N. Misra, Stone age in India: an ecological perspective, Man Environ 14 (1992), pp. 17–64.

4 A. Danielou, A Brief History of India, Inner Traditions India, Vermont (2003).

5•• R. Thapar, A History of India, vol. 1, Middlesex, Penguin (1966). An authentic history of the Indian subcontinent, including a chapter on prehistorical evidence of human settlements in India.

6 K.S. Singh, People of India: An Introduction, Anthropological Survey of India, Calcutta (1992).

7•• L.L. Cavalli-Sforza, P. Menozzi and A. Piazza, The History and Geography of Human Genes, Princeton University Press, Princeton (1994). The most detailed reconstruction of human evolution based on classical genetic markers, including a compilation of data and spatial maps of allele frequencies.

8 N. Maca-Meyer, A.M. Gonzalez, J.M. Larruga, C. Flores and V.M. Cabrera, Major genomic mitochondrial lineages delineate early human expansions, BMC Genet 2 (2001), pp. 13–20. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (40)

9 T. Kivisild, S. Rootsi, M. Metspalu, S. Mastana, K. Kaldma, J. Parik, E. Metspalu, M. Adojaan, H.-V. Told and V. Stepanov et al., The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations, Am J Hum Genet 72 (2003), pp. 313–332. Article | PDF (1525 K) | Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (100)

10•• M.M. Lahr and R.A. Foley, Towards a theory of modern human origins: geography, demography and diversity in recent human evolution, Yearbook Phys Anthropol 41 (1998), pp. 137–176. View Record in Scopus | Cited By in Scopus (78) A very detailed review of inferences on human evolution, primarily based on nongenetic data.

11 P. Mellars, Going East: new genetic and archaeological perspectives on the modern human colonization of Eurasia, Science 313 (2006), pp. 796–800. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (37)

12• L. Quintana-Murci, O. Semino, H.-J. Bandelt, G. Passarino, K. McElreavey and A.S. Santachiara-Benerecetti, Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa, Nat Genet 23 (1999), pp. 437–441. View Record in Scopus | Cited By in Scopus (198) The first genetic evidence that humans may have used the coastal southern exit route from out-of-Africa.

13 S. Roychoudhury, S. Roy, A. Basu, R. Banerjee, H. Vishwanathan, M.V. Usha Rani, S.K. Sil, M. Mitra and P.P. Majumder, Genomic structures and population histories of linguistically distinct tribal groups of India, Hum Genet 109 (2001), pp. 339–350. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (49)

14 V. Macaulay, C. Hill, A. Achilli, C. Rengo, D. Clarke, W. Meehan, K. Blackburn, O. Semino, R. Scozzari and F. Cruciani et al., Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes, Science 308 (2005), pp. 1034–1036. View Record in Scopus | Cited By in Scopus (115)

15 K. Thangaraj, G. Chaubey, T. Kivisild, A.G. Reddy, V.K. Singh, A.A. Rasalkar and L. Singh, Reconstructing the origin of Andaman islanders, Science 308 (2005), p. 996. View Record in Scopus | Cited By in Scopus (67)

16 P.A. Underhill, G. Passarino, A.A. Lin, P. Shen, M.M. Lahr, R.A. Foley, P.J. Oefner and L.L. Cavalli-Sforza, The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations, Ann Hum Genet 65 (2001), pp. 43–62. View Record in Scopus | Cited By in Scopus (280)

17 T. Kivisild, M.J. Bamshad, K. Kaldma, M. Metspalu, E. Metspalu, M. Reidla, S. Laos, J. Parik, W.S. Watkins and M. Dixon et al., Deep common ancestry of Indian and western-Eurasian mitochondrial DNA lineages, Curr Biol 9 (1999), pp. 1331–1334. Article | PDF (586 K) | View Record in Scopus | Cited By in Scopus (108)

18• A. Basu, N. Mukherjee, S. Roy, S. Sengupta, S. Banerjee, M. Chakraborty, B. Dey, M. Roy, B. Roy and N.P. Bhattacharyya et al., Ethnic India: a genomic view, with special reference to peopling and structure, Genome Res 13 (2003), pp. 2277–2290. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (72) A detailed study based on mitochondrial, autosomal, and Y-chromosomal genetic markers pertaining to the peopling and genetic structure of the ethnic groups of India.

19• The Indian Genome Variation Consortium, Genetic landscape of the people of India: a canvas for disease gene exploration, J Genet 87 (2008), pp. 3–20. The most comprehensive study on Indian populations based on SNPs in autosomal genes.

20 D.P. Pattanayak, The language heritage of India. In: D. Balasubramanian and N.A. Rao, Editors, The Indian Human Heritage, Universities Press, Hyderabad (1998), pp. 95–99.

21 B.S. Guha, The racial affinities of the people of India, Census of India, 1931, Part III — Ethnographical, Government of India Press, Simla (1935).

22 M. Bamshad, T. Kivisild, W.S. Watkins, M.E. Dixon, C.E. Ricker, B.B. Rao, J.M. Naidu, B.V.R. Prasad, P.G. Reddy and A. Rasanayagam et al., Genetic evidence on the origins of Indian caste populations, Genome Res 11 (2001), pp. 994–1004. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (118)

23 P. Endicott, M.T.P. Gilbert, C. Stringer, C. Lalueza-Fox, E. Willerslev, A.J. Hansen and A. Cooper, The genetic origin of Andaman islanders, Am J Hum Genet 72 (2003), pp. 178–184. Article | PDF (301 K) | Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (42)

24 V. Kumar, A.N. Reddy, J.P. Babu, T.N. Rao, B.T. Langstieh, A.G. Reddy, L. Singh and B.M. Reddy, Y-chromosome evidence suggests a common paternal heritage of Austro-Asiatic populations, BMC Evol Biol 7 (2007), p. 47. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (4)

25 C. Renfrew, Archaeology, genetics and linguistic diversity, Man 27 (1992), pp. 445–487. Full Text via CrossRef

26 R. Cordaux, N. Saha, G.R. Bentley, R. Aunger, S.M. Sirajuddin and M. Stoneking, Mitochondrial DNA analysis reveals diverse histories of tribal populations from India, Eur J Hum Genet 11 (2003), pp. 253–264. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (42)

27 R. Cordaux, G. Weiss, N. Saha and M. Stoneking, The Northeast Indian passageway: a barrier or corridor for human migrations?, Mol Biol Evol 21 (2004), pp. 1525–1533. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (16)

28•• S. Sengupta, L.A. Zhivotovsky, R. King, S.Q. Mehdi, C.A. Edmonds, T.C.-E. Chow, A.A. Lin, M. Mitra, S.K. Sil and A. Ramesh et al., Polarity and temporality of high-resolution Y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists, Am J Hum Genet 78 (2006), pp. 202–221. Article | PDF (1584 K) | Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (37) A comprehensive study using a very large number of Y-chromosomal SNPs and microsatellites to arrive at inferences regarding peopling of the Indian, southeast and west Asian regions, some of which are at variance with inferences of some earlier studies.

29 D.W. McAlpin, Proto-Elamo-Dravidian: the evidence and its implications, Trans Am Philos Soc 71 (1981), pp. 3–155.

30 C. Renfrew, Language families and the spread of farming. In: D.R. Harris, Editor, The Origins and Spread of Agriculture and Pastoralism in Eurasia, Smithsonian Institution Press, Washington, DC (1996), pp. 70–92.

31 S. Sahoo, A. Singh, G. Himabindu, J. Banerjee, T. Sitalaximi, S. Gaikwad, R. Trivedi, P. Endicott, T. Kivisild and M. Metspalu, A prehistory of Indian Y chromosomes: evaluating demic diffusion scenarios, Proc Natl Acad Sci U S A 103 (2006), pp. 843–848. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (13)

32•• A. Kong, D.F. Gudbjartsson, J. Sainz, G.M. Jonsdottir, S.A. Gudjonsson, B. Richardsson, S. Sigurdardottir, J. Barnard, B. Hallbeck and G. Masson et al., A high-resolution recombination map of the human genome, Nat Genet 31 (2002), pp. 241–247. View Record in Scopus | Cited By in Scopus (734) An insightful paper that provides estimates not only of recombination rates, but also of the possible processes that lead to variability in recombination rates.

33 G.A.T. McVean, S.R. Myers, S. Hunt, P. Deloukas, D.R. Bentley and P. Donnelly, The fine-scale structure of recombination rate variation in the human genome, Science 304 (2004), pp. 581–584. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (278)

34•• S.B. Gabriel, S.F. Schaffner, H. Nguyen, J.M. Moore, J. Roy, B. Blumenstiel, J. Higgns, M. DeFelice, A. Lochner and M. Faggart et al., The structure of haplotype blocks in the human genome, Science 296 (2002), pp. 2225–2229. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (1723) A large-scale study that introduces the concept of a haplotype block.

35•• P.C. Sabeti, P. Varilly, B. Fry, J. Lohmueller, E. Hostetter, C. Cotsapas, X. Xie, E.H. Byrne, S.A. McCarroll and R. Gaudet et al., Genome-wide detection and characterization of positive selection in human populations, Nature 449 (2007), pp. 913–918. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (53) A significant study to detect signatures of positive selection in the human genome using the HapMap data.

36•• C.D. Bustamante, A. Fledel-Alon, S. Williamson, R. Nielsen, M.T. Hubisz, S. Glanowski, D.M. Tanenbaum, T.J. White, J.J. Sninsky and R.D. Hernandez et al., Natural selection on protein-coding genes in the human genome, Nature 437 (2005), pp. 1153–1157. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (147) An early large-scale study using sequence data on 11 000 protein-coding genes to discover the impact on natural selection on the human genome.

37 N.E. Morton, Parameters of the human genome, Proc Natl Acad Sci U S A 88 (1991), pp. 7474–7476. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (302)

38 M.L. Freedman, C.A. Haiman, N. Patterson, G.J. McDonald, A. Tandon, A. Waliszewska, K. Penney, R.G. Steen, K. Ardlieb and E.M. Johni et al., Admixture mapping identifies 8q24 as a prostate cancer risk locus in African-American men, Proc Natl Acad Sci U S A 103 (2006), pp. 14068–14073. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (93)

39 X. Zhu, A. Luke, R.S. Cooper, T. Quertermous, C. Hanis, T. Mosley, C.C. Gu, H. Tang, D.C. Rao and N. Risch et al., Admixture mapping for hypertension loci with genome-scan markers, Nat Genet 37 (2005), pp. 177–181. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (78)

40 M.M. Carrasquillo, J. Zlotogoral, S. Barges and A. Chakravarti, Two different connexin 26 mutations in an inbred kindred segregating non-syndromic recessive deafness: implications for genetic studies in isolated populations, Hum Mol Genet 6 (1997), pp. 2163–2172. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (107)

41•• N. Morral, Bertranpetit, X. Estivill, V. Nunes, T. Casals, J. Gimenez, A. Reis, R. Varon-Mateeva, M. Macek and L. Kalaydjieva et al., The origin of the major cystic fibrosis mutation (ΔF508) in European populations, Nat Genet 7 (1994), pp. 169–175. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (152) A very interesting and detailed study that traces the origin and spread of the ΔF508 mutation.

42 O. Lao, A.M. Andre, E. Mateu, J. Bertranpetit and F. Calafell, Spatial patterns of cystic fibrosis mutation spectra in European populations, Eur J Hum Genet 11 (2003), pp. 385–394. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (5)

43 C.M. Cserti and W.H. Dzik, The ABO blood group system and Plasmodium falciparum malaria, Blood 110 (2007), pp. 2250–2258. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (15)

44 P.P. Majumder and B. Dey, Absence of the HIV-1 protective Δccr5 allele in most ethnic populations of India, Eur J Hum Genet 9 (2001), pp. 794–796. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (8)