Friday, April 27, 2012
Scientists from Uppsala University have managed to extract genome-wide markers from the early Neolithic remains of three hunter-gatherers and one farmer from southern Sweden. They only pulled a few thousand SNPs from each sample, but that was enough to successfully compare the ancient remains to modern Europeans. The results of their study, published in Science Magazine today, reveal that Poles top the allele sharing list with the the hunter-gatherers. Interestingly, Poles also show higher allele-sharing with the farmer than Swedes do, but not as high as Cypriots and Greeks. The figure below illustrates this clearly.
But why is it that Poles show higher similarity to these Neolithic Scandinavians than Swedes do? Firstly, it's important to realize that the differences aren't that great. Note, for instance, that Swedes are the second most similar population to the hunter-gatherers after Poles. However, clearly, the data suggests that there had to be other population movements into Scandinavia after the late Neolithic. These also likely affected Poland, but to a lesser degree.
No one yet knows what these were exactly, but if I had to guess, I'd say the Bell Beaker folk of the Copper Age represented one of the major waves (see figure below from "Europe during the third millennium BC and Bell Beaker culture phenomenon: peopling history though dental non-metric traits study" by Jocelyne Desideri). Also, another factor might be that the hunter-gatherers tested by Skoglund et al. belonged to the Pitted Ware culture, which arrived in Scandinavia from the Eastern Baltic.
Anyway, I'm absolutely delighted with the results from this study. The reason is that they correlate very closely with the experiments I've been running with ADMIXTURE, aimed at untangling the story of the peopling of Europe. Note, for instance, the close correlation between the STRUCTURE plot above, and the results from my Hunter-Gatherer vs. Farmer analysis (see here). All you have to do is add up the blue and purple components from the STRUCTURE graph, and you'll basically get my "Baltic hunter-gatherer" cluster. Also, the orange component is very similar to my "Mediterranean farmer" cluster.
If Skoglund et al. had access to more prehistoric samples, then it's likely these would create their own clusters. That's because the four Neolithic individuals they tested, especially the hunter-gatherers, seem to fall outside the range of modern European genetic variation, like on some of the PCAs below. The appearance of ancient clusters wouldn't invalidate the current results, because such clusters would no doubt show a close relationship to those created by modern samples. However, I’m pretty sure they'd give us a better idea of how much hunter-gatherer ancestry survives in modern Europeans, because they wouldn't be affected by such factors as genetic drift since the Neolithic. So that’s something to look forward to in the future.
Skoglund et al., Origins and Genetic Legacy of Neolithic Farmers and Hunter-Gatherers in Europe, Science 27 April 2012: Vol. 336 no. 6080 pp. 466-469 DOI: 10.1126/science.1216304
Saturday, April 21, 2012
Basically, the first map below reveals the answer. It shows the spread of a European specific cluster from a global-wide ADMIXTURE analysis at K=8 (eight ancestral populations assumed), which I call "North European". Thus, genetically, the most European populations are found around the Baltic Sea, and in particular in the East Baltic region. In my genome collection, samples from Lithuania clearly and consistently score the highest percentages in ADMIXTURE clusters specific to Europe. However, I suspect that if I had Latvians with no known foreign ancestry going back more than four generations, they'd come out the "most European". Hopefully we can test that in the near future.
Below are the fifteen Eurogenes samples that scored the highest percentage levels of membership in the North European cluster. The list only includes groups with five or more individuals present in the analysis, so some populations, like Estonians or Danes, weren't included, even though they easily made the cut. The spreadsheet with all the results from this run can be seen here. A table of Fst (genetic) distances between the eight clusters is available here.
Kargopol Russians 68%
HapMap Utah Americans (CEU) 63%
So why did I pick the results from K=8, and not some other K, like 2, 10, or 25? Well, it's not possible to evaluate who is more European without a European-specific cluster (ie. modal in Europeans, with a low frequency outside of Europe). Provided that a decent number and range of global and West Eurasian samples are used in the analysis, such clusters begin appearing at around K=5 or K=6, and start breaking up into local clusters from about K=9. I found that runs below K=8 produced European clusters that spilled too generously outside of the borders of Europe. On the other hand, runs above K=8 produced European clusters that weren't representative of enough European groups (ie. too localized). But the European cluster from K=8 was pretty much perfect, and I think that's obvious from the map. In fact, I can hardly believe how well it fits the modern geographic concept of Europe - north of the Mediterranean and west of the Urals. Amazing stuff.
There are two other clusters that show up across Europe in non-trivial amounts - Mediterranean and Caucasus (see maps below). These can also be thought of as native European clusters, since they've been on the continent for thousands of years. However, their peak frequencies are found in West Asia, so they're not particularly useful signals of European-specific ancestry.
So what do these three clusters show exactly? They represent certain allele frequencies in modern populations, and in fact, these can change fairly rapidly due to admixture, selection, and genetic drift. So claiming that such clusters represent pure ancient populations is unlikely to be true in most cases, if ever. However, I don't think there's anything wrong in saying that, when robust enough, they can be thought of as signals of ancestry from relatively distinct ancestral groups.
Indeed, anyone who's read up on the prehistory of Europe, knows that there are three general Neolithic archeological waves to consider when trying to untangle the story of the peopling of Europe. These are Mediterranean Neolithic, Anatolian Neolithic and Forest Neolithic (for example, see here).
Mediterranean Neolithic refers to a series of migrations from West Asia via the Mediterranean and its coasts. The areas most profoundly affected by these movements include the islands of Sardinia and Corsica, and the Southwest European mainland. Anatolian Neolithic describes migrations into Europe from modern day Turkey, mostly into the Balkans, but also as far as Germany and France. At the moment, Forest Neolithic of Northeastern Europe is something of a mystery. However, the general opinion is that it was largely the result of native Mesolithic hunter-gatherers adopting agriculture.
Obviously, it's very difficult to dismiss the correlations between these three broad archeological groups and the European and two European/West Asian clusters produced in my K=8 ADMIXTURE analysis. Is it a coincidence that the Mediterranean cluster today peaks in Sardinia, which has been largely shielded from foreign admixture since the Neolithic, and today forms a very distinct Southern European isolate? Why does the North European cluster show the highest peaks in classic Forest Neolithic territory? And why does the Caucasus cluster radiate in Europe from the southeast, which is where Anatolian farmers had the greatest impact? These can't all be coincidences, and I'm willing to bet that none of them are. I'm convinced that the three clusters from my K=8 run are strong signals from the Neolithic, and the North European cluster also from the Mesolithic.
Eventually, these issues will be settled with ancient DNA data, in a much more comprehensive way than ever possible using modern genomes. We've already seen some preliminary results, mostly from Mesolithic, Neolithic and Bronze Age sites around Europe, so perhaps it's useful to ask whether my ADMIXTURE analysis and commentary here mirror these early findings? I think they do. For instance, here's an interesting conclusion regarding the East Baltic area from a study on ancient Scandinavian mtDNA by Malmström et al.
Through analysis of DNA extracted from ancient Scandinavian human remains, we show that people of the Pitted Ware culture were not the direct ancestors of modern Scandinavians (including the Saami people of northern Scandinavia) but are more closely related to contemporary populations of the eastern Baltic region. Our findings support hypotheses arising from archaeological analyses that propose a Neolithic or post-Neolithic population replacement in Scandinavia . Furthermore, our data are consistent with the view that the eastern Baltic represents a genetic refugia for some of the European hunter-gatherer populations.
I suppose there will be people wondering why I didn't take Sub-Saharan African, East Asian, and South Asian admixtures into account in my analysis. The reason is that I wasn't looking at which group was most West Eurasian, or Caucasoid. Based on everything I've seen to date, in my own work as well as elsewhere, the most West Eurasian group would probably be the French Basques from the HGDP. However, the differences between them, and certain groups from Northeastern Europe, like Northern Poles and Lithuanians, really wouldn't be that great anyway. I might do a write up about that at some point.
- Maps by Eurogenes project member FR7
- Additional stats by Eurogenes project member DESEUK1
Helena Malmström et al., Ancient DNA Reveals Lack of Continuity between Neolithic Hunter-Gatherers and Contemporary Scandinavians, Current Biology, 24 September 2009, doi:10.1016/j.cub.2009.09.017
Noreen von Cramon-Taubadel and Ron Pinhasi, Craniometric data support a mosaic model of demic and cultural Neolithic diffusion to outlying regions of Europe, Proc. R. Soc. B published online 23 February 2011, doi: 10.1098/rspb.2010.2678
Wednesday, April 11, 2012
We recently learned that many of the typically East Eurasian mtDNA lineages present in Europe today arrived there during the Neolithic, and perhaps in some cases even the Mesolithic (see here and here). It now seems that a large part of the Sub-Saharan African mtDNA lineages found in Europe are also of Neolithic origin. However, most appear to have come "rather recently", as a result of contacts between Europe and Africa during the Roman Empire, the Trans-Atlantic slave trade, and so on.
Mitochondrial DNA (mtDNA) lineages of macro-haplogroup L (excluding the derived L3 branches M and N) represent the majority of the typical sub-Saharan mtDNA variability. In Europe, these mtDNAs account for <1% of the total but, when analyzed at the level of control region, they show no signals of having evolved within the European continent, an observation that is compatible with a recent arrival from the African continent. To further evaluate this issue, we analyzed 69 mitochondrial genomes belonging to various L sublineages from a wide range of European populations. Phylogeographic analyses showed that ∼65% of the European L lineages most likely arrived in rather recent historical times, including the Romanization period, the Arab conquest of the Iberian Peninsula and Sicily, and during the period of the Atlantic slave trade. However, the remaining 35% of L mtDNAs form European-specific subclades, revealing that there was gene flow from sub-Saharan Africa toward Europe as early as 11,000 yr ago.
Maria Cerezo et al., Reconstructing ancient mitochondrial DNA links between Africa and Europe, Published in Advance March 27, 2012, doi: 10.1101/gr.134452.111