Literature DB >> 31517044

Population structure of modern-day Italians reveals patterns of ancient and archaic ancestries in Southern Europe.

A Raveane^1,2, S Aneli^2,3,4, F Montinaro^2,5, G Athanasiadis⁶, S Barlera⁷, G Birolo^3,4, G Boncoraglio^8,9, A M Di Blasio¹⁰, C Di Gaetano^3,4, L Pagani^5,11, S Parolo¹², P Paschou¹³, A Piazza^3,14, G Stamatoyannopoulos¹⁵, A Angius¹⁶, N Brucato¹⁷, F Cucca¹⁶, G Hellenthal¹⁸, A Mulas¹⁹, M Peyret-Guzzon²⁰, M Zoledziewska¹⁶, A Baali²¹, C Bycroft²⁰, M Cherkaoui²¹, J Chiaroni^22,23, J Di Cristofaro^22,23, C Dina²⁴, J M Dugoujon¹⁷, P Galan²⁵, J Giemza²⁴, T Kivisild^5,26, S Mazieres²², M Melhaoui²⁷, M Metspalu⁵, S Myers²⁰, L Pereira^28,29, F X Ricaut¹⁷, F Brisighelli³⁰, I Cardinali³¹, V Grugni¹, H Lancioni³¹, V L Pascali³⁰, A Torroni¹, O Semino¹, G Matullo^3,4, A Achilli¹, A Olivieri¹, C Capelli².

Abstract

European populations display low genetic differentiation as the result of long-term blending of their ancient founding ancestries. However, it is unclear how the combination of ancient ancestries related to early foragers, Neolithic farmers, and Bronze Age nomadic pastoralists can explain the distribution of genetic variation across Europe. Populations in natural crossroads like the Italian peninsula are expected to recapitulate the continental diversity, but have been systematically understudied. Here, we characterize the ancestry profiles of Italian populations using a genome-wide dataset representative of modern and ancient samples from across Italy, Europe, and the rest of the world. Italian genomes capture several ancient signatures, including a non-steppe contribution derived ultimately from the Caucasus. Differences in ancestry composition, as the result of migration and admixture, have generated in Italy the largest degree of population structure detected so far in the continent, as well as shaping the amount of Neanderthal DNA in modern-day populations.

Entities: CellLine Chemical Disease Species

Year: 2019 PMID： 31517044 PMCID： PMC6726452 DOI： 10.1126/sciadv.aaw3492

Source DB: PubMed Journal: Sci Adv ISSN： 2375-2548 Impact factor: 14.136

INTRODUCTION

Our understanding of the events that shaped European genetic variation has been redefined by the availability of ancient DNA (aDNA). In particular, it has emerged that, in addition to the contributions of early hunter-gatherer populations, major genetic components can be traced back to Neolithic and Bronze Age expansions (). Extensive gene flow across the continent over the last few thousand years (, ) has further contributed to the correlation between geography and genetic variation observed in modern Europe (). The arrival of farming in Europe led to admixture between incoming Anatolian farmers and autochthonous hunter-gatherers, a process that generated individuals genetically close to present-day Sardinians (, ). During the Bronze Age, the dispersal of a population related to the pastoralist nomadic Yamnaya from the Pontic-Caspian steppe further markedly affected the genetic landscape of the continent (, ). This migration, supported by archeological and genetic data, has also been linked to the spread of the Indo-European languages in Europe and the introduction of several technological innovations in peninsular Eurasia (). Genetically, ancient steppe populations have been described as a combination of Eastern and Caucasus hunter-gatherer/Iran Neolithic (EHG and CHG/IN) ancestries (). However, the analysis of aDNA from Southern East Europe has identified the existence of additional contributions ultimately from the Caucasus (, ) and suggests a more complex ancient ancestry composition for Europeans (). The geographic location of Italy, enclosed between continental Europe and the Mediterranean Sea, makes the Italian people relevant for the investigation of continent-wide demographic events to complement and enrich the information provided by aDNA studies. To characterize the genetic variability of modern-day European populations and their relationship to early European foragers, Neolithic farmers, and Bronze Age nomadic pastoralists, we investigated the population structure of present-day Italians and other Europeans in terms of their ancestry composition as the result of migration and admixture, both ancient and historical. To do this, we assembled and analyzed a comprehensive genome-wide single nucleotide polymorphism (SNP) dataset composed of 1616 individuals from all 20 Italian administrative regions and more than 140 worldwide reference populations to give a total of 5192 modern-day samples (fig. S1, A and B, and data file S1), to which we added genomic data available for ancient individuals (data file S1).

RESULTS

Distinctive genetic structure in Italy

We initially investigated patterns of genetic differentiation in Italy and surrounding regions by exploring the information embedded in the SNP-based haplotypes of modern samples [full modern dataset (FMD) including 218,725 SNPs]. The phased genome-wide dataset was analyzed using the ChromoPainter (CP) and fineSTRUCTURE (fS) pipeline (see the Supplementary Materials) (, ) to generate a tree of groups of individuals with similar “copying vectors” (clusters; Fig. 1A). The fraction of pairs of individuals placed in the same cluster across multiple runs was, on average, 0.95 for Italian clusters and 0.96 across the whole set of clusters (see Materials and Methods and the Supplementary Materials). Non-European clusters were pooled into larger groups in subsequent analyses (see Materials and Methods and the Supplementary Materials).

Fig. 1

Genetic structure of the Italian populations.

Genetic structure of the Italian populations.

(A) Simplified dendrogram of 3057 Eurasian samples clustered by the fS algorithm using the CP output (complete dendrogram in fig. S1C). Each leaf represents a cluster of individuals with similar copying vectors. Clusters with more than five individuals are labeled in black. Italian clusters are color coded. Gray labels ending in the form <>_D refer to clusters containing less than five individuals or individuals of uncertain origin that have been removed in the following analyses. (B) Principal components analysis (PCA) based on the CP chunkcount matrix [colors as in (A)]. The centroid of the individuals belonging to non-Italian clusters is identified by the label for each cluster. The plot was rotated to the left by 90° to highlight the correspondence with the geography of the Italian samples. (C) Pie charts summarizing the relative proportions of inferred fS genetic clusters for all the 20 Italian administrative regions [colors as in (A)]. (D) Between-cluster FST estimates within European groups. Clusters were generated using only individuals belonging to the population analyzed (see Materials and Methods and the Supplementary Materials). The number of genetic clusters analyzed for each population is reported within brackets. For the comparisons across Europe, the cluster NEurope1 containing almost exclusively Finnish individuals was excluded (FST estimates for Italian and European clusters are in data file S2). FST distributions statistically lower than the Italian one are in colors other than green. (E) Estimated effective migration surfaces (EEMS) analysis in Southern Europe. Colors represent the log10 scale of the effective migration rate, from low (red) to high (yellow). Italian clusters separated into three main groups: Sardinia, Northern (North/Central-North Italy), and Southern Italy (South/Central-South Italy and Sicily); the first two were close to populations originally from Western Europe, while the last was closer to Middle Eastern groups (Fig. 1, A and B; figs. S1D and S2, A to C; and data file S1). These observations were confirmed using a subset of the dataset genotyped for a larger number of SNPs [high-density dataset (HDD) including 591,217 SNPs; see Materials and Methods and the Supplementary Materials; fig. S1D and data file S1]. To highlight the geographic distribution of the identified clusters along the Italian peninsula, we reconstructed the cluster composition of the various administrative regions of Italy by using the best sampling origin information available for each individual in our dataset (Fig. 1C and data file S1). Recent migrants and admixed individuals, as identified on the basis of their copying vectors (fig. S3, data file S2), were removed in subsequent CP/fS analyses (see Materials and Methods and the Supplementary Materials). A sharp north-south division in cluster distribution was detected, the separation between northern and southern areas being shifted north along the peninsula (Fig. 1B) (). The reported structure dismissed the possibility that the Central Italian populations differentiated from the Northern and Southern Italian groups (Fig. 1A) (). Individuals from Central Italy were, in fact, assigned mostly to the Southern Italian clusters, except for samples from Tuscany, which grouped instead with the Northern Italian clusters (Fig. 1, A and B) (). Contrary to previous results, no outliers were detected among the Northern Italian clusters (). We evaluated the impact of geography in shaping the genomic variability of Italy by testing the correlation between geographic and genetic coordinates, applying a Procrustes analysis (Fig. 1B; Materials and Methods). A significant correlation was observed in our dataset, in agreement with reports for Italy and Europe (, ) (correlation in Procrustes rotation, 0.77 and 0.78, P < 0.05 including or excluding Sardinia, respectively). Intracluster variation within Southern and Northern Italian clusters was comparable (data file S2). Sardinian clusters were characterized by a high proportion of genome copied by individuals from the same cluster (self-copying), in agreement with previous indications of drift in Sardinian groups (, ). Southern Italian samples showed higher among-cluster FST values than the Northern Italian ones, but lower TVD (total variation distance) values (data file S2; Materials and Methods) (, ). Sardinian clusters showed both high TVD and FST intercluster variation, combining the effects of drift and variation in ancestry composition. We compared the degree of variation among genetic clusters in Italy with those in several European countries (, –) and across the whole of Europe (Fig. 1D). Among-cluster variability in Italy was significantly higher than in any other country examined (FST; median Italy, 0.004; data file S2; range medians for listed countries, 0.0001 to 0.002) and showed differences comparable with estimates across European clusters (median European clusters, 0.004; Materials and Methods and the Supplementary Materials), as previously suggested using unilinear data (, ). The analysis of the migration surfaces [estimated effective migration surfaces (EEMS)] () highlighted several barriers to gene flow within and around Italy, but also suggested the existence of migration corridors in the southern part of the Adriatic and Ionian seas and between Sardinia, Corsica, and continental Italy (Fig. 1E and fig. S4) ().

Multiple ancient ancestries in Italian clusters

To further characterize the observed genetic structure, we investigated the ancestry composition of modern clusters. We tested different combinations of ancient putative sources using a “mixture fit” approach [non-negative least square (NNLS) algorithm] (, ). We applied this approach to ancient samples using the “unlinked” mode implemented in CP, similar to other routinely performed analyses based on genotype data, such as qpAdm and ADMIXTURE. In addition, data from modern individuals (from the FMD dataset) were harnessed as donor populations (Materials and Methods and Supplementary Materials). Following Lazaridis et al. (), we performed two separate CP/NNLS analyses, “ultimate” and “proximate,” referring to the least and the most recent putative sources, respectively (Fig. 2 and fig. S5, A to E). In the ultimate analysis, all the Italian clusters were characterized by relatively high amounts of Anatolian Neolithic (AN) contributions, ranging from 56% (SItaly1) to 72% (NItaly4), distributed along a north-south cline (Spearman ρ = 0.52, P < 0.05; Fig. 2, A to C, fig. S5A, and data file S3), with Sardinians showing values above 80%, as previously suggested (, ). A closer affinity of Northern Italian than Southern Italian clusters to AN was also supported by D-statistics (fig. S6A). The remaining ancestry was mainly assigned to WHG (western hunter-gatherer), CHG, and EHG. In particular, the first two components were more present in populations from the South of Italy (P < 0.05, Student’s t test), while the latter was higher in Northern Italian clusters (P < 0.05, Student’s t test). These observations suggest the existence of different secondary source contributions to the two edges of the peninsula, with the north affected more by EHG-related populations and the south by CHG-related groups. IN ancestry was detected in Europe only in Southern Italy (Fig. 2 and fig. S5A).

Fig. 2

Ancient ancestries in Western Eurasian modern-day clusters and Italian ancient samples.

Ancient ancestries in Western Eurasian modern-day clusters and Italian ancient samples.

CP/NNLS analysis on all Italian and European clusters using as donors different sets of ancient samples and two modern clusters (NAfrica1, North Africa; EAsia2, East Asia) [full results in fig. S5 (A and B)]. (A) Ultimate sources: AN, Anatolian Neolithic (Bar8); WHG, western hunter-gatherer (Bichon); CHG, Caucasus hunter-gatherer (KK1); EHG, Eastern hunter-gatherer (I0061); IN, Iranian Neolithic (WC1). (B) EHG and (C) CHG ancestry contributions in Western Eurasia, as inferred in (A) and figs. S8A and S5A. (D) Same as in (A), using proximate sources: WHG, western hunter-gatherer (Bichon); EEN, European Early Neolithic (Stuttgart); SBA, Bronze Age from steppe (I0231); ABA, Bronze Age from Anatolia (I2683). (E) SBA and (F) ABA ancestry contributions, as inferred in (D) and fig. S5B. Triangles refer to the location of ancient samples used as sources (data file S1). (G) Ratio of the residuals in the NNLS analysis (see Materials and Methods and the Supplementary Materials) for all the Italian and European clusters when ABA was excluded and included in the set of proximate sources; (H) as in (G), but excluding/including SBA instead of ABA. (I) Ancient Italian and other selected ancient samples projected on the components inferred from modern European individuals. Labels are placed at the centroid of the individuals belonging to the indicated clusters. North-south differences across Italy were also detected in the proximate analysis. When proximate sources were evaluated, significantly higher ABA (Anatolia Bronze Age) and SBA (Steppe Bronze Age) ancestries were detected in Southern and Northern Italy, respectively (Fig. 2, D to F, and fig. S5B; P < 0.05, Student’s t test; P < 0.05, Wilcoxon rank sum test; Supplementary Materials), in line with the results based on the D-statistics (fig. S6, A and B) and mirroring the CHG and EHG patterns, respectively (Fig. 2, A to C). Contrary to previous reports (), the occurrence of CHG as detected by our CP/NNLS analysis did not mirror the presence of SBA, with several populations testing positive for the latter but not for the former (Fig. 2 and fig. S5, A and B). When we compared this analysis and the one using a different CHG sample (SATP) (), the two were highly correlated (Spearman ρ = 0.972, P < 0.05; fig. S5F). We therefore speculate that our approach might, in general, underestimate the presence of CHG across the continent; however, we note that even considering this scenario, the excess of Caucasus-related ancestry detected in the south of the European continent, and in Southern Italy in particular, is notable and unexplained by currently proposed models for the peopling of the continent. The different impact of ABA and SBA across Italy was additionally supported by the reduced fit of the NNLS (sum of the squared residuals; Materials and Methods and Supplementary Materials) when the proximate analysis was run excluding one of these two sources. The residuals were almost twice as much for Southern Italians when ABA was not included as a source, while a substantial increase in the residual values was observed for Northern Italians when SBA was removed from the panel of proximate sources (Fig. 2, G and H). The different affinities of Southern and Northern Italians for ABA and SBA were also highlighted in the principal components analysis (PCA) and ADMIXTURE analysis on ancient and modern samples (Fig. 2I and fig. S7). These results were confirmed by the qpAdm analysis, where all the analyzed Italian clusters could be modeled as a combination of ABA, SBA, and European Middle-Neolithic/Chalcolithic populations, their contributions mirroring the pattern observed in the CP/NNLS analysis (fig. S5G and data file S4). Sardinian clusters were consistently modeled as AN + WHG + CHG/IN across runs, with the inclusion of North Africa and SBA when a different number of sources were considered. The qpAdm analyses of Italian HDD clusters generated similar results (Materials and Methods, Supplementary Materials, and data file S4). We noted that in the Balkan peninsula, signatures related to ABA were present but were less evident than in Southern Italy across modern-day populations, possibly being masked by historical contributions from Central Europe (Figs. 2 and 3 and fig. S5B) (, , ). Overall, SBA and ABA appeared to have different distribution patterns in Italy and to reflect those present in Europe as a whole: more common in North Italy and across the continent in the case of the former, more localized in the south of Europe and Italy in the case of the latter. Similar results were obtained when other Southern European ancient sources replaced ABA in the proximate analysis (see fig. S5, C to E, Materials and Methods, and the Supplementary Materials).

Fig. 3

Admixture events inferred by GT.

(A) Dates of the events inferred in the GT noItaly analysis on all the Italian clusters (labels as in Fig. 1A and data file S1; full results in fig. S8 and data file S5; see Materials and Methods and the Supplementary Materials); lines encompassed the 95% confidence interval. GT events were distinguished in “one date” (black squares; 1D in data file S5) and “one date multiway” (white squares; 1 MW). (B) Correlation values between copying vectors of first source(s) identified by GT and the best proxy in the noItaly analysis (circles) or the best proxy among Italian clusters (diamonds). (C) Same as in (B), referring to second source(s) copying vectors. Empty symbols refer to additional first (B) and second (C) sources detected in multiway events. African best proxies in (B) for clusters SItaly1 and SItaly2 were plotted on the 0.90 boundary for visualization only, the correlation values being 0.78 and 0.87, respectively. The symbols referring to the best Italian proxies for the African sources identified for clusters SItaly1, SItaly2, Sicily1, and Sardinia3 in (B) are not included as the correlation values are lower than the African ones and below the threshold used in the figure. The colors of the symbols refer to the ancestry to which proxies were assigned (see Materials and Methods and the Supplementary Materials).

Admixture events inferred by GT.

Modeling the ancestry composition of ancient Italian samples

To obtain temporal insights into the emergence of the differences between Northern and Southern Italy in relation to SBA and ABA ancestries, we performed the same qpAdm analysis on post-Neolithic/Bronze Age Italian individuals (data file S4). Iceman and Remedello, the oldest Italian samples included here [3400 to 2800 Before Common Era (BCE)], were composed of high proportions of AN (74 and 85%, respectively). The Bell Beaker samples of Northern Italy (2200 to 1930 BCE) were modeled as ABA and AN + SBA and WHG. Although ABA estimates in these samples were characterized by large standard errors (SE), the detection of steppe ancestry, at approximately 14%, was more robust. In contrast, Bell Beaker samples from Sicily (2500 to 1900 BCE) were modeled almost exclusively as ABA, with less than 5% SBA (data file S4). Despite the fact that the small number of SNPs and prehistoric individuals tested prevents the formulation of conclusive results, differences in the occurrence of AN ancestry, and possibly also of Bronze Age–related contributions, are suggested to be present between ancient samples from North and South Italy. Differences across ancient Italian samples were also supported by their projections on the PCA of modern-day data (Fig. 2I). Remedello and Iceman clustered with European Early Neolithic samples, together with one of the three Bell Beaker individuals from North Italy, as previously reported (), and modern-day Sardinians. The other two Bronze Age North Italian samples clustered with modern North Italians, while the Bell Beaker sample from Sicily was projected in between European Early Neolithic, Bronze Age Southern European, and modern-day Southern Italian samples (Fig. 2I). These results suggest a differentiation in ancient ancestry composition between different areas of Italy, dating at least in part back to the Bronze Age.

Historical admixture

To investigate the contribution of historical admixture events in shaping the modern distribution of ancient ancestries and the observed population structure, we characterized admixture events for Italian and European populations using GLOBETROTTER (GT) (Fig. 3, fig. S8, and data file S5)(). We discuss here the results based on the nonlocal () (GT “noItaly”) analysis of the FMD (data file S5) as it provides a wider coverage at the population level; concordant results were found when the HDD was similarly analyzed (data file S5). We ran the analysis excluding the Italians as donors to reduce copying between highly similar groups (Fig. 3). The events detected in Italy occurred mostly between 1000 and 2000 years ago (ya) and extended to 2500 ya in the rest of Europe (Fig. 3A and fig. S8). The profile of inferred sources varied across Italy. Clusters from the Caucasus and North-West Europe were identified for many Italian clusters as best proxies for the admixing sources in agreement with previous studies (), while Middle Eastern and African groups were detected for Southern Italy and Sardinia (Fig. 3, B and C). When Italian clusters were included among putative sources, they were as good as, or better proxies than, clusters from the Caucasus and the Middle East. On the other hand, North-West European and African clusters were mostly confirmed as better proxies than groups from any other area (Fig. 3, B and C), as observed when GT was run including all clusters as donors (“GTall” analysis; data file S5). Overall, these results supported a scenario in which gene flow mostly occurred between Italian and African/other Eurasian populations. SBA and ABA ancestries were detected in Italian and non-Italian best proxies (Figs. 2D and 3 and data file S5), which suggests that part of these ancestries arrived from outside Italy in historical times (), but also that these components were already present in Italian groups at the time of these admixture events. The timing of the admixture events and the sources involved differ between Northern and Southern Italian clusters, pointing to different admixture histories for the two areas. Episodes of gene flow were also detected in Sardinia, combining signals from both the African continent and North-West Europe. MALDER results for the more recent episodes replicated the admixture pattern identified by GT (fig. S8 and data file S5).

The Neanderthal legacy across Italy and Europe

The variation in ancestry composition reported across Italy and Europe is expected to influence other aspects of the genetic profiles of European populations, including the presence of archaic genetic material (). We investigated the degree of Neanderthal ancestry in Italian and other Eurasian populations (data file S1) by focusing on SNPs tagging Neanderthal introgressed regions (). SNPs were pruned for linkage disequilibrium (LD), and a final set of 3969 SNPs was used to estimate the number of Neanderthal alleles in samples genotyped for the Infinium Omni2.5-8 Illumina Beadchip. Asian and Northern European populations had significantly more Neanderthal alleles than European and Southern European groups, as previously reported (, ), with significant differences also highlighted within Italy (Fig. 4, A and B). Contributions from the African groups may have influenced these patterns, particularly in Southern European populations () (Figs. 2 and 3). However, differences within Europe and Italy were still present once individuals belonging to clusters with African contributions were removed (see fig. S9, A and B, Materials and Methods, and the Supplementary Materials). Ancient samples have been reported to differ in their amount of Neanderthal DNA partially because of the variation in proportion of the “Basal Eurasian” lineage, which harbors only a negligible fraction of Neanderthal ancestry (). Consistent with this (), we found the estimated amounts of Basal Eurasian and Neanderthal to be negatively correlated across modern-day European clusters and populations (Fig. 4C and fig. S9, E to H). Our estimation of Basal Eurasian ancestry might be affected by non-Eurasian contributions, which could therefore be partially responsible for the observed Neanderthal patterns. The correlation between the Basal Eurasian component and Neanderthal allele sharing was still present once populations with African and East Asian contributions (Fig. 3 and fig. S8) were removed (Fig. 2 and fig. S9D). These results suggest that the pattern of Neanderthal ancestry observed in Italy and across Europe has been shaped by different factors: variation in the amount of Basal Eurasian present in ancient sources, variation in the ancient ancestry composition of modern samples, and variation in the historical contribution from Africa and East Asia.

Fig. 4

Neanderthal ancestry distribution in Eurasian populations.

Neanderthal ancestry distribution in Eurasian populations.

(A) Neanderthal allele counts in individuals from Eurasian populations, sorted by median values on 3969 LD-pruned Neanderthal tag-SNPs. CEU, Utah Residents with Northern and Western European ancestry; GBR, British in England and Scotland; FIN, Finnish in Finland; IBS, Iberian Population in Spain; TSI, Tuscans from Italy; ITN, Italians from Northern Italy; ITC, Italians from Central Italy; ITS, Italians from Southern Italy; SAR, Italians from Sardinia; CHB, Han Chinese. (B) Matrix of significances based on Wilcoxon rank sum test between pairs of populations including (lower triangular matrix) and removing (upper) outliers (see Materials and Methods and Supplementary Materials; dark blue, adjusted P < 0.05; light blue, adjusted P > 0.05). Colored squares at the sides of the heatmap refer to the populations compared, as per Fig. 4A. (C) Correlation between Neanderthal ancestry proportions and the amount of Basal Eurasian ancestry in European clusters (see Materials and Methods and the Supplementary Materials). (D and E) Neanderthal allele frequency (AF) for selected SNPs within the indicated genes: (D) high-frequency alleles in Europe and (E) North-South Europe divergent alleles. (F) Comparisons between Northern European and Italian populations (excluding Sardinia). Bars refer to comparison for reported pairs of populations; the number of NTT SNPs is reported within bars. Each section of the circos represents a tested chromosome; points refer to NTT SNPs. Colors are the same as for bars; igr, intergenic region variant. The variation in Neanderthal ancestry was also evident when SNP frequencies were evaluated. A total of 144 SNPs were identified among the Neanderthal-tag SNPs showing the largest differences in allelic frequency in genome-wide comparisons across Eurasian and African populations (Materials and Methods; Supplementary Materials, Neanderthal-tag SNPs within the top 1% of the genome-wide distributions of each of the 55 pairwise population comparisons—NTT SNPs; fig. S9I, data file S6). The top 1% of each distribution was significantly depleted in Neanderthal SNPs (Materials and Methods, Supplementary Materials, and data file S6), in agreement with a scenario of Neanderthal mildly deleterious variants being removed more efficiently in human populations (). The 50 genes containing NTT SNPs were enriched for phenotypes related to facial morphology, body size, metabolism, and muscular diseases (Materials and Methods, Supplementary Materials, and data file S6). A total of 34 NTT SNPs were found to have at least one known phenotypic association (data file S6) (). Among these, we found Neanderthal alleles associated with increased gene expression in testis and in skin after sun exposure (SNPs within the IP6K3 and ITPR3 genes), susceptibility to cardiovascular and renal conditions (AGTR1), and Brittle cornea syndrome (PRDM5) (). NTT SNPs between European and Asian/African populations included previously reported variants in BNC2 () and SPATA18 genes (, ) (Materials and Methods, Supplementary Materials, and Fig. 4D), while 80 NTT SNPs were involved in at least one comparison between Northern (CEU, GBR, and FIN) and Southern European populations (IBS and Italian groups). Among these SNPs, three mapped to the Neanderthal introgressed haplotype hosting the PLA2R1 gene (), the archaic allele at these positions reaching frequencies of at least 43% in Northern European and at most 35% in Southern European populations (Fig. 4, E and F). Ten SNPs showed an opposite frequency gradient: seven mapped to one Neanderthal-introgressed region spanning the OR51F1, OR51F2, and OR52R1 genes (Fig. 4, E and F), and the other three identified regions hosting the AKAP13 gene, within one of the high-frequency European Neanderthal introgressed haplotypes recently reported (Fig. 4, E and F) (). The locus-specific differences explored here might be the result of selection-based evolutionary dynamics combined with the demographic events that shaped the more general genome-wide patterns present across Italy and Europe.

DISCUSSION

The pattern of variation reported across Italian groups appears geographically structured across three main regions in Italy: Southern, Northern, and Sardinia. Similarly to Europe, the genetic structure reflects isolation by distance following population movements since prehistoric times () and historical admixture from the fringes of the continent (). The analysis of both modern and ancient data suggests that in Italian populations, ancestries related to CHG and EHG derive from at least two sources. One is the well-characterized steppe (SBA) signature associated with nomadic groups from the Pontic-Caspian steppes. This component reached Italy from mainland Europe at least as long ago as the Bronze Age, as suggested by its presence in Bell Beaker samples from North Italy (data file S4). The other contribution is ultimately associated with CHG ancestry, as previously suggested (), and predominantly affected Southern Italy, where it represents a substantial component of the ancestry profile of local modern populations. Although the details of the origins of this signature are still uncharacterized, it may have been present as early as the Bronze Age in Southern Italy (data file S4). The very low presence of CHG signatures in Sardinia and in older Italian samples (Remedello and Iceman), but its occurrence in modern-day Southern Italians, might be explained by different scenarios not mutually exclusive: (i) population structure among early foraging groups across Italy, reflecting different affinities to CHG; (ii) the presence in Italy of different Neolithic contributions, characterized by a different proportion of CHG-related ancestry; (iii) the combination of a post-Neolithic, prehistoric CHG-enriched contribution with a previous AN-related Neolithic layer; and (iv) a substantial historical contribution from Southeastern Europe across the whole of Southern Italy. No major structure has been highlighted so far in pre-Neolithic Italian samples (). An arrival of the CHG-related component in Southern Italy from the Southern part of the Balkan Peninsula, including the Peloponnese, is compatible with the identification of genetic corridors linking the two regions (Fig. 1E) () and the presence of Southern European ancient signatures in Italy (Fig. 2). The temporal appearance of CHG signatures in Anatolia and Southern East Europe in the Late Neolithic/Bronze Age suggests its relevance for post-Neolithic contributions (). Our results suggest contributions from ancestries additional to the three “canonical” ones considered so far in the literature (WHG, AN, and SBA). The differential distribution of these ancestries contributed to the differentiation observed between Northern and Southern Italian clusters. Additional analyses of aDNA samples from around this time in Italy are expected to clarify what ancient scenario might best support the structure related to ancient ancestry composition presented here. Historical events possibly involving continental groups at the end of the Roman Empire and African contributions following the establishment of Arab kingdoms in Southern Europe around 1300 to 1200 ya (, , , , ) played a role in further shaping the ancestry profiles and population structure of Italians (Fig. 3). In particular, African contributions might have contributed to the increased diversity detected among clusters in Southern Italy and Sardinia (Fig. 3 and data file S2) (). Significantly, despite Sardinia being confirmed as the most closely related population to Early European Neolithic farmers (Fig. 2, D and I), there is no evidence for a simple genetic continuity between the two groups. Populations in Sardinia were not completely isolated and, like the rest of Italy, experienced historical episodes of gene flow (Figs. 2 and 3 and data file S4) that contributed to the further dispersal of ancient ancestries and the introduction of other components, including African ones. It has been previously reported that variation in effective population size might explain differences in the amount of Neanderthal DNA detected in European and Asian populations (, ). Additional Neanderthal introgression events in Asia and gene flow from populations with lower Neanderthal ancestry in Europe may provide further explanations for differences in Neanderthal occurrence across populations. The spatial heterogeneity of Neanderthal legacy within Europe reported here appears to be the result of ancient and historical events that brought together in different combinations groups harboring different amounts of Neanderthal genetic material. While these events have shaped the overall continental distribution of Neanderthal DNA, locus-specific differences in the occurrence of Neanderthal alleles are also expected to reflect selective pressures acting on these variants since their introgression in the populations (). The Bronze Age migration from the Pontic-Caspian steppe region has been linked to the arrival of the Indo-European languages in mainland Europe. Our identification of an additional substantial component in Italy possibly arriving in the Bronze Age raises the possibility of multiple Indo-European waves into the continent. Similarly, the persistence in Italy of non-Indo European languages into historical times (e.g., Etruscan) could be linked to reduced SBA penetration along the peninsula. These associations, while fascinating, will require dedicated and multidisciplinary approaches to be properly explored and validated. In particular, ancient samples spanning diachronic, geographic, and cultural transects in the Italian peninsula and nearby regions will need to be analyzed to complement the interpretative framework proposed here for the processes that have shaped Southern European variation.

MATERIALS AND METHODS

Analysis of modern samples

Dataset

Two hundred twenty-two samples are presented here. Of these, 167 Italians and 6 Albanians were specifically selected and sequenced for this project with two versions (1.2 and 1.3) of the Infinium Omni2.5-8 Illumina beadchip, while 49 Italians and Europeans were genotyped with the Human660W-Quad BeadChip in the frame of another research (data file S1) (). Two separate worldwide datasets were prepared. The FMD included 4852 samples (, , , –) (1589 Italians) and 218,725 SNPs genotyped with Illumina arrays; the HDD contained 1651 samples (, , , , , , , ) (524 Italians) and 591,217 SNPs genotyped with the Illumina Omni array (Supplementary Materials). The merging, the removal of ambiguous C/G and A/T and triallelic markers, the exclusion of related individuals, and the discarding of SNPs in LD were performed using PLINK 1.9 (). Only autosomal markers were considered.

Haplotype analysis (CP and fS)

We investigated patterns of genetic differentiation in Italy by exploring the information provided by SNP-based haplotypes. Phased haplotypes were generated using SHAPEIT 2 () and by applying the HapMap b37 genetic map. CP was used to generate a matrix of recipient individuals “painted” as a combination of donor samples (copying vector). Three runs of CP were done for each dataset generating three different outputs: (i) a matrix of all the individuals painted as a combination of all the individuals, for cluster identification and GT analysis; (ii) a matrix of all Italians as a combination of all Italians, for FST analysis; and (iii) a matrix of all the samples as a combination of all the other samples but excluding Italians, for noItaly GT analysis. The median numbers of SNPs per painted fragment were 13.7 and 31.6 for the FMD and HDD, respectively. Clusters were inferred using fS. After an initial search based on the “greedy” mode, the dendrogram was processed by visual inspection (, ) according to the geographical origin of the samples. The robustness of the cluster was obtained by processing the Markov Chain Monte Carlo (MCMC) pairwise coincidence matrix (Supplementary Materials).

Cluster self-copy analysis

Recently admixed individuals were identified as those copying from members of the cluster to which they belong less than the amount of cluster self-copying for samples with all four grandparents from the same geographic region (Supplementary Materials).

FSTand TVD within and between Italian clusters

To have a comprehensive overview of the genetic diversity in Italy, we estimated the pairwise FST within Italian clusters using smartpca implemented in EIGENSOFT (). TVD estimates were obtained using the TVD function () on the CP chunklength matrix. Pairwise genetic diversity among clusters was computed estimating pairwise FST and TVD metrics on Italian clusters belonging to the same or to different macroareas. In detail, the NItaly macroarea comprised clusters named as NItaly and NCItaly; the SItaly macroarea included SItaly, SCItaly, and Sicily named clusters; while the Sardinia macroarea included only the Sardinia clusters.

FSTestimates among clusters

Pairwise FST estimates among newly generated Italian clusters and originally generated European clusters (Supplementary Materials) were inferred using the smartpca software implemented in the EIGENSOFT package (). Comparisons between the FST distributions were performed using a Wilcoxon rank sum test in the R programming language environment.

Principal components analysis

PCA was performed on the CP chunkcount matrix (Supplementary Materials) and was generated using the prcomp() function on R software. Allele frequency PCA was performed using smartpca implemented in EIGENSOFT () after pruning the datasets for LD.

Procrustes analysis

To validate the correlation observed between the haplotype-based PCA (Fig. 1) and the cluster distribution (Fig. 1C), we performed a symmetric Procrustes analysis with 100,000 permutations. In detail, we used the values of the first two PCs of the PCA estimated on the CP chunkcount matrix generated using only Italian individuals for which the place of origin (administrative region) was available along with the geographic coordinates of the administrative center (“capoluogo di regione”) of the region to which they were assigned to on the basis of available information (data file S1).

Characterization of the migration landscape (EEMS analysis)

We highlighted the spatial patterns of genetic differentiation by EEMS analysis (). This was performed estimating the average pairwise distances between populations using the bed2diffs tool, and the resulting output was visualized using the Reems package ().

ADMIXTURE analysis

ADMIXTURE 1.3.0 software () was used, performing 10 different runs using a random seed. The results were combined with CLUMPAK () using the largeKGreedy algorithm and random input orders with 10,000 repeats. Distruct for many K’s implemented in CLUMPAK () was then used to identify the best alignment of CLUMPP results. Results were processed using the R statistical software.

The time and the sources of admixture events (GT and MALDER analyses)

Times of admixture events were investigated using the GLOBETROTTERv2 software. GT was utilized using two approaches: complete and nonlocal (referred to as noItaly; Supplementary Materials) in default modality (, ). The difference between the two approaches was the inclusion or the exclusion respectively of all the Italian clusters as donors in the CP matrix used as the input file. To improve the precision of the admixture signals, the “null.ind: 1” parameter was set (). Unclear signals were corrected using the default parameters, and a total of 100 bootstraps were performed. MALDER () uses allele frequencies to dissect the time of admixture signals. The best amplitude was identified and used to calculate a Z score (Supplementary Materials). A Z score equal to or lower than 2 identifies not significantly different amplitude curves (Supplementary Materials) (). Sources for both GT and MALDER were grouped in different ancestries as indicated in the legend of Fig. 3 and fig. S8. The expression [1950 − (g + 1)*29], where g is the number of generation, was used to convert the GT and MALDER results into years.

Analyses including ancient samples

To explore the extent to which the European and Italian genetic variation has been shaped by ancient demographic events, we merged modern samples from FMD with 63 ancient samples selected from recent studies (data file S1) (, , , , , –).

Principal component analysis (PCA)

To visualize the genetic affinities of ancient and modern samples, we performed two PCAs with the EIGENSOFT () smartpca software and the “lsqproject” and “shrinkmode” option, projecting the ancient samples onto components inferred from modern European, West Asian, and Caucasian individuals and then only on the modern European clusters. To evaluate the potential impact of DNA damage in calling variants from aDNA samples, we repeated the PCA with the 63 ancient samples and the modern European, Caucasian, and West Asian samples by removing transition polymorphisms and recorded significant correlations for the localization of ancient samples along PC1 and PC2 (Pearson r > 0.99, P < 0.05). We explored the genetic relationships between modern and ancient samples by projecting the ancient samples on the previously inferred ancestral allele frequencies from 10 ADMIXTURE () runs on modern samples (see the “Analysis of modern samples” section and the Supplementary Materials). We used CLUMPP for merging the resulting matrices and distruct for the visualization in CLUMPAK program ().

D-statistics

We tested for admixture using the D-statistics as implemented in the qpDstat tool in the software ADMIXTOOLS v4.2 (). We performed the D-statistics analyses evaluating the relationship of the Italian cluster with AN, ABA, and SBA. In detail, we performed the D-statistics D(Ita1,Ita2,AN/ABA/SBA,Mbuti), where Ita1 and Ita2 are the different clusters composed mainly of Italian individuals as inferred by fS.

CP/NNLS analysis

We used an approach based on the CP software () and a slight adaptation of the NNLS function (, ) to estimate the proportions of the genetic contributions from ancient population to our modern clusters. We ran CP using the unlinked mode () with the same Ne and θ parameters of the modern dataset, painting both modern and ancient individuals and using modern samples as donors (, ). We analyzed the output of CP by solving an appropriately formulated NNLS problem, reconstructing the modern clusters in terms of the ancients. We applied this combined approach on different sets of ancient samples (ultimate and proximate sources). Goodness of fit was measured evaluating the residuals of the NNLS analysis. In detail, we focused on the proximate sources and compared the sum of squared residuals when ABA or SBA was included/excluded as putative sources. Furthermore, for the ultimate and proximate analyses, we estimated the SEs by applying a weighted jackknife bootstrap (data file S3), estimating the mixture profile removing one chromosome at time and averaging the values taking into account the total number of markers analyzed for iteration ().

qpAdm analysis

We used the ancestral reconstruction method qpAdm, which harnesses different relationships of populations related to a set of outgroups (e.g., f4[Target, O1, O2, O3]) () to model the ancestry composition of modern and ancient Italian samples as different combinations of ancient sources. In detail, for each tested cluster of the FMD and HDD, we evaluated all the possible combinations of N “left” sources with N = {2..5} and one set of right/left outgroups (see the Supplementary Materials) (). For each of the tested combinations, we used qpWave to evaluate whether the set of chosen outgroups is able to (i) discriminate the combinations of sources and (ii) establish if the target may be explained by the sources. We used a P value threshold of 0.01. Last, we used qpAdm to infer the admixture proportions and reported it and the associated SEs in data file S4. In addition, we performed the same analysis with N = {2..4} for Iceman, Remedello, and Bell Beaker individuals from Sicily and North Italy (data file S4).

Archaic contribution

We assembled an additional HDD by retaining only samples genotyped on the Illumina Infinium Omni2.5-8 BeadChip from our larger modern dataset. In particular, we included seven populations from the 1000 Genomes Project: the five European populations (Northern European from Utah, CEU; England, GBR; Finland, FIN; Spain, IBS; Italy from Tuscany, TSI), one from Asia (Han Chinese, CHB), and one from Africa (Yoruba from Nigeria, YRI). We also retained 466 Italian samples, whose four grandparents were born in the same Italian region. The Italian samples were broadly clustered according to their geographical origin into Northern, Central, and Southern Italians, and Sardinians, while TSI samples from the 1000 Genome Project formed a separate cluster (data file S1). From this dataset, we extracted 7164 Neanderthal SNPs tagging Neanderthal introgressed regions (). To select which allele was inherited from Neanderthals, we chose the one from the Altai Neanderthal () genome when it was homozygous and the minor allele in YRI when it was heterozygous.

Number of Neanderthal alleles in present-day human populations

To provide a direct (relative) estimate of the Neanderthal DNA present in different individuals, we initially pruned Neanderthal tagging SNPs in LD and counted the number of Neanderthal alleles considering all the tag-SNP across all samples. Then, we compared the distribution of Neanderthal allele counts across populations with the two-sample Wilcoxon rank sum test. We repeated the same analyses after removing outlier individuals.

Basal Eurasian ancestry and Neanderthal contribution

To infer the proportion of Basal Eurasian present in European populations and investigate its impact in shaping variation in the Neanderthal legacy across populations (, ), we used the f4 ratio implemented in the ADMIXTOOLS package () in the form f4(Target, Loschbour, Ust_Ishim, Kostenki14)/f4(Mbuti, Loschbour, Ust_Ishim, Kostenki14). We repeated this approach to infer the Neanderthal ancestry in the form f4(Mbuti, Chimp Target, Altai)/f4(Mbuti, Chimp, Dinka, Altai) (fig. S9, E to H). We then performed the same analyses by grouping the modern individuals according to the CP/fS inferred clusters (see the “Analysis of modern samples” section) and retained only clusters with at least 10 samples (Fig. 4C).

African ancestry and Neanderthal legacy

The impact of African contributions in shaping the amount of Neanderthal occurrence was evaluated by exploring how the removal of the clusters showing African gene flow as detected by the GT analysis (Fig. 3), and of individuals belonging to these clusters, affected the correlation between Basal Eurasian/Neanderthal estimates and the degree of population differentiation in the amount of Neanderthal alleles, respectively (Supplementary Materials and fig. S9, A to D).

Comparison of Neanderthal allele frequencies across modern populations

We explored significant differences in the frequencies of Neanderthal alleles across populations by computing the allele frequency differences for every SNP for each of the possible pairs of the 11 populations in our dataset, thus obtaining 55 distributions (see the Supplementary Materials). Then, the NTT SNPs, i.e., the Neanderthal-tag SNPs in the top 1% of each distribution, were selected (data file S6).

The biological implications of Neanderthal introgression

Given the list of genes overlapping the Neanderthal introgressed regions harboring the NTT SNPs and the list of genes directly harboring the NTT SNPs, we performed different enrichment tests with the online tool EnrichR (). Particularly, we searched for significant enrichments compared to the human genome using the EnrichR collection of database, e.g., dbGaP, Panther 2016, HPO, and KEGG 2016 (data file S6). We then investigated known direct associations between the Neanderthal alleles of the NTT SNPs and phenotypes by looking in the GWAS and PheWAS catalogs (https://phewascatalog.org/phewas) and applying the PheGenI tool (https://www.ncbi.nlm.nih.gov/gap/phegeni) (data file S6). We used the circos representation as in Kanai et al. () to highlight different sets of NTT SNPs (Fig. 4F).

61 in total

1. The genome-wide structure of the Jewish people.

Authors: Doron M Behar; Bayazit Yunusbayev; Mait Metspalu; Ene Metspalu; Saharon Rosset; Jüri Parik; Siiri Rootsi; Gyaneshwer Chaubey; Ildus Kutuev; Guennady Yudkovsky; Elza K Khusnutdinova; Oleg Balanovsky; Ornella Semino; Luisa Pereira; David Comas; David Gurwitz; Batsheva Bonne-Tamir; Tudor Parfitt; Michael F Hammer; Karl Skorecki; Richard Villems
Journal: Nature Date: 2010-06-09 Impact factor: 49.962

2. The phenotypic legacy of admixture between modern humans and Neandertals.

Authors: Corinne N Simonti; Benjamin Vernot; Lisa Bastarache; Erwin Bottinger; David S Carrell; Rex L Chisholm; David R Crosslin; Scott J Hebbring; Gail P Jarvik; Iftikhar J Kullo; Rongling Li; Jyotishman Pathak; Marylyn D Ritchie; Dan M Roden; Shefali S Verma; Gerard Tromp; Jeffrey D Prato; William S Bush; Joshua M Akey; Joshua C Denny; John A Capra
Journal: Science Date: 2016-02-12 Impact factor: 47.728

3. Genetic analysis of quantitative traits in the Japanese population links cell types to complex human diseases.

Authors: Masahiro Kanai; Masato Akiyama; Atsushi Takahashi; Nana Matoba; Yukihide Momozawa; Masashi Ikeda; Nakao Iwata; Shiro Ikegawa; Makoto Hirata; Koichi Matsuda; Michiaki Kubo; Yukinori Okada; Yoichiro Kamatani
Journal: Nat Genet Date: 2018-02-05 Impact factor: 38.330

4. Worldwide human relationships inferred from genome-wide patterns of variation.

Authors: Jun Z Li; Devin M Absher; Hua Tang; Audrey M Southwick; Amanda M Casto; Sohini Ramachandran; Howard M Cann; Gregory S Barsh; Marcus Feldman; Luigi L Cavalli-Sforza; Richard M Myers
Journal: Science Date: 2008-02-22 Impact factor: 47.728

5. Massive migration from the steppe was a source for Indo-European languages in Europe.

Authors: Wolfgang Haak; Iosif Lazaridis; Nick Patterson; Nadin Rohland; Swapan Mallick; Bastien Llamas; Guido Brandt; Susanne Nordenfelt; Eadaoin Harney; Kristin Stewardson; Qiaomei Fu; Alissa Mittnik; Eszter Bánffy; Christos Economou; Michael Francken; Susanne Friederich; Rafael Garrido Pena; Fredrik Hallgren; Valery Khartanovich; Aleksandr Khokhlov; Michael Kunst; Pavel Kuznetsov; Harald Meller; Oleg Mochalov; Vayacheslav Moiseyev; Nicole Nicklisch; Sandra L Pichler; Roberto Risch; Manuel A Rojo Guerra; Christina Roth; Anna Szécsényi-Nagy; Joachim Wahl; Matthias Meyer; Johannes Krause; Dorcas Brown; David Anthony; Alan Cooper; Kurt Werner Alt; David Reich
Journal: Nature Date: 2015-03-02 Impact factor: 49.962

6. Tracing the route of modern humans out of Africa by using 225 human genome sequences from Ethiopians and Egyptians.

Authors: Luca Pagani; Stephan Schiffels; Deepti Gurdasani; Petr Danecek; Aylwyn Scally; Yuan Chen; Yali Xue; Marc Haber; Rosemary Ekong; Tamiru Oljira; Ephrem Mekonnen; Donata Luiselli; Neil Bradman; Endashaw Bekele; Pierre Zalloua; Richard Durbin; Toomas Kivisild; Chris Tyler-Smith
Journal: Am J Hum Genet Date: 2015-05-28 Impact factor: 11.025

7. Genetic origins of the Minoans and Mycenaeans.

Authors: Iosif Lazaridis; Alissa Mittnik; Nick Patterson; Swapan Mallick; Nadin Rohland; Saskia Pfrengle; Anja Furtwängler; Alexander Peltzer; Cosimo Posth; Andonis Vasilakis; P J P McGeorge; Eleni Konsolaki-Yannopoulou; George Korres; Holley Martlew; Manolis Michalodimitrakis; Mehmet Özsait; Nesrin Özsait; Anastasia Papathanasiou; Michael Richards; Songül Alpaslan Roodenberg; Yannis Tzedakis; Robert Arnott; Daniel M Fernandes; Jeffery R Hughey; Dimitra M Lotakis; Patrick A Navas; Yannis Maniatis; John A Stamatoyannopoulos; Kristin Stewardson; Philipp Stockhammer; Ron Pinhasi; David Reich; Johannes Krause; George Stamatoyannopoulos
Journal: Nature Date: 2017-08-02 Impact factor: 49.962

8. The genetic history of Ice Age Europe.

Authors: Qiaomei Fu; Cosimo Posth; Mateja Hajdinjak; Martin Petr; Swapan Mallick; Daniel Fernandes; Anja Furtwängler; Wolfgang Haak; Matthias Meyer; Alissa Mittnik; Birgit Nickel; Alexander Peltzer; Nadin Rohland; Viviane Slon; Sahra Talamo; Iosif Lazaridis; Mark Lipson; Iain Mathieson; Stephan Schiffels; Pontus Skoglund; Anatoly P Derevianko; Nikolai Drozdov; Vyacheslav Slavinsky; Alexander Tsybankov; Renata Grifoni Cremonesi; Francesco Mallegni; Bernard Gély; Eligio Vacca; Manuel R González Morales; Lawrence G Straus; Christine Neugebauer-Maresch; Maria Teschler-Nicola; Silviu Constantin; Oana Teodora Moldovan; Stefano Benazzi; Marco Peresani; Donato Coppola; Martina Lari; Stefano Ricci; Annamaria Ronchitelli; Frédérique Valentin; Corinne Thevenet; Kurt Wehrberger; Dan Grigorescu; Hélène Rougier; Isabelle Crevecoeur; Damien Flas; Patrick Semal; Marcello A Mannino; Christophe Cupillard; Hervé Bocherens; Nicholas J Conard; Katerina Harvati; Vyacheslav Moiseyev; Dorothée G Drucker; Jiří Svoboda; Michael P Richards; David Caramelli; Ron Pinhasi; Janet Kelso; Nick Patterson; Johannes Krause; Svante Pääbo; David Reich
Journal: Nature Date: 2016-05-02 Impact factor: 49.962

9. Characterization of the biological processes shaping the genetic structure of the Italian population.

Authors: Silvia Parolo; Antonella Lisa; Davide Gentilini; Anna Maria Di Blasio; Simona Barlera; Enrico B Nicolis; Giorgio B Boncoraglio; Eugenio A Parati; Silvia Bione
Journal: BMC Genet Date: 2015-11-09 Impact factor: 2.797

10. Insular Celtic population structure and genomic footprints of migration.

Authors: Ross P Byrne; Rui Martiniano; Lara M Cassidy; Matthew Carrigan; Garrett Hellenthal; Orla Hardiman; Daniel G Bradley; Russell L McLaughlin
Journal: PLoS Genet Date: 2018-01-25 Impact factor: 5.917

24 in total

1. Ancient Rome: A genetic crossroads of Europe and the Mediterranean.

Authors: Margaret L Antonio; Ziyue Gao; Hannah M Moots; Ron Pinhasi; Jonathan K Pritchard; Michaela Lucci; Francesca Candilio; Susanna Sawyer; Victoria Oberreiter; Diego Calderon; Katharina Devitofranceschi; Rachael C Aikens; Serena Aneli; Fulvio Bartoli; Alessandro Bedini; Olivia Cheronet; Daniel J Cotter; Daniel M Fernandes; Gabriella Gasperetti; Renata Grifoni; Alessandro Guidi; Francesco La Pastina; Ersilia Loreti; Daniele Manacorda; Giuseppe Matullo; Simona Morretta; Alessia Nava; Vincenzo Fiocchi Nicolai; Federico Nomi; Carlo Pavolini; Massimo Pentiricci; Philippe Pergola; Marina Piranomonte; Ryan Schmidt; Giandomenico Spinola; Alessandra Sperduti; Mauro Rubini; Luca Bondioli; Alfredo Coppa
Journal: Science Date: 2019-11-08 Impact factor: 47.728

2. Archaeogenomic distinctiveness of the Isthmo-Colombian area.

Authors: Marco Rosario Capodiferro; Bethany Aram; Alessandro Raveane; Nicola Rambaldi Migliore; Giulia Colombo; Linda Ongaro; Javier Rivera; Tomás Mendizábal; Iosvany Hernández-Mora; Maribel Tribaldos; Ugo Alessandro Perego; Hongjie Li; Christiana Lyn Scheib; Alessandra Modi; Alberto Gòmez-Carballa; Viola Grugni; Gianluca Lombardo; Garrett Hellenthal; Juan Miguel Pascale; Francesco Bertolini; Gaetano Salvatore Grieco; Cristina Cereda; Martina Lari; David Caramelli; Luca Pagani; Mait Metspalu; Ronny Friedrich; Corina Knipper; Anna Olivieri; Antonio Salas; Richard Cooke; Francesco Montinaro; Jorge Motta; Antonio Torroni; Juan Guillermo Martín; Ornella Semino; Ripan Singh Malhi; Alessandro Achilli
Journal: Cell Date: 2021-03-23 Impact factor: 41.582

3. Late-onset neonatal sepsis: genetic differences by sex and involvement of the NOTCH pathway.

Authors: Timothy H Ciesielski; Xueyi Zhang; Scott M Williams; Giorgio Sirugo; Alessandra Tacconelli; Irja Lutsar; Vincent Meiffredy de Cabre; Emmanuel Roilides; Cinzia Ciccacci; Paola Borgiani; William K Scott
Journal: Pediatr Res Date: 2022-07-14 Impact factor: 3.953

4. Helena's Many Daughters: More Mitogenome Diversity behind the Most Common West Eurasian mtDNA Control Region Haplotype in an Extended Italian Population Sample.

Authors: Martin Bodner; Christina Amory; Anna Olivieri; Francesca Gandini; Irene Cardinali; Hovirag Lancioni; Gabriela Huber; Catarina Xavier; Maria Pala; Alessandro Fichera; Lisa Schnaller; Mario Gysi; Stefania Sarno; Davide Pettener; Donata Luiselli; Martin B Richards; Ornella Semino; Alessandro Achilli; Antonio Torroni; Walther Parson
Journal: Int J Mol Sci Date: 2022-06-16 Impact factor: 6.208

5. Revealing the recent demographic history of Europe via haplotype sharing in the UK Biobank.

Authors: Edmund Gilbert; Ashwini Shanmugam; Gianpiero L Cavalleri
Journal: Proc Natl Acad Sci U S A Date: 2022-06-13 Impact factor: 12.779

6. First Bronze Age Human Mitogenomes from Calabria (Grotta Della Monaca, Southern Italy).

Authors: Francesco Fontani; Elisabetta Cilli; Fabiola Arena; Stefania Sarno; Alessandra Modi; Sara De Fanti; Adam Jon Andrews; Adriana Latorre; Paolo Abondio; Felice Larocca; Martina Lari; David Caramelli; Emanuela Gualdi-Russo; Donata Luiselli
Journal: Genes (Basel) Date: 2021-04-25 Impact factor: 4.096

7. The genetic structure of the Turkish population reveals high levels of variation and admixture.

Authors: M Ece Kars; A Nazlı Başak; O Emre Onat; Kaya Bilguvar; Jungmin Choi; Yuval Itan; Caner Çağlar; Robin Palvadeau; Jean-Laurent Casanova; David N Cooper; Peter D Stenson; Alper Yavuz; Hakan Buluş; Murat Günel; Jeffrey M Friedman; Tayfun Özçelik
Journal: Proc Natl Acad Sci U S A Date: 2021-09-07 Impact factor: 11.205

8. The mitogenome portrait of Umbria in Central Italy as depicted by contemporary inhabitants and pre-Roman remains.

Authors: Alessandra Modi; Hovirag Lancioni; Irene Cardinali; Marco R Capodiferro; Nicola Rambaldi Migliore; Abir Hussein; Christina Strobl; Martin Bodner; Lisa Schnaller; Catarina Xavier; Ermanno Rizzi; Laura Bonomi Ponzi; Stefania Vai; Alessandro Raveane; Bruno Cavadas; Ornella Semino; Antonio Torroni; Anna Olivieri; Martina Lari; Luisa Pereira; Walther Parson; David Caramelli; Alessandro Achilli
Journal: Sci Rep Date: 2020-07-01 Impact factor: 4.379

9. Genome-wide analysis of Corsican population reveals a close affinity with Northern and Central Italy.

Authors: Erika Tamm; Julie Di Cristofaro; Stéphane Mazières; Erwan Pennarun; Alena Kushniarevich; Alessandro Raveane; Ornella Semino; Jacques Chiaroni; Luisa Pereira; Mait Metspalu; Francesco Montinaro
Journal: Sci Rep Date: 2019-09-19 Impact factor: 4.379

10. Dutch population structure across space, time and GWAS design.

Authors: Ross P Byrne; Wouter van Rheenen; Leonard H van den Berg; Jan H Veldink; Russell L McLaughlin
Journal: Nat Commun Date: 2020-09-11 Impact factor: 14.919