| Literature DB >> 30226838 |
Laurits Skov1, Ruoyun Hui2, Vladimir Shchur3, Asger Hobolth1, Aylwyn Scally2, Mikkel Heide Schierup1, Richard Durbin2,3.
Abstract
Human populations outside of Africa have experienced at least two bouts of introgression from archaic humans, from Neanderthals and Denisovans. In Papuans there is prior evidence of both these introgressions. Here we present a new approach to detect segments of individual genomes of archaic origin without using an archaic reference genome. The approach is based on a hidden Markov model that identifies genomic regions with a high density of single nucleotide variants (SNVs) not seen in unadmixed populations. We show using simulations that this provides a powerful approach to identifying segments of archaic introgression with a low rate of false detection, given data from a suitable outgroup population is available, without the archaic introgression but containing a majority of the variation that arose since initial separation from the archaic lineage. Furthermore our approach is able to infer admixture proportions and the times both of admixture and of initial divergence between the human and archaic populations. We apply the model to detect archaic introgression in 89 Papuans and show how the identified segments can be assigned to likely Neanderthal or Denisovan origin. We report more Denisovan admixture than previous studies and find a shift in size distribution of fragments of Neanderthal and Denisovan origin that is compatible with a difference in admixture time. Furthermore, we identify small amounts of Denisova ancestry in South East Asians and South Asians.Entities:
Mesh:
Year: 2018 PMID: 30226838 PMCID: PMC6161914 DOI: 10.1371/journal.pgen.1007641
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1Overview of the model.
Illustration on small test dataset. a) An archaic segment introgresses into the ingroup population at time T with admixture proportion a. The segments in the ingroup have a mean coalescence time with a segment from the outgroup at time T and an archaic segment has a mean coalescence time with a segment from the outgroup at time T. Removing all variants found in the outgroup (light orange points) should remove all the variants in the common ancestor of ingroup and outgroup, leaving only private variants that either occurred on the ingroup branch (dark orange) or on the archaic branch (dark blue). This will make the archaic segment have a higher variant density. The genome is then binned into windows of length L (here 1000 bp) and the number of private variants is counted in each window. These are the observations and the hidden states are either Ingroup state or Archaic state. When decoding the sequence the most likely path through the sequence is found. b) The transition matrix between the archaic state and ingroup state. c) The emission probabilities are modelled as Poisson distributions with means λ and λ. It is more likely to see more private variants in the Archaic state than in the Ingroup state.
Fig 2Evaluation of the model on simulated data.
a). The estimated parameters T, a,T and T are shown for different admixture proportions in simulated data with varying recombination rate and missing data. We also show the sensitivity and precision for different admixture proportions. For sensitivity and precision we show the values with a posterior probability cutoff at 0.5 (average posterior probability of all bins being belonging to the archaic state for a segments) b). Sensitivity and precision shown for the Sstar methods, Sprime and the HMM on different datasets. For Sstar and Sprime methods the different points are when the score for a segment is 50,000, 100,000, 150,000 and 200,000 as in Browning et al 2018. For the HMM the cutoffs is 0.5, 0.6, 0.7, 0.8 and 0.9. c) When there is no admixture the model is not in agreement with itself. The estimated admixture proportion from the transition matrix does not match the amount of sequence classified as belonging to the archaic state.
Fig 3Application of model to Papuan genomes.
a) Relationship between modern and archaic humans with the outgroup branches (Sub-Saharan Africans) colored in red. The average coalescence times for ingroup and outgroup T and archaic and outgroup T are shown. The admixture proportions a and admixture time T are shown for segments that are shared with other non-African populations. b) The outgroup colored in red is now all non-Papuans, and the new demographic parameters are shown. c) The segments that are shared with other Non-Africans share more variation with the Vindija Neanderthal than they do with the Altai Denisova. Segments that are unique to Papuan individuals share more variation with Altai Denisova than they do with the Vindija Neanderthal. d) Archaic segments that are shared with other non-African populations are shorter than segments that are unique to Papuans (segments with a mean posterior probability > 0.5 are kept).
Amount of sequence of different origins.
For different methods and populations, the amounts of sequence (in Mb) are shown in putative archaic segments that share equal numbers of private variants with the Denisova and Vindija Neanderthal (Both), more with Denisova, none with either, or more with Vindija Neanderthal. Neither Sstar nor CRF label segments that do not share variants with the archaic reference genomes. For CRF, segments had to be either more similar to Neanderthal than Denisova or vice versa so they do not report segments that match both equally well. For Sstar the comparison to Denisova was only made for Papuans. Note the Papuans individuals used in Sstar are admixted with East Asians.
| HMM | Papuan | 4.35 | 83.11 | 11.54 | 71.70 | 170.7 |
| eastasia | 1.48 | 5.69 | 9.96 | 61.37 | 78.49 | |
| southasia | 1.62 | 5.85 | 10.12 | 51.36 | 68.95 | |
| westeurasia | 1.47 | 2.39 | 10.14 | 43.95 | 57.94 | |
| Sstar | Papuan | 26.5 | 43.11 | - | 49.21 | 118.82 |
| eastasia | - | - | - | 65.02 | 65.02 | |
| southasia | - | - | - | 55.18 | 55.18 | |
| westeurasia | - | - | - | 51.23 | 51.23 | |
| CRF | Papuan | - | 58.17 | - | 84.72 | 142.89 |
| eastasia | - | 3.21 | - | 72.92 | 76.14 | |
| southasia | - | 2.79 | - | 61.36 | 64.15 | |
| westeurasia | - | 0.68 | - | 57.29 | 57.97 | |
| Sprime | Papuan | 1.04 | 38.98 | 13.48 | 27.85 | 81.36 |
| eastasia | 0.89 | 4.29 | 14.14 | 60.49 | 79.81 | |
| southasia | 0.76 | 4.60 | 15.09 | 53.83 | 74.29 | |
| westeurasia | 0.76 | 1.68 | 14.02 | 52.22 | 68.70 |