| Literature DB >> 29760079 |
Guo-Dong Wang1,2, Bao-Lin Zhang1,3, Wei-Wei Zhou1,4, Yong-Xin Li1,3, Jie-Qiong Jin1, Yong Shao1, He-Chuan Yang5, Yan-Hu Liu6, Fang Yan1, Hong-Man Chen1, Li Jin7, Feng Gao8, Yaoguang Zhang7, Haipeng Li2,8, Bingyu Mao1,2, Robert W Murphy1,9, David B Wake10, Ya-Ping Zhang11,2, Jing Che11,2,4.
Abstract
Tibetan frogs, Nanorana parkeri, are differentiated genetically but not morphologically along geographical and elevational gradients in a challenging environment, presenting a unique opportunity to investigate processes leading to speciation. Analyses of whole genomes of 63 frogs reveal population structuring and historical demography, characterized by highly restricted gene flow in a narrow geographic zone lying between matrilines West (W) and East (E). A population found only along a single tributary of the Yalu Zangbu River has the mitogenome only of E, whereas nuclear genes of W comprise 89-95% of the nuclear genome. Selection accounts for 579 broadly scattered, highly divergent regions (HDRs) of the genome, which involve 365 genes. These genes fall into 51 gene ontology (GO) functional classes, 14 of which are likely to be important in driving reproductive isolation. GO enrichment analyses of E reveal many overrepresented functional categories associated with adaptation to high elevations, including blood circulation, response to hypoxia, and UV radiation. Four genes, including DNAJC8 in the brain, TNNC1 and ADORA1 in the heart, and LAMB3 in the lung, differ in levels of expression between low- and high-elevation populations. High-altitude adaptation plays an important role in maintaining and driving continuing divergence and reproductive isolation. Use of total genomes enabled recognition of selection and adaptation in and between populations, as well as documentation of evolution along a stepped cline toward speciation.Entities:
Keywords: gene flow; hybridization; natural selection; population genomics; speciation
Mesh:
Year: 2018 PMID: 29760079 PMCID: PMC5984489 DOI: 10.1073/pnas.1716257115
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.Sampling sites and population structure of the Tibetan frog (N. parkeri). (A) Sampling locations (ArcGIS 10.2; esri). Site numbers refer to . Colors denote the five main groups recovered from population structure and phylogenetic analyses. Gray indicates hybrid populations (25–27). (B) PCA of all high-coverage samples. (C) Principle component plot of E samples only. (D) ML tree based on concatenated sequences. W, green; E1, purple; E2, yellow; E3, light blue; E4, red.
Fig. 2.Population structure plots with the number of ancestral clusters (K) = 2–5.
Admixture signatures from D-statistic tests
| Test | D statistic | Z score |
| W, | −0.247 | −50.868 |
| W, | −0.264 | −61.376 |
| W, E1; E4, E2 | −0.009 | −2.354 |
| W, | 0.117 | 21.667 |
| E1, | 0.260 | 78.181 |
Populations with gene flow are denoted in boldface.
Fig. 3.Demographic inference. (A) Demographic history inferred by G-PhoCS. Widths of branches are proportional to Ne. Horizontal dashed lines denote posterior estimates for divergence times, associated mean values are shown in bold, and 95% credible intervals are shown in parentheses. Arrows indicate the direction of gene flow, and associated figures indicate the estimates of total migration rates. (B) Distribution of FST(W, E1) values in the observed data and in the simulated data under different migration scenarios between W and E. The full model shows simulation with the full set of demographic parameters inferred from G-PhoCS. The no_E234 refers to the simulation without the postdivergence migration from E234 to W. The no_E refers to the simulation without current and postdivergence migration from E to W.
Fig. 4.Genomic divergence associated with species formation. (A) FST distribution between W and E1 and their landscape of genomic divergence measured by the df. Both statistics are measured from 50-kb nonoverlapping windows. Scaffolds are concatenated to reveal the whole-genomic divergence pattern. (B) FST distribution between E1 and E2 and their landscape of genomic divergence. (C) FST distribution between E3 and E4 and their landscape of genomic divergence.
Fig. 5.Identification of HDRs. (A) Distribution of FST-based statistic of PBSW. HDRs in top 2.5% PBS distribution, light blue; outside HDRs, gray. (B) Comparisons of HDRs of PBSW in terms of FST, π, dxy, and DAF versus the genomic background. (C) Distribution of PBSE1. (D) Comparisons of HDRs of PBSE1 with the genomic background. Asterisks designate levels of significance between HDRs and outside HDRs by a two-tailed Mann–Whitney test (*P < 0.01; **P < 1e-8; ***P < 2.2e-16).
Fig. 6.Distribution of observed genome-wide top 2.5% PBSW values compared with the simulated PBSW values under the full model.
GO analysis of genes located in regions that strongly differentiated W from all four subpopulations of E
| Category | Term | No. of genes | |
| Cluster 1 | GO:0048538∼thymus development | 5 | 0.00 |
| GO:0048534∼hemopoietic or lymphoid organ development | 12 | 0.01 | |
| GO:0002520∼immune system development | 12 | 0.01 | |
| Cluster 2 | GO:0003006∼reproductive developmental process | 15 | 0.00 |
| GO:0048610∼reproductive cellular process | 11 | 0.00 | |
| GO:0048609∼reproductive process in a multicellular organism | 18 | 0.01 | |
| GO:0032504∼multicellular organism reproduction | 18 | 0.01 | |
| GO:0019953∼sexual reproduction | 17 | 0.01 | |
| GO:0007281∼germ cell development | 7 | 0.01 | |
| GO:0007276∼gamete generation | 15 | 0.01 | |
| GO:0048232∼male gamete generation | 12 | 0.02 | |
| GO:0007283∼spermatogenesis | 12 | 0.02 | |
| Cluster 3 | GO:0035270∼endocrine system development | 6 | 0.01 |
| GO:0030325∼adrenal gland development | 3 | 0.01 |
Annotation clusters with an enrichment score of ≥2 are shown.
Fig. 7.Morphological and physiological changes associated with elevation. (A) Bar plots of the differences in numbers of granular glands in the middorsal skin between low-elevation (2,968 m, E1) and high-elevation (4,859 m, E4) populations (two-tailed test: P < 0.05). (B) Hb levels (grams per deciliter) at different elevations. Red lines show the best-fit regression line based on a third-order polynomial equation. The 95% confidence interval is shown in gray. (C) TOC (μmol/L) in low (E1) and high (E4) elevations. (D) Expression level of DNAJC8 in the brain, TNNC1 and ADORA1 in the heart, and LAMB3 in the lung from low-elevation (E1) and high-elevation (E4) populations of E, respectively. Eight replicates were performed for each group. Statistically significant differences in differential expression are indicated by asterisk(s) (two-tailed t test: *P < 0.05; **P < 0.01).