| Literature DB >> 30249188 |
Yihan Wang1,2, Weimei Jiang1, Wenqing Ye1, Chengxin Fu1, Matthew A Gitzendanner3, Pamela S Soltis4, Douglas E Soltis3,4, Yingxiong Qiu5.
Abstract
BACKGROUND: Tetrastigma hemsleyanum is of great medicinal importance and used as a model system to address the evolutionary history of warm-temperate evergreen (WTE) forest biomes in East Asia over Neogene time scales. However, further studies on the neutral and adaptive divergence processes of T. hemsleyanum are currently impeded by a lack of genomic resources. In this study, we de novo assembled and annotated a reference transcriptome for two cpDNA lineages (Central-South-East vs. Southwest) of T. hemsleyanum. We further used comparative genomic and multilocus coalescent approaches to investigate the tempo and mode of lineage diversification in T. hemsleyanum.Entities:
Keywords: K a/K s; Tetrastigma hemsleyanum; coalescent-based analyses; demographic history; gene flow; single-copy nuclear gene; transcriptome
Mesh:
Substances:
Year: 2018 PMID: 30249188 PMCID: PMC6154912 DOI: 10.1186/s12870-018-1429-8
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Summary statistics for the transcriptomes of CSE and SW lineages of Tetrastigma hemsleyanum.
| CSE lineage | SW lineage | |
|---|---|---|
| Total number of clean reads | 47,880,822 | 122,587,340 |
| Total length of clean reads (bp) | 4,309,273,980 | 12,258,734,000 |
| Total numbers of contigs | 101,421 | 138,294 |
| N50 value of contigs | 1001 | 926 |
| Mean length of contigs (bp) | 413 | 377 |
| Q20 percentage (%) | 97.98% | 98.51% |
| GC percentage (%) | 45.13% | 46.10% |
| Total numbers of unigenes | 52,838 | 65,197 |
| N50 value of unigenes (bp) | 1667 | 1841 |
| Mean length of unigenes (bp) | 1034 | 1095 |
Notes: Q20 percentage denotes the percentage of sequences with sequencing error rate lower than 1%; N50 means that the contig size such that 50% of the entire assembly is contained in contigs equal to or longer than this value, bp = base pair.
Annotation results of assembled unigenes from CSE and SW lineages.
| Functional annotations | CDS annotations | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Nr | Nt | Swiss-Prot | KEGG | COG | GO | All | Homolog |
| All | |
| Number (N) | 38,451 | 47,387 | 24,383 | 22,887 | 14,520 | 28,064 | 48,697 | 49,915 | 1056 | 50,971 |
| N/All annotated (%) | 78.96 | 97.31 | 50.07 | 47.00 | 29.82 | 57.63 | 100 | |||
| N/All-unigene (%) | 72.77 | 89.68 | 46.15 | 43.32 | 27.48 | 53.11 | 92.16 | 94.47 | 2.00 | 96.47 |
| Number (N) | 43,172 | 56,208 | 27,246 | 25,659 | 16,932 | 30,177 | 58,622 | 54,373 | 1449 | 55,822 |
| N/All annotated (%) | 73.64 | 95.88 | 46.48 | 43.77 | 28.88 | 51.48 | 100 | |||
| N/All-unigene (%) | 66.22 | 86.21 | 41.79 | 39.36 | 25.97 | 46.29 | 89.92 | 83.40 | 2.22 | 85.62 |
List of candidate orthologs potentially under positive selection in the transcriptomes of CSE and SW lineages
| Gene ID | Descriptions | |||
|---|---|---|---|---|
| CSE lineage | SW lineage | |||
| Unigene21284 | CL1668_Contig1 | 4.13 | 0.002 | Type 2 ribosome-inactivating protein Nigrin l precursor |
| CL1485_Contig2 | CL876_Contig13 | 1.94 | 0.017 | TMV resistance protein N-like |
| Unigene9215 | CL4926_Contig1 | 0.55 | 0.001 | GDSL esterase/lipase EXL3 |
| CL7288_Contig1 | CL7933_Contig2 | 0.5 | 0.006 | IAA-amino acid hydrolase ILR1-like 4 |
| CL2482_Contig1 | CL3284_Contig1 | 0.51 | 0.015 | Probable disease resistance protein RDL6/RF9-like |
| Unigene1115 | Unigene8463 | 0.53 | 0.022 | Proline-rich receptor-like protein kinase PERK9 |
| Unigene22333 | CL1306_Contig1 | 0.52 | 0.029 | Probable glycosyltransferase At3g07620-like |
| CL291_Contig2 | CL4212_Contig1 | 0.59 | 0.036 | Armadillo repeat-containing protein 7 |
| CL6365_Contig1 | Unigene18910 | 0.52 | 0.046 | Glutamyl-tRNA(Gln) amidotransferase subunit A |
| CL1077_Contig1 | Unigene25101 | 0.52 | 0.049 | Probable disease resistance protein At1g15890 |
Test results of polymorphism and neutrality for each SCNG screened in representative individuals of T. hemsleyanum
| Locus ID |
|
| Haplotype diversity ( | Nucleotide diversity ( | Test of neutrality | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Tajima’s |
| Fu and Li’s |
| Fu and Li’s |
| |||||
| Th-41 | 22 | 19 | 0.960 | 0.00944 | -0.586 | -0.457 | -0.590 | |||
| ThR-3 | 41 | 32 | 0.991 | 0.01185 | -0.684 | 0.932 | 0.451 | |||
| ThR-6 | 41 | 31 | 0.989 | 0.01527 | -0.014 | 0.980 | 0.759 | |||
| ThR-7 | 40 | 15 | 0.923 | 0.01682 | 1.049 | -0.183 | 0.275 | |||
| ThR-11 | 23 | 25 | 0.979 | 0.00798 | -1.228 | -0.548 | -0.910 | |||
| ThR-28 | 31 | 21 | 0.960 | 0.00946 | -0.819 | -0.697 | -0.870 | |||
| ThR-31 | 24 | 19 | 0.919 | 0.01380 | -0.172 | -0.754 | -0.663 | |||
| ThR-34 | 25 | 27 | 0.981 | 0.01027 | -0.559 | 0.871 | 0.458 | |||
Number of polymorphic sites, N Numbers of haplotypes.
Fig. 1Distribution of gene ontology (GO) term categories in the transcriptomes of CSE and SW lineages. They-axis indicates the percent of unigenes (left y-axis), and the number of unigenes (right y-axis) per (sub-) category. Black bar, CSE lineage; grey bar, SW lineage.
Maximum-likelihood estimates (MLE) and 90% highest posterior density (HPD) intervals of demographic parameters of T. hemsleyanum based on ima multi-locus analyses.
| Estimates |
|
|
|
|
|
|
|
|
| 2NCSEMCSE-SW | 2NSWMSW-CSE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MLE | 8.518 | 6.718 | 2.728 | 0.048 | 0.073 | 1.494 | 845282 | 666590 | 270737 | 0.201 | 0.232 | 2,964,251 |
| HPD90HPDLO | 4.963 | 1.888 | 1.188 | 0.000 | 0.000 | 0.803 | 492425 | 187295 | 117835 | 0.000 | 0.000 | 1,592,630 |
| HPD90HPDHi | 12.69 | 12.96 | 4.412 | 0.133 | 0.226 | 2.362 | 1258971 | 1286259 | 437849 | 0.562 | 0.738 | 4,688,583 |
ΘCSE, ΘSW, ΘA represent the scaled effective population sizes (Ne) of CSE lineage, SW lineage of T. hemsleyanum, and the ancestral population, respectively. mcse-sw and msw-cse refer to the scaled migration rates forward in time from CSE to SW lineage and vice versa. t is the time since ancestral population splitting in mutational units. 2NCSEMCSE-SW and 2NSWMSW-CSE are the effective migration rates (number of migrants per generation). All estimates include the per gene mutation rate V (geometric mean of the mutation rates of all the loci). Parameters in the first six columns are scaled by the mutation rate, while the rest are scaled by years or individuals.
Fig. 2Each of the circles represents the number of SCNG loci homologous to the gene datasets of (a) APVO or (b) Vitaceae species, or detected by (c) the software markerminer. The grey and black overlapping area of the three circles correspond to the number of SCNGs shared between the results of the two, and three approaches, respectively.
Fig. 3Marginal posterior probability (MPP) distributions of ima model parameters between the Central-south-east (CSE) and Southwest (SW) lineages identified in T. hemsleyanum: (a) the time (t) since ancestral population splitting in mutational units (b) the scaled effective population sizes of both lineages (ΘCSE, ΘSW), and the ancestral population (ΘA) (c) the scaled migration rates from CSE to SW lineage (mCSE-SW), and vice versa (mSW-CSE).