| Literature DB >> 29673217 |
Lu Hou1, Yanhong Cui2, Xiang Li3, Wu Chen4, Zhiyong Zhang5, Xiaoming Pang6, Yingyue Li7.
Abstract
Thuja koraiensis Nakai is an endangered conifer of high economic and ecological value in Jilin Province, China. However, studies on its population structure and conservation genetics have been limited by the lack of genomic data. Here, 37,761 microsatellites (simple sequence repeat, SSR) were detected based on 875,792 de novo-assembled contigs using a restriction-associated DNA (RAD) approach. Among these SSRs, 300 were randomly selected to test for polymorphisms and 96 obtained loci were able to amplify a fragment of expected size. Twelve polymorphic SSR markers were developed to analyze the genetic diversity and population structure of three natural populations. High genetic diversity (mean NA = 5.481, HE = 0.548) and moderate population differentiation (pairwise Fst = 0.048–0.078, Nm = 2.940–4.958) were found in this species. Molecular variance analysis suggested that most of the variation (83%) existed within populations. Combining the results of STRUCTURE, principal coordinate, and neighbor-joining analysis, the 232 individuals were divided into three genetic clusters that generally correlated with their geographical distributions. Finally, appropriate conservation strategies were proposed to protect this species. This study provides genetic information for the natural resource conservation and utilization of T. koraiensis and will facilitate further studies of the evolution and phylogeography of the species.Entities:
Keywords: Thuja koraiensis Nakai; conservation genetics; population genetics; restriction-associated DNA (RAD) sequencing; simple sequence repeat (SSR) markers
Year: 2018 PMID: 29673217 PMCID: PMC5924560 DOI: 10.3390/genes9040218
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Geographic locations of the three natural populations of Thuja koraiensis. The map was created using the ‘maps’ and ‘mapdata’ package in R (v3.3.1, https://www.r-project.org/).
Location and sampling site characteristics for the three Thuja koraiensis populations.
| Populations | Location | Number | Longitude (E) | Latitude (N) | Altitude/m |
|---|---|---|---|---|---|
| LGZ | Lenggouzi, Changbai, Jilin | 155 | 126°28′ | 41°37′ | 1108 |
| SDG | Sandaogou, Baishan, Jilin | 26 | 126°28′ | 41°51′ | 483 |
| DJG | Dajinggou, Baishan, Jilin | 51 | 126°44′ | 41°52′ | 1172 |
LGZ: Lenggouzi; SDG: Sandaogou; DJG: Dajinggou.
Summary statistics of the restriction-associated DNA (RAD) tags sequencing.
| Feature | Value |
|---|---|
| Number of contigs | 875,792 |
| No. of contigs (≥200 bp) | 805,284 |
| No. of contigs (≥500 bp) | 2227 |
| No. of contigs (≥1000 bp) | 95 |
| N50 (bp) | 274 |
| GC (%) | 41.38 |
| Total contig length (bp) | 230,095,060 |
| Max. contig length (bp) | 4265 |
| Min. contig length (bp) | 150 |
| Average contig length (bp) | 262 |
N50: weighted median statistic such that 50% of the entire assembly is contained in contigs equal to or greater than this value.
Figure 2Gene ontology (GO) classification of T. koraiensis contigs.
Summary of simple sequence repeats (SSRs) identified in T. koraiensis.
| Searching Item | Number |
|---|---|
| Sequences examined | 875,792 |
| Size of examined sequences (bp) | 230,095,060 |
| Identified SSRs | 37,761 |
| SSR-containing sequences | 30,102 |
| Sequences containing more than one SSR | 5212 |
| SSRs present in compound formation | 7253 |
| Mononucleotide SSRs | 16,802 |
| Dinucleotide SSRs | 14,795 |
| Trinucleotide SSRs | 5119 |
| Tetranucleotide SSRs | 478 |
| Pentanucleotide SSRs | 133 |
| Hexanucleotide SSRs | 434 |
Characterization of 12 polymorphic SSRs.
| SSR Name | Contigs | Primer Sequence (5′–>3′) | Motif | Tm (°C) | Size (bp) |
|
|
| PIC | HWE |
|---|---|---|---|---|---|---|---|---|---|---|
| BFTK-39 | contig_120265 | F: TGTTCACTCCTCATCCACCG | (TG)11 | 60 | 200 | 4 | 1.000 | 0.653 | 0.599 | ns |
| BFTK-51 | contig_169311 | F: TCATTGGAGTTGTATGGTGTCA | (CT)34 | 59 | 159 | 3 | 0.571 | 0.561 | 0.465 | ns |
| BFTK-68 | contig_220530 | F: ACAACAAAGCGGTGGTAAACC | (GA)14 | 60 | 198 | 3 | 0.375 | 0.461 | 0.398 | ns |
| BFTK-123 | contig_387595 | F: TGCTTGCACTTGGATGTTGTG | (TG)19 | 60 | 175 | 3 | 0.800 | 0.540 | 0.466 | ns |
| BFTK-136 | contig_412632 | F: CCCCCGGGCATAGATCAAAT | (TA)10 | 60 | 183 | 3 | 1.000 | 0.594 | 0.511 | ns |
| BFTK-167 | contig_487894 | F: TGAAGTCCCCATCTACATGTCA | (TG)16 | 59 | 169 | 3 | 0.125 | 0.664 | 0.590 | ** |
| BFTK-187 | contig_551058 | F: AGGACACAGAACAGAGCAGC | (AAC)10 | 60 | 147 | 2 | 0.750 | 0.469 | 0.359 | ns |
| BFTK-194 | contig_576296 | F: TACCTCGGAGATCAACCCCA | (GA)13 | 60 | 109 | 2 | 0.143 | 0.133 | 0.124 | ns |
| BFTK-199 | contig_589312 | F: ATAGGGCACGACTAGCTTGC | (AC)15 | 60 | 132 | 4 | 0.667 | 0.667 | 0.620 | ns |
| BFTK-261 | contig_680582 | F: AGAGGTGGGGAAGAGGAGAC | (AG)9 | 60 | 142 | 5 | 0.286 | 0.653 | 0.602 | * |
| BFTK-273 | contig_689976 | F: TCCCATGTTTGTGGTCTCAGT | (TTG)9 | 60 | 209 | 7 | 1.000 | 0.820 | 0.798 | ns |
| BFTK-295 | contig_111296 | F: CGCAAGTCCAAATCAGCAAC | (CAA)6 | 59 | 146 | 2 | 0.750 | 0.500 | 0.375 | ns |
| Mean | 3.417 | 0.622 | 0.559 | 0.492 |
Tm: annealing temperature; N number of alleles; H observed heterozygosity; H expected heterozygosity; PIC: polymorphism information content; HWE: Hardy–Weinberg equilibrium; ns: not significant; * p < 0.05; ** p < 0.01.
Genetic diversity of the three populations of T. koraiensis.
| Populations |
|
| I |
|
| F | PIC |
|---|---|---|---|---|---|---|---|
| LGZ | 7.667 | 2.557 | 1.091 | 0.766 | 0.572 | −0.279 | 0.503 |
| SDG | 3.667 | 2.750 | 1.024 | 0.629 | 0.582 | −0.068 | 0.504 |
| DJG | 5.111 | 2.235 | 0.908 | 0.537 | 0.491 | −0.036 | 0.428 |
| Mean | 5.481 | 2.514 | 1.008 | 0.644 | 0.548 | −0.128 | 0.478 |
N: effective number of alleles; I: Shannon’s information index; F: fixation index.
Results of the analysis of molecular variance (AMOVA) for three populations of T. koraiensis.
| Source | df | SS | MS | Percentage of Variation (%) |
|---|---|---|---|---|
| Among Populations | 2 | 164.500 | 82.250 | 17% |
| Within Populations | 229 | 1489.435 | 6.504 | 83% |
| Total | 231 | 1653.935 | 88.754 | 100% |
df: degrees of freedom; SS: sum of squares; MS: mean of squares.
Pairwise genetic differentiation index (F) values between the three populations.
| Populations | LGZ | SDG | DJG |
|---|---|---|---|
| LGZ | |||
| SDG | 0.061 | ||
| DJG | 0.048 | 0.078 |
Figure 3Estimations of gene flow among three populations by Migrate-n [43]. The width of the line and the number shown next to the arrows indicate the gene flow (N) values. LGZ: Lenggouzi; SDG: Sandaogou; DJG: Dajinggou.
Figure 4Results of STRUCTURE analysis based on microsatellite data: (a) Estimation of population using LnP(D)-derived ΔK with cluster number (K) ranged from 1 to 8; (b) estimated genetic structure of the three populations based on STRUCTURE analysis with K = 3.
Figure 5Principal coordinate analysis (PCoA) (a) and neighbour-joining (NJ) tree (b) of the 232 T. koraiensis individuals. PC1: the first principal coordinate; PC2: the second principal coordinate.