| Literature DB >> 30083050 |
Mingliu Yang1, Nanyu Han1, Heng Li2, Lihua Meng1.
Abstract
Halenia elliptica is a popular Chinese medicinal herb that is used to treat jaundice disease and virus hepatitis, and its wild populations have been reduced significantly due to overharvesting recently. However, effective conservation could not be implemented because of the lack of genomic information and genetic markers. In this study, a de novo transcriptome of H elliptica was sequenced using the NGS Illumina, and 132 695 unigenes with the length >200 bp (base pairs) were obtained. Among them, a total of 32 109 unigenes were scanned to develop simple sequence repeats (SSRs). Based on NCBI (National Center for Biotechnology Information) nonredundant database (Nr), these SSR sequences were annotated and assigned into gene ontology categories. In addition, we designed 126 pairs of SSR primers for polymerase chain reaction amplification, of which 12 pairs were identified to be polymorphic among 40 individuals from 8 populations. We then used the 12 polymorphic SSRs to construct a UPGMA dendrogram of the 40 individuals. In addition, a significant correlation between the genetic relationship and the geographic distance was found, suggesting a phylogeographic structure in H elliptica. Moreover, 2 of these SSRs were also successfully amplified in a related species Veratrilla baillonii, suggesting their cross-species transferability. Generally, the SSR markers with high polymorphisms identified in this study provide valuable genetic resources and represent an initial step for exploring the genetic diversity and population histories of H elliptica and its related species.Entities:
Keywords: Halenia elliptica; SSR; polymorphisms; transcriptome
Year: 2018 PMID: 30083050 PMCID: PMC6073823 DOI: 10.1177/1176934318790263
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Summary of assembly and annotation results for Halenia elliptica using Trinity.
| Results | Number |
|---|---|
| Total no. of raw reads | 19 668 659 |
| Total no. of clean reads | 19 426 614 |
| Total no. of contigs | 158 076 |
| Total size of contigs, bp | 117 982 598 |
| Mean length of contigs | 746 |
| N50 value of contigs | 1230 |
| Length range of contigs | 201-18 947 |
| Total no. of unigenes | 132 695 |
| GC content | 42.1% |
| Total no. of identified SSRs | 32 109 |
| SSR-containing sequences with BLASTX hit | 17 973 (56.0%) |
| SSR-containing sequences with annotation | 13 825 (43.1%) |
Figure 1.Number of sequences for all 158 076 transcriptome contigs for Halenia elliptica.
Frequency of mono- to hexanucleotide repeat motifs in Halenia elliptica.
| Repeats | Counts |
|---|---|
| Mononucleotide | 21 013 |
| A/T | 20 344 |
| C/G | 669 |
| Dinucleotide | 4718 |
| AC/GT | 678 |
| AG/CT | 900 |
| AT/AT | 3090 |
| CG/CG | 50 |
| Trinucleotide | 5569 |
| AAC/GTT | 365 |
| AAG/CTT | 1005 |
| AAT/ATT | 940 |
| ACC/GGT | 643 |
| ACG/CGT | 210 |
| ACT/AGT | 87 |
| AGC/CTG | 700 |
| AGG/CCT | 621 |
| ATC/ATG | 608 |
| CCG/CGG | 390 |
| Tetranucleotide | 339 |
| Pentanucleotide | 235 |
| Hexanucleotide | 235 |
Figure 2.GO classification of SSRs in coding regions. GO indicates gene ontology; SSRs, simple sequence repeats.
Results of primer screening through 40 diversified accessions in Halenia elliptica.
| Locus | Repeat | Forward primer (5′-3′) | Reverse primer (5′-3′) | Ta, °C | Size, bp | Na | Ne | Ho | He | PIC |
|---|---|---|---|---|---|---|---|---|---|---|
| FR11 | (AAAGAA)11 | TCCAGTTTGTTTTCTTGGGC | AATTGAAGCGTGGAAATTGG | 56 | 496–538 | 6 | 1.837 | 0.100 | 0.383 | 0.7263 |
| FR166 | (CTCTTC)8 | ATGAAGGTTGAGCTTGGTGG | AGGTGTGGTTGGACTTGGAC | 54 | 219–261 | 6 | 1.821 | 0.200 | 0.378 | 0.7852 |
| FR197 | (ATT)14 | TTGCCTCATTCCTCTCTCGT | GGGTGTTCTCCCTTCTTTTT | 49.3 | 203–221 | 3 | 1.027 | 0.025 | 0.023 | 0.2019 |
| FR200 | (ATCC)6 | TACTTCCCGAAATACCC | ACCTCCATTCTTGATAG | 43.5 | 179–191 | 3 | 1.233 | 0.000 | 0.140 | 0.1736 |
| FR202 | (CATA)6 | CCTTCTTTTTTTTCTTC | ATCCTCTGGAGCGTTAT | 47 | 411–459 | 10 | 2.231 | 0.175 | 0.473 | 0.7099 |
| FR248 | (TGC)10 | TTGGTTGATGACTCG | CAATGACTGGGGCTA | 49.3 | 375–387 | 5 | 1.208 | 0.025 | 0.133 | 0.4980 |
| FR265 | (GGAA)6 | AAAGTGTCCATCAAATCA | CCGTTCAGTTCACAATCC | 48.5 | 328–336 | 3 | 1.086 | 0.025 | 0.063 | 0.0714 |
| FR268 | (AAAT)6 | AGAAACAGAGAGACGAGG | ATAAGATGGGTAAGAGGC | 57.8 | 368–404 | 6 | 2.554 | 0.400 | 0.530 | 0.7398 |
| FR283 | (TC)12(TA)10 | CCCAAATGCCATAGTG | AGGAAGGGAAAACAGA | 52 | 125–131 | 4 | 1.686 | 0.400 | 0.343 | 0.5638 |
| FR288 | (CGG)7 | TAACGAATGAAGACACG | ATCAGGAAGACTATGCT | 52 | 184–190 | 3 | 1.208 | 0.025 | 0.133 | 0.3764 |
| FR295 | (ACC)9 | ACATTCTCGTAAGTAT | CAACCAGTATTCGGCT | 52 | 285–297 | 5 | 1.204 | 0.025 | 0.143 | 0.4234 |
| FR299 | (AT)13 | GTAACAAGAAGAGAAGG | GATAAATGGGAAGTAGA | 52 | 175–191 | 5 | 1.567 | 0.150 | 0.330 | 0.5498 |
Abbreviations: He, expected heterozygosity; Ho, observed heterozygosity; Na, number of alleles; Ne, effective number of alleles; PIC, polymorphism information content; size, size of cloned allele; Ta, annealing temperature.
Figure 3.UPGMA dendrogram constructed among 40 individuals from 8 populations based on 12 SSR markers developed in this study. SSR indicates simple sequence repeats.