| Literature DB >> 26244005 |
Lei Wang1, Zhengkun Wang1, Jianbing Chen2, Chunyan Liu1, Wanlong Zhu1, Liuyang Wang3, Lihua Meng1.
Abstract
Veratrilla baillonii Franch is an important Chinese medicinal herb for treating liver-related diseases, which has been over-collected in the recent decades. However, the effective conservation and related population genetic study has been hindered because of the lack of genome sequences and genetic markers in the natural population. We have conducted RNA-seq on V. baillonii. We performed de novo assembly of these data to characterize the V. baillonii transcriptome, resulting in 133,019 contigs with size >200 bp. These contigs were annotated using the NCBI nonredundant database and Gene Ontology (GO) terms. From these contigs, we developed novel microsatellite simple sequence repeat (SSR) markers, identifying a total of 40,885 SSRs. SSRs with repeat motifs of 1-4 bp (mono-, di-, tri-, and tetranucleotides) accounted for 99.8% of all SSRs, with mononucleotide repeats most common, followed by dinucleotide (16.2%) and trinucleotide repeats (14.7%). We selected 151 SSRs for experimental validation, of which 74 were confirmed by polymerase chain reaction. Fourteen SSRs were determined to be polymorphic by screening 40 individuals from six distant populations. The number of alleles per locus ranged from two to four, and the expected heterozygosity varied from 0.2637 to 0.8571, suggesting that these SSR markers are highly polymorphic and effective for further genetic analysis in the nature population. In addition, we explored the genetic structure of V. baillonii using five SSRs in four geographic populations and found that the identified genotypes were clustered into two phylogenetic clades: the Mekong River clade and Jinsha River clade. This result indicates that these two regions may harbor highly divergent genetic lineages and enriched genetic diversity. The de novo transcriptome sequences and new SSR markers discovered by this study provide an initial step for understanding the population genetics of V. baillonii, and a valuable resource for effective conservation management.Entities:
Keywords: Illumina RNA-Seq; Veratrilla baillonii; microsatellite (SSR) markers; transcriptome
Year: 2015 PMID: 26244005 PMCID: PMC4498661 DOI: 10.4137/EBO.S20942
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Summary of assembly and annotation results for V. baillonii.
| Total number of high quality reads | 28483317 |
| Total number of contigs | 133019 |
| Total size of contigs (bp) | 168009542 |
| Mean length of contigs | 1263 |
| N50 value of contigs | 2104 |
| Length range of contigs | 201–13830 |
| GC content | 40.6% |
| Total number of identified SSRs | 40885 |
| SSRs containing sequences with BLASTx hit | 28912 (70.7%) |
| SSRs containing sequences with annotation | 11148 (27.3%) |
Counts of various SSR types with different repeat motifs in V. baillonii.
| REPEATS | COUNTS |
|---|---|
| A/T | 26360 |
| C/G | 776 |
| AC/GT | 1631 |
| AG/CT | 1739 |
| AT/AT | 3245 |
| CG/CG | 5 |
| AAC/GTT | 215 |
| AAG/CTT | 1828 |
| AAT/ATT | 572 |
| ACC/GGT | 612 |
| ACG/CGT | 63 |
| ACT/AGT | 197 |
| AGC/CTG | 1036 |
| AGG/CCT | 768 |
| ATC/ATG | 594 |
| CCG/CGG | 125 |
Figure 1Length distributions for all 133,019 transcriptome contigs for V. baillonii, of size >200 bp.
Figure 2GO classification of SSRs in coding regions. The x-axis refers to the three functional classes. The y-axis indicates the percentage (left) and number (right) of genes that contain an SSR, which belong to each functional class.
Fourteen SSR primers, size, and summary statistic across four populations in V. baillonii.
| LOCUS | REPEAT | Forward primer (5′–3′) | reverse primer (5′–3′) | Ta (°C) | SIZE(BP) | Na | He | Ho |
|---|---|---|---|---|---|---|---|---|
| HQJ19 | (TA)6 | TTTGCTTACCGTTTGTCC | AATGCTTCCAGCCTATCC | 52 | 187–192 | 2 | 0.3310 | 0.0000 |
| HQJ34 | (TGG)6 | CGTTACGGTCTTTCCTTG | AATACCTCACTCCTCCACAT | 58 | 191–197 | 4 | 0.6095 | 0.1111 |
| HQJ35 | (TAAAA)4 | CCGAACAAACAACTCATT | TCCTGTATTCACCCTCCT | 54 | 163–178 | 2 | 0.3692 | 0.0000 |
| HQJ37 | (AAGA)5 | GCTCGTTTCGTTTGTTTC | GTCGGTTATGAGATTCCATC | 58 | 105–129 | 4 | 0.6841 | 0.1667 |
| HQJ40 | (GAT)7 | AGCGTCTATTGGGCAGTG | AAAAGCAGAGTGAAGAAACATC | 58 | 52–149 | 4 | 0.5270 | 0.3333 |
| HQJ45 | (AT)6 | CAGCCTCACGCTCAACAA | CGACGGCCTACCATCTTT | 55 | 195–199 | 3 | 0.7750 | 0.3750 |
| HQJ56 | (TA)9 | CTAAAAATGATGAACTCCCGAAAAA | ACTGAGCAGCACAGCACAAC | 58 | 98–106 | 4 | 0.7033 | 0.8751 |
| HQJ63 | (TA)5 | ACGGAGGACATCACGAGC | TGGCAGGGCAAACCATAT | 52 | 116–118 | 2 | 0.2637 | 0.0000 |
| HQJ79 | (AGA)6 | CAGCTTGCGAGGATACGG | CTTCCCAAACTGCGAGGC | 58 | 157–162 | 2 | 0.6667 | 0.0000 |
| HQJ99 | (GCC)6 | GAGCAATCAGGAGGAGGG | GGGAAATGAACAGCGACTT | 60 | 156–168 | 2 | 0.3556 | 0.0000 |
| HQJ103 | (ACTC)5 | TGACTCCTTGACTGACCCTC | TGCAGCAGCTTGCTTTAT | 58 | 122–133 | 3 | 0.5333 | 0.0000 |
| HQJ115 | (AT)10 | GTTCTGTTGCTACCTGTG | TTGTCTCATTTTGCTTTC | 56 | 200–209 | 2 | 0.8571 | 1.0000 |
| HQJ134 | (TA)9 | TCCTCCTCCTTTATCACA | GTGCAGTATTAAGCGTTG | 57 | 365–377 | 3 | 0.5455 | 1.0000 |
| HQJ137 | (AC)9 | TTTCACGCTCATCTTTTA | CCTTTTGGCAGTCATTAT | 52 | 231–233 | 2 | 0.5455 | 1.0000 |
Abbreviations: Size, size of cloned allele; Ta, annealing temperature; Na, number of alleles; He, expected heterozygosity; Ho, observed heterozygosity.
Figure 3UPGMA dendrogram constructed based on eight genotypes from four representative populations and five SSR markers developed in this study. Two clusters were identified, generally corresponding to two geologic locations: the Mekong River and the Jinsha River (YN, Yun Nan Province; SC, Si Chuan Province).