| Literature DB >> 26057749 |
Xiao-Jun Zhou1,2, Yue-Yue Wang3, Ya-Nan Xu4, Rong-Shan Yan5, Peng Zhao6, Wen-Zhe Liu7.
Abstract
Tapiscia sinensis Oliv (Tapisciaceae) is an endangered species native to China famous for its androdioecious breeding system. However, there is a lack of genomic and transcriptome data on this species. In this study, the Tapiscia sinensis transcriptomes from two types of sex flower buds were sequenced. A total of 97,431,176 clean reads were assembled into 52,169 unigenes with an average length of 1116 bp. Through similarity comparison with known protein databases, 36,662 unigenes (70.27%) were annotated. A total of 10,002 (19.17%) unigenes were assigned to 124 pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database. Additionally, 10,371 simple sequence repeats (SSRs) were identified in 8608 unigenes, with 16,317 pairs of primers designed for applications. 150 pairs of primers were chosen for further validation, and the 68 pairs (45.5%) were able to produce clear polymorphic bands. Six polymorphic SSR markers were used to Bayesian clustering analysis of 51 T. sinensis individuals. This is the first report to provide transcriptome information and to develop large-scale SSR molecular markers for T. sinensis. This study provides a valuable resource for conservation genetics and functional genomics research on T. sinensis for future work.Entities:
Keywords: EST-SSRs (Expressed sequence tag simple sequence repeats); Illumina sequencing; Tapiscia sinensis; transcriptome
Mesh:
Substances:
Year: 2015 PMID: 26057749 PMCID: PMC4490475 DOI: 10.3390/ijms160612855
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Length distributions of the unigenes in the assembled transcriptome.
Figure 2The unigenes showed homology with sequences in the Nr, Swiss-Prot, COG (Clusters of Orthologous Groups of proteins) and KEGG (Kyoto Encyclopedia of Genes and Genomes) databases.
Figure 3Unigenes were assigned to Gene Ontology.
Figure 4Unigenes were assigned to COG classifications.
Figure 5The male flower and hermaphrodite flower of T. sinensis, (A) and (C) are flowers from male individual; (B) and (D) are bisexual flowers from hermaphrodite individual. Anatomical images (C,D) are the difference between male and hermaphrodite flowers about pistils. Picture (C) shows degenerated pistil, picture (D) shows well-developed pistil.
Summary of the EST-SSRs identified in the transcriptome sequences.
| Item | Number |
|---|---|
| Total number of sequences examined | 52,169 |
| Total size of examined sequences (bp) | 58,253,075 |
| Total number of identified SSRs | 10,371 |
| Number of SSR containing sequences | 8608 |
| Number of sequences containing more than 1 SSR | 1436 |
| Number of SSRs present in compound formation | 733 |
The distribution of the identified EST-SSRs in sequences using the MISA (MIcroSAtellite identification) software.
| Repeat Numbers | Motif | Total | Repeat Number (%) | ||||
|---|---|---|---|---|---|---|---|
| Di- | Tri- | Tetra- | Penta- | Hexa | |||
| 4 | 0 | 0 | 368 | 158 | 266 | 792 | 7.64 |
| 5 | 0 | 1319 | 101 | 15 | 12 | 1447 | 13.95 |
| 6 | 2026 | 703 | 52 | 5 | 11 | 2797 | 26.97 |
| 7 | 1416 | 411 | 0 | 0 | 4 | 1831 | 17.66 |
| 8 | 1164 | 49 | 1 | 1 | 2 | 1217 | 11.73 |
| 9 | 1256 | 0 | 0 | 0 | 0 | 1256 | 12.11 |
| 10 | 823 | 1 | 0 | 0 | 0 | 824 | 7.95 |
| 11 | 195 | 1 | 0 | 0 | 0 | 196 | 1.89 |
| ˃11 | 8 | 2 | 0 | 0 | 1 | 11 | 0.11 |
| Total | 6888 | 2486 | 522 | 179 | 296 | 10,371 | 100 |
| Motif (%) | 66.42 | 23.97 | 5.03 | 1.73 | 2.85 | ||
Figure 6Frequency distribution of SSRs based on the motif types.
Figure 7The results of the SSR-PCR that produced clear bands. M refers to DNA markers pBR322 DNA/MspI. Numbers 1 to 24 represent twenty-four individuals of T. sinensis. The products of primer TS061 are single bands without polymorphism, while those of TS062 are clear polymorphic bands.
Characteristics of the six primer pairs used for the phylogenetic analysis of the 51 individuals.
| Primer | Motif | HW | ||||
|---|---|---|---|---|---|---|
| TS053 | (TCA)6 | 9 | 0.706 | 0.835 | 0.804 | ND |
| TS126 | (TAC)7 | 6 | 0.706 | 0.768 | 0.723 | NS |
| TS103 | (AAGA)5 | 5 | 0.549 | 0.631 | 0.573 | NS |
| TS149 | (TTG)11 | 6 | 0.608 | 0.769 | 0.725 | NS |
| TS060 | (TGT)6 | 6 | 0.529 | 0.501 | 0.464 | NS |
| TS013 | (AGA)8 | 5 | 0.51 | 0.653 | 0.592 | NS |
K, No. (Number) of alleles; H, Observed heterozygosity; H, Expected heterozygosity; PIC, Polymorphism Information Content; HW, Hardy-Weinberg equilibrium test; NS, Not significant; ND, Not done.