| Literature DB >> 24392108 |
Lin Wei1, Shenghua Li1, Shenggui Liu1, Anna He1, Dan Wang2, Jie Wang3, Yulian Tang1, Xianjin Wu1.
Abstract
BACKGROUND: Houttuynia cordata Thunb. is an important traditional medical herb in China and other Asian countries, with high medicinal and economic value. However, a lack of available genomic information has become a limitation for research on this species. Thus, we carried out high-throughput transcriptomic sequencing of H. cordata to generate an enormous transcriptome sequence dataset for gene discovery and molecular marker development. PRINCIPALEntities:
Mesh:
Substances:
Year: 2014 PMID: 24392108 PMCID: PMC3879290 DOI: 10.1371/journal.pone.0084105
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of sequencing output statistics.
| Samples | Total Raw Reads | Total Clean Reads | Total Clean Nucleotides | Q20% | N% | GC% |
|
| 56,668,324 | 51,973,070 | 46,77,576,300 | 97.83% | 0.00% | 50.62% |
*Total Clean Nucleotides = Total Clean Reads1×Read1 size+Total Clean Reads2×Read2 size.
Figure 1Length distribution of H. cordata Unigenes.
Figure 2Comparison of unigene length with or without hits.
Figure 3Characteristics of homology search of Illumina sequences against the nr database.
(A) E-value distribution of BLAST hits for each unique sequence with a cut-off E-value of 1.0E-5. (B) Similarity distribution of the top BLAST hits for each sequence. (C) Species distribution is shown as a percentage of the total homologous sequences with an E-value of at least 1.0E-5.
Figure 4Gene ontology classification of assembled unigenes.
Figure 5Clusters of orthologous groups (COG) classification.
The top 20 pathways with highest sequence numbers.
| Number | Pathway | All genes with pathway annotation (24434) | Pathway ID |
| 1 | Metabolic pathways | 6,718 (27.49%) | ko01100 |
| 2 | Biosynthesis of secondary metabolites | 2,448 (10.02%) | ko01110 |
| 3 | Endocytosis | 2,112 (8.64%) | ko04144 |
| 4 | Glycerophospholipid metabolism | 1,974 (8.08%) | ko00564 |
| 5 | Ether lipid metabolism | 1,853 (7.58%) | ko00565 |
| 6 | Plant-pathogen interaction | 1,319 (5.4%) | ko04626 |
| 7 | Plant hormone signal transduction | 1,111 (4.55%) | ko04075 |
| 8 | RNA transport | 1,058 (4.33%) | ko03013 |
| 9 | Spliceosome | 894 (3.66%) | ko03040 |
| 10 | Starch and sucrose metabolism | 869 (3.56%) | ko00500 |
| 11 | Purine metabolism | 697 (2.85%) | ko00230 |
| 12 | mRNA surveillance pathway | 665 (2.72%) | ko03015 |
| 13 | Pyrimidine metabolism | 637 (2.61%) | ko00240 |
| 14 | Protein processing in endoplasmic reticulum | 602 (2.46%) | ko04141 |
| 15 | Pentose and glucuronate interconversions | 535 (2.19%) | ko00040 |
| 16 | Ubiquitin mediated proteolysis | 503 (2.06%) | ko04120 |
| 17 | Ribosome | 462 (1.89%) | ko03010 |
| 18 | RNA polymerase | 431 (1.76%) | ko03020 |
| 19 | Ribosome biogenesis in eukaryotes | 426 (1.74%) | ko03008 |
| 20 | RNA degradation | 426 (1.74%) | ko03018 |
Summary of SSR searching results.
| Searching Item | Numbers |
| Total number of sequences examined | 63,954 |
| Total size of examined sequences (bp) | 43,395,361 |
| Total number of identified SSRs | 4,800 |
| Number of SSR containing sequences | 4,413 |
| Number of sequences containing more than 1 SSR | 357 |
| Number of SSRs present in compound formation | 164 |
| Mono- nucleotide | 1,313 |
| Di-nucleotide | 1,278 |
| Tri-nucleotide | 1,994 |
| Tetra-nucleotide | 39 |
| Penta-nucleotide | 66 |
| Hexa-nucleotide | 110 |
Length distribution of SSRs based on the number of repeaters.
| Number of repeaters | mono- | Di- | Tri- | Tetra- | Penta- | Hexa- | Total |
| 4 | - | - | - | - | 57 | 94 | 151 |
| 5 | - | - | 1,275 | 34 | 9 | 10 | 1,328 |
| 6 | - | 446 | 481 | 5 | 0 | 2 | 934 |
| 7 | - | 284 | 208 | 0 |
|
|
|
|
|
|
|
|
|
|
|
|
| 9 | - | 145 | 3 | 0 | 0 | 0 | 148 |
| 10 | - | 119 | 0 | 0 | 0 | 0 | 119 |
| 11 | - | 78 | 2 | 0 | 0 | 0 | 80 |
| 12 | 604 | 4 | 0 | 0 | 0 | 0 | 608 |
| 13 | 305 | 0 | 1 | 0 | 0 | 0 | 306 |
| 14 | 169 | 0 | 0 | 0 | 0 | 0 | 169 |
| 15 | 108 | 0 | 0 | 0 | 0 | 0 | 108 |
| 16 | 38 | 0 | 1 | 0 | 0 | 1 | 40 |
| 17 | 21 | 0 | 0 | 0 | 0 | 0 | 21 |
| 18 | 3 | 0 | 1 | 0 | 0 | 0 | 4 |
| 19 | 7 | 0 | 0 | 0 | 0 | 0 | 7 |
| ≥20 | 58 | 0 | 1 | 0 | 0 | 0 | 59 |
Figure 6Frequency distribution of SSRs based on motif sequence types.
Characterization of 43 SSRs in H. cordata.
| Primer | SSRs | Forward primer (5′-3′) | Reverse primer (5′-3′) | No. of alleles | Observed heterozygosity ( | Expected heterozygosity ( | Polymorphism information content ( |
| HM_1 | (TGA)5 |
|
| 3 | 1.00 | 0.59 | 0.49 |
| HM_2 | (AGA)6 |
|
| 7 | 1.00 | 0. 86 | 0.84 |
| HM_3 | (AC)7 |
|
| 6 | 0.89 | 0.93 | 0.93 |
| HM_4 | (GGA)5 |
|
| 4 | 1.00 | 0.71 | 0.63 |
| HM_5 | (AG)10 |
|
| 6 | 1.00 | 0.84 | 0.79 |
| HM_6 | (CTC)5 |
|
| 8 | 0.60 | 0.79 | 0.68 |
| HM_7 | (GCT)5 |
|
| 5 | 0.91 | 0.90 | 0.89 |
| HM_8 | (TCC)5 |
|
| 4 | 1.00 | 0.73 | 0.66 |
| HM_9 | (TGA)6 |
|
| 7 | 1.00 | 0.79 | 0.75 |
| HM_10 | (AG)6 |
|
| 8 | 0.67 | 0.80 | 0.72 |
| HM_11 | (TGG)5 |
|
| 10 | 0.65 | 0.82 | 0.73 |
| HM_12 | (TTC)6 |
|
| 4 | 1.00 | 0.76 | 0.70 |
| HM_13 | (TGA)7 |
|
| 4 | 1.00 | 0.75 | 0.68 |
| HM_14 | (TA)7 |
|
| 5 | 0.80 | 0.69 | 0.65 |
| HM_15 | (CT)6 |
|
| 8 | 1.00 | 0.89 | 0.84 |
| HM_16 | (GCG)6 |
|
| 7 | 0.63 | 0.82 | 0.65 |
| HM_17 | (TTC)5 |
|
| 6 | 1.00 | 0.85 | 0.80 |
| HM_18 | (ACT)5 |
|
| 6 | 0.54 | 0.65 | 0.63 |
| HM_19 | (CCA)5 |
|
| 4 | 0.58 | 0.96 | 0.96 |
| HM_20 | (TCA)5 |
|
| 7 | 0.88 | 0.79 | 0.78 |
| HM_21 | (GA)8 |
|
| 8 | 0.64 | 0.83 | 0.76 |
| HM_22 | (TTC)5 |
|
| 4 | 0.50 | 0.67 | 0.62 |
| HM_23 | (TCT)6 |
|
| 4 | 1.00 | 0.68 | 0.60 |
| HM_24 | (GAA)5 |
|
| 4 | 1.00 | 0.65 | 0.57 |
| HM_25 | (GCG)6 |
|
| 7 | 0.40 | 0.61 | 0.52 |
| HM_26 | (CT)7 |
|
| 8 | 0.67 | 0.76 | 0.65 |
| HM_27 | (GA)8 |
|
| 5 | 0.60 | 0.80 | 0.73 |
| HM_28 | (GGA)5 |
|
| 7 | 0.59 | 0.80 | 0.71 |
| HM_29 | (AT)8 |
|
| 4 | 1.00 | 0.73 | 0.66 |
| HM_30 | (GCT)5 |
|
| 4 | 1.00 | 0.75 | 0.68 |
| HM_31 | (GCCCCA)4 |
|
| 5 | 0.75 | 0.63 | 0.59 |
| HM_32 | (TGT)7 |
|
| 8 | 1.00 | 0.87 | 0.82 |
| HM_33 | (TCC)5 |
|
| 6 | 0.68 | 0.78 | 0.67 |
| HM_34 | (GAA)7 |
|
| 5 | 0.52 | 0.71 | 0.64 |
| HM_35 | (GGCGAT)7 |
|
| 3 | 1.00 | 0.63 | 0.54 |
| HM_36 | (TC)9 |
|
| 3 | 0.86 | 0.91 | 0.90 |
| HM_37 | (AGA)5 |
|
| 5 | 0.78 | 0.80 | 0.77 |
| HM_38 | (GA)7 |
|
| 3 | 0.91 | 0.90 | 0.89 |
| HM_39 | (TC)6 |
|
| 7 | 1.00 | 0.85 | 0.80 |
| HM_40 | (GA)9 |
|
| 5 | 0.80 | 0.69 | 0.65 |
| HM_41 | (ACC)6 |
|
| 6 | 1.00 | 0.79 | 0.75 |
| HM_42 | (TAT)5 |
|
| 9 | 0.90 | 0.89 | 0.83 |
| HM_43 | (CAC)7 |
|
| 8 | 1.00 | 0.86 | 0.81 |
| Mean | 5.74 | 0.83 | 0.78 | 0.72 |
Figure 7The number of chromosomes and karyotype of H. cordata.