| Literature DB >> 27148320 |
Xiu Huang1, Hai-Dong Yan1, Xin-Quan Zhang1, Jian Zhang2, Taylor P Frazier3, De-Jun Huang2, Lu Lu4, Lin-Kai Huang1, Wei Liu1, Yan Peng1, Xiao Ma1, Yan-Hong Yan1.
Abstract
Hemarthria R. Br. is an important genus of perennial forage grasses that is widely used in subtropical and tropical regions. Hemarthria grasses have made remarkable contributions to the development of animal husbandry and agro-ecosystem maintenance; however, there is currently a lack of comprehensive genomic data available for these species. In this study, we used Illumina high-throughput deep sequencing to characterize of two agriculturally important Hemarthria materials, H. compressa "Yaan" and H. altissima "1110." Sequencing runs that used each of four normalized RNA samples from the leaves or roots of the two materials yielded more than 24 million high-quality reads. After de novo assembly, 137,142 and 77,150 unigenes were obtained for "Yaan" and "1110," respectively. In addition, a total of 86,731 "Yaan" and 48,645 "1110" unigenes were successfully annotated. After consolidating the unigenes for both materials, 42,646 high-quality SNPs were identified in 10,880 unigenes and 10,888 SSRs were identified in 8330 unigenes. To validate the identified markers, high quality PCR primers were designed for both SNPs and SSRs. We randomly tested 16 of the SNP primers and 54 of the SSR primers and found that the majority of these primers successfully amplified the desired PCR product. In addition, high cross-species transferability (61.11-87.04%) of SSR markers was achieved for four other Poaceae species. The amount of RNA sequencing data that was generated for these two Hemarthria species greatly increases the amount of genomic information available for Hemarthria and the SSR and SNP markers identified in this study will facilitate further advancements in genetic and molecular studies of the Hemarthria genus.Entities:
Keywords: Hemarthria R. Br.; RNA-Seq; de novo assembly; marker development; transcriptome
Year: 2016 PMID: 27148320 PMCID: PMC4834353 DOI: 10.3389/fpls.2016.00496
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Summary of the transcripts for the two .
| 200–300 | 76,195 (28.22%) | 38,384 (22.51%) |
| 300–500 | 53,478 (19.81%) | 28,688 (16.82%) |
| 500–1000 | 55,885 (20.70%) | 35,220 (20.65%) |
| 1000–2000 | 55,309 (20.49%) | 42,683 (25.03%) |
| 2000+ | 29,105 (10.78%) | 25,575 (15.00%) |
| Total number | 269,972 | 170,550 |
| Tota length (nt) | 241,059,806 | 180,748,233 |
| N50 length (bp) | 1504 | 1716 |
| Mean length (nt) | 892.91 | 1059.80 |
Summary of unigenes for the two .
| 200–300 | 62,639 (45.67%) | 30,958 (40.13%) |
| 300–400 | 23,145 (16.88%) | 11,936 (15.47%) |
| 400–500 | 11,839 (8.63%) | 6339 (8.22%) |
| 500–600 | 7198 (5.25%) | 4014 (5.20%) |
| 600–700 | 4961 (3.62%) | 2831 (3.67%) |
| 700–800 | 3679 (2.68%) | 2327 (3.02%) |
| 800–900 | 2925 (2.13%) | 1932 (2.50%) |
| 900–1000 | 2441 (1.78%) | 1661 (2.15%) |
| 1000–1100 | 1965 (1.43%) | 1435 (1.86%) |
| 1100–1200 | 1706 (1.24%) | 1333 (1.73%) |
| 1200–1300 | 1535 (1.12%) | 1198 (1.55%) |
| 1300–1400 | 1327 (0.97%) | 1066 (1.38%) |
| 1400–1500 | 1256 (0.92%) | 958 (1.24%) |
| 1500–1600 | 1071 (0.78%) | 958 (1.24%) |
| 1600–1700 | 948 (0.69%) | 880 (1.14%) |
| 1700–1800 | 940 (0.69%) | 759 (0.98%) |
| 1800–1900 | 827 (0.60%) | 723 (0.94%) |
| 1900–2000 | 735 (0.54%) | 640 (0.83%) |
| 2000–2100 | 686 (0.50%) | 582 (0.75%) |
| 2100–2200 | 560 (0.41%) | 475 (0.62%) |
| 2200–2300 | 573 (0.42%) | 477 (0.62%) |
| 2300–2400 | 464 (0.34%) | 411 (0.53%) |
| 2400–2500 | 391 (0.29%) | 404 (0.52%) |
| 2500–2600 | 387 (0.28%) | 318 (0.41%) |
| 2600–2700 | 341 (0.25%) | 294 (0.38%) |
| 2700–2800 | 270 (0.20%) | 266 (0.34%) |
| 2800–2900 | 210 (0.15%) | 234 (0.30%) |
| 2900–3000 | 228 (0.17%) | 185 (0.24%) |
| 3000–3100 | 186 (0.14%) | 147 (0.19%) |
| >3000 | 1709 (1.25%) | 1409 (1.83%) |
| Total number | 137,142 | 77,150 |
| Tota length (nt) | 77,665,144 | 52,404,585 |
| N50 length (bp) | 826 | 1189 |
| Mean length (nt) | 566.31 | 679.26 |
Functional annotation of the .
| Annotated in Nt | 58,661 (42.77%) | 40,408 (52.38%) |
| Annotated in Nr | 75,478 (55.04%) | 41,262 (53.48%) |
| Annotated in Swiss-Prot | 46,906 (34.20%) | 28,513 (36.96%) |
| Annotated in Pfam | 44,285 (32.29%) | 25,190 (32.65%) |
| Annotated in GO | 50,638 (36.92%) | 30,538 (39.58%) |
| Annotated in KOG | 42,047 (30.66%) | 21,867 (28.34%) |
| Annotated in KEGG | 18,148 (13.23%) | 8434 (10.93%) |
| Annotated in all databases | 7628 (5.56%) | 4250 (5.51%) |
| Annotated in at least one databases | 86,731 (63.24%) | 48,645 (63.05%) |
| Total unigenes | 137,142 (100.00%) | 77,150 (100.00%) |
Figure 1The length distribution of CDSs mapped to know genes.
Figure 2The length distribution of CDSs unmapped to know genes.
Figure 3Distribution of the 17 most abundant Pfam function classifications for .
Figure 4Comparative distribution of GO categories for .
Figure 5Comparative distributions of KOG categories for .
Figure 6KEGG classification of .
Summary of putative SNPs identified from the two .
| A/C | 3734 |
| A/T | 2752 |
| C/G | 5183 |
| G/T | 3636 |
| A/G | 13,630 |
| C/T | 13,711 |
| Total SNPs | 42,646 |
| Number of unigenes containing SNPs | 10,880 |
| Number of annotated unigenes containing SNPs | 10,815 |
| Number of SNPs in CDS | 32,166 |
| Number of SNPs in non-CDS | 10,480 |
| Number of SNPs in non-synonymous | 12,707 |
| Number of SNPs in synonymous | 19,459 |
Summary of simple sequence repeats (SSRs) identified from the combined .
| Total number of sequences examined | 182,842 |
| Total number of identified SSRs | 10,888 |
| Number of sequences containing SSRs | 8330 |
| Number of known genes containing SSRs | 7701 |
| Number of sequences containing more than 1 SSR | 1936 |
| Number of SSRs present in compound formation | 575 |
| Mono-nucleotide repeats | 4777 |
| Di-nucleotide repeats | 1387 |
| Tri-nucleotide repeats | 4513 |
| Tetra-nucleotide repeats | 156 |
| Penta-nucleotide repeats | 28 |
| Hexa-nucleotide repeats | 27 |
Summary information on frequencies of different SSR repeat motif types related to variation of repeat unit numbers in .
| Di | – | 591 | 317 | 181 | 129 | 94 | 75 | 1387 | 22.70 |
| Tri | 3079 | 1096 | 300 | 33 | – | 2 | 3 | 4513 | 73.85 |
| Tetra | 123 | 23 | 4 | 3 | – | – | 3 | 156 | 2.55 |
| Penta | 22 | 3 | 3 | - | – | – | – | 28 | 0.46 |
| Hexa | 14 | 6 | 5 | 1 | 1 | – | – | 27 | 0.44 |
| Total | 3238 | 1719 | 629 | 218 | 130 | 96 | 81 | 6111 | – |
| % | 52.99 | 28.13 | 10.30 | 3.57 | 2.13 | 1.57 | 1.32 | – | – |
Statistics of repeat motifs.
| Di | AG/TC(49.68%) | AC/TG(22.93%) | AT/TA(15.14%) | CG/GC(12.26%) |
| Tri | CCG/GGC(48.19%) | AGC/TCG(14.45%) | AGG/TCC(11.30%) | ACG/TGC(7.93%) |