| Literature DB >> 28154822 |
Wenlan Tian1, Dev Paudel2, Wagner Vendrame3, Jianping Wang4.
Abstract
Jatropha (Jatropha curcas L.) is an economically important species with a great potential for biodiesel production. To enrich the jatropha genomic databases and resources for microgravity studies, we sequenced and annotated the transcriptome of jatropha and developed SSR and SNP markers from the transcriptome sequences. In total 1,714,433 raw reads with an average length of 441.2 nucleotides were generated. De novo assembling and clustering resulted in 115,611 uniquely assembled sequences (UASs) including 21,418 full-length cDNAs and 23,264 new jatropha transcript sequences. The whole set of UASs were fully annotated, out of which 59,903 (51.81%) were assigned with gene ontology (GO) term, 12,584 (10.88%) had orthologs in Eukaryotic Orthologous Groups (KOG), and 8,822 (7.63%) were mapped to 317 pathways in six different categories in Kyoto Encyclopedia of Genes and Genome (KEGG) database, and it contained 3,588 putative transcription factors. From the UASs, 9,798 SSRs were discovered with AG/CT as the most frequent (45.8%) SSR motif type. Further 38,693 SNPs were detected and 7,584 remained after filtering. This UAS set has enriched the current jatropha genomic databases and provided a large number of genetic markers, which can facilitate jatropha genetic improvement and many other genetic and biological studies.Entities:
Year: 2017 PMID: 28154822 PMCID: PMC5244023 DOI: 10.1155/2017/8614160
Source DB: PubMed Journal: Int J Genomics ISSN: 2314-436X Impact factor: 2.326
Summary statistics of the transcript sequence reads assembly.
| Raw sequence | Assembled contigs (Newbler) | UASs (>100 bp) (Newbler + CD-HIT) | Contigs (>100 bp) | |
|---|---|---|---|---|
| Number of sequences | 1,714,433 | 27,897 | 115,611 | 20,624 |
| Min length | 40 | 1 | 100 | 100 |
| Max length | 956 | 11,929 | 11,929 | 11,929 |
| Median length | 464 | 839 | 491 | 1168.5 |
| Average length | 441.2 | 1082.2 | 621.0 | 1380.9 |
| N25 | — | 2,481 | 1,435 | 2,554 |
| N50 | — | 1,679 | 556 | 1,746 |
| N75 | — | 1,036 | 471 | 1,150 |
| N90 | — | 607 | 396 | 751 |
| GC% | 41.67 | 40.45 | 39.54 | 40.56 |
Figure 1Length distribution of the assembled contigs from Newbler.
Figure 2Contig depth distribution.
Figure 3Gene index distribution for Arabidopsis, cassava, and castor bean.
Figure 4Classification distribution of the transcripts in different transcription factor families.
Figure 5Classification distribution of the annotated GO terms of the UASs.
Identified UASs involved in oil biosynthesis and oil metabolic pathways.
| Name | Definition | EC | KO |
|---|---|---|---|
| Fatty acid biosynthesis | |||
| K01964 | Acetyl-CoA/propionyl-CoA carboxylase | 6.4.1.2; 6.4.1.3 | K01964 |
| accC | Acetyl-CoA carboxylase, biotin carboxylase subunit | 6.4.1.2; 6.3.4.14 | K01961 |
| fabH | 3-Oxoacyl-[acyl-carrier-protein] synthase III | 2.3.1.180 | K00648 |
| fabG | 3-Oxoacyl-[acyl-carrier protein] reductase | 1.1.1.100 | K00059 |
| fabA | 3-Hydroxyacyl-[acyl-carrier-protein] dehydratase | 4.2.1.59 | K01716 |
| fabZ | 3-Hydroxyacyl-[acyl-carrier-protein] dehydratase | 4.2.1.59 | K02372 |
| fabI | Enoyl-[acyl-carrier protein] reductase I | 1.3.1.9; 1.3.1.10 | K00208 |
| fabD | [Acyl-carrier-protein] S-malonyltransferase | 2.3.1.39 | K00645 |
| DESA1 | Acyl-[acyl-carrier-protein] desaturase | 1.14.19.2 | K03921 |
| FATA | Fatty acyl-ACP thioesterase A | 3.1.2.14; 3.1.2.- | K10782 |
| FATB | Fatty acyl-ACP thioesterase B | 3.1.2.14; 3.1.2.- | K10781 |
| Linoleic acid metabolism | |||
| PLA2G, SPLA2 | Secretory phospholipase A2 | 3.1.1.4 | K01047 |
| LOX1_5 | Linoleate 9S-lipoxygenase | 1.13.11.58 | K15718 |
| LOX2S | Lipoxygenase | 1.13.11.12 | K00454 |
Identified UASs involved in cold stress responses.
| Gene | Protein | UAS hits |
|---|---|---|
| HOS1 | E3 ubiquitin-protein ligase HOS1 | Contig 15459 |
| ICE1 | Transcription factor ICE1 | Contig 18116, contig 17481 |
| ICE2 | Inducer of CBF expression 2 | Contig 18116, contig 17481 |
| CBF1/DREB1b | Dehydration-responsive element-binding protein 1B | Contig 05683 contig 19154 |
| RAV1 | AP2/ERF and B3 domain-containing transcription factor RAV1 | Contig 09872, contig 21221 |
| Rd22 | Dehydration-responsive protein RD22 | Contig 20575 |
Summary statistics of SSR detection and validation.
| Di- | Tri- | >3 | Total | |
|---|---|---|---|---|
| SSRs in the database | 8,175 | 3,485 | 384 | 12,044 |
| Ordered SSR primer pairs | 88 | 174 | — | 262 |
| Amplified SSRs | 60 | 142 | — | 202 (77.1%) |
| Polymorphic SSRs | 11 | 22 | — | 33 |
| Polymorphic rate | 18.3% | 15.5% | — | 16.3% |
Summary statistics of SNP detection and validation.
| Depth | Number | ||
|---|---|---|---|
| SNPs discovered in the database | 0 | 38,693 | |
| ≥4 | 7,584 | ||
| ≥10 | 4,767 | ||
| Chosen SNPs for validation | 4–10 | 96 | |
| 11–20 | 90 | ||
| 21+ | 89 | ||
| Primers designed flanking the chosen SNPs | 21 | ||
| Total expected SNPs chosen | 275 | ||
| Amplified primers for Sanger sequencing | 19 | ||
| SNPs for Sanger sequencing | 240 | ||
| Primers succeeded for sequencing | 17 | ||
| SNPs succeeded for sequencing | 214 | ||
| Primers validated containing SNPs | 5 (29.4%) | ||
| Matched SNPs with sequencing results | 28 (13.1%) |