| Literature DB >> 25766098 |
Yogesh Gupta1, Ashish K Pathak2, Kashmir Singh3, Shrikant S Mantri4, Sudhir P Singh5, Rakesh Tuli6,7.
Abstract
BACKGROUND: Annona squamosa L., a popular fruit tree, is the most widely cultivated species of the genus Annona. The lack of transcriptomic and genomic information limits the scope of genome investigations in this important shrub. It bears aggregate fruits with numerous seeds. A few rare accessions with very few seeds have been reported for Annona. A massive pyrosequencing (Roche, 454 GS FLX+) of transcriptome from early stages of fruit development (0, 4, 8 and 12 days after pollination) was performed to produce expression datasets in two genotypes, Sitaphal and NMK-1, that show a contrast in the number of seeds set in fruits. The data reported here is the first source of genome-wide differential transcriptome sequence in two genotypes of A. squamosa, and identifies several candidate genes related to seed development. <br> RESULTS: Approximately 1.9 million high-quality clean reads were obtained in the cDNA library from the developing fruits of both the genotypes, with an average length of about 568 bp. Quality-reads were assembled de novo into 2074 to 11004 contigs in the developing fruit samples at different stages of development. The contig sequence data of all the four stages of each genotype were combined into larger units resulting into 14921 (Sitaphal) and 14178 (NMK-1) unigenes, with a mean size of more than 1 Kb. Assembled unigenes were functionally annotated by querying against the protein sequences of five different public databases (NCBI non redundant, Prunus persica, Vitis vinifera, Fragaria vesca, and Amborella trichopoda), with an E-value cut-off of 10(-5). A total of 4588 (Sitaphal) and 2502 (NMK-1) unigenes did not match any known protein in the NR database. These sequences could be genes specific to Annona sp. or belong to untranslated regions. Several of the unigenes representing pathways related to primary and secondary metabolism, and seed and fruit development expressed at a higher level in Sitaphal, the densely seeded cultivar in comparison to the poorly seeded NMK-1. A total of 2629 (Sitaphal) and 3445 (NMK-1) Simple Sequence Repeat (SSR) motifs were identified respectively in the two genotypes. These could be potential candidates for transcript based microsatellite analysis in A. squamosa. <br> CONCLUSION: The present work provides early-stage fruit specific transcriptome sequence resource for A. squamosa. This repository will serve as a useful resource for investigating the molecular mechanisms of fruit development, and improvement of fruit related traits in A. squamosa and related species.Entities:
Mesh:
Year: 2015 PMID: 25766098 PMCID: PMC4336476 DOI: 10.1186/s12864-015-1248-3
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Mature fruits of Sitaphal (a) and NMK-1 (b), showing densely seeded and nearly seedless ripened carpels (Scale 2 cm), respectively. Bar diagram shows the difference between the two genotypes in fruit seed number (c). The error bars indicate standard error in thirty mature fruits, harvested from three different plants (10 fruits from each plant) of each genotype.
Figure 2Early-stage developing fruits (0, 4, 8, and 12 DAP) in Sitaphal and NMK-1.
Summary of the sequencing-reads, assembly and functional annotation (using NCBI NR database) of the . transcriptome
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| Sitaphal | 0 DAP | 227,732 | 635 | 10403 | 8176 |
| 4 DAP | 198,269 | 512 | 2074 | 1808 | |
| 8 DAP | 219,057 | 574 | 6850 | 6023 | |
| 12 DAP | 292,212 | 578 | 7394 | 6512 | |
| NMK1 | 0 DAP | 288,216 | 592 | 8645 | 7401 |
| 4 DAP | 287,824 | 584 | 11004 | 9038 | |
| 8 DAP | 272,750 | 589 | 7001 | 6003 | |
| 12 DAP | 143,649 | 479 | 2078 | 1886 |
The details of the contigs are given in Additional file 3.
Figure 3GO classifications of assembled unigenes, having sequence homology with uniprot proteins, assigned to 51 functional groups.
Number and percentage (in bracket) of unigenes in . genotypes (Sitaphal and NMK-1) from BLASTx searches against public protein databases of fruits crop and a closely related genus
|
|
|
|
|
|
|---|---|---|---|---|
| NCBI non redundant | 10333 (69.25%) | 11676 (82.35%) | 1928 (12.92%) | 2825 (19.92%) |
|
| 9187 (61.57%) | 11126 (78.47%) | 1557 (10.43%) | 2289 (16.14%) |
|
| 9152 (61.33%) | 11108 (78.34%) | 1554 (10.41%) | 2313 (16.31%) |
|
| 9218 (61.77%) | 11206 (79.03%) | 968 (6.48%) | 1469 (10.36%) |
|
| 9003 (60.33%) | 10963 (77.32%) | 1701 (11.40%) | 2577 (18.17%) |
Summary of hormone related unigenes identified in the transcriptome of . genotypes (Sitaphal and NMK-1)
|
|
|
|---|---|
| Brassinosteroid | 43 |
| Auxin | 31 |
| Abscisic acid | 28 |
| Gibberellin | 20 |
| Ethylene | 16 |
| Cytokinin | 10 |
The details of the unigenes are given in Additional file 5.
Summary of transcription factor related unigenes identified in the transcriptome of . genotypes (Sitaphal and NMK-1)
|
|
|
|---|---|
| bHLH | 34 |
| C3H | 25 |
| MYB | 17 |
| MYB-related | 17 |
| b ZIP | 15 |
| NAC | 15 |
| HD-ZIP | 14 |
| C2H2 | 15 |
| HB-other | 10 |
| GRAS | 10 |
| ARF | 11 |
| WRKY | 14 |
| Trihelix | 8 |
| Mikc | 8 |
| FAR1 | 8 |
| G2-like | 8 |
| ARR-B | 6 |
| ERF | 6 |
| TALE | 6 |
| SBP | 6 |
| CO-like | 5 |
| HSF | 5 |
The table includes only those transcription factors which represent at least 5 unigenes in the transcriptome data. The details of the unigenes are given in Additional file 6.
AgriGO categories (False discovery rate ≤ 0.05) for putative genes up-regulated (≥2 fold) in early-stage fruits (8 DAP) of Sitaphal and NMK-1
|
|
|
|
|
|
|---|---|---|---|---|
| GO:0003006 | P | reproductive developmental process | 76 | 0 |
| GO:0005975 | P | carbohydrate metabolic process | 63 | 0 |
| GO:0005996 | P | monosaccharide metabolic process | 23 | 0 |
| GO:0006066 | P | alcohol metabolic process | 32 | 0 |
| GO:0006082 | P | organic acid metabolic process | 70 | 0 |
| GO:0006091 | P | generation of precursor metabolites and energy | 30 | 0 |
| GO:0006457 | P | protein folding | 33 | 0 |
| GO:0006461 | P | protein complex assembly | 19 | 0 |
| GO:0006508 | P | proteolysis | 69 | 0 |
| GO:0006511 | P | ubiquitin-dependent protein catabolic process | 36 | 0 |
| GO:0006519 | P | cellular amino acid and derivative metabolic process | 58 | 0 |
| GO:0006605 | P | protein targeting | 19 | 0 |
| GO:0006810 | P | transport | 128 | 0 |
| GO:0006886 | P | intracellular protein transport | 40 | 0 |
| GO:0006950 | P | response to stress | 147 | 77 |
| GO:0006996 | P | organelle organization | 64 | 0 |
| GO:0007017 | P | microtubule-based process | 17 | 0 |
| GO:0007275 | P | multicellular organismal development | 138 | 0 |
| GO:0008104 | P | protein localization | 52 | 0 |
| GO:0008152 | P | metabolic process | 591 | 230 |
| GO:0008610 | P | lipid biosynthetic process | 40 | 0 |
| GO:0009266 | P | response to temperature stimulus | 41 | 23 |
| GO:0009408 | P | response to heat | 0 | 11 |
| GO:0009628 | P | response to abiotic stimulus | 123 | 53 |
| GO:0009790 | P | embryonic development | 45 | 0 |
| GO:0009791 | P | post-embryonic development | 87 | 28 |
| GO:0009987 | P | cellular process | 696 | 251 |
| GO:0010035 | P | response to inorganic substance | 28 | 0 |
| GO:0010154 | P | fruit development | 45 | 0 |
| GO:0010876 | P | lipid localization | 0 | 7 |
| GO:0015031 | P | protein transport | 49 | 0 |
| GO:0016043 | P | cellular component organization | 97 | 0 |
| GO:0016192 | P | vesicle-mediated transport | 29 | 0 |
| GO:0019318 | P | hexose metabolic process | 18 | 0 |
| GO:0019538 | P | protein metabolic process | 241 | 0 |
| GO:0019752 | P | carboxylic acid metabolic process | 70 | 0 |
| GO:0019941 | P | modification-dependent protein catabolic process | 36 | 0 |
| GO:0022414 | P | reproductive process | 78 | 0 |
| GO:0022607 | P | cellular component assembly | 32 | 0 |
| GO:0032501 | P | multicellular organismal process | 144 | 0 |
| GO:0032502 | P | developmental process | 156 | 0 |
| GO:0033036 | P | macromolecule localization | 61 | 22 |
| GO:0033365 | P | protein localization in organelle | 14 | 0 |
| GO:0034613 | P | cellular protein localization | 42 | 0 |
| GO:0034621 | P | cellular macromolecular complex subunit organization | 24 | 0 |
| GO:0034637 | P | cellular carbohydrate biosynthetic process | 21 | 0 |
| GO:0034641 | P | cellular nitrogen compound metabolic process | 52 | 0 |
| GO:0042180 | P | cellular ketone metabolic process | 71 | 0 |
| GO:0043170 | P | macromolecule metabolic process | 377 | 0 |
| GO:0043436 | P | oxoacid metabolic process | 70 | 0 |
| GO:0043632 | P | modification-dependent macromolecule catabolic process | 36 | 0 |
| GO:0043933 | P | macromolecular complex subunit organization | 30 | 14 |
| GO:0044085 | P | cellular component biogenesis | 54 | 0 |
| GO:0044106 | P | cellular amine metabolic process | 41 | 0 |
| GO:0044237 | P | cellular metabolic process | 506 | 197 |
| GO:0044238 | P | primary metabolic process | 507 | 0 |
| GO:0044248 | P | cellular catabolic process | 66 | 0 |
| GO:0044257 | P | cellular protein catabolic process | 36 | 0 |
| GO:0044260 | P | cellular macromolecule metabolic process | 337 | 0 |
| GO:0044262 | P | cellular carbohydrate metabolic process | 43 | 0 |
| GO:0044265 | P | cellular macromolecule catabolic process | 50 | 0 |
| GO:0044267 | P | cellular protein metabolic process | 206 | 0 |
| GO:0045184 | P | establishment of protein localization | 49 | 0 |
| GO:0046907 | P | intracellular transport | 54 | 0 |
| GO:0048316 | P | seed development | 44 | 0 |
| GO:0048513 | P | organ development | 61 | 0 |
| GO:0048608 | P | reproductive structure development | 75 | 0 |
| GO:0048731 | P | system development | 61 | 0 |
| GO:0048856 | P | anatomical structure development | 115 | 0 |
| GO:0050896 | P | response to stimulus | 257 | 114 |
| GO:0051179 | P | localization | 131 | 0 |
| GO:0051234 | P | establishment of localization | 128 | 0 |
| GO:0051603 | P | proteolysis involved in cellular protein catabolic process | 36 | 0 |
| GO:0051641 | P | cellular localization | 62 | 0 |
| GO:0051649 | P | establishment of localization in cell | 56 | 0 |
| GO:0051716 | P | cellular response to stimulus | 62 | 0 |
| GO:0065003 | P | macromolecular complex assembly | 28 | 0 |
| GO:0070271 | P | protein complex biogenesis | 19 | 0 |
| GO:0070727 | P | cellular macromolecule localization | 43 | 0 |
The details of the unigenes are given in Additional file 9.
Pathway assignment based on KEGG (False Discovery Rate ≤ 0.05) for putative genes up-regulated (≥2 fold) in early-stage fruits (8 DAP) of Sitaphal and NMK-1
|
|
|
|
|---|---|---|
| Metabolic pathways | 152 | 63 |
| Biosynthesis of plant hormones | 47 | 0 |
| Spliceosome | 23 | 12 |
| Biosynthesis of alkaloids derived from terpenoid and polyketide | 28 | 0 |
| Biosynthesis of terpenoids and steroids | 31 | 0 |
| Biosynthesis of alkaloids derived from shikimate pathway | 28 | 0 |
| Biosynthesis of alkaloids derived from ornithine, lysine and nicotinic acid | 27 | 0 |
| Proteasome | 15 | 0 |
| Citrate cycle (TCA cycle) | 14 | 0 |
| Biosynthesis of alkaloids derived from histidine and purine | 24 | 0 |
| Biosynthesis of phenylpropanoids | 31 | 0 |
| Inositol phosphate metabolism | 10 | 0 |
| Amino sugar and nucleotide sugar metabolism | 14 | 0 |
| Endocytosis | 12 | 0 |
| Aminoacyl-tRNA biosynthesis | 10 | 0 |
| Ribosome | 19 | 0 |
| Oxidative phosphorylation | 13 | 0 |
The details of the unigenes are given in Additional file 9.
Figure 4Differential accumulation (≥2 fold, 8 DAP 0 DAP) of transcripts for embryogenesis related putative genes in early-stage fruits of Sitaphal and NMK-1. The orthologous genes give a defective embryo and/or seed phenotype in Arabidopsis mutants. The details of the differentially expressed transcripts are given in Additional file 8.
Figure 5Quantitative RT-PCR analyses and RPKM expression value of 5 randomly selected candidate genes for seed development in Sitaphal and NMK-1, at 8 DAP. Quantitative RT-PCR analyses (a). Each bar indicates standard error in three biological replicates (*p ≤ 0.05). A detail of the primers is given in Additional file 10. The qRT-PCR fold change is comparable with RPKM values in transcriptome data (b).
Statistics of SSRs identified in the transcripts of . genotypes (Sitaphal and NMK-1)
|
|
|
|
|---|---|---|
| Total number of sequences examined | 14921 | 14178 |
| Total size of examined sequences | 13101288 | 15606312 |
| Total number of identified SSRs | 2629 | 3445 |
| Number of SSR containing sequences | 2045 | 2678 |
| Number of sequences containing more than one SSR | 417 | 541 |
| Number of SSRs present in compound Formation | 324 | 428 |
| Frequency of SSRs | 4.53 kb / SSR | 4.98 kb / SSR |
Classes of SSR repeat motifs in the transcriptome of . genotypes (Sitaphal and NMK-1)
|
|
|
|
|---|---|---|
| Mono-nucleotides | 939 (35.71%) | 1518 (44.06%) |
| Di-nucleotides | 812 (30.88%) | 880 (25.54%) |
| Tri-nucleotides | 769 (29.25%) | 912 (26.47%) |
| Tetra-nucleotides | 81 (0.030%) | 87 (0.025%) |
| Penta-nucleotides | 12 (0.004%) | 18 (0.005%) |
| Hexa-nucleotides | 16 (0.006%) | 30 (0.008%) |