Literature DB >> 25766098

De novo assembly and characterization of transcriptomes of early-stage fruit from two genotypes of Annona squamosa L. with contrast in seed number.

Yogesh Gupta1, Ashish K Pathak2, Kashmir Singh3, Shrikant S Mantri4, Sudhir P Singh5, Rakesh Tuli6,7.   

Abstract

BACKGROUND: Annona squamosa L., a popular fruit tree, is the most widely cultivated species of the genus Annona. The lack of transcriptomic and genomic information limits the scope of genome investigations in this important shrub. It bears aggregate fruits with numerous seeds. A few rare accessions with very few seeds have been reported for Annona. A massive pyrosequencing (Roche, 454 GS FLX+) of transcriptome from early stages of fruit development (0, 4, 8 and 12 days after pollination) was performed to produce expression datasets in two genotypes, Sitaphal and NMK-1, that show a contrast in the number of seeds set in fruits. The data reported here is the first source of genome-wide differential transcriptome sequence in two genotypes of A. squamosa, and identifies several candidate genes related to seed development. <br> RESULTS: Approximately 1.9 million high-quality clean reads were obtained in the cDNA library from the developing fruits of both the genotypes, with an average length of about 568 bp. Quality-reads were assembled de novo into 2074 to 11004 contigs in the developing fruit samples at different stages of development. The contig sequence data of all the four stages of each genotype were combined into larger units resulting into 14921 (Sitaphal) and 14178 (NMK-1) unigenes, with a mean size of more than 1 Kb. Assembled unigenes were functionally annotated by querying against the protein sequences of five different public databases (NCBI non redundant, Prunus persica, Vitis vinifera, Fragaria vesca, and Amborella trichopoda), with an E-value cut-off of 10(-5). A total of 4588 (Sitaphal) and 2502 (NMK-1) unigenes did not match any known protein in the NR database. These sequences could be genes specific to Annona sp. or belong to untranslated regions. Several of the unigenes representing pathways related to primary and secondary metabolism, and seed and fruit development expressed at a higher level in Sitaphal, the densely seeded cultivar in comparison to the poorly seeded NMK-1. A total of 2629 (Sitaphal) and 3445 (NMK-1) Simple Sequence Repeat (SSR) motifs were identified respectively in the two genotypes. These could be potential candidates for transcript based microsatellite analysis in A. squamosa. <br> CONCLUSION: The present work provides early-stage fruit specific transcriptome sequence resource for A. squamosa. This repository will serve as a useful resource for investigating the molecular mechanisms of fruit development, and improvement of fruit related traits in A. squamosa and related species.

Entities:  

Mesh:

Year:  2015        PMID: 25766098      PMCID: PMC4336476          DOI: 10.1186/s12864-015-1248-3

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Annona squamosa L., commonly known as sugar-apple (or sweetsop or custard-apple), is a popular fruit throughout the tropics, mainly southern Mexico, Antilles, Central and South America, tropical Africa, Australia, India, Indonesia, Polynesia and US (Hawaii and Florida) [1]. It is native to the tropical America and West Indies. In India, it was introduced by the Spanish and Portuguese in the 16th century [1,2]. It is known by several names in India: ata, aarticum, shareefa, sitaphal, seethaphal or seetha pazham, aathachakka and atna kothal etc. [1]. Annona sp. belongs to family Annonaceae which is the largest living family among magnoliids (primitive angiosperms). The genus Annona contains about 166 species [3], out of which six produce edible fruits; A. squamosa, A. reticulata, A. cherimola, A. muricata, A. atemoya and A. diversifolia [4]. A. squamosa is the most widely cultivated species [5]. The flower of A. squamosa comprises of a gynoecium of several loosely cohering carpels, surrounded by an androecium of numerous stamens, encircled by three small, inconspicuous sepals, and three green colored fleshy petals [6]. It is an apocarpous flower i.e. carpels are separate in individual pistils. Fruit is a syncarpium i.e. formed by amalgation of many ripened pistils and a fleshy receptacle. Each carpel has a single anatropous ovule that may develop into a single seed. The pulp is creamy white to light yellow, sweetly aromatic, and tastes like custard. The pulp is of nutritional and medicinal value [7,8], rich in calories, vitamin C, and minerals [1,9,10]. Annona fruits have been mentioned as ‘one of the most delicious fruits known to man’ and as ‘aristocrat of fruits’, considering its nutritional and medicinal value [11,12]. There have been very few genomic studies on A. squamosa, as only 158 and 12 sequences are available in nucleotide and protein databases, respectively, in NCBI GenBank as on 20th December, 2014 (http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=301693). Next generation sequencing (NGS) technologies have facilitated rapid investigation of transcriptome [13-16]. The GS FLX+ platform is a high-throughput system, which can generate long sequence reads (up to ~1 kb), with high accuracy (http://454.com/products/gs-flx-system). We report de novo assembly and transcriptome catalogue from A. squamosa. The data provides an important resource for gene discovery, gene expression, functional analysis, molecular breeding, and comparative genomic analysis of A. squamosa and related species. In most angiosperms, including A. squamosa, ovule and ovary develop into seed and fruit, respectively. This transition is a complex physiological process with coordinated development of maternal and filial tissues. Understanding the early phase of fruit development is important, since the molecular and biochemical pathways of seed and fruit set, soon after fertilization, determine seed number, fruit size, and other fruit quality traits such as accumulation of sugar and organic acids [17-19]. Less number of seeds in fruit or seedlessness is important to consumers, both for fresh fruit consumption and fruit processing, especially when the seeds in Annona are hard and have a bad taste. Differences in fruit related traits, such as seed number have been reported among the Annona species and cultivars [9]. The presence of parthenocarpic fruits has not been reported in Annona sp. However, absence of the outer integument and change in ovule structure have been suggested as the causes for failure in seed formation due to interruption in the reproductive program in a spontaneous mutant of A. squamosa (Thai seedless) [11]. In India, some accessions have been reported with significantly reduced number of seeds, as compared to the common sugar-apple, Sitaphal [20,21]. In order to gain molecular insight into early-stage fruit development and to create groundwork for molecular characterization of fruit development, it is desirable to profile the transcriptome of developing fruits of A. squamosa. In the present study, a massive pyrosequencing of transcriptome from early stages of fruit development was performed in two Annona genotypes (Sitaphal and NMK-1), showing significant difference in fruit seed number, using NGS technology (Roche 454 GS FLX+). De novo transcriptome assembly, functional annotation, and in silico discovery of potential molecular markers have been described here. Various genes, related to hormone, seed and fruit development, transcription factors, and metabolic pathways were identified. The information will be helpful in functional genomic studies and in furthering the understanding of molecular mechanisms of fruit development in Annona sp.

Methods

Plant material and RNA extraction

Two Annona genotypes with contrast in fruit seed number (Figure 1), Sitaphal and NMK-1, were used in this study. Sitaphal is a well known cultivar of A. squamosa [22]. NMK-1 was developed by selection for desirable characteristics from a population of Annona genotypes [21]. However, systematic information on the development of the cultivars is not available. Phylogenetic analysis, using two marker sequences (rbcl and LMCH10) in seventeen species of Annona, placed both the genotypes close to A. squamosa (Additional file 1: Figure S1). The two genotypes were collected from the field of Madhuban Nursery (17.68° N 75.92° E coordinates, at an elevation of 457 m), Solapur, Maharshtra, India, where these are clonally propagated.
Figure 1

Mature fruits of Sitaphal (a) and NMK-1 (b), showing densely seeded and nearly seedless ripened carpels (Scale 2 cm), respectively. Bar diagram shows the difference between the two genotypes in fruit seed number (c). The error bars indicate standard error in thirty mature fruits, harvested from three different plants (10 fruits from each plant) of each genotype.

Mature fruits of Sitaphal (a) and NMK-1 (b), showing densely seeded and nearly seedless ripened carpels (Scale 2 cm), respectively. Bar diagram shows the difference between the two genotypes in fruit seed number (c). The error bars indicate standard error in thirty mature fruits, harvested from three different plants (10 fruits from each plant) of each genotype. Pollens were collected from flowers, in male stage, as described by Jalikop and Kumar [4]. The flowers, in female stage, were hand self-pollinated, using freshly collected pollens, in the morning (06.00 and 10.00 h). All the flowers were pollinated at the same time to avoid confounding effect of environment on fruit development. In each pollinated flower, the floral tube was plugged with cotton to prevent contamination of outside pollen. Flowers in similar stages were tagged and left as un-pollinated controls to examine seed numbers in fruits, developed from hand self-pollination and natural open-pollination (Figure 1c). The experiment was performed on three plants (three biological replicates) each of both the genotypes, during July, 2012. Developing fruits were harvested at 4, 8, and 12 days after pollination (DAP) (Figure 2). The gynoecium comprising of unfertilized ovules (0 DAP) was harvested. All the stamens were removed surrounding the gynoecium before harvesting. The 0, 4, 8 and 12 DAP samples were surface sterilized by using absolute ethanol before harvesting. The samples were frozen in liquid nitrogen immediately after harvest, and stored at −80°C until use.
Figure 2

Early-stage developing fruits (0, 4, 8, and 12 DAP) in Sitaphal and NMK-1.

Early-stage developing fruits (0, 4, 8, and 12 DAP) in Sitaphal and NMK-1. Total RNA was isolated from the developing fruits (hand self-pollinated) using RNA isolation kit (Sigma) following the manufacturer’s instructions. For RNA extraction, at least three developing fruits of same stage (0, 4, 8, and 12 DAP) were taken for each sample. The RNA was extracted in three biological replicates for each genotype. The quality of RNA was confirmed by using 2100 Bioanalyzer (Agilent). For sequencing, in case of each sample (0, 4, 8, and 12 DAP of Sitaphal or NMK-1), the equivalent quantity of total RNA of three biological replicates was pooled.

Library preparation and 454 pyrosequencing

Approximately 1 μg total RNA was used for preparing mRNA sequencing library of each sample. Poly (A+) RNA was isolated from total RNA mixture by using NEBNext® Poly(A) mRNA Magnetic Isolation Module (New England Biolabs), following the manufacturer’s protocol. The purified Poly (A+) RNA was sheared and cDNA library was prepared by using NEBNext® mRNA Library Prep Reagent Set for 454™ (New England Biolabs) as per the instruction manual. The cDNA libraries were sequenced on a 454 Genome Sequencer FLX+ platform (Roche, USA).

De novo assembly

The raw data obtained from 454 GS FLX+ was filtered with a quality cut-off value of 40. The adaptor and primer sequences were removed using NGS QC Tool kit and GS Run processor. The low-quality sequences and sequences with less than 40 bp were removed before contig assembly. De novo contig assembly of the reads was performed using GS De Novo Assembler v2.6 (Roche, USA). In the assembled transcript sequence data at the four developmental stages (0, 4, 8 and 12 DAP), repeat sequences were identified with at least 40 bp overlap and 95% overlap identity. The repeat sequences of the transcriptome data from four libraries (0, 4, 8 and 12 DAP) were combined into larger units- unigenes, by using CAP3 [23].

Functional annotation

Accelerated large scale functional annotation of all contigs was done using WImpiBLAST tool [24] on high performance computing cluster. For functional annotation, the assembled transcript sequences were used as queries against the non-redundant (NR) protein database using NCBI BLASTx algorithm [25]. The BLAST E-value threshold was set at 10−5.

Gene ontology analysis

The gene ontology (GO) annotation was derived for the unigenes, using Uniprot annotation [26]. The categorization and visualization of GO terms was done using WEGO web tool [27].

Comparative analysis of A. squamosa unigenes

Comparative analysis was performed using the unigenes as queries against the protein databases of some fruit crops such as, Prunus persica (http://www.rosaceae.org/species/prunus_persica/genome_v1.0), Vitis vinifera (ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_02_5/Fasta/proteome.vvi.tfa.gz) and Fragaria vesca (ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_02_5/Fasta/proteome.fve.tfa.gz), and a primitive angiosperm, Amborella trichopoda (http://amborella.huck.psu.edu/).

Detection of sequences associated with hormone related signalling pathways, transcription factors and seed development

The unigene sequences were used to blast (BLASTx) against the transcription factor (http://planttfdb.cbi.edu.cn), seed development (http://www.seedgenes.org/) and hormone related (http://molbio.mgh.harvard.edu/sheenweb/Ara_pathways.html) protein sequence database of Arabidopsis thaliana, at the criteria of E-value ≤ 10−5 and query coverage ≥ 50%.

Single nucleotide polymorphism (SNP) analysis

Reads from the transcriptome libraries were mapped on unigenes of the respective genotype using program 'clc_ref_assemble_long' of CLC Assembly Cell version 3.2.2. Variants were detected using 'find_variations' program. SNP with read depth of more than five for each allele was only considered as heterozygous.

Detection of simple sequence repeats (SSRs)

The unigene sequences were searched for SSRs using the perl script program MISA (MIcroSAtellite; http://pgrc.ipk-gatersleben.de/misa/). The repeats of mono- to hexa-nucleotide motifs with a minimum of five repetitions were considered as search criteria in MISA script [15].

Web resource

A web resource, comprising of entire transcriptome contigs, has been developed using open source sequence server [28], and was hosted on Linux server.

Identification of differentially expressed unigenes

A total 5689 unigenes common between the two genotypes with more than 80% query coverage, were used as reference sequences for mapping individual reads from each library. The calculation of reads per kilobase of transcript per million mapped reads (RPKM) was done by using the program 'clc_ref_assemble_long' of CLC Assembly Cell version 3.2.2. Enrichment of Gene Ontology (GO) terms in the differentially expressed genes was performed using AgriGO analysis tool (http://bioinfo.cau.edu.cn/agriGO), with Fisher tests and Bonferroni multiple testing correction (False Discovery Rate ≤ 0.05). Kyoto Encyclopedia of Genes and Genomes (KEGG) categories was assigned by the plant gene set enrichment analysis toolkit (http://structuralbiology.cau.edu.cn/PlantGSEA/analysis.php) with fisher test function (False Discovery Rate ≤ 0.05).

Quantitative real time PCR

First strand cDNA was synthesized using cDNA Synthesis Kit RT-PCR (Roche, USA), with oligodT anchored primers following the manufacturer’s instructions. Gene-specific primers were designed using Primer Express software. QuantiTect SYBR Green RT-PCR Master mix (Qiagen) was used to perform real time PCR assay in an ABI 7700 Sequence Detector Real-Time PCR system (Applied Biosystems, USA). Three biological replications were conducted for each transcript for both the genotypes. The expression data was analyzed using ABI PRISM 7700 Sequence Detection System software (Applied Biosystems). The expression values were normalized with respect to Actin gene from A. squamosa. Dissociation curves confirmed the presence of a single amplicon in each PCR reaction. Relative expression was calculated as described previously [29].

Results and discussion

454 pyrosequencing, sequence assembly and annotation

In total, 1,801,608 and 1,901,179 raw reads were produced in the four cDNA library preparations of developing fruits (0, 4, 8 and 12 DAP) from the two genotypes of A. squamosa- Sitaphal and NMK-1 (Figure 2), respectively, with an average length of 568 bp (Additional file 2). The raw reads were filtered by removing low-quality reads, adapters, primer sequences, and sequences of less than 40 bp. Finally, 9,37,270 and 9,92,439 quality reads were obtained in the four cDNA library preparations (0, 4, 8 and 12 DAP) of Sitaphal and NMK-1, respectively. The average number of reads produced for each library was 0.24 million (Table 1). The filtered raw reads (sff files) were deposited in the NCBI Short Read Archive (SRA) database (accession number SRP042646). The quality-reads were assembled, giving 2074 to 11004 contigs, with more than 200 bp length, in the eight different cDNA libraries (Table 1). The contig sequences were searched against the known sequences in NCBI non redundant (NR) database, using BLASTx algorithm. At the E-value ≤ 10−5, 1808 to 9038 contigs were annotated across different libraries (Table 1, Additional file 3). The results provide sequence information for genes expressed during early developmental stages of fruits of A. squamosa.
Table 1

Summary of the sequencing-reads, assembly and functional annotation (using NCBI NR database) of the . transcriptome

Genotype Developmental stage Total high quality reads Average read length Total number of contigs (≥200 bp) Total number of annotated contigs (≥200 bp)
Sitaphal0 DAP227,732635104038176
4 DAP198,26951220741808
8 DAP219,05757468506023
12 DAP292,21257873946512
NMK10 DAP288,21659286457401
4 DAP287,824584110049038
8 DAP272,75058970016003
12 DAP143,64947920781886

The details of the contigs are given in Additional file 3.

Summary of the sequencing-reads, assembly and functional annotation (using NCBI NR database) of the . transcriptome The details of the contigs are given in Additional file 3. The contig sequence data in the four stages of fruit development (0, 4, 8, and 12 DAP) was combined into larger units, mentioned here as unigenes, by using CAP3. A total of 14921 (Sitaphal) and 14178 (NMK-1) unigenes were obtained. Out of the 14921 unigenes in Sitaphal, 2905 were ≥ 500 bp, 5239 were ≥ 1000 bp, 3663 were ≥ 1500 bp and 3114 were ≥ 2000 bp in length. Out of the 14178 unigenes in NMK-1, 2697 were ≥ 500 bp, 4883 were ≥ 1000 bp, 3516 were ≥ 1500 bp and 3082 were ≥ 2000 bp in length. The average lengths of the unigenes were 1086 bp and 1100 bp for Sitaphal and NMK-1, respectively. The sequence information is a useful resource for identification, cloning and functional genomic studies in future. The 14178 unigenes of NMK-1 were mapped over 14921 unigenes of Sitaphal. A total of 5689 unigenes were common between the two genotypes with more than 80% query coverage. Single nucleotide polymorphism was investigated in the 1160 unigenes with at least 500 bp length and showing at least 95% similarity between the two genotypes. The SNP analysis estimated about 0.35 and 0.33% heterozygosity in Sitaphal and NMK-1, respectively, after examining about 2.2 and 1.3 million nucleotide positions. The low level of heterozygosity agrees with the previous reports, notifying the development of true-to-type and uniform seedlings in A. squamosa [30,31].

Functional categorization by GO annotation

In total, 5401 (Sitaphal) and 6421 (NMK-1) unigenes, having sequence homology with uniprot annotations, were subjected to GO assignments for biological processes, cellular components and molecular functions categories. In the category of biological processes, unigenes related to metabolic processes (49.2% in Sitaphal and 75.3% in NMK-1), cellular processes (42.9% in Sitaphal and 77.3% in NMK-1), and response to stimulus (8.4% in Sitaphal and 26.2% in NMK-1) were predominant. In cellular components, genes related to cell parts (39.2% in Sitaphal and 81.8% in NMK-1) and organelles (23.5% in Sitaphal and 62.4% in NMK-1) were the most abundant classes. In molecular functions, genes involved in binding (38.1% in Sitaphal and 60.8% in NMK-1) and catalytic activities (38.3% Sitaphal and 49.1% in NMK-1) were abundantly expressed (Figure 3).
Figure 3

GO classifications of assembled unigenes, having sequence homology with uniprot proteins, assigned to 51 functional groups.

GO classifications of assembled unigenes, having sequence homology with uniprot proteins, assigned to 51 functional groups.

Comparative analysis with available public databases

The assembled unigenes were functionally annotated using BLASTx algorithm against the protein sequences of five public databases: NCBI NR, Prunus persica, Vitis vinifera and Fragaria vesca, and a primitive species, Amborella trichopoda, with an E-value cut-off of 10−5. The number and percentage of annotated unigenes of A. squmosa genotypes are given in Table 2. Of the 14921 (Sitaphal) and 14178 (NMK-1) unigenes, 10333 (69.25%) and 11676 (82.35%), respectively, showed significant similarity to NCBI NR protein database (Additional file 4). Furthermore, 60.33% to 61.57% (Sitaphal) and 77.32% to 78.47% (NMK-1) of the unigenes showed significant homology with the four plant species (Table 2). We obtained 1928 (12.92%) and 2825 (19.92%) unigenes with more than 90% subject coverage, suggesting quasi-full length genes in Sitaphal and NMK-1, respectively. However, 4588 (Sitaphal) and 2502 (NMK-1) unigenes did not match with any known protein in the NR database. These un-assigned transcripts may be novel genes or belong to untranslated regions, and could play specific roles in A. squamosa. The unigene sequence information would be useful as a reference for molecular biology research on A. squamosa and its related species.
Table 2

Number and percentage (in bracket) of unigenes in . genotypes (Sitaphal and NMK-1) from BLASTx searches against public protein databases of fruits crop and a closely related genus

Database Number of unigenes annotated in Sitaphal Number of unigenes annotated in NMK-1 Number of unigenes with ≥90 subject coverage in Sitaphal Number of unigenes with ≥90 subject coverage in NMK-1
NCBI non redundant10333 (69.25%)11676 (82.35%)1928 (12.92%)2825 (19.92%)
Vitis vinifera 9187 (61.57%)11126 (78.47%)1557 (10.43%)2289 (16.14%)
Prunus persica 9152 (61.33%)11108 (78.34%)1554 (10.41%)2313 (16.31%)
Fragaria vesca 9218 (61.77%)11206 (79.03%)968 (6.48%)1469 (10.36%)
Amborella trichopoda 9003 (60.33%)10963 (77.32%)1701 (11.40%)2577 (18.17%)
Number and percentage (in bracket) of unigenes in . genotypes (Sitaphal and NMK-1) from BLASTx searches against public protein databases of fruits crop and a closely related genus

Detection of transcript sequences related to hormone pathway, transcription factors and seed development

Fruit development is a complex process and involves numerous physiological and biochemical events which are initiated and regulated by hormonal signals [32]. Plant hormones, such as auxins, gibberellins, cytokinins, abscisic acid, ethylene, and brassinosteroids, play important role in fruit set and development [17,33]. Brassinosteroids are important for early fruit development [34], and the regulation of seed size [35] and number [36]. A total of 148 unigenes encoding putative hormone related genes were identified in A. squamosa (Table 3, Additional file 5), by BLASTx searches against the protein database of hormone pathway genes of A. thaliana.
Table 3

Summary of hormone related unigenes identified in the transcriptome of . genotypes (Sitaphal and NMK-1)

Hormones Number of unigenes
Brassinosteroid43
Auxin31
Abscisic acid28
Gibberellin20
Ethylene16
Cytokinin10

The details of the unigenes are given in Additional file 5.

Summary of hormone related unigenes identified in the transcriptome of . genotypes (Sitaphal and NMK-1) The details of the unigenes are given in Additional file 5. Transcription factors (TFs) control gene expression quantitatively, spatially and temporally [37]. It is desirable to identify the gene regulatory networks responsible for programming of early fruit development. The unigene sequences were annotated against the Plant-TFDB database of A. thaliana, to identify TFs which express during early phases of fruit development in A. squamosa. The BLASTx searches revealed a total of 319 unigenes matched with putative homologs of Arabidopsis TFs (Table 4, Additional file 6). The three most abundant families of transcription factors were bHLH, MYB and MYB-related, and C3H, represented by 34, 34 and 25 unigenes, respectively. Basic helix–loop–helix (bHLH) proteins participate in the regulation of a myriad of essential developmental and physiological processes, including reproductive development, determination of plant organ size, fruit and seed development [38,39]. The interplay of MYB factors, apart from transcription control on many crucial biological processes, regulates fruit and seed development [40,41]. Some of the C3H type TFs are embryo specific and play regulatory role in seed development [42].
Table 4

Summary of transcription factor related unigenes identified in the transcriptome of . genotypes (Sitaphal and NMK-1)

Transcription factor family Numbers of unigenes
bHLH34
C3H25
MYB17
MYB-related17
b ZIP15
NAC15
HD-ZIP14
C2H215
HB-other10
GRAS10
ARF11
WRKY14
Trihelix8
Mikc8
FAR18
G2-like8
ARR-B6
ERF6
TALE6
SBP6
CO-like5
HSF5

The table includes only those transcription factors which represent at least 5 unigenes in the transcriptome data. The details of the unigenes are given in Additional file 6.

Summary of transcription factor related unigenes identified in the transcriptome of . genotypes (Sitaphal and NMK-1) The table includes only those transcription factors which represent at least 5 unigenes in the transcriptome data. The details of the unigenes are given in Additional file 6. The BLAST search on the transcriptome data, using Arabidopsis protein sequences obtained from SeedGenes Project (http://www.seedgenes.org/), identified 379 transcripts associated with the development of seeds (Additional file 7). The sequence information on TFs, hormone and seed development related putative genes will be useful in examining the differential expression in the two genotypes of A. squamosa, with contrasting trait related to fruit and seed development.

Differentially expressed unigenes

The transcript abundance profile was examined for the 5689 unigenes common between the two genotypes, in the developing fruits at 0 and 8 DAP. At these stages, comparable numbers of contigs were identified in the two genotypes (Table 1). A total of 5504 unigenes were differentially expressed between the two genotypes in at least one time point (0 or 8 DAP) (Additional file 8). Among these, 1792 and 721 unigenes were up- and down-regulated, respectively, by ≥ 2 fold in Sitaphal at 8 DAP. By using the information of BLASTx searches against the protein database of A. thaliana, the differentially expressed unigenes (≥2 fold, 8 DAP) were mapped to terms in AgriGO and KEGG databases [43,44]. The GO enrichment patterns showed a disproportionate representation of unigenes involved in the biological process of reproductive structure, embryo, seed and fruit development in the two genotypes (Table 5, Additional file 9). The ontology analysis based on KEGG revealed the abundance of transcripts related to hormones, alkaloids, terpanoids, steroids, phenylpropanoids, spliceosome and other metabolic pathways in Sitaphal (Table 6, Additional file 9). The results indicate a distinctly more active primary and secondary metabolism in the early-stage fruits of Sitaphal as compared to the less seeded NMK-1. Hence, development of multiple seeds in Sitaphal was accompanied by a higher rate of metabolism in developing fruits.
Table 5

AgriGO categories (False discovery rate ≤ 0.05) for putative genes up-regulated (≥2 fold) in early-stage fruits (8 DAP) of Sitaphal and NMK-1

GO Term Ontology Description Sitaphal NMK-1
GO:0003006Preproductive developmental process760
GO:0005975Pcarbohydrate metabolic process630
GO:0005996Pmonosaccharide metabolic process230
GO:0006066Palcohol metabolic process320
GO:0006082Porganic acid metabolic process700
GO:0006091Pgeneration of precursor metabolites and energy300
GO:0006457Pprotein folding330
GO:0006461Pprotein complex assembly190
GO:0006508Pproteolysis690
GO:0006511Pubiquitin-dependent protein catabolic process360
GO:0006519Pcellular amino acid and derivative metabolic process580
GO:0006605Pprotein targeting190
GO:0006810Ptransport1280
GO:0006886Pintracellular protein transport400
GO:0006950Presponse to stress14777
GO:0006996Porganelle organization640
GO:0007017Pmicrotubule-based process170
GO:0007275Pmulticellular organismal development1380
GO:0008104Pprotein localization520
GO:0008152Pmetabolic process591230
GO:0008610Plipid biosynthetic process400
GO:0009266Presponse to temperature stimulus4123
GO:0009408Presponse to heat011
GO:0009628Presponse to abiotic stimulus12353
GO:0009790Pembryonic development450
GO:0009791Ppost-embryonic development8728
GO:0009987Pcellular process696251
GO:0010035Presponse to inorganic substance280
GO:0010154Pfruit development450
GO:0010876Plipid localization07
GO:0015031Pprotein transport490
GO:0016043Pcellular component organization970
GO:0016192Pvesicle-mediated transport290
GO:0019318Phexose metabolic process180
GO:0019538Pprotein metabolic process2410
GO:0019752Pcarboxylic acid metabolic process700
GO:0019941Pmodification-dependent protein catabolic process360
GO:0022414Preproductive process780
GO:0022607Pcellular component assembly320
GO:0032501Pmulticellular organismal process1440
GO:0032502Pdevelopmental process1560
GO:0033036Pmacromolecule localization6122
GO:0033365Pprotein localization in organelle140
GO:0034613Pcellular protein localization420
GO:0034621Pcellular macromolecular complex subunit organization240
GO:0034637Pcellular carbohydrate biosynthetic process210
GO:0034641Pcellular nitrogen compound metabolic process520
GO:0042180Pcellular ketone metabolic process710
GO:0043170Pmacromolecule metabolic process3770
GO:0043436Poxoacid metabolic process700
GO:0043632Pmodification-dependent macromolecule catabolic process360
GO:0043933Pmacromolecular complex subunit organization3014
GO:0044085Pcellular component biogenesis540
GO:0044106Pcellular amine metabolic process410
GO:0044237Pcellular metabolic process506197
GO:0044238Pprimary metabolic process5070
GO:0044248Pcellular catabolic process660
GO:0044257Pcellular protein catabolic process360
GO:0044260Pcellular macromolecule metabolic process3370
GO:0044262Pcellular carbohydrate metabolic process430
GO:0044265Pcellular macromolecule catabolic process500
GO:0044267Pcellular protein metabolic process2060
GO:0045184Pestablishment of protein localization490
GO:0046907Pintracellular transport540
GO:0048316Pseed development440
GO:0048513Porgan development610
GO:0048608Preproductive structure development750
GO:0048731Psystem development610
GO:0048856Panatomical structure development1150
GO:0050896Presponse to stimulus257114
GO:0051179Plocalization1310
GO:0051234Pestablishment of localization1280
GO:0051603Pproteolysis involved in cellular protein catabolic process360
GO:0051641Pcellular localization620
GO:0051649Pestablishment of localization in cell560
GO:0051716Pcellular response to stimulus620
GO:0065003Pmacromolecular complex assembly280
GO:0070271Pprotein complex biogenesis190
GO:0070727Pcellular macromolecule localization430

The details of the unigenes are given in Additional file 9.

Table 6

Pathway assignment based on KEGG (False Discovery Rate ≤ 0.05) for putative genes up-regulated (≥2 fold) in early-stage fruits (8 DAP) of Sitaphal and NMK-1

Description Sitaphal NMK-1
Metabolic pathways15263
Biosynthesis of plant hormones470
Spliceosome2312
Biosynthesis of alkaloids derived from terpenoid and polyketide280
Biosynthesis of terpenoids and steroids310
Biosynthesis of alkaloids derived from shikimate pathway280
Biosynthesis of alkaloids derived from ornithine, lysine and nicotinic acid270
Proteasome150
Citrate cycle (TCA cycle)140
Biosynthesis of alkaloids derived from histidine and purine240
Biosynthesis of phenylpropanoids310
Inositol phosphate metabolism100
Amino sugar and nucleotide sugar metabolism140
Endocytosis120
Aminoacyl-tRNA biosynthesis100
Ribosome190
Oxidative phosphorylation130

The details of the unigenes are given in Additional file 9.

AgriGO categories (False discovery rate ≤ 0.05) for putative genes up-regulated (≥2 fold) in early-stage fruits (8 DAP) of Sitaphal and NMK-1 The details of the unigenes are given in Additional file 9. Pathway assignment based on KEGG (False Discovery Rate ≤ 0.05) for putative genes up-regulated (≥2 fold) in early-stage fruits (8 DAP) of Sitaphal and NMK-1 The details of the unigenes are given in Additional file 9. The transcript level of several unigenes associated with hormones, transcription factors, and seed development were also differentially expressed between the two genotypes at 0 and 8 DAP (Additional file 8). Many of the putative orthologous genes which give a defective embryo and/or seed phenotype in Arabidopsis mutants [45], showed reduced expression in NMK-1 at 8 DAP (Figure 4, Additional file 8). Lower level of expression of the embryogenesis-related genes (Figure 4) could be indicative of aberrant embryo development, eventually affecting seed development in NMK-1 plants. The underlying transcriptional changes need to be validated with the accompanying anatomical and metabolic changes in the developing ovules. Moreover, further in-depth RNA-sequencing is required to generate comprehensive transcriptional profile for each developing stage of fruits.
Figure 4

Differential accumulation (≥2 fold, 8 DAP 0 DAP) of transcripts for embryogenesis related putative genes in early-stage fruits of Sitaphal and NMK-1. The orthologous genes give a defective embryo and/or seed phenotype in Arabidopsis mutants. The details of the differentially expressed transcripts are given in Additional file 8.

Differential accumulation (≥2 fold, 8 DAP 0 DAP) of transcripts for embryogenesis related putative genes in early-stage fruits of Sitaphal and NMK-1. The orthologous genes give a defective embryo and/or seed phenotype in Arabidopsis mutants. The details of the differentially expressed transcripts are given in Additional file 8.

Real time PCR

To validate the usefulness of the transcript sequences identified in the transcriptome resource, expression of five randomly selected unigenes was examined by real time PCR and compared with the RPKM expression values in the transcriptome data of 5689 unigenes. qRT-PCR was performed in the developing fruit (8 DAP) of Sitaphal and NMK-1, in three biological replicates. At 8 DAP initial cell division takes place in the zygote, which leads to the formation of the embryo [46]. Interestingly, the qRT-PCR analysis (Figure 5a; Additional file 10) suggested preferentially lower expression of the orthologous genes such as Clavata-3 (regulates seed formation [47]), Abnormal Suspensor-2 (involved in embryogenesis [48]), Embryo Defective-1144 (role in embryo development [49]), Embryo Defective-2742 (role in embryo development [50]), and Ovule abortion-9 (role in ovule development [51]). The qRT-PCR fold change was comparable to the RPKM values in the transcriptome data (Figure 5b). Thus, transcriptome data for the two contrasting Annona genotypes presented here is useful for identifying candidate genes for the development of less seeded fruits.
Figure 5

Quantitative RT-PCR analyses and RPKM expression value of 5 randomly selected candidate genes for seed development in Sitaphal and NMK-1, at 8 DAP. Quantitative RT-PCR analyses (a). Each bar indicates standard error in three biological replicates (*p ≤ 0.05). A detail of the primers is given in Additional file 10. The qRT-PCR fold change is comparable with RPKM values in transcriptome data (b).

Quantitative RT-PCR analyses and RPKM expression value of 5 randomly selected candidate genes for seed development in Sitaphal and NMK-1, at 8 DAP. Quantitative RT-PCR analyses (a). Each bar indicates standard error in three biological replicates (*p ≤ 0.05). A detail of the primers is given in Additional file 10. The qRT-PCR fold change is comparable with RPKM values in transcriptome data (b).

Mining of SSRs

Identification of SSRs was carried out to generate information for the development of molecular markers for future studies on genetic diversity in A. squamosa. In total, 2629 and 3445 SSR motifs were identified in 2045 and 2678 transcripts for Sitaphal and NMK-1, respectively (Table 7). Out of the transcripts analyzed, 417 and 541 contained more than one SSR, whereas, 324 and 428 were in compound form in Sitaphal and NMK-1, respectively. The mono-nucleotide SSRs represented the largest fraction of SSRs (35.71% in Sitaphal and 44.06% in NMK-1), followed by di-nucleotide (30.88% in Sitaphal and 25.54% in NMK-1) and/or tri-nucleotide (29.25% in Sitaphal and 26.47% in NMK-1) SSRs. Tetra-, penta- and hexa-nucleotide SSRs were identified in a small fraction (0.030-0.004%) (Table 8). The 5689 unigenes, common between the two genotypes, were examined for the presence of SSRs with differences in length. In total, 18 SSRs were identified with variable number of tandem repeat loci between the two genotypes (Additional file 11). The SSR motifs could be potential candidates for transcript based microsatellite marker development, useful in determining functional genetic variation in A. squamosa [52].
Table 7

Statistics of SSRs identified in the transcripts of . genotypes (Sitaphal and NMK-1)

Statistics of SSRs Sitaphal NMK-1
Total number of sequences examined1492114178
Total size of examined sequences1310128815606312
Total number of identified SSRs26293445
Number of SSR containing sequences20452678
Number of sequences containing more than one SSR417541
Number of SSRs present in compound Formation324428
Frequency of SSRs4.53 kb / SSR4.98 kb / SSR
Table 8

Classes of SSR repeat motifs in the transcriptome of . genotypes (Sitaphal and NMK-1)

Motifs Sitaphal NMK-1
Mono-nucleotides939 (35.71%)1518 (44.06%)
Di-nucleotides812 (30.88%)880 (25.54%)
Tri-nucleotides769 (29.25%)912 (26.47%)
Tetra-nucleotides81 (0.030%)87 (0.025%)
Penta-nucleotides12 (0.004%)18 (0.005%)
Hexa-nucleotides16 (0.006%)30 (0.008%)
Statistics of SSRs identified in the transcripts of . genotypes (Sitaphal and NMK-1) Classes of SSR repeat motifs in the transcriptome of . genotypes (Sitaphal and NMK-1)

Annona transcriptome web resource

A web resource has been developed where entire assembled transcripts are available in BLAST enabled search format (www.annonatranscriptome.nabi.res.in). The web resource is useful for researchers in data-mining and to access pre-computed annotations.

Conclusion

The study provides transcriptome information on A. squamosa. We report sequencing, de novo assembly and analysis of early-stage fruit transcriptome of two genotypes with contrasting level of seed number in fruits. Orthologous genes related to hormone pathways, transcription factors and seed development were determined in the early-stage fruit tramscriptome. Differentially expressed unigenes were identified between the two genotypes. Several of such unigenes were related to seed and fruit related traits, and expressed at a higher level in the densely seeded genotype, Sitaphal. Additionally, a large number of SSRs were identified, which will be a useful resource in marker development for future genetic studies in Annona sp. This repository will serve as a useful resource for investigating the molecular mechanisms of fruit development, and improvement of fruit related traits in A. squamosa and related species.

Availability of supporting data

The RNA-seq data is available in the NCBI Sequence Read Archive (SRA) (http://www.ncbi.nlm.nih.gov/sra), under accession number SRP042646.
  36 in total

1.  Differential expression of structural genes for the late phase of phytic acid biosynthesis in developing seeds of wheat (Triticum aestivum L.).

Authors:  Kaushal Kumar Bhati; Sipla Aggarwal; Shivani Sharma; Shrikant Mantri; Sudhir P Singh; Sherry Bhalla; Jagdeep Kaur; Siddharth Tiwari; Joy K Roy; Rakesh Tuli; Ajay K Pandey
Journal:  Plant Sci       Date:  2014-04-18       Impact factor: 4.729

2.  Gene and metabolite regulatory network analysis of early developing fruit tissues highlights new candidate genes for the control of tomato fruit composition and development.

Authors:  Fabien Mounet; Annick Moing; Virginie Garcia; Johann Petit; Michael Maucourt; Catherine Deborde; Stéphane Bernillon; Gwénaëlle Le Gall; Ian Colquhoun; Marianne Defernez; Jean-Luc Giraudel; Dominique Rolin; Christophe Rothan; Martine Lemaire-Chamley
Journal:  Plant Physiol       Date:  2009-01-14       Impact factor: 8.340

3.  agriGO: a GO analysis toolkit for the agricultural community.

Authors:  Zhou Du; Xin Zhou; Yi Ling; Zhenhai Zhang; Zhen Su
Journal:  Nucleic Acids Res       Date:  2010-04-30       Impact factor: 16.971

4.  BR signal influences Arabidopsis ovule and seed number through regulating related genes expression by BZR1.

Authors:  Hui-Ya Huang; Wen-Bo Jiang; Yu-Wei Hu; Ping Wu; Jia-Ying Zhu; Wan-Qi Liang; Zhi-Yong Wang; Wen-Hui Lin
Journal:  Mol Plant       Date:  2012-08-22       Impact factor: 13.164

5.  Laser capture microdissection for the analysis of gene expression during embryogenesis of Arabidopsis.

Authors:  Stuart Casson; Matthew Spencer; Katherine Walker; Keith Lindsey
Journal:  Plant J       Date:  2005-04       Impact factor: 6.417

6.  AceView: a comprehensive cDNA-supported gene and transcripts annotation.

Authors:  Danielle Thierry-Mieg; Jean Thierry-Mieg
Journal:  Genome Biol       Date:  2006-08-07       Impact factor: 13.583

7.  The progamic phase of an early-divergent angiosperm, Annona cherimola (Annonaceae).

Authors:  J Lora; J I Hormaza; M Herrero
Journal:  Ann Bot       Date:  2009-11-19       Impact factor: 4.357

8.  A dynamic interplay between phytohormones is required for fruit development, maturation, and ripening.

Authors:  Peter McAtee; Siti Karim; Robert Schaffer; Karine David
Journal:  Front Plant Sci       Date:  2013-04-17       Impact factor: 5.753

9.  PlantGSEA: a gene set enrichment analysis toolkit for plant community.

Authors:  Xin Yi; Zhou Du; Zhen Su
Journal:  Nucleic Acids Res       Date:  2013-04-30       Impact factor: 16.971

10.  A role of brassinosteroids in early fruit development in cucumber.

Authors:  Feng Qing Fu; Wei Hua Mao; Kai Shi; Yan Hong Zhou; Tadao Asami; Jing Quan Yu
Journal:  J Exp Bot       Date:  2008-05-31       Impact factor: 6.992

View more
  4 in total

1.  Molecular diversity of Annona species and proximate fruit composition of selected genotypes.

Authors:  Hirdayesh Anuragi; Haresh L Dhaduk; Sushil Kumar; Jitendra J Dhruve; Mithil J Parekh; Amar A Sakure
Journal:  3 Biotech       Date:  2016-09-23       Impact factor: 2.406

2.  Transcriptome Analysis and Identification of Genes Associated with Floral Transition and Flower Development in Sugar Apple (Annona squamosa L.).

Authors:  Kaidong Liu; Shaoxian Feng; Yaoling Pan; Jundi Zhong; Yan Chen; Changchun Yuan; Haili Li
Journal:  Front Plant Sci       Date:  2016-11-09       Impact factor: 5.753

3.  Transcriptional changes during ovule development in two genotypes of litchi (Litchi chinensis Sonn.) with contrast in seed size.

Authors:  Ashish K Pathak; Sudhir P Singh; Yogesh Gupta; Anoop K S Gurjar; Shrikant S Mantri; Rakesh Tuli
Journal:  Sci Rep       Date:  2016-11-08       Impact factor: 4.379

4.  Transcriptome Analysis of Soursop (Annona muricata L.) Fruit under Postharvest Storage Identifies Genes Families Involved in Ripening.

Authors:  Yolotzin Apatzingan Palomino-Hermosillo; Guillermo Berumen-Varela; Verónica Alhelí Ochoa-Jiménez; Rosendo Balois-Morales; José Orlando Jiménez-Zurita; Pedro Ulises Bautista-Rosales; Mónica Elizabeth Martínez-González; Graciela Guadalupe López-Guzmán; Moisés Alberto Cortés-Cruz; Luis Felipe Guzmán; Fernanda Cornejo-Granados; Luigui Gallardo-Becerra; Adrian Ochoa-Leyva; Iran Alia-Tejacal
Journal:  Plants (Basel)       Date:  2022-07-07
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.