Literature DB >> 26391769

Genome-wide cataloging and analysis of alternatively spliced genes in cereal crops.

Xiang Jia Min1,2, Brian Powell3, Jonathan Braessler3, John Meinken4,3,5, Feng Yu3, Gaurav Sablok6.   

Abstract

BACKGROUND: Protein functional diversity at the post-transcriptional level is regulated through spliceosome mediated pre-mRNA alternative splicing (AS) events and that has been widely demonstrated to be a key player in regulating the functional diversity in plants. Identification and analysis of AS genes in cereal crop plants are critical for crop improvement and understanding regulatory mechanisms.
RESULTS: We carried out the comparative analyses of the functional landscapes of the AS using the consensus assembly of expressed sequence tags and available mRNA sequences in four cereal plants. We identified a total of 8,734 in Oryza sativa subspecies (ssp) japonica, 2,657 in O. sativa ssp indica, 3,971 in Sorghum bicolor, and 10,687 in Zea mays AS genes. Among the identified AS events, intron retention remains to be the dominant type accounting for 23.5 % in S. bicolor, and up to 55.8 % in O. sativa ssp indica. We identified a total of 887 AS genes that were conserved among Z. mays, S. bicolor, and O. sativa ssp japonica; and 248 AS genes were found to be conserved among all four studied species or ssp. Furthermore, we identified 53 AS genes conserved with Brachypodium distachyon. Gene Ontology classification of AS genes revealed functional assignment of these genes in many biological processes with diverse molecular functions.
CONCLUSIONS: AS is common in cereal plants. The AS genes identified in four cereal crops in this work provide the foundation for further studying the roles of AS in regulation of cereal plant growth and development. The data can be accessed at Plant Alternative Splicing Database (http://proteomics.ysu.edu/altsplice/).

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26391769      PMCID: PMC4578763          DOI: 10.1186/s12864-015-1914-5

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Spliceosome mediated post-transcriptional modifications are the biggest challenges in understanding and predicting the degree of certainty and complexity of the proteome diversity [1, 2]. One of the most important mechanisms that contribute to the diversity in the protein isoforms is alternative splicing (AS), thus modulating the protein function as a consequence of the linking of the functional units (exons and introns) in a ubiquitous manner [3]. In addition, to the observed alternative splicing sub-types such as exon skipping (ES), alternative donor (AltD) or acceptor (AltA) site, and intron retention (IR), various complex types can be formed by combination of basic events [4, 5]. Apart from the four basic events, alternative transcripts may arise as a consequence of the alternative transcription initiation, alternative transcription termination, and alternative polyadenylation [2]. AS isoforms might encode distinct functional proteins, or might be nonfunctional, which harbor a premature termination codon. These nonfunctional isoforms generated through the process called “regulated unproductive splicing and translation” are degraded by a process known as nonsense-mediated decay [6]. Previous reports estimated around 90 % of human genes containing multiple exons are alternatively spliced [7, 8]. In line with the observed reports in humans, alternative splicing has been shown to be a major player in generation of the plant proteome diversity with 60 % of Arabidopsis thaliana multi-exon genes undergoing alternative splicing [9]. Genome-wide identification and physiological implications of AS have been reported in a number of model and non-model plant species including A. thaliana [10-13], Oryza sativa [14], Nelumbo nucifera (sacred lotus) [15], Vitis vinifera [16], Brachypodium distachyon [5, 17]. AS transcripts are generally generated through three pathways: (1) IR in the mature mRNA; (2) alternative exon usage (AEU), resulting in ES; and (3) the use of cryptic splice sites that may elongate or shorten an exon that generates AltD or AltA site or both [14, 17]. Approximately 60–75 % of AS events occur within the protein coding regions of mRNAs, resulting changes in binding properties, intracellular localization, protein stability, enzymatic, and signaling activities [18]. In plants, IR has been shown to be the most dominant form with reports suggesting the proportions of intron containing genes undergoing AS in plants ranged from ~30 % to >60 % depending the depth of available transcriptome data [4, 5]. On contrast, recent reports suggest the down-regulation of the IR events and up-regulation of the alternative donor/acceptor site (AltDA) and ES under heat stress in model Physcomitrella patens [19]. With the advent of the Next Generation Sequencing (NGS) based approaches, fine scale physiological implications revealed alternative splicing as the prominent mechanism, which regulates the microRNA- mediated gene regulation by increasing the complexity of the alternative mRNA processing in Arabidopsis [20]. Complex networks of regulation of gene expression and variation in AS has played a major role in the adaptation of plants to their corresponding environment and additionally in coping with environmental stresses [13]. Rice (O. sativa ssp japonica and indica), maize (Zea mays), and sorghum (Sorghum bicolor) are important cereal crops as major sources of food in many countries. Previously several approaches have widely demonstrated the identification of the quantitative trait loci, genes and proteins linked to the functional grain content in these species [21]. However, a major portion of the gene functional diversity is controlled by a spliceosomal regulated AS. AS has been shown to be a critical regulator in grass clade, demonstrating several of the genes involved in flowering and abiotic stress depicting alternative splicing [4, 17, 22]. Identifying alternative splicing genes in these cereal plants is the first step toward understanding the functions and regulations of these genes in plant development and abiotic or biotic stress resistance. Previously, using the homology based mapping approach and expressed sequence tags (ESTs) representing the functional transcripts, we identified a total of 941 AS genes in B. distachyon, a model temperate grass [5, 17]. Previous and recent reports on the identification and prevalence of the alternative splicing events in O. sativa [4, 23], S. bicolor [24], and Z. mays [25] have shown the functional diversity changes through EST/RNA-seq approaches. Previous report by Ner-Gaon et al. suggested a 3.7-fold difference in AS rates between O. sativa and S. bicolor using EST pairs gapped alignment [26]. The lack of the identification of the comparative AS events in cereal plants and realizing the importance of these functional foods in climate changes, we attempted to carry out the large scale analysis using the so far currently ESTs and mRNA based information in cereal plants to identify species specific and conserved AS events across cereal plants. In this work, we compared the AS event landscape and the AS gene functional diversity in cereal plants, which includes O. sativa ssp japonica and indica, S. bicolor and Z. mays, with a much deeper coverage of the identified AS events and also comparatively analyzed these AS genes with AS genes identified from B. distachyon to reveal conserved patterns of the AS across the grass species. Identified AS events will allow for the experimental characterization of the AS genes involved in important physiological processes. Investigation of the genome-wide conserved AS events across different species will shed light on the understanding of the evolution of the functional diversity in cereal plant for crop improvement.

Methods

Sequence datasets and sequence assembly

To identify the putative functional transcriptional changes across the Panicoideae lineage, we systematically queried and downloaded expressed sequence tags (ESTs) and mRNA sequences of O. sativa ssp japonica and indica, S. bicolor, and Z. mays from the dbEST and nucleotide repository of National Center for Biotechnology Information (NCBI; www.ncbi.nlm.nih.gov). Prior to aligning the ESTs/mRNAs to the corresponding genomic sequence, we applied stringent cleaning procedure using the strategy outlined below: 1) ESTs and mRNA sequences were subsequently cleaned using EMBOSS “trim” tool for trimming of the polyA or polyT ends; 2) Cleaned and trimmed ESTs and mRNA sequences were blasted using the BLASTN against UniVec and E. coli database for removal of vector and E. coli contaminants; 3) BLASTN searches against the plant repeat database which was built with TIGR gramineae repeat data and species specific repeat data including sorghum, maize, and rice available from ftp://ftp.plantbiology.msu.edu/pub/data/TIGR_Plant_Repeats/. Following stringent cleaning procedure, we assembled rice and sorghum cleaned EST and mRNA sequences using CAP3 with the following parameters: −p 95 –o 50 –g 3 –y 50 –t 1000 [27]. In case of the maize data, owing to the large number of available ESTs for this species, which is difficult to assemble, we followed an alternative way of assembling those ESTs. We first mapped ESTs and mRNA sequences to each individual chromosome of the maize genome using GMAP with default settings [28], and then chromosome specific-mapped ESTs and mRNAs were assembled individually using CAP3 with the parameters as mentioned above. The unmapped data and all assembled data from each individual assembly were combined and then re-assembled using CAP3 to generate a final consensus assembly for the further identification of the AS events. The raw data and assembled data for each organism were summarized in Table 1. For the prediction of the AS events, genome sequences, predicted protein coding DNA sequences (CDS), and related GFF data of O. sativa ssp japonica, Z. mays, and S. bicolor were downloaded from Phytozome database (http://www.phytozome.net/) [29-32]. The genome sequences and CDS data of O. sativa ssp indica (strain 93–11) were downloaded from BGI database (http://rise2.genomics.org.cn/page/rice/index.jsp) [33].
Table 1

Summary of raw sequence data and assembled data in each organism

SpeciesESTsmRNAsTotal SequencesCleaned SequencesTotal PUTsAverage Length (bp)
O. sativa ssp japonica 9873278245110698681053842163778783
O. sativa ssp indica 20701211953219065212768102424751
S. bicolor 20983533248243083241690601891002
Z. mays 20195249199021115141822653488243466

PUTs putative unique transcripts

Summary of raw sequence data and assembled data in each organism PUTs putative unique transcripts

Putative unique transcripts to genome mapping, identification and functional annotation of AS isoforms

In the present study, taking into the account the genome duplication events in Z. mays and S. bicolor, accurate prediction of the alternative splicing events is a major concern over the decades. In our study, calling and predicting alternative splicing events is taken into account by mapping of EST and mRNA assemblies, i.e. putative unique transcripts (hereafter simply referred them as PUTs), to the corresponding genomic sequences were carried out using in-house developed algorithm, ASFinder (http://proteomics.ysu.edu/tools/ASFinder.html/) [34], which uses SIM4 program [35] to map PUTs to the corresponding genome and then subsequently identifies those PUTs that are mapped to the same genomic location but have variable exon-intron boundaries as AS isoforms. To avoid the call of the spurious alternative splicing events, we applied a threshold of minimum of 95 % identity of aligned PUT with a genomic sequence, a minimum of 80 bp aligned length, and >75 % of a PUT sequence aligned to the genome [17]. Application of the above identity percentage and the aligned length removes the chance of the false positive AS events calling as a result of genome duplication events. The output file (AS.gtf) of ASFinder was then subsequently submitted to AStalavista server (http://genome.crg.es/astalavista/) for AS event analysis [36]. The percentage of alternative splicing genes was estimated using the genome predicted gene models having alternative splicing PUT isoforms among total genes models having at least one PUT, the results were presented in Table 2.
Table 2

Percentage of alternative splicing genes

Total mapped PUTs (%)PUT match to gene modelTotal unique genesAS genesAS (%)
O. sativa ssp japonica 104447 (63.8)7183026191788330.1
O. sativa ssp indica 47843 (46.7)3646717402241413.9
S. bicolor 50224 (83.4)3865426540358013.5
Z. mays 207332 (42.5)11941828698968933.8

AS Alternative splicing

Percentage of alternative splicing genes AS Alternative splicing We further queried the coding potential and corresponding coding frame of each PUT using the ORFPredictor [37], and to assess the full–length transcript coverage using TargetIdentifier [38] as previously described. Functional classification was assigned to the PUTs by performing BLASTX searches with an E-value threshold of 1E-5 against UniProtKB/Swiss-Prot. Predicted protein sequences from ORFPredictor were further annotated using rpsBLAST against the PFAM database (http://pfam.xfam.org/). Gene Ontologies (GOs) were assigned on the basis of the functional homology obtained by the BLASTX searching algorithm against the UniProtKB/Swiss-Prot. The GO categories were further analyzed using GO SlimViewer using plant specific GO terms [39]. To assess the functional coverage of the assembled PUTs, we further compared PUTs against the predicted gene primary transcripts using BLASTN with a cut off E-value of 1E-10, ≥ 95 % identity and minimum aligned length of 80 bp.

Conserved alternatively spliced genes in cereal plants and visualization of AS

For the identification of the potentially conserved AS genes among O. sativa ssp japonica and indica, Z. mays and S. bicolor, reciprocal BLASTP (cutoff E-value 1E-10) were done using the longest (or longer) ORF of the AS PUT isoforms for classifying the conserved AS pairs between species or sub-species. Venn graphical visualization for conserved AS pairs were obtained using R programming language (http://www.r-project.org/). Visualization of the alternative splicing events with genome tracks is critically important from two points of views: (1) To have a graphic look at the corresponding genomic coordinate and associated genic functional changes; and (2) To extract the corresponding spliced region of interest for functional primer designing of putative AS events. Keeping in view the above points, AS events identified in this study along with the integrated genomic tracks are available from Plant Alternative Splicing Database (http://proteomics.ysu.edu/altsplice/) [15, 17]. The specific pages associated with the cereal plants offer several end-users functionalities such as querying using the PUT ID, gene ID, keywords in functional annotation, PFAM, or AS event types as “query fields”. Additionally, the identified AS events can be visualized and compared with predicted gene models using GBrowse for comparative assessment. Nevertheless, we also deployed BLASTN functionality to search for the PUTs and AS isoforms. The data analyzed along with the GO and PFAM annotations in the present research are publicly available at: http://proteomics.ysu.edu/publication/data/.

Results and discussion

EST assembly and annotation

Optimization of the assembly parameters and mapping functionally annotated PUTs is a key parameter to provide a robust identification and classification of the AS events. Table 1 represents the assembly information, including the final cleaned reads for the assembly, mRNA count for each species, assembled consensus sequence and average length of assembled consensus. In the present research, we assembled and generated consensus PUTs accounting for a total of 163,778 PUTs in O. sativa ssp japonica, 102,424 PUTs in O. sativa ssp indica, 60,189 PUTs in S. bicolor, and 488,243 PUTs in Z. mays. The average length (N50) of assembled PUTs was 783 bp in O. sativa ssp japonica, 751 bp in O. sativa ssp indica, 1,002 bp in S. bicolor, and 466 bp in Z. mays. To check for the coverage of the assembled functional transcriptome, we further checked for the functional assignments and all the assembled PUTs were structurally and functionally annotated including putative open reading frame (ORF) prediction, coding region full-length prediction, a putative function and PFAM prediction, which ensures the reliability of the assembly strategies in case of large complex ploidy genomes underwent whole genome duplication events. PUTs were mapped to their corresponding genomes and predicted gene models were also visualized using GBrowse.

Gapped alignments of PUTs to genome, detection and classification of alternative splicing events

Following the sequence assembly, resulting unique PUTs were mapped onto their corresponding genomic sequences using gapped alignments as implemented in SIM4 method that was integrated as part of ASFinder [34]. The numbers of mapped PUTs and matched gene models, as well as the number of the observed AS genes are presented in Table 2. We observed that a relatively larger proportion of PUTs in S. bicolor (83.4 %) and O. sativa ssp japonica (63.8 %) aligned to their genomes as compared to the other cereal plants. We identified a total of 8,734 in Oryza sativa subspecies (ssp) japonica, 2,657 in O. sativa ssp indica, 3,971 in Sorghum bicolor, and 10,687 in Zea mays AS genes (Table 3). The percentage of AS genes was estimated based on the proportion of predicted gene models having AS PUT isoforms over the total gene models having an EST (PUT) evidence (Table 2). The percentages of AS genes vary in different cereal plants, up to 30.1 % in O. sativa ssp japonica and 33.8 % in Z. mays, and relatively low in O. sativa ssp indica (13.9 %) and in S. bicolor (13.5 %). The difference in the mapping rate and AS rate might be due to the difference in the number of ESTs available for respective species. Previous reports on AS in B. distachyon clearly illustrates the fact that availability of the more ESTs/mRNAs reflects the prediction of the AS landscape [5, 17].
Table 3

Alternative splicing events in different cereal species

SpeciesIR(%)AltD(%)AltA (%)ES (%)Complex event (%)Total eventsTotal AS genes
O. sativa ssp japonica 8288 (42.0)1245 (6.3)1950 (9.9)762 (3.9)7447 (37.8)196928734
O. sativa ssp indica 2193 (55.8)332 (8.5)576 (14.7)161 (4.1)665 (16.9)39272657
S. bicolor 4448 (23.5)1072 (5.7)1230 (6.5)507 (2.6)11681 (61.7)189383971
Z. mays 11048 (40.4)2080 (7.6)3314 (11.4)1568 (5.7)5576 (20.4)2338610687

IR Intron Retention, AltD Alternative donor, AltA Alternative acceptor, ES exon skipping

Alternative splicing events in different cereal species IR Intron Retention, AltD Alternative donor, AltA Alternative acceptor, ES exon skipping Recent reports using the RNA-seq technology revealed that AS is common in plants—around 61 % of multi-exonic genes in A. thaliana are alternatively spliced under normal growth conditions [12], and ~40 % of intron containing genes that undergo AS in maize [25]. Classification of the AS events observed in the cereal plants are listed in Table 3 showing the prevalence of the IR as the major splicing type showing frequency as high as 55.8 % in O. sativa ssp indica and as low as 23.5 % in S. bicolor (Table 3). The high frequency of the IR in the mature mRNA is perfectly in line with the previously observed frequencies of IR (30–50 %) in AS landscape in A. thaliana and O. sativa [14]. It is worthwhile to mention that plant spliceosomal machinery supports the intron definition model, thus identifies the introns for pre-mRNA splicing as oppose to the abundant exon-spliceosome model observed in case of mammals. Previous arguments have clearly justified the cause and benefits of retaining the introns as potential cytoplasmic translatable transcripts [26] or as mediators of increasing the gene expression, a process widely described as intron-mediated enhancement (IME) of gene expression [40]. The abundance of IR as a major AS event is consistent with previous reports including Medicago tuncatula (39 %), Populus trichocarpa (34 %), A. thaliana (56 %), O. sativa (54 %), Chlamydomonas reinhardtii (50 %), Z. mays (58–62 %) and B. distachyon (55.5 %) [14, 17, 25, 41, 42]. In contrast, recently IR has been found remarkably repressed under elevated temperature in P. patens [19]. Alternative acceptor (AltA) and donor (AltD) represent the second most abundant and classified functional class of observed AS events with AltA showing a relatively higher frequency as compared to AltD (Table 3). Although ES events have been described as the rarest events in plants, which are in line with the observed results in this study, recently they have been proposed as the candidates of the transgene regulation using the conditional splicing [43]. We noted that 61.7 % events are complex events in sorghum, which have more than one basic event in compared paired PUTs. This is clearly related to the relative longer lengths of the PUTs in sorghum assembly. Recent reports suggest the differential up-regulation of the alternative donor/acceptor site (AltDA) and ES elucidating the importance of these events as indicators of early heat stress [19]. Our data in this work clearly showed that the number of AS genes and the percentage of genes with AS are different in different crops (Tables 2 and 3). However, this observation only reflects the current state in these plants based on the available data. Our previous analysis on AS in B. distachyon clearly demonstrated that more AS genes were identified with more available ESTs/mRNA data [5, 17]. This is also consistent with the finding of increasing frequency of occurrence of AS in Arabidopsis with time—a reflection of an accumulation of available transcriptome data, for example, only 1.2 % of the genes in Arabidopsis were reported undergo AS in 2003 and now it was estimated over 60 % of intron-containing genes undergo AS [13].

Features of exons and introns in protein coding genes: indicators of gene evolution

Understanding the patterns of gene evolution and identifying signatures of convergent and divergent evolution is of paramount importance, especially when we are addressing the genome complexity in terms of gene evolution. Exon-intron framework properties such as length distribution and GC content evolution have been previously used to demonstrate the gene evolution [44]. Additionally, longer introns as compared to short introns have been shown to play an important role in the gene expression [40, 45]. However, reports by Yang [46] demonstrate the negative correlation of the long introns with the levels of the expression in A. thaliana and O. sativa. Realizing the importance of the features of exon-intron in evolution and physiological responses, we extracted and plotted the length distribution of all internal exons and introns from each plant and the results are summarized (Table 4; Fig. 1; Fig. 2). Interestingly, we observed that the average internal exon lengths in O. sativa ssp indica and Z. mays are almost similar, and are relatively much shorter than the internal exon lengths in O. sativa ssp japonica and S. bicolor. On the other hand, Z. mays had the longer intron length (554 bp) and showed a wide variation in intron lengths as compared to the observed range of intron lengths (422–440 bp) in other three cereal plants in this study. We further analyzed deeply the exon size and intron size distribution frequencies demonstrating that Z. mays and O. sativa ssp indica had a relatively much higher proportion of internal exons of a smaller size (<120 bp) (Fig. 1). The observed frequency of internal exon lengths below 300 bp was 0.93 in Z. mays, 0.95 in O. sativa ssp indica, 0.89 in S. bicolor, and 0.90 in O. sativa ssp japonica. S. bicolor and O. sativa ssp japonica displayed more exons of relatively large size, whereas Z. mays displayed a higher number of long introns (Fig. 2). Prevalence of the introns richness and specifically long introns have been previously been shown to be widely associated with the increased expression of Adh1, Sh1, Bz1, Hsp82, actin, and GapA1 genes in Z. mays [47-51] and salT, Act1, and tpi genes in rice [52, 53]. Additionally, a relative higher proportion of introns having a shorter length were observed in S. bicolor. We also observed ~2 % introns in maize and a small number of introns (<0.5 %) in other plants having a size >10 kb. However, taking into account the possible errors in PUT and genome assembly, these long introns were not included in the calculation of the average intron size. It is worthwhile to mention that the average internal exon size (180 bp) and intron size (440 bp) in O. sativa ssp japonica obtained in this work were close to the exon (193 bp) and intron (433 bp) size obtained previously in O. sativa, which presents the robustness of the implemented approach [14].
Table 4

Exon and intron size in cereal plants

ExonIntron
Sample sizeAverage size (bp)SDSample SizeAverage size (bp)SD
O. sativa ssp japonica 127627180261180575440695
O. sativa ssp indica 5233013311379735434703
S. bicolor 106753179222144860422747
Z. mays 1370201421332091395541057

SD Standard deviation

Fig. 1

Distribution of internal exon size: The x-axis indicates the size of internal exons. Bin sizes are right inclusive (e.g., bin 100 comprises sequences of lengths 1–100 bp). The y-axis indicates the frequency of internal exons. The inset shows a detailed distribution of small internal exons

Fig. 2

Distribution of intron size: The x-axis indicates the size of introns. Bin sizes are right inclusive (e.g., bin 100 comprises sequences of lengths 1 –100 bp). The y-axis indicates the frequency of introns. The inset shows a detailed distribution of small introns

Exon and intron size in cereal plants SD Standard deviation Distribution of internal exon size: The x-axis indicates the size of internal exons. Bin sizes are right inclusive (e.g., bin 100 comprises sequences of lengths 1–100 bp). The y-axis indicates the frequency of internal exons. The inset shows a detailed distribution of small internal exons Distribution of intron size: The x-axis indicates the size of introns. Bin sizes are right inclusive (e.g., bin 100 comprises sequences of lengths 1 –100 bp). The y-axis indicates the frequency of introns. The inset shows a detailed distribution of small introns

Functional classification of AS genes

AS and gene regulation can be observed at almost all levels of biological interactions [54]. The AS transcripts identified in the present study were functionally annotated for the Gene Ontologies (GOs) and for putative protein domains association by performing a BLASTX search of all PUTs against UniProt/Swiss-Prot database. The ORFs of PUTs were identified using ORFPredictor webserver [37]. The protein families of the AS genes, using the longest ORF of each AS gene, were predicted using rpsBLAST searching PFAM database. Among predicted ORFs of these AS genes, 6,900 in Z. mays, 4,939 in O. sativa ssp japonica, 1,362 in O. sativa ssp indica, and 2,890 in S. bicolor were classified with a putative protein family (Table 5, Additional file 1: Table S1). We further classified AS gene functional products into 2,030 unique protein families in Z. mays, 1,708 unique protein families in O. sativa ssp japonica, 757 unique protein families in O. sativa ssp indica, and 1,194 unique protein families in S. bicolor. Among the protein functions, encoded by these AS genes, widely includes protein kinase domain, RNA recognition motif, protein tyrosine kinase, ring finger domain, cytochrome P450, Myb-like DNA-binding domain, WRKY DNA-binding domain, Thioredoxin and protein phosphatase 2C (Table 5). A complete list of all the protein families encoded by AS genes is shown in Additional file 1: Table S1. Our analysis demonstrated that AS genes in cereal plants encode diverse protein families that play important roles in various biological processes. A classical example can be WRKY- DNA binding domains, which represents the largest and functionally diverse transcription factors in plants playing a major role in developmental and physiological processes. Previous studies have widely demonstrated the presence of the alternative ORF in the WRKY genes [55, 56]. Yang et al. [57] and Feng et al. [58] have clearly highlighted the role of the alternative splicing and WRKY in plant immunity. Previous functional studies have shown the presence of the splicing of the R-type intron and V-type intron in O. sativa WRKY genes and functionally correlated them to plant immunity [59]. MYB-domains play an important role in plant defense mechanism and are transcriptionally regulated by alterative splicing in A. thaliana and O. sativa and encode MYB- or MYB-related proteins [60]. Alternative splicing of MYB related genes MYR1 and MYR2 have clearly demonstrated the change in protein dimerization and folding as a consequence of alternative splicing thus affecting the transcriptional sensitivity in light mediated responses [61].
Table 5

Protein family classification of alternative genes in cereal plants

PFAMDomain Z. mays O. sativa ssp japnoica O. sativa ssp indica S. bicolor Putative Functions
pfam00069Pkinase2052285574Protein kinase domain
pfam00076RRM_1112613243RNA recognition motif
pfam07714Pkinase_Tyr88791225Protein tyrosine kinase
pfam13639zf-RING_25328815Ring finger domain
pfam00067p4504556437Cytochrome P450
pfam00481PP2C45251111Protein phosphatase 2C
pfam00249Myb_DNA-binding4414722Myb-like DNA-binding domain
pfam00179UQ_con4317107Ubiquitin-conjugating enzyme
pfam00010HLH41517Helix-loop-helix DNA-binding domain
pfam00071Ras3820117Ras family
pfam00141peroxidase37301131Peroxidase
pfam00153Mito_carr3524714Mitochondrial carrier protein
pfam01559Zein35000Zein seed storage protein
pfam01490Aa_trans331217Transmembrane amino acid transporter protein
pfam02365NAM3333814No apical meristem (NAM) protein
pfam00125Histone31935Core histone H2A/H2B/H3/H4
pfam01370Epimerase3126822NAD dependent epimerase/dehydratase family
pfam00083Sugar_tr3022710Sugar (and other) transporter
pfam00847AP230939AP2 domain
pfam00106adh_short29251015short chain dehydrogenase
pfam00657Lipase_GDSL295116GDSL-like Lipase/Acylhydrolase
pfam00085Thioredoxin2814611Thioredoxin
pfam00226DnaJ281899DnaJ domain
pfam03151TPT27926Triose-phosphate Transporter family
pfam00004AAA2618614ATPase family associated with various cellular
pfam00270DEAD242158DEAD/DEAH box helicase
pfam00504Chloroa_b-bind24191120Chlorophyll A-B binding protein
pfam02309AUX_IAA241355AUX/IAA family
pfam00149Metallophos239313Calcineurin-like phosphoesterase
pfam00134Cyclin_N22924Cyclin
pfam00450Peptidase_S102118618Serine carboxypeptidase
pfam03106WRKY212277WRKY DNA -binding domain
pfam13041PPR_22130112PPR repeat family
Total6900493913622890

Note: a complete list is shown in Additional file 1: Table S1

Protein family classification of alternative genes in cereal plants Note: a complete list is shown in Additional file 1: Table S1 GO analysis according to biological and molecular function revealed a wide visibility in all the major biological and molecular functions (Table 6; Table 7). Interestingly, even the data we collected are from pooled data in the public domain, i.e., not from a strictly controlled experiment, our GO analysis revealed that relative to the average of AS percentage, a higher percentage of genes involved in response to abiotic stimulus, photosynthesis, carbohydrate metabolic process, and cell death are involved in AS in cereal plants. In contrast, the genes involved in multicellular organismal development and reproduction had a lower percentage of AS (Table 6). GO molecular function analysis revealed that genes encoding proteins having DNA binding, sequence-specific DNA binding transcription factor activity, nuclease activity had a lower percentage of AS, and the gene coding proteins for protein binding and having kinase activity had a higher percentage of AS in the majority of plants (Table 7). Our observed results are consistent with literature reviewed recently by Reddy et al. [4] and Staiger and Brown [22] that AS is involved in most plant processes and plays regulated roles in plant development and stress responses.
Table 6

Classification of biological processes based on Gene Ontology (GO)

Table 7

Classification of molecular functions based on Gene Ontology (GO)

Classification of biological processes based on Gene Ontology (GO) Classification of molecular functions based on Gene Ontology (GO)

Conserved alternatively spliced genes

Classification of the conserved alternative splicing events provides a framework for understanding the evolution of the functional genes and their genic-regulation at the transcriptional level, which may initiate the cross-talks between the evolution of the genes under AS and between the transcriptional environment and the ecological adaptation. For the identification of the conserved AS pairs, longest ORFs of AS genes in each studied species were compared using the BLASTP (cutoff E-value 1E-10) to identify the best-reciprocal top hit as the conserved pairs. In total, we identified 1558 AS genes conserved between O. sativa ssp japonica and indica, 3,246 AS genes conserved between O. sativa ssp japonica and Z. mays, and 1,967 AS genes between S. bicolor and Z. mays (Additional file 2: Table S3). A total of 887 AS genes are conserved among Z. mays, S. bicolor, and O. sativa ssp japonica. More importantly, we identified 248 AS genes conserved among all four plants (Fig. 3). Furthermore, using the same approach, we identified a total of 53 AS genes conserved with B. distachyon belonging to BEP-clade of grass evolution. The co-orthologous conserved 53 AS genes are listed in Table 8. The set of co-orthologs 248 AS genes conserved in the four plants, with 53 of them conserved to B. distachyon, are provided in Additional file 3: Table S2 (can be downloaded at http://proteomics.ysu.edu/altsplice/). Interestingly, one of the candidates among the conserved gene is Drought-induced protein (Di19). It has been previously suggested that the presence of the retained intron within the coding sequence may give rise to the non-sense mediated decay (NMD) [62]. Recent studies highlight the role of cycloheximide in introducing pre-mature termination codons (PTCs) and NMD in A. thaliana Di19, indicating the splicing mechanism in Di19 [63]. Identification of the Di19 mediated splicing will be of critical importance in increasing the drought resistance or increasing the captive yield of the cereal plants, which are acting as major suppliers of food in climate change. As current analysis were based on the pooled EST/mRNA sequences available in the public domain, more biologically functionally conserved AS genes will be identified when more transcriptome data are collected with improved technologies, various environmental conditions, developmental stages and tissues in these cereal crops. The present data is of immense potential for experimental validation and highlights the role of the AS and biological significance in plant, growth development and environmental regulation, which is a standing challenge in climate change.
Fig. 3

Conserved alternative splicing genes in rice (Oryza sativa) ssp japonica, rice ssp indica, sorghum (Sorghum bicolor), and maize (Zea mays) plants

Table 8

Conserved alternative splicing genes among five monocot plants

O. sativa ssp indica O. sativa ssp japonica Z. maysS. bicolor B. distachyon CDD/Pfam
Osi19962Osj954Zm92934Sb6267Bd2565pfam031712OG-FeII_Oxy2OG-Fe(II) oxygenase superfamily
Osi18787Osj44013Zm40020Sb12294Bd28385pfam00004AAAATPase family associated with various cellular
Osi6875Osj22392Zm8831645969421Bd7352pfam00248Aldo_ket_redAldo/keto reductase family
Osi9356Osj41340Zm162Sb17314Bd6214pfam00248Aldo_ket_redAldo/keto reductase family
CX100091Osj15328Zm35072Sb10885Bd29210pfam00439BromodomainBromodomain
Osi12568Osj24409Zm100060Sb8817Bd24009pfam05042CaleosinCaleosin related protein
CT843009Osj14649Zm32705Sb6709Bd10918pfam00571CBSCBS domain
Osi524CT828785.1Zm73067Sb4586Bd10523pfam04733Coatomer_ECoatomer epsilon subunit
Osi21096Osj16673FL1033802.42E + 08Bd4031pfam07876DabbStress responsive A/B Barrel Domain
Osi8549Osj47391Zm69871Sb334Bd7166pfam05605Di19Drought induced 19 protein (Di19)
CT833644.1CI258157Zm20082Sb10226Bd7036pfam05057DUF676Putative serine esterase (DUF676)
Osi21136Osj16693Zm46142Sb13903Bd7810pfam05623DUF789Protein of unknown function (DUF789)
Osi19974Osj14932Zm70017Sb10575Bd3731595pfam00676E1_dhDehydrogenase E1 component
Osi1759Osj22934Zm35625Sb15873Bd7027pfam01370EpimeraseNAD dependent epimerase/dehydratase family
CT842225Osj27697Zm91971Sb4930Bd268pfam00316FBPaseFructose-1-6-bisphosphatase
Osi20900Osj16392Zm58947Sb3303Bd7531597pfam00210FerritinFerritin-like domain
Osi339Osj20205Zm20714Sb12056Bd6374pfam00762FerrochelataseFerrochelatase
Osi11082Osj491Zm59942Sb3313Bd27405pfam00125HistoneCore histone H2A/H2B/H3/H4
Osi13655Osj36042Zm81325Sb15256Bd28446pfam00403HMAHeavy-metal-associated domain
Osi11360Osj36865Zm38497Sb20674Bd9583pfam00447HSF_DNA-bindHSF-type DNA-binding
Osi17520Osj35947Zm27347Sb9471Bd7833pfam01156IU_nuc_hydroInosine-uridine preferring nucleoside
Osi13902Osj25885Zm23750Sb12436Bd28318pfam00013KH_1KH domain
Osi11280Osj28328Zm35841Sb9907Bd13744cd00116LRR_RILeucine-rich repeats (LRRs)
CT844279CB642464Zm3338Sb7337Bd28467pfam01717Meth_synt_2Cobalamin-independent synthase
Osi1437Osj37916Zm4695Sb5119Bd7994pfam00635Motile_SpermMSP (Major sperm protein) domain
Osi231Osj25397Zm37411Sb10332Bd28960pfam14360PAP2_CPAP2 superfamily C-terminal
Osi8815Osj32580Zm61468Sb11226Bd6619pfam01195Pept_tRNA_hydroPeptidyl-tRNA hydrolase
Osi16666Osj19199Zm104454Sb12015Bd16056pfam00450Peptidase_S10Serine carboxypeptidase
Osi12736Osj14309Zm22618Sb7927Bd8683pfam00141peroxidasePeroxidase
Osi833Osj39350Zm92939Sb19533Bd5931597pfam00069PkinaseProtein kinase domain
Osi3301Osj17780Zm29726Sb14730Bd29285pfam00069PkinaseProtein kinase domain
Osi6061Osj15126Zm59883Sb673Bd15932PLN02756PLN02756S-methyl-5-thioribose kinase
Osi6187Osj42201Zm39790Sb2138Bd8363pfam00348polyprenyl_syntPolyprenyl synthetase
Osi13092Osj21144Zm33939Sb5787Bd23758pfam14299PP2Phloem protein 2
Osi11891NM_001070568.2Zm87952Sb2001Bd2595pfam00854PTR2POT family
Osi20788Osj19691Zm3938430944654Bd10083pfam07992Pyr_redox_2Pyridine nucleotide-disulphide
CT837906.1Osj7689Zm101865Sb5520Bd25885pfam00719PyrophosphataseInorganic pyrophosphatase
Osi21504Osj17274Zm6058Sb11340Bd21664pfam00072Response_regResponse regulator receiver domain
Osi15366Osj24220Zm5068Sb10671Bd23705pfam02453ReticulonReticulon
Osi9029Osj47510Zm24118Sb11323Bd8231593pfam03214RGPReversibly glycosylated polypeptide
Osi5643Osj25267Zm80771Sb227Bd11010pfam01246Ribosomal_L24eRibosomal protein L24e
Osi8310Osj36859Zm101179Sb11303Bd6311pfam00076RRM_1RNA recognition motif
Osi1456Osj43479Zm371Sb12579Bd15819pfam00076RRM_1RNA recognition motif
Osi773Osj43052Zm24001Sb2305Bd20070pfam00464SHMTSerine hydroxymethyltransferase
Osi9812Osj14203Zm394912.42E + 08Bd28258pfam01406tRNA-synt_1etRNA synthetases class I (C) catalytic
Osi9653Osj44577Zm33457Sb5144Bd6360pfam00443UCHUbiquitin carboxyl-terminal hydrolase
Osi2251Osj35805Zm985772.42E + 08Bd20683pfam12076Wax2_CWAX2 C-terminal domain
Osi15508Osj14495Zm95Sb10474Bd4536pfam05495zf-CHYCHY zinc finger
Osi21052Osj4519Zm39479Sb9831Bd24331No Pfam predicted
Osi8778Osj195Zm100142Sb1070Bd2265No Pfam predicted
Osi20728Osj16408Zm3429457806619Bd19455No Pfam predicted
Osi17233Osj20996Zm100171Sb1504Bd16477No Pfam predicted
CT830510.1Osj3010Zm96019Sb2210Bd12504No Pfam predicted
Conserved alternative splicing genes in rice (Oryza sativa) ssp japonica, rice ssp indica, sorghum (Sorghum bicolor), and maize (Zea mays) plants Conserved alternative splicing genes among five monocot plants

Conclusions

In the present work, we investigated the functional landscape of the four most important cereal plants O. sativa ssp indica and japonica, S. bicolor and Z. mays using the updated EST and mRNA sequences available in NCBI thus bridging the knowledge gap and updating the conserved AS catalog with functional elucidation. The availability of the conserved AS genes among the four cereal plants will facilitate to understand the regulation of the alternative physiological processes in global climate change biology and their subsequent impact on the genic-environmental interactions.

Availability of supporting data

The data described in the work can be searched or downloaded at the Plant Alternative Splicing Database (http://proteomics.ysu.edu/altsplice/). Other detailed analysis data can be downloaded at http://proteomics.ysu.edu/publication/data/CerealAS/.
  61 in total

Review 1.  Alternative splicing: increasing diversity in the proteomic world.

Authors:  B R Graveley
Journal:  Trends Genet       Date:  2001-02       Impact factor: 11.639

Review 2.  Alternative splicing: combinatorial output from the genome.

Authors:  Gavin C Roberts; Christopher W J Smith
Journal:  Curr Opin Chem Biol       Date:  2002-06       Impact factor: 8.822

3.  Annotations and functional analyses of the rice WRKY gene superfamily reveal positive and negative regulators of abscisic acid signaling in aleurone cells.

Authors:  Zhen Xie; Zhong-Lin Zhang; Xiaolu Zou; Jie Huang; Paul Ruas; Daniel Thompson; Qingxi J Shen
Journal:  Plant Physiol       Date:  2004-12-23       Impact factor: 8.340

Review 4.  Function of alternative splicing.

Authors:  Olga Kelemen; Paolo Convertini; Zhaiyi Zhang; Yuan Wen; Manli Shen; Marina Falaleeva; Stefan Stamm
Journal:  Gene       Date:  2012-08-15       Impact factor: 3.688

5.  ASFinder: a tool for genome-wide identification of alternatively splicing transcripts from EST-derived sequences.

Authors:  Xiang Jia Min
Journal:  Int J Bioinform Res Appl       Date:  2013

6.  Genome-wide survey of Alternative Splicing in Sorghum Bicolor.

Authors:  Bahman Panahi; Bahram Abbaszadeh; Mehdi Taghizadeghan; Esmaeil Ebrahimie
Journal:  Physiol Mol Biol Plants       Date:  2014-06-29

7.  Genome-wide mapping of alternative splicing in Arabidopsis thaliana.

Authors:  Sergei A Filichkin; Henry D Priest; Scott A Givan; Rongkun Shen; Douglas W Bryant; Samuel E Fox; Weng-Keen Wong; Todd C Mockler
Journal:  Genome Res       Date:  2009-10-26       Impact factor: 9.043

8.  OsWRKY62 is a negative regulator of basal and Xa21-mediated defense against Xanthomonas oryzae pv. oryzae in rice.

Authors:  Ying Peng; Laura E Bartley; Xuewei Chen; Christopher Dardick; Mawsheng Chern; Randy Ruan; Patrick E Canlas; Pamela C Ronald
Journal:  Mol Plant       Date:  2008-05       Impact factor: 13.164

9.  Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis.

Authors:  Matthew A Campbell; Brian J Haas; John P Hamilton; Stephen M Mount; C Robin Buell
Journal:  BMC Genomics       Date:  2006-12-28       Impact factor: 3.969

Review 10.  Alternative splicing in plant immunity.

Authors:  Shengming Yang; Fang Tang; Hongyan Zhu
Journal:  Int J Mol Sci       Date:  2014-06-10       Impact factor: 5.923

View more
  16 in total

1.  Comparative transcriptomics analysis uncovers alternative splicing events and molecular markers in cabbage (Brassica oleracea L.).

Authors:  Yuanyuan Xu; Aisong Zeng; Lixiao Song; Jiaqing Li; Jiyong Yan
Journal:  Planta       Date:  2019-02-15       Impact factor: 4.116

Review 2.  Emerging Roles and Landscape of Translating mRNAs in Plants.

Authors:  Gaurav Sablok; Jonathan J Powell; Kemal Kazan
Journal:  Front Plant Sci       Date:  2017-09-01       Impact factor: 5.753

3.  Alternative splicing in tomato pollen in response to heat stress.

Authors:  Mario Keller; Yangjie Hu; Anida Mesihovic; Sotirios Fragkostefanakis; Enrico Schleiff; Stefan Simm
Journal:  DNA Res       Date:  2017-04-01       Impact factor: 4.458

4.  Transcriptome analysis of Brachypodium during fungal pathogen infection reveals both shared and distinct defense responses with wheat.

Authors:  Jonathan J Powell; Jason Carere; Gaurav Sablok; Timothy L Fitzgerald; Jiri Stiller; Michelle L Colgrave; Donald M Gardiner; John M Manners; John P Vogel; Robert J Henry; Kemal Kazan
Journal:  Sci Rep       Date:  2017-12-08       Impact factor: 4.379

5.  Position-specific intron retention is mediated by the histone methyltransferase SDG725.

Authors:  Gang Wei; Kunpeng Liu; Ting Shen; Jinlei Shi; Bing Liu; Miao Han; Maolin Peng; Haihui Fu; Yifan Song; Jun Zhu; Aiwu Dong; Ting Ni
Journal:  BMC Biol       Date:  2018-04-30       Impact factor: 7.431

6.  Transcriptome analysis of alternative splicing in peanut (Arachis hypogaea L.).

Authors:  Jian Ruan; Feng Guo; Yingying Wang; Xinguo Li; Shubo Wan; Lei Shan; Zhenying Peng
Journal:  BMC Plant Biol       Date:  2018-07-04       Impact factor: 4.215

7.  The emergence and evolution of intron-poor and intronless genes in intron-rich plant gene families.

Authors:  Hui Liu; Hai-Meng Lyu; Kaikai Zhu; Yves Van de Peer; Zong-Ming Max Cheng
Journal:  Plant J       Date:  2021-02-09       Impact factor: 6.417

8.  The Wild Sugarcane and Sorghum Kinomes: Insights Into Expansion, Diversification, and Expression Patterns.

Authors:  Alexandre Hild Aono; Ricardo José Gonzaga Pimenta; Ana Letycia Basso Garcia; Fernando Henrique Correr; Guilherme Kenichi Hosaka; Marishani Marin Carrasco; Cláudio Benício Cardoso-Silva; Melina Cristina Mancini; Danilo Augusto Sforça; Lucas Borges Dos Santos; James Shiniti Nagai; Luciana Rossini Pinto; Marcos Guimarães de Andrade Landell; Monalisa Sampaio Carneiro; Thiago Willian Balsalobre; Marcos Gonçalves Quiles; Welison Andrade Pereira; Gabriel Rodrigues Alves Margarido; Anete Pereira de Souza
Journal:  Front Plant Sci       Date:  2021-07-07       Impact factor: 5.753

9.  Involvement of Alternative Splicing in Barley Seed Germination.

Authors:  Qisen Zhang; Xiaoqi Zhang; Songbo Wang; Cong Tan; Gaofeng Zhou; Chengdao Li
Journal:  PLoS One       Date:  2016-03-31       Impact factor: 3.240

10.  The kinome of pineapple: catalog and insights into functions in crassulacean acid metabolism plants.

Authors:  Kaikai Zhu; Hui Liu; Xinlu Chen; Qunkang Cheng; Zong-Ming Max Cheng
Journal:  BMC Plant Biol       Date:  2018-09-18       Impact factor: 4.215

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.