Christine M Gault1,2, Federico Martin1,2, Wenbin Mei3, Fang Bai2, Joseph B Black2, W Brad Barbazuk1,3,4, A Mark Settles5,2,4. 1. Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL 32611. 2. Horticultural Sciences Department, University of Florida, Gainesville, FL 32611. 3. Department of Biology, University of Florida, Gainesville, FL 32611. 4. Genetics Institute, University of Florida, Gainesville, FL 32611. 5. Plant Molecular and Cellular Biology Program, University of Florida, Gainesville, FL 32611; settles@ufl.edu.
Abstract
RNA splicing of U12-type introns functions in human cell differentiation, but it is not known whether this class of introns has a similar role in plants. The maize ROUGH ENDOSPERM3 (RGH3) protein is orthologous to the human splicing factor, ZRSR2. ZRSR2 mutations are associated with myelodysplastic syndrome (MDS) and cause U12 splicing defects. Maize rgh3 mutants have aberrant endosperm cell differentiation and proliferation. We found that most U12-type introns are retained or misspliced in rgh3 Genes affected in rgh3 and ZRSR2 mutants identify cell cycle and protein glycosylation as common pathways disrupted. Transcripts with retained U12-type introns can be found in polysomes, suggesting that splicing efficiency can alter protein isoforms. The rgh3 mutant protein disrupts colocalization with a known ZRSR2-interacting protein, U2AF2. These results indicate conserved function for RGH3/ZRSR2 in U12 splicing and a deeply conserved role for the minor spliceosome to promote cell differentiation from stem cells to terminal fates.
RNA splicing of U12-type introns functions in human cell differentiation, but it is not known whether this class of introns has a similar role in plants. The maizeROUGH ENDOSPERM3 (RGH3) protein is orthologous to the humansplicing factor, ZRSR2. ZRSR2 mutations are associated with myelodysplastic syndrome (MDS) and cause U12 splicing defects. Maizergh3 mutants have aberrant endosperm cell differentiation and proliferation. We found that most U12-type introns are retained or misspliced in rgh3 Genes affected in rgh3 and ZRSR2 mutants identify cell cycle and protein glycosylation as common pathways disrupted. Transcripts with retained U12-type introns can be found in polysomes, suggesting that splicing efficiency can alter protein isoforms. The rgh3 mutant protein disrupts colocalization with a known ZRSR2-interacting protein, U2AF2. These results indicate conserved function for RGH3/ZRSR2 in U12 splicing and a deeply conserved role for the minor spliceosome to promote cell differentiation from stem cells to terminal fates.
Most eukaryotic transcripts contain introns that are removed by dynamic ribonucleoprotein complexes known as spliceosomes (1). Spliceosomes include hundreds of RNA-splicing factors that influence splice site selection. Many eukaryotic lineages have two different spliceosomes (2). The major spliceosome removes the common U2-type introns, and the minor spliceosome removes rare U12-type introns. U2- and U12-type introns are recognized by different consensus sequences at the splice sites and branch point (3). The significance of maintaining separate splicing machinery for these rare introns is not well defined. U12-type intron splicing efficiency increases in response to cell stress signaling in HeLa cells (4). Reduction in U12-type splicing causes developmental defects in Arabidopsis, Drosophila, zebrafish, and humans (5–11).HumanZRSR2 is required for the second transesterification reaction in U2 splicing and takes part in the initial assembly of the minor spliceosome in human cell extracts (12, 13). Somatic mutations in ZRSR2 are found in patients with myelodysplastic syndrome (MDS) (14–16). MDS is a blood cell differentiation disorder that causes an increase in undifferentiated blasts, abnormal myeloid cells, and a decrease in fully differentiated myeloid cell types (17). MDS subtypes with ZRSR2 mutations more frequently progress to acute myeloid leukemia, and consequently, loss of ZRSR2 is considered a driver toward cancer (17). RNA-sequencing (RNA-seq) analysis of ZRSR2 mutants found reduced U12 splicing with U2-type introns largely unaffected (18).The hypomorphic rough endosperm3 (rgh3) allele in maize disrupts the ZRSR2 ortholog, seed development, and plant viability (19). Endosperm cell differentiation is defective and delayed in rgh3 mutants, allowing mutant cells to proliferate in tissue culture at late developmental stages when WT endosperm is unable to grow (19) (Fig. S1). Based on a survey of alternatively spliced transcripts in maize, only a few genes have been identified with altered transcript isoform abundance in rgh3 mutants (19).
Fig. S1.
Extended proliferation of rgh3 endosperm tissue cultures. (A) Normal sibling and rgh3 mutant endosperm culture plates at 30–35 d of culture. Endosperm tissues were plated from kernels at 7, 10, and 16 d after pollination (DAP). (B) Frequency of callus growth from rgh3 and normal sibling endosperm tissues after 30 d of culture. Results are from 50 endosperm tissues for each developmental stage and genotype. These data are independent replicates of figure 4 and figure S2 of ref. 19.
Extended proliferation of rgh3 endosperm tissue cultures. (A) Normal sibling and rgh3 mutant endosperm culture plates at 30–35 d of culture. Endosperm tissues were plated from kernels at 7, 10, and 16 d after pollination (DAP). (B) Frequency of callus growth from rgh3 and normal sibling endosperm tissues after 30 d of culture. Results are from 50 endosperm tissues for each developmental stage and genotype. These data are independent replicates of figure 4 and figure S2 of ref. 19.
Results
Missplicing of U12-Type Introns in rgh3.
We determined the genome-wide effect of rgh3 on mRNA splicing with RNA-seq and isoform expression analysis from homozygous rgh3 and WT sibling root and shoot tissues. Cufflinks (20) predicted 46 genes had altered isoform use in rgh3 (Table S1). Localized differences in read coverage depth between rgh3 and WT libraries were used to design semiquantitative RT-PCR assays for nine of the genes identified as differentially spliced (Fig. 1 and Fig. S2). Sequencing of the amplified products revealed that intron retention occurs more frequently in rgh3 mutants at seven of these loci (Fig. 1 and Figs. S3 and S4). Some rgh3 amplifications exhibited slow-migrating bands due to heteroduplex formation of differently sized RT-PCR products (Fig. S5). Six of the validated splicing differences have U12-type introns as defined by the ERISdb plant splice site database (21). The intron retained in the seventh gene, GRMZM2G133028, is likely to be a U12-type intron. Although the 5′-splice site and branch point in GRMZM2G133028 are somewhat diverged from the U12-type consensus, the six additional plant species analyzed within ERISdb have U12-type introns at the orthologous intron position in this conserved, plant-specific gene (Fig. S6).
Table S1.
Genes with differentially expressed transcript isoforms based on Cufflinks statistics
Gene
ERISdb annotation
WT vs. rgh3 comparison
(JSD)1/2
Cufflinks test statistic
FDR-corrected P value
RT-PCR
GRMZM2G000665
U12 intron
Shoots
0.831
4.59E-05
1.86E-03
GRMZM2G007981
Shoots
0.833
2.11E-04
1.86E-03
GRMZM2G012841
Shoots
0.401
3.12E-12
1.86E-03
GRMZM2G059671
Shoots
0.831
1.73E-10
1.86E-03
GRMZM2G061596
Shoots
0.832
1.49E-06
1.86E-03
GRMZM2G083655
Shoots
0.832
1.16E-08
1.86E-03
Roots
0.831
8.87E-09
2.97E-03
GRMZM2G096600
Shoots
0.831
7.01E-05
1.86E-03
No difference
GRMZM2G097170
U12 intron
Shoots
0.833
4.42E-09
1.86E-03
Roots
0.832
2.67E-04
2.97E-03
GRMZM2G098423
Shoots
0.831
0.00E+00
1.86E-03
GRMZM2G110277
Shoots
0.801
1.10E-11
1.86E-03
GRMZM2G111954
Shoots
0.833
2.03E-11
1.86E-03
GRMZM2G119640
Shoots
0.683
2.22E-16
1.86E-03
GRMZM2G130432
U12 intron
Shoots
0.733
8.71E-12
1.86E-03
Misspliced
Roots
0.656
2.99E-10
2.65E-02
GRMZM2G137847
Shoots
0.833
3.75E-12
1.86E-03
GRMZM2G175398
Shoots
0.828
3.40E-08
1.86E-03
GRMZM2G350312
Shoots
0.830
4.38E-06
1.86E-03
GRMZM2G476538
Shoots
0.833
1.78E-08
1.86E-03
GRMZM2G480607
Shoots
0.682
1.05E-09
1.86E-03
GRMZM5G820727
U12 intron
Shoots
0.685
4.80E-14
1.86E-03
Misspliced
GRMZM2G000842
Roots
0.797
2.47E-09
2.97E-03
GRMZM2G007453
Roots
0.832
0.00E+00
2.97E-03
GRMZM2G048846
Roots
0.220
9.10E-08
2.97E-03
GRMZM2G057646
Roots
0.184
3.67E-08
2.97E-03
GRMZM2G077233
Roots
0.827
8.49E-12
2.97E-03
GRMZM2G127911
Roots
0.833
1.17E-07
2.97E-03
GRMZM2G138566
Roots
0.831
1.23E-08
2.97E-03
GRMZM2G336533
Roots
0.831
4.81E-07
2.97E-03
GRMZM2G454550
Roots
0.831
3.51E-10
2.97E-03
GRMZM2G408305
U12 intron
Shoots
0.725
5.60E-09
3.54E-03
Misspliced
GRMZM2G068710
Shoots
0.588
1.69E-09
3.85E-03
GRMZM2G097568
U12 intron
Shoots
0.777
8.50E-08
3.85E-03
Misspliced
Roots
0.783
2.31E-10
4.09E-03
GRMZM2G113619
Shoots
0.288
3.05E-05
3.85E-03
GRMZM2G166659
Roots
0.120
6.45E-08
6.29E-03
GRMZM2G045257
Shoots
0.535
7.73E-07
6.64E-03
GRMZM5G848692
Roots
0.282
1.51E-06
1.52E-02
GRMZM2G306935
U12 intron
Roots
0.716
2.18E-10
1.63E-02
Misspliced
GRMZM2G011636
U12 intron
Roots
0.664
1.88E-10
2.00E-02
Misspliced
GRMZM2G021549
Roots
0.770
9.30E-08
2.00E-02
GRMZM2G041418
Roots
0.789
6.74E-06
2.00E-02
GRMZM2G133028
U2 intron*
Shoots
0.687
6.75E-07
2.62E-02
Misspliced
Roots
0.731
4.63E-07
2.65E-02
GRMZM2G103152
Roots
0.656
7.43E-10
2.65E-02
No difference
GRMZM2G111912
Shoots
0.433
1.58E-06
2.72E-02
GRMZM2G036837
Roots
0.322
2.62E-08
3.71E-02
GRMZM2G032348
Roots
0.706
5.83E-07
3.77E-02
GRMZM2G014709
Roots
0.580
1.03E-10
3.88E-02
GRMZM2G065655
Shoots
0.771
3.84E-03
4.66E-02
JSD is the Jensen–Shannon divergence.
The retained intron in GRMZM2G133028 has a 5′-splice site similar to the U12-type consensus and a potential U12-type branch point (Fig. S6).
Fig. 1.
Minor splicing is compromised in rgh3. (A) WT and rgh3 root RNA-seq read depth for GRMZM2G011636. Transcript models show the U12-type intron with increased read depth in rgh3. A brace symbol indicates the region amplified in B. (B) RT-PCR of normal (Rgh3 +) and rgh3 mutant (Rgh3 −) RNA from root, shoot, kernel, embryo, starchy endosperm (SE), basal endosperm transfer cell layer (BETL), and endosperm culture (EC) tissues. Schematic shows sequenced amplification products with forward and reverse primers indicated by arrows. (C–F) Intron read depth and PSO metrics with U2-type introns indicated in orange and U12-type introns indicated in blue. (C) Scatterplot showing intron read depth normalized for gene expression. Black diagonal line indicates equivalent read depth, and dotted lines are twofold differences. (D) Distribution of Welch’s t test P values. Lines were fit with a sliding window average of 0.3 log units. Gray line, P = 0.05. (E) Scatterplot of PSO metrics. Introns with less than 10 exon–exon junction reads in both WT and rgh3 are not plotted. (F) Distribution of splicing differences between WT and rgh3 based on ΔPSO metrics.
Fig. S2.
Genes identified by Cufflinks with differential splicing in rgh3 mutants. Root RNA-seq read depth for three WT and three rgh3 mutant replicates are shown. Annotated transcript models have U12-type introns labeled. A brace symbol indicates the region amplified in RT-PCR validation experiments shown in Fig. S3. Partial gene models are shown for (A) GRMZM2G408305, (B) GRMZM2G097568, and (D) GRMZM2G133028 to adequately resolve intron read depth for the region amplified. Full gene models are shown for (C) GRMZM2G306935, (E) GRMZM2G130432, and (F) GRMZM5G820727.
Fig. S3.
RT-PCR validation of rgh3 splicing defects predicted by Cufflinks in diverse tissue types. Gene-specific primers amplified cDNA from roots, shoots, whole kernels, embryos, starchy endosperm (SE), endosperm tissue enriched for the basal endosperm transfer cell layer (BETL), and endosperm culture (EC). For root and shoot, the Rgh3 + labels indicate homozygous WT Rgh3 tissues. The Rgh3 + labels for all other tissues indicate phenotypically WT samples that are either homozygous or heterozygous for the WT Rgh3 gene. Rgh3 – labels indicate homozygous mutant samples. Panel order is the same as in Fig. S2. (A–G) Transcript diagrams and RT-PCR products are shown for the following: (A) GRMZM2G408305, (B) GRMZM2G097568, (C) GRMZM2G306935, (D) GRMZM2G133028, (E) GRMZM2G130432, (F) GRMZM5G820727, and (G) a ubiquitin control. All transcript diagrams are based on cloned, sequenced products and drawn to the same scale as in A. Arrows indicate forward and reverse primers used for RT-PCR.
Fig. S4.
Genes identified by Cufflinks as having differential splicing in rgh3 mutants that failed to validate in RT-PCR experiments. (A and B) RNA-seq read depth for three WT and three rgh3 mutant replicates from root libraries are shown for the full gene models of (A) GRMZM2G096600 and (B) GRMZM2G103152. A brace symbol indicates the region amplified in RT-PCR validation experiments. (C) RT-PCR analysis of the same RNA used for two of the three RNA-seq libraries. Size markers as well as the expected size of the annotated transcript variants and a genomic fragment are indicated.
Fig. S5.
Mixed PCR products from alternatively spliced cDNA form slow migrating heteroduplex products. Purified plasmids containing inserts of transcript fragments with the intron spliced out (S) and the intron retained (R) were amplified using gene-specific primers to show expected sizes. The S and R plasmid for each gene also were combined in a 1:1 ratio as template for the S/R reaction. The S/R templates amplify two bands of expected size as well as additional slow-migrating, heteroduplex PCR products similar to those observed in RT-PCRs that have significant levels misspliced or intron retention products such as GRMZM2G011636 (Fig. 1) or GRMZM2G106613 (Fig. 2).
Fig. S6.
Intron affected in GRMZM2G133028 is a U12-type intron. Sequence comparison of the U12-like intron in GRMZM2G133028 with U12-type introns of orthologs from ERISdb. The last five codons of the upstream exon, the 5′-splice site, the branch point, and the downstream exon are shown for the following: Zea mays GRMZM2G133028 (Zm), Oryza sativa LOC_Os01g15790 (Os), Arabidopsis thaliana AT4G36440 (At), Vitis vinifera Vv04s0023g02560 (Vv), Glycine max GLYMA02G05950 (Gm), Selaginella moellendorffii SELMODRAFT_176312/ SELMODRAFT_102953 (Sm), and Physcomitrella patens PP1S6_51V6 /PP1S83_251V6 (Pp). Peptide sequences are shown below the exons. The U12-type 5′-splice site is underlined, and the U12-type branch point in ERISdb is in red text. The homologous branch point in Zea mays is in blue text. Additional sequences consistent with a U12-type branch point are underlined and bold.
Minor splicing is compromised in rgh3. (A) WT and rgh3 root RNA-seq read depth for GRMZM2G011636. Transcript models show the U12-type intron with increased read depth in rgh3. A brace symbol indicates the region amplified in B. (B) RT-PCR of normal (Rgh3 +) and rgh3 mutant (Rgh3 −) RNA from root, shoot, kernel, embryo, starchy endosperm (SE), basal endosperm transfer cell layer (BETL), and endosperm culture (EC) tissues. Schematic shows sequenced amplification products with forward and reverse primers indicated by arrows. (C–F) Intron read depth and PSO metrics with U2-type introns indicated in orange and U12-type introns indicated in blue. (C) Scatterplot showing intron read depth normalized for gene expression. Black diagonal line indicates equivalent read depth, and dotted lines are twofold differences. (D) Distribution of Welch’s t test P values. Lines were fit with a sliding window average of 0.3 log units. Gray line, P = 0.05. (E) Scatterplot of PSO metrics. Introns with less than 10 exon–exon junction reads in both WT and rgh3 are not plotted. (F) Distribution of splicing differences between WT and rgh3 based on ΔPSO metrics.Genes with differentially expressed transcript isoforms based on Cufflinks statisticsJSD is the Jensen–Shannon divergence.The retained intron in GRMZM2G133028 has a 5′-splice site similar to the U12-type consensus and a potential U12-type branch point (Fig. S6).Genes identified by Cufflinks with differential splicing in rgh3 mutants. Root RNA-seq read depth for three WT and three rgh3 mutant replicates are shown. Annotated transcript models have U12-type introns labeled. A brace symbol indicates the region amplified in RT-PCR validation experiments shown in Fig. S3. Partial gene models are shown for (A) GRMZM2G408305, (B) GRMZM2G097568, and (D) GRMZM2G133028 to adequately resolve intron read depth for the region amplified. Full gene models are shown for (C) GRMZM2G306935, (E) GRMZM2G130432, and (F) GRMZM5G820727.RT-PCR validation of rgh3 splicing defects predicted by Cufflinks in diverse tissue types. Gene-specific primers amplified cDNA from roots, shoots, whole kernels, embryos, starchy endosperm (SE), endosperm tissue enriched for the basal endosperm transfer cell layer (BETL), and endosperm culture (EC). For root and shoot, the Rgh3 + labels indicate homozygous WT Rgh3 tissues. The Rgh3 + labels for all other tissues indicate phenotypically WT samples that are either homozygous or heterozygous for the WT Rgh3 gene. Rgh3 – labels indicate homozygous mutant samples. Panel order is the same as in Fig. S2. (A–G) Transcript diagrams and RT-PCR products are shown for the following: (A) GRMZM2G408305, (B) GRMZM2G097568, (C) GRMZM2G306935, (D) GRMZM2G133028, (E) GRMZM2G130432, (F) GRMZM5G820727, and (G) a ubiquitin control. All transcript diagrams are based on cloned, sequenced products and drawn to the same scale as in A. Arrows indicate forward and reverse primers used for RT-PCR.Genes identified by Cufflinks as having differential splicing in rgh3 mutants that failed to validate in RT-PCR experiments. (A and B) RNA-seq read depth for three WT and three rgh3 mutant replicates from root libraries are shown for the full gene models of (A) GRMZM2G096600 and (B) GRMZM2G103152. A brace symbol indicates the region amplified in RT-PCR validation experiments. (C) RT-PCR analysis of the same RNA used for two of the three RNA-seq libraries. Size markers as well as the expected size of the annotated transcript variants and a genomic fragment are indicated.Mixed PCR products from alternatively spliced cDNA form slow migrating heteroduplex products. Purified plasmids containing inserts of transcript fragments with the intron spliced out (S) and the intron retained (R) were amplified using gene-specific primers to show expected sizes. The S and R plasmid for each gene also were combined in a 1:1 ratio as template for the S/R reaction. The S/R templates amplify two bands of expected size as well as additional slow-migrating, heteroduplex PCR products similar to those observed in RT-PCRs that have significant levels misspliced or intron retention products such as GRMZM2G011636 (Fig. 1) or GRMZM2G106613 (Fig. 2).
Fig. 2.
Diverse rgh3 splicing defects are associated with U12-type introns. (A–C) WT and rgh3 root RNA-seq read depth for (A) GRMZM2G106613, (B) GRMZM2G083620, and (C) GRMZM2G587327. Transcript models show U12-type introns with a brace symbol indicating the regions amplified in D–F. (D–F) RT-PCR of normal (Rgh3 +) and rgh3 mutant (Rgh3 −) RNA from root, shoot, kernel, embryo, starchy endosperm (SE), basal endosperm transfer cell layer (BETL), and endosperm culture (EC) tissues. Schematics show sequenced amplification products with forward and reverse primers indicated by arrows. (D) GRMZM2G106613 showed U12-type intron retention in rgh3. (E) GRMZM2G083620 had adjacent U12- and U2-type intron retention. (F) GRMZM2G587327 activated cryptic U2-type splice sites in rgh3.
Intron affected in GRMZM2G133028 is a U12-type intron. Sequence comparison of the U12-like intron in GRMZM2G133028 with U12-type introns of orthologs from ERISdb. The last five codons of the upstream exon, the 5′-splice site, the branch point, and the downstream exon are shown for the following: Zea mays GRMZM2G133028 (Zm), Oryza sativa LOC_Os01g15790 (Os), Arabidopsis thalianaAT4G36440 (At), Vitis vinifera Vv04s0023g02560 (Vv), Glycine max GLYMA02G05950 (Gm), Selaginella moellendorffii SELMODRAFT_176312/ SELMODRAFT_102953 (Sm), and Physcomitrella patens PP1S6_51V6 /PP1S83_251V6 (Pp). Peptide sequences are shown below the exons. The U12-type 5′-splice site is underlined, and the U12-type branch point in ERISdb is in red text. The homologous branch point in Zea mays is in blue text. Additional sequences consistent with a U12-type branch point are underlined and bold.U12-type intron-containing genes account for <2% of the B73 filtered gene set. With 20% of Cufflinks predictions involving U12-type introns, we hypothesized that U12-type introns are misspliced on a global scale in rgh3. To allow for novel transcript isoforms to be detected, we analyzed intron read counts normalized by gene expression for all introns in the filtered gene set of the maize genome (Fig. 1). Of the 446 nonredundant U12-type introns in ERISdb, 340 were expressed in the seedling libraries. Significantly more RNA-seq reads mapped to 240 U12-type introns in rgh3 libraries vs. WT libraries (Fig. 1 and Dataset S1). This represents 71% of the maizeU12-type introns tested. Only three U12-type introns showed significantly fewer reads in rgh3 libraries, which is below the expected false-positive rate for the number of t tests completed. By contrast, 4% of the remaining 113,345 nonredundant introns within expressed genes had differences in the number of normalized reads mapped in rgh3 vs. WT (Fig. 1 and Dataset S2). Globally, fewer U2-type introns showed significant differences than the expected number of false-positive t tests, indicating that rgh3 specifically affects genes with U12-type introns.To quantify the extent of intron splicing defects, we calculated percent spliced out (PSO) for individual introns using exon–exon junction and intron reads (Fig. 1). Based on Fisher’s exact tests of read counts, splicing defects were detected for 77% of U12-type introns (Dataset S1). The median difference between WT and rgh3 PSO (ΔPSO) was 62%, indicating extensive retention of U12-type introns in rgh3 (Fig. 1). More than 13% of all other introns also showed statistically significant increased retention in rgh3, but the median ΔPSO for these introns was only 4%, indicating minor impacts on U2 splicing (Dataset S2).An additional nine genes showing significant differences in U12-type intron read depth and PSO were randomly selected for RT-PCR (Fig. 2 and Fig. S7). All of these genes had splicing defects in rgh3 with three patterns of altered splicing: intron retention (Fig. 2 and Fig. S8 ), missplicing of the U12-type intron concomitant with retention of a downstream U2-type intron (Fig. 2 and Fig. S8), and activation of cryptic, U2-type, 5′- and 3′-splice sites at the U12-type intron (Fig. 2 and Fig. S8 ). There were no differences in splice site consensus sequences between the U12-type introns that are misspliced in rgh3 and those with no significant difference in rgh3 mutants (Fig. S9). The range of splicing defects found in rgh3 is also observed in human bone marrow samples from MDSpatients with ZRSR2 mutations (18). We conclude that rgh3 mutants, like humanZRSR2 mutants, are impaired in minor spliceosome function.
Fig. S7.
RNA-seq read depth for experimentally validated genes with significantly more U12-type intron normalized expression. All panels show RNA-seq read depth for three WT and three rgh3 mutant replicates from root libraries. Annotated transcript models are shown with the U12-type intron labeled. A brace symbol indicates the region amplified in RT-PCR validation experiments shown in Fig. S8. Full gene models are shown for the following: (A) GRMZM2G131321, (B) GRMZM2G074015, (C) GRMZM2G040401, and (D) GRMZM2G033430. Partial gene models are shown for (E) GRMZM2G416751 and (F) GRMZM2G153434 to illustrate intron read depth adequately in the region amplified.
Fig. S8.
RT-PCR validation of rgh3 U12 splicing defects identified by normalized intron read depth analysis. (A–G) Gene-specific primers (arrows) amplified cDNA from roots, shoots, whole kernels, embryos, starchy endosperm (SE), endosperm tissue enriched for the basal endosperm transfer cell layer (BETL), and endosperm culture (EC). The seedling cDNA was derived from the same RNA used for RNA-seq as in Fig. 1. The cDNA for the seed and EC tissues are the same as in Fig. S3. Transcript diagrams and RT-PCR products are shown for (A) GRMZM2G131321, (B) GRMZM2G074015, (C) GRMZM2G040401, (D) GRMZM2G033430, (E) GRMZM2G416751, (F) GRMZM2G153434, and (G) ubiquitin. All transcript diagrams are based on cloned, sequenced RT-PCR product and are drawn to the same scale with 100 bp indicated in A. Arrows on the gene model schematics show forward- and reverse-primer binding sites.
Fig. S9.
Consensus splice sites for U12-type introns showing differential splicing in rgh3. Sequence logos are shown for 5′-splice sites, branch point sequences, 3′-splice sites, and exons downstream of U12-type introns that had significantly increased intron read depth in rgh3 (A) and nonsignificant U12-type introns (B). The last 3 bp of the upstream exon did not show any conserved nucleotides and is not shown.
Diverse rgh3 splicing defects are associated with U12-type introns. (A–C) WT and rgh3 root RNA-seq read depth for (A) GRMZM2G106613, (B) GRMZM2G083620, and (C) GRMZM2G587327. Transcript models show U12-type introns with a brace symbol indicating the regions amplified in D–F. (D–F) RT-PCR of normal (Rgh3 +) and rgh3 mutant (Rgh3 −) RNA from root, shoot, kernel, embryo, starchy endosperm (SE), basal endosperm transfer cell layer (BETL), and endosperm culture (EC) tissues. Schematics show sequenced amplification products with forward and reverse primers indicated by arrows. (D) GRMZM2G106613 showed U12-type intron retention in rgh3. (E) GRMZM2G083620 had adjacent U12- and U2-type intron retention. (F) GRMZM2G587327 activated cryptic U2-type splice sites in rgh3.RNA-seq read depth for experimentally validated genes with significantly more U12-type intron normalized expression. All panels show RNA-seq read depth for three WT and three rgh3 mutant replicates from root libraries. Annotated transcript models are shown with the U12-type intron labeled. A brace symbol indicates the region amplified in RT-PCR validation experiments shown in Fig. S8. Full gene models are shown for the following: (A) GRMZM2G131321, (B) GRMZM2G074015, (C) GRMZM2G040401, and (D) GRMZM2G033430. Partial gene models are shown for (E) GRMZM2G416751 and (F) GRMZM2G153434 to illustrate intron read depth adequately in the region amplified.RT-PCR validation of rgh3U12 splicing defects identified by normalized intron read depth analysis. (A–G) Gene-specific primers (arrows) amplified cDNA from roots, shoots, whole kernels, embryos, starchy endosperm (SE), endosperm tissue enriched for the basal endosperm transfer cell layer (BETL), and endosperm culture (EC). The seedling cDNA was derived from the same RNA used for RNA-seq as in Fig. 1. The cDNA for the seed and EC tissues are the same as in Fig. S3. Transcript diagrams and RT-PCR products are shown for (A) GRMZM2G131321, (B) GRMZM2G074015, (C) GRMZM2G040401, (D) GRMZM2G033430, (E) GRMZM2G416751, (F) GRMZM2G153434, and (G) ubiquitin. All transcript diagrams are based on cloned, sequenced RT-PCR product and are drawn to the same scale with 100 bp indicated in A. Arrows on the gene model schematics show forward- and reverse-primer binding sites.Consensus splice sites for U12-type introns showing differential splicing in rgh3. Sequence logos are shown for 5′-splice sites, branch point sequences, 3′-splice sites, and exons downstream of U12-type introns that had significantly increased intron read depth in rgh3 (A) and nonsignificant U12-type introns (B). The last 3 bp of the upstream exon did not show any conserved nucleotides and is not shown.
Predicted Functions of rgh3 Misspliced Genes.
Pfam domains in the 230 maize genes with increased intron read depth in rgh3 showed no significant enrichment or deenrichment compared with all 308 expressed U12-type intron-containing genes in the RNA-seq experiment (Dataset S3). These analyses suggest most biological processes that are dependent upon U12 splicing are affected in rgh3. Many U12-type introns are found within genes involved in DNA replication, DNA repair, transcription, RNA processing, and translation (22, 23). ArabidopsisU12-type introns were used for prior cross-kingdom comparisons, but only 58% of maizeU12-type intron-containing genes have an Arabidopsis ortholog that also contains a U12-type intron (21).Pfam domains were analyzed at a global level to identify U12-dependent biological processes affected in rgh3 (Fig. 3 and Dataset S3). More than one-half of the domains with annotated roles in translation, endomembrane dynamics, and unknown functions were enriched in the 230 rgh3 misspliced genes relative to all genes tested for intron read depth differences. When all U12-type intron-containing genes are compared genome-wide, more than one-half of domains with roles in cell cycle, RNA processing, and protein folding/degradation also are enriched (Fig. 3). These analyses support conserved functions for maizeU12-type intron-containing genes with U12 spliceosome targets in other species.
Fig. 3.
U12 splicing defects in rgh3-affected genes involved in cell differentiation and growth. (A) Heat map of Pfam domains found in U12-type intron-containing genes. Red domains are enriched in genes with rgh3 U12 splicing defects relative to all maize genes tested for splicing defects. Blue indicates additional domains enriched in U12-type intron-containing genes relative to all maize genes. Gray and white indicate no enrichment of the domain relative either to genes tested for splicing defects or to all maize genes, respectively. (B) Cell cycle schematic showing human homolog gene symbols for maize U12-type intron-containing genes. Bold indicates U12 splicing defects in rgh3. Asterisks indicate splicing defects in human ZRSR2 mutants. (C–E) Endosperm expression of rgh3 misspliced genes (24). Cluster analysis of all genes (open triangles) and rgh3 misspliced genes (blue squares) is plotted for embryo (C) and endosperm (D). E/L, early/late developmental expression. Const., constitutive expression. (E) Example endosperm expression profiles for maize/human homologs with U12-type introns. (F) Cumulative frequency plot for U12-type intron positions in maize proteins (orange circles) with human homologs (blue squares) containing U12-type introns. Orange (maize) and blue (human) lines plot the expected normal distribution of each protein set. (G and H) Scatterplots of expression levels for misspliced genes with human homologs that also have U12-type introns (orange circles) and all other misspliced genes (blue squares). Dotted black line shows 1:1 ratio of WT to rgh3 expression. Dotted gray lines show a twofold ratio change.
U12 splicing defects in rgh3-affected genes involved in cell differentiation and growth. (A) Heat map of Pfam domains found in U12-type intron-containing genes. Red domains are enriched in genes with rgh3U12 splicing defects relative to all maize genes tested for splicing defects. Blue indicates additional domains enriched in U12-type intron-containing genes relative to all maize genes. Gray and white indicate no enrichment of the domain relative either to genes tested for splicing defects or to all maize genes, respectively. (B) Cell cycle schematic showing human homolog gene symbols for maizeU12-type intron-containing genes. Bold indicates U12 splicing defects in rgh3. Asterisks indicate splicing defects in humanZRSR2 mutants. (C–E) Endosperm expression of rgh3 misspliced genes (24). Cluster analysis of all genes (open triangles) and rgh3 misspliced genes (blue squares) is plotted for embryo (C) and endosperm (D). E/L, early/late developmental expression. Const., constitutive expression. (E) Example endosperm expression profiles for maize/human homologs with U12-type introns. (F) Cumulative frequency plot for U12-type intron positions in maize proteins (orange circles) with human homologs (blue squares) containing U12-type introns. Orange (maize) and blue (human) lines plot the expected normal distribution of each protein set. (G and H) Scatterplots of expression levels for misspliced genes with human homologs that also have U12-type introns (orange circles) and all other misspliced genes (blue squares). Dotted black line shows 1:1 ratio of WT to rgh3 expression. Dotted gray lines show a twofold ratio change.Endosperm cells in rgh3 show aberrant differentiation of embryo surrounding region (ESR) and basal endosperm transfer layer (BETL) cells (19), which is analogous to blood cell differentiation defects in MDSpatients (18). BETL and ESR differentiation occurs early in seed development. To examine developmental expression of rgh3-affected genes, we reanalyzed a transcriptome profile of maize seed development (24). Based on reported clusters of expression profiles, the 230 misspliced genes in rgh3 were overrepresented in early endosperm but not in early embryo development (Fig. 3 ). Approximately 78% of misspliced genes have a peak of expression between 6 and 12 d after pollination (Fig. 3). These data are consistent with U12-type intron-containing genes playing a role in endosperm cell differentiation.To explore mechanistic similarities between U12-dependent cell differentiation pathways in maize and human, we identified human homologs of maize genes with U12-type introns. BLASTP searches of human RefSeq proteins identified 233 maizeU12-type intron-containing genes with a human homolog (Dataset S4). The 233 maize genes correspond to 154 human genes due to differences in gene redundancy in the maize and human genomes. We found two biological processes that account for 35% of the U12-type intron-containing maize genes with human homologs. First, 50 maizeU12-type intron-containing genes have roles in cell cycle. Of these cell cycle genes, 29 are differentially spliced in rgh3, which correspond to 24 human homologs (Fig. 3). Second, 33 maize genes with U12-type introns have predicted roles in protein glycosylation including the following: synthesis and transport UDP-xylose, synthesis of dolichyl-diphosphooligosaccharide, secretion of glycosylated protein complexes, and the unfolded protein response pathway (Table S2). Twenty-five of the protein glycosylation genes are misspliced in rgh3 and correspond to 10 human homologs.
Table S2.
Maize–human homologs with predicted roles in protein glycosylation
Maize gene
U12 splicing defect in rgh3
Human gene symbol
Predicted function
UDP-xylose synthesis and transport
GRMZM2G032003
Yes
UGP2
UDP-glucose pyrophosphorylase 2
GRMZM2G098370
UGP2
GRMZM2G007195
Yes
UXS1
UDP-glucuronate decarboxylase 1
GRMZM2G007404
Yes
UXS1
GRMZM2G347717
Yes
UXS1
GRMZM2G359234
Yes
UXS1
GRMZM2G370048
Yes
UXS1
GRMZM2G381473
Yes
UXS1
GRMZM2G000632
Yes
GALE
UDP-galactose-4-epimerase
GRMZM2G040397
Yes
GALE
GRMZM2G145460
Yes
GALE
GRMZM5G830983
Yes
GALE
GRMZM2G301172
UMPS
Uridine monophosphate synthetase
GRMZM2G063253
Yes
SLC35E3
UDP-xylose transporter
GRMZM2G063511
Yes
SLC35E3
GRMZM2G068714
Yes
SLC35E3
GRMZM2G081848
Yes
SLC35E3
GRMZM2G116053
Yes
SLC35E3
GRMZM2G122618
Yes
SLC35E3
GRMZM2G048434
SLC35E3
GRMZM5G828581
SLC35E3
Protein glycosylation
GRMZM2G426275
MGAT1
UDP-N-acetylglucosamine:α-3-d-mannoside β-1,2-N-acetylglucosaminyltransferase I
Phosphatidylinositol glycan anchor biosynthesis, class M
GRMZM2G000937
Yes
PIGB
Phosphatidylinositol glycan anchor biosynthesis, class B
GRMZM2G164175
Yes
PIGB
Glycoprotein turnover and secretion
GRMZM2G117388
Yes
DERL2
Degradation of misfolded glycoproteins
GRMZM2G143817
DERL2
GRMZM2G061922
Yes
TRAPPC2
Sedlin, collagen secretion
GRMZM2G097568
Yes
TRAPPC2
Maize–human homologs with predicted roles in protein glycosylationTo find genes with altered splicing in both rgh3 and humanZRSR2 mutants, we identified the current gene symbols for humanU12-intron–containing genes in U12DB (Dataset S5) (25). This revealed 36 human genes that are homologs of 57 maize genes with U12-type introns in both species. Maize/human homology enriches for U12 splicing defects in both rgh3 and ZRSR2 mutants (Table S3). Fifty of these maize genes were tested in the RNA-seq analysis, and 96% (48 of 50) had evidence of splicing defects in rgh3, as determined by intron read depth or junction read tests. Of the 758 human genes in U12DB, only 216 (28%) genes had significant splicing defects reported in ZRSR2 mutants, but 47% (17 of 36) of humanU12-type intron-containing genes with maize homologs have splicing defects (18).
Table S3.
Human–maize gene pairs both containing U12-type introns that are misspliced in rgh3
Human gene
Maize gene
Maize–human U12 intron position
Splicing defect in ZRSR2 MDS (18)
Biological process
GPN2
GRMZM2G093716
Both U12 introns at same residues
Yes
Embryonic development
ALG12
GRMZM2G152194
Same residue
Yes
Protein glycosylation
MAEA
GRMZM2G177026
Same residue
Yes
Cell differentiation
SACM1L
GRMZM2G047894
Same residue
Yes
Metabolism
SACM1L
GRMZM2G171080
Same residue
Yes
Metabolism
SACM1L
GRMZM2G418916
Same residue
Yes
Metabolism
TAPT1
GRMZM2G347645
Same residue
Yes
Embryonic development
TRAPPC2
GRMZM2G061922
Same residue
Yes
Endomembrane
TRAPPC2
GRMZM2G097568
Same residue
Yes
Endomembrane
SLC9A8
GRMZM2G067747
First U12 intron, same residue
Yes
Transport
WDR91
GRMZM2G158179
First U12 intron, same residue
Yes
Unknown
SMYD2
GRMZM2G457881
Second U12 intron, conserved position, divergent protein motif
Yes
Cell cycle
DERL2
GRMZM2G117388
Divergent
Yes
Protein processing
E2F3
GRMZM2G041701
Divergent
Yes
Cell cycle
E2F3
GRMZM2G052515
Divergent
Yes
Cell cycle
EXO1
GRMZM2G096920
Divergent
Yes
Cell cycle
FRA10AC1
GRMZM2G001444
Divergent
Yes
RNA processing
IPO4
GRMZM2G408305
Divergent
Yes
Protein targeting
IPO9
GRMZM2G457415
Divergent
Yes
Protein targeting
PIGB
GRMZM2G000937
Divergent
Yes
Protein glycosylation
PIGB
GRMZM2G164175
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G048434
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G063253
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G063511
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G068714
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G081848
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G116053
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM2G122618
Divergent
Yes
Protein glycosylation
SLC35E3
GRMZM5G828581
Divergent
Yes
Protein glycosylation
NAPG
GRMZM2G145175
First U12 intron, same residue
Endomembrane
BRCC3
GRMZM2G096491
Same residue
Cell cycle
BRCC3
GRMZM2G152436
Same residue
Cell cycle
EXOSC1
GRMZM5G841900
Same residue
RNA processing
EXOSC5
GRMZM2G083620
Same residue
RNA processing
FAM96B
GRMZM2G159389
Same residue
Metabolism
FAM96B
GRMZM2G162266
Same residue
Metabolism
POLE2
GRMZM2G154267
Same residue
Cell cycle
XRCC5
GRMZM2G137968
Same residue
Cell cycle
PQLC2
GRMZM2G024733
Shifted by 1 codon
Transport
PQLC2
GRMZM2G153434
Shifted by 1 codon
Transport
BTAF1
GRMZM2G168096
Divergent
Transcription
GTF2H3
GRMZM2G027209
Divergent
Transcription
NCBP2
GRMZM2G034804
Divergent
RNA processing
NCBP2
GRMZM2G052341
Divergent
RNA processing
SETD2
GRMZM2G033694
Divergent
Chromatin structure
SETD2
GRMZM2G130910
Divergent
Chromatin structure
SMC3
GRMZM2G456570
Divergent
Cell cycle
SMYD3
GRMZM2G080462
Divergent
Cell cycle
Human–maize gene pairs both containing U12-type introns that are misspliced in rgh3Mutations in three of these genes, GPN2, MAEA, and TAPT1, have documented roles in animal and plant development. The Arabidopsis homolog of GPN2 encodes the QQT1 protein and is required for embryos to complete periclinal divisions to establish epidermal and internal cell layers (26). The mouse ortholog of MAEA is required for final differentiation of erythrocytes (27). The vertebrate TAPT1 gene is required for skeletal patterning and normal function of the primary cilium (28). The ArabidopsisTAPT1 ortholog, POD1, is required for apical–basal patterning of the early embryo and endomembrane protein sorting (29).Our analysis also revealed divergent loss of U12-type introns for genes with conserved biological functions (Datasets S4 and S5). The protein glycosylation enzymes, ALG12 and PIGB, have U12-type introns in both maize and human. U12-type introns are also found in other protein glycosylation genes including the following: ALG3, ALG6, ALG8, PIGN, and PIGP in human as well as the maize homologs of DDOST, MPDU1, MGAT1, RFT1, and PIGM. The DNA origin of replication complex subunit, ORC3, in humans contains a U12-type intron, whereas the maizeORC4 homolog contains a U12-type intron. For ubiquitin-specific peptidases, U12-type introns are found in humanUSP7, USP10, and USP14 as well as in maize homologs of USP36 and USP42. Similarly, molybdenum cofactor (Moco) biosynthesis genes contain U12-type introns in maize. In humans, two Moco cofactor-requiring enzymes, aldehyde oxidase (AOX1) and xanthine dehydrogenase (XDH), contain U12-type introns. Loss of Moco in maize does not affect endosperm development but does cause seedling lethality (30, 31). There are at least 25 additional examples where different members of protein complexes or biochemical pathways have U12-type introns in human and maize including TRAPP and adaptor related protein complexes, Rab, importins, DNA polymerase, activating signal cointegrator-1, TFIIA, RNA polymerase III, the exosome, ribosomal proteins, and the autophagy pathway. These homologies suggest a common genetic architecture of minor spliceosome targets in human and maize where splicing defects in U12-type introns disrupt conserved biological processes and result in stem cell-like phenotypes in ZRSR2 mutant MDS cells and rgh3 endosperm cells.
Conservation of U12-Type Intron Positions in Maize and Human.
Divergence of gene sets with U12-type introns in a given species is almost exclusively due to loss of U12-type introns or mutation toward a U2-type intron (22, 32). Protein alignments of the human and maize isoforms were used to identify U12-type intron positions for maize/human homologs (Fig. S10 and Dataset S4). Approximately one-half of the maize/human homologs (19 of 36 human genes and 26 of 57 maize genes) have at least one U12-type intron in a conserved position within the protein coding sequence (Fig. S11 and Table S3). The U12-type introns are randomly distributed within the coding sequences of both maize and human homologs (Fig. 3). The intron positions fit the normal distribution (Shapiro–Wilk P > 0.05) with very little skew (human = −0.04; maize = −0.03) but have a low kurtosis (human = −0.53; maize = −0.91). The near-uniform distribution suggests a diversity of coding sequence impacts with most defective transcripts expected to trigger nonsense-mediated decay (NMD). However, rgh3 splicing defects do not appear to significantly alter transcript abundance. For the rgh3 misspliced genes, 86 and 79% of shoot and root transcripts accumulate within a twofold range of WT, respectively (Fig. 3 ). Near-equivalent transcripts levels were also observed for misspliced genes with a U12DB human homolog.
Fig. S10.
Protein sequence alignment of maize and human E2F3 homologs reveals U12-type intron position divergence. Clustal Omega was used to align the maize and human protein isoforms as well as sequences truncated at the last codon 5′ of the U12-type intron. Truncated proteins were removed from the alignment, and the 5′-codon is indicated by an underlined residue in the alignment. The position of the human U12-type intron is indicated by the blue arrow callout (Hs), and the maize intron is indicated by the orange arrow callout (Zm).
Fig. S11.
Protein sequence alignment of maize and human MAEA homologs identifies conserved U12-type intron positions. Clustal Omega was used to align the maize and human protein isoforms as well as sequences truncated at the last codon 5′ of the U12-type intron. Truncated proteins were removed from the alignment, and the 5′-codon is indicated by an underlined residue in the alignment. The position of the human U12-type intron is indicated by the blue arrow callout (Hs), whereas the maize intron is indicated by the orange arrow callout (Zm).
Protein sequence alignment of maize and humanE2F3 homologs reveals U12-type intron position divergence. Clustal Omega was used to align the maize and human protein isoforms as well as sequences truncated at the last codon 5′ of the U12-type intron. Truncated proteins were removed from the alignment, and the 5′-codon is indicated by an underlined residue in the alignment. The position of the humanU12-type intron is indicated by the blue arrow callout (Hs), and the maize intron is indicated by the orange arrow callout (Zm).Protein sequence alignment of maize and humanMAEA homologs identifies conserved U12-type intron positions. Clustal Omega was used to align the maize and human protein isoforms as well as sequences truncated at the last codon 5′ of the U12-type intron. Truncated proteins were removed from the alignment, and the 5′-codon is indicated by an underlined residue in the alignment. The position of the humanU12-type intron is indicated by the blue arrow callout (Hs), whereas the maize intron is indicated by the orange arrow callout (Zm).RT-PCR from purified nuclei and polysomes indicates that some misspliced transcripts in rgh3 are likely to be translated. Intron retention and misspliced transcripts copurified with polysomes when the U12-type intron is the last intron of the transcript, such as in maize homologs of the E2F3 cell-cycle transcription factor and TRAPPC2 (Fig. 4 ). The U12 splicing defects in GRMZM2G033430 and PQLC2 also copurify with polysomes in rgh3 mutants (Fig. 4 and Fig. S12). These splice variants may not be NMD targets, because the termination codons are relatively close to the last exon–exon junction. By contrast, transcripts with multiple exon–exon junctions downstream of the U12 splicing defect as well as control transcripts expected to be retained in the nucleus were enriched in nuclei and excluded from polysomes (Fig. 4 and Fig. S12 ).
Fig. 4.
A subset of rgh3 misspliced transcripts may be translated. (A–H) RT-PCR analysis using RNA extracted from total, nuclei, and polysome fractions of normal and rgh3 samples. U12-type intron regions were amplified from the following: (A) GRMZM2G052515, (B) GRMZM2G097568, (C) GRMZM2G033430, (D) GRMZM2G177026, and (E) GRMZM2G093716. The larger two mRNA variants of Rsp31B (F) and mir156 premiRNA (G) are expected to be excluded from polysomes, whereas actin (H) serves as loading control. All transcript diagrams are drawn to the same scale as in A with U12-type introns indicated. Arrows show forward and reverse primers used for RT-PCR. (I) Transcript and protein diagrams for maize E2F3 genes and their human E2F3a homolog. Retention of the maize U12-type introns introduce premature termination codons (red octagon) that would have minimal effect on E2F protein domains. Retention and cryptic splicing of the U12-type intron in human are predicted NMD targets, whereas skipping exons 4–5 is predicted to produce a nonfunctional protein.
Fig. S12.
U12-type intron retention transcripts are primarily retained in the nucleus. Semiquantitative RT-PCR analysis using RNA extracted from total tissue, nuclear enriched, and polysome fractions of normal and rgh3 seedlings. (A) GRMZM2G153434, (B) GRMZM2G106613, (C) GRMZM2G083620, (D) GRMZM2G131321, and (E) GRMZM2G074015. Actin (F) was used as loading control. All transcript diagrams are based on cloned, sequenced RT-PCR products and are drawn to the same scale with 100 bp indicated in A. Arrows on the gene model schematics show forward- and reverse-primer binding sites.
A subset of rgh3 misspliced transcripts may be translated. (A–H) RT-PCR analysis using RNA extracted from total, nuclei, and polysome fractions of normal and rgh3 samples. U12-type intron regions were amplified from the following: (A) GRMZM2G052515, (B) GRMZM2G097568, (C) GRMZM2G033430, (D) GRMZM2G177026, and (E) GRMZM2G093716. The larger two mRNA variants of Rsp31B (F) and mir156 premiRNA (G) are expected to be excluded from polysomes, whereas actin (H) serves as loading control. All transcript diagrams are drawn to the same scale as in A with U12-type introns indicated. Arrows show forward and reverse primers used for RT-PCR. (I) Transcript and protein diagrams for maizeE2F3 genes and their human E2F3a homolog. Retention of the maizeU12-type introns introduce premature termination codons (red octagon) that would have minimal effect on E2F protein domains. Retention and cryptic splicing of the U12-type intron in human are predicted NMD targets, whereas skipping exons 4–5 is predicted to produce a nonfunctional protein.U12-type intron retention transcripts are primarily retained in the nucleus. Semiquantitative RT-PCR analysis using RNA extracted from total tissue, nuclear enriched, and polysome fractions of normal and rgh3 seedlings. (A) GRMZM2G153434, (B) GRMZM2G106613, (C) GRMZM2G083620, (D) GRMZM2G131321, and (E) GRMZM2G074015. Actin (F) was used as loading control. All transcript diagrams are based on cloned, sequenced RT-PCR products and are drawn to the same scale with 100 bp indicated in A. Arrows on the gene model schematics show forward- and reverse-primer binding sites.Missplicing of E2F3 was identified as a candidate for mediating humanmyeloid malignancies observed in ZRSR2 mutant cells (18). U12-type intron retention in the maizeE2F3 homologs truncate the C-terminal transactivation domain similar to the endogenous GRMZM2G041701_T02 isoform (Fig. 4). The predicted E2F proteins from the intron retention transcripts contain all domains necessary for cell proliferation and the endocycle in plants (33). By contrast, the splicing defects observed in humanE2F3 are primarily expected to be NMD targets, except for a transcript that skips exons 3–4, which flank the U12-type intron (18). If translated, the human exon skip transcript would produce a nonfunctional E2F3 protein lacking the DP-dimerization and transactivation domains. This difference in U12-type intron position may contribute to the contrasting cell proliferation phenotypes with rgh3 being proliferative in culture and ZRSR2 mutants arresting the cell cycle.
Mutant RGH3 Proteins Disrupt Localization with U2AF2.
It is surprising that rgh3 disrupts a larger proportion of U12-type introns than ZRSR2 mutations in MDSpatients. The rgh3 mutant is predicted to encode a weak allele, whereas most ZRSR2 mutations are loss-of-function alleles (14–16, 19). The rgh3-umu1 allele has a Mutator transposon insertion that partially splices from the transcript to delete 12 aa and insert 47 aa coded by the transposon sequence in the N-terminal acidic domain of the RGH3 protein (Fig. 5). The normal Rgh3 allele produces multiple splice variants. Only the Rgh3α isoform encodes a full-length ortholog of ZRSR2 (19). Both the mutant and splice isoforms coding for protein truncations are likely to be expressed as protein in maize, because RGH3 antibodies cross-react with multiple proteins that migrate similarly to in vitro transcribed/translated cDNA clones of the alternatively spliced variants (Fig. S13). We investigated subcellular localization of RGH3 protein variants to gain additional insight into the nature of the rgh3 mutant allele.
Fig. 5.
Multiple RGH3 protein domains are necessary for colocalization with U2AF2. (A) Protein domain schematic of RGH3 isoforms and UHM domain deletion tested in colocalization assays. (B–D) Transient colocalization of U2AF2 with mutant RGH3umu1α allele (B), RGH3ΔUHM domain deletion (C), and the RGH3ε isoform (D). (E–H) Transient expression of BiFC constructs: cYFP-U2AF2 with RGH3α-nYFP (E), nYFP-U2AF2 with U2AF1-cYFP (F), nYFP-U2AF2 with cYFP-RGH3umuα (G), and cYFP-U2AF2 with nYFP-RGH3ΔUHM (H). White arrowheads point to nucleolus. [Scale bar: 5 µm (in all microscopy images).]
Fig. S13.
Alternatively spliced Rgh3 variants produce truncated proteins in vivo. (A) Schematic of RGH3 protein isoforms based on full-length cDNA sequences. Predicted protein molecular weights are given on the Left. (B) Western blot analysis with an N-terminal anti-RGH3 peptide antibody. The panel on the Left shows in vitro transcription/translation reactions charged with full-length cDNA clones coding for different RGH3 protein isoforms. The arrowhead points to RGH3α and RGH3umu1α full-length proteins. The arrow points to a nonspecific, cross-reactive protein in the wheat germ extract. All RGH3 isoforms migrate through SDS/PAGE slower than predicted. The panel on the Right shows 24 DAP seed tissue protein extracts. Multiple protein bands cross-react with the anti-RGH3 antibody with the arrowhead indicating proteins correlating in mobility to RGH3α and RGH3umu1α in vitro-transcribed/translated isoforms. The asterisk indicates potential RGH3 truncated isoforms. The RGH3β and RGH3γ isoforms are the most common alternative splice variants expressed from the rgh3 locus and migrate similarly to the major low–molecular-weight proteins recognized by the anti-RGH3 antibody.
Multiple RGH3 protein domains are necessary for colocalization with U2AF2. (A) Protein domain schematic of RGH3 isoforms and UHM domain deletion tested in colocalization assays. (B–D) Transient colocalization of U2AF2 with mutant RGH3umu1α allele (B), RGH3ΔUHM domain deletion (C), and the RGH3ε isoform (D). (E–H) Transient expression of BiFC constructs: cYFP-U2AF2 with RGH3α-nYFP (E), nYFP-U2AF2 with U2AF1-cYFP (F), nYFP-U2AF2 with cYFP-RGH3umuα (G), and cYFP-U2AF2 with nYFP-RGH3ΔUHM (H). White arrowheads point to nucleolus. [Scale bar: 5 µm (in all microscopy images).]Alternatively spliced Rgh3 variants produce truncated proteins in vivo. (A) Schematic of RGH3 protein isoforms based on full-length cDNA sequences. Predicted protein molecular weights are given on the Left. (B) Western blot analysis with an N-terminal anti-RGH3 peptide antibody. The panel on the Left shows in vitro transcription/translation reactions charged with full-length cDNA clones coding for different RGH3 protein isoforms. The arrowhead points to RGH3α and RGH3umu1α full-length proteins. The arrow points to a nonspecific, cross-reactive protein in the wheat germ extract. All RGH3 isoforms migrate through SDS/PAGE slower than predicted. The panel on the Right shows 24 DAP seed tissue protein extracts. Multiple protein bands cross-react with the anti-RGH3 antibody with the arrowhead indicating proteins correlating in mobility to RGH3α and RGH3umu1α in vitro-transcribed/translated isoforms. The asterisk indicates potential RGH3 truncated isoforms. The RGH3β and RGH3γ isoforms are the most common alternative splice variants expressed from the rgh3 locus and migrate similarly to the major low–molecular-weight proteins recognized by the anti-RGH3 antibody.HumanZRSR2 interacts with the major spliceosome subunit, U2AF2, through a U2AF homology motif (UHM) that is related to RNA recognition motifs but mediates protein–protein interactions (13, 34). These interactions are likely conserved in maize (19). Colocalization is observed when maizeRGH3α and U2AF2 are transiently expressed in Nicotiana benthamiana as GFP and RFP fusions (Fig. S14). U2AF2 is the large subunit of U2 auxiliary factor (U2AF), and maizeU2AF2 also colocalizes with the small subunit of U2AF, U2AF1 (Fig. S14). These data are consistent with maizeRGH3α and U2AF2 acting in the same subnuclear compartment.
Fig. S14.
Colocalization analysis of multiple RGH3 natural and engineered isoforms with U2AF2. Engineered and truncated protein variants of RGH3 were fused to GFP or RFP and transiently coexpressed with GFP-U2AF2 or U2AF2-RFP in N. benthamiana. (A) RGH3α colocalizes with U2AF2 throughout the nucleoplasm as well as the nucleolus (white arrowhead). This is an independent experiment from figure 9D in ref. 19. (B) U2AF1 and U2AF2 subunits colocalize in the nucleoplasm. (C and D) RGH3umu1α and RGH3ΔUHM show intermittent colocalization with U2AF2 within structures of the nucleoplasm (red arrowhead). (E and F) WT, truncated RGH3 protein variants are concentrated in the nucleolus and fail to colocalize with U2AF2-RFP. [Scale bar: 5 µm (in all images).]
Colocalization analysis of multiple RGH3 natural and engineered isoforms with U2AF2. Engineered and truncated protein variants of RGH3 were fused to GFP or RFP and transiently coexpressed with GFP-U2AF2 or U2AF2-RFP in N. benthamiana. (A) RGH3α colocalizes with U2AF2 throughout the nucleoplasm as well as the nucleolus (white arrowhead). This is an independent experiment from figure 9D in ref. 19. (B) U2AF1 and U2AF2 subunits colocalize in the nucleoplasm. (C and D) RGH3umu1α and RGH3ΔUHM show intermittent colocalization with U2AF2 within structures of the nucleoplasm (red arrowhead). (E and F) WT, truncated RGH3 protein variants are concentrated in the nucleolus and fail to colocalize with U2AF2-RFP. [Scale bar: 5 µm (in all images).]The mutant, RGH3umu1α, protein has aberrant subnuclear localization with low levels of diffuse signal in the nucleoplasm (Fig. 5) instead of localizing to spliceosomal speckles as seen for RGH3α (Fig. S14). When RGH3umu1α is coexpressed with U2AF2, the proteins typically localize to different subnuclear compartments. Some cells showed partial overlap of RFP and GFP fusions, indicating reduced colocalization (Fig. S14). An in-frame deletion of the UHM domain (RGH3ΔUHM) showed intermittent colocalization like RGH3umu1α (Fig. 5 and Fig. S14). Splice isoforms coding for protein truncations of the RS domain (RGH3ε) or the UHM and RS domain (RGH3β, RGH3γ) did not show overlap with U2AF2 (Fig. 5 and Fig. S14 ).Bimolecular fluorescent complementation (BiFC) assays of U2AF2 with RGH3α, U2AF1, RGH3umu1α, and RGH3ΔUHM all resulted in YFP signal in the nucleus (Fig. 5 and Fig. S15). Similar to colocalization experiments, U2AF2 and RGH3α showed YFP signal in nuclear speckles and the nucleolus, whereas U2AF1 and U2AF2 had YFP signal in nuclear speckles. By contrast, BiFC signals from U2AF2 with RGH3umu1α or RGH3ΔUHM appear aggregated in larger subnuclear foci. Reconstitution of YFP in BiFC assays is irreversible with transient interactions able to give stable YFP signal (35). These data show that U2AF1, U2AF2, and RGH3 colocalize as predicted from human protein–protein interaction studies. Mutations or truncations affecting the acidic, UHM, or RS-domain all disrupt the dynamic colocalization of RGH3 with U2AF2 equivalently. Combined with the RNA-seq results, these localization experiments suggest that rgh3-umu1 is more likely a strong allele and indicate an important role for the acidic domain of ZRSR2/RGH3 in U12 splicing.
Fig. S15.
Additional BiFC images supporting colocalization of RGH3 variants and U2AF2. BiFC signal is observed when the N- and C-terminal segments of the split YFP are swapped in two alternate combinations of the RGH3α and U2AF2 fusions as well as one alternate combination of RGH3ΔUHM and U2AF2 fusions. (Scale bars: 5 µm.)
Additional BiFC images supporting colocalization of RGH3 variants and U2AF2. BiFC signal is observed when the N- and C-terminal segments of the split YFP are swapped in two alternate combinations of the RGH3α and U2AF2 fusions as well as one alternate combination of RGH3ΔUHM and U2AF2 fusions. (Scale bars: 5 µm.)
Discussion
RGH3/ZRSR2 Are U12 Splicing Factors in Vivo.
Our data provide an independent genetic analysis of ZRSR2 function in a distantly related species from humans. RGH3/ZRSR2 has a conserved role in splicing U12-type intron-containing genes. ZRSR2 and rgh3 mutants exhibit U12-type intron retention, activation of cryptic 5′- and 3′-splice sites within U12-type introns, and retention of U2-type introns that are adjacent to misspliced U12-type introns.A near-exclusive in vivo function for ZRSR2/RGH3 in U12 splicing contradicts biochemical experiments that conclusively show ZRSR2 copurifies and is required in both spliceosomes (12, 13). It is possible that copurification of ZRSR2 and U2AF in human cell extracts is due to independent binding of common pre-mRNA species. However, ZRSR2 interacts with U2AF2 in yeast two-hybrid assays (13). Yeast lacks U12-type introns, suggesting direct protein–protein interactions are more likely between these spliceosome subunits. In maize, colocalization of RGH3 with U2AF2 appears to be an indicator of RGH3 function in the minor spliceosome. Potentially, interactions between the major and minor spliceosomes promote efficient splicing of either class of intron.
Minor Splicing as a Regulatory Process.
Minor splicing factors are at low abundance, and U12 splicing can be a rate-limiting step to produce protein-coding mRNA (36). For example, U6atac levels in HeLa cells affect splicing efficiency as well as expression level of genes with U12-type introns, and the minor spliceosome was proposed to regulate cellular responses to stimuli (4). Under this model, minor spliceosome activity determines the balance of coding mRNA vs. NMD or translation into alternative protein isoforms (4). We found little evidence of splicing defects resulting in expression level changes of maizeU12-type intron-containing genes. Similarly, DrosophilaU6atac mutants have little impact on U12-type intron-containing gene expression levels (37). In maize, reduced U12 splicing generally leads to predicted NMD targets being retained in the nucleus. A smaller subset of U12-type intron retention transcripts in rgh3 are associated with polysomes and are likely to be translated into alternative protein isoforms.Although different molecular mechanisms seem to act downstream of U12 splicing efficiency in maize and humans, there are a substantial number of homologous genes with U12-type introns. We found 233 maizeU12-type intron-containing genes with easily identified human homologs representing 154 human genes. Nearly 25% of the maize–human homolog pairs contain U12-type introns in both species (57 of 233 for maize genes or 36 of 154 human homologs). There are also overlapping functions among nonhomologous genes subject to U12 splicing. These nonhomologous overlapping genes are subunits of the same protein complexes, members of conserved gene families, or members of the same metabolic pathways. These overlaps suggest selection for specific biological processes, such as cell cycle and protein glycosylation, to be dependent upon U12 splicing efficiency and support the idea that the minor spliceosome could be regulatory.
Minor Splicing Is Required for Cell Differentiation.
ZRSR2 and rgh3 mutants both disrupt cell differentiation programs. MDSpatients with ZRSR2 mutations accumulate myeloid blast precursors (18). Maizergh3 endosperm retains proliferative capacity and shows cell fate switching to aleurone in the basal endosperm transfer cell layer and the embryo-surrounding region (19). Mutations in other minor spliceosome factors lead to developmental defects in many species. In humans, Taybi–Linder syndrome is caused by mutations in U4atac, resulting in severe bone abnormalities and microcephaly (6, 7). Reduced U12 splicing efficiency may also be the primary molecular cause of spinal muscular atrophy, which affects the peripheral nervous system (38, 39). Zebrafish mutations in RNPC3 lead to defects in endodermal organ development and aberrant intestinal epithelium morphology (8). Drosophila mutants in U6atac and U12 snRNA are lethal in third-instar larvae, when adult metamorphosis occurs (5). Artificial microRNA (amiRNA) down-regulation of ArabidopsisU12 splicing factors show defective leaf morphology and arrest inflorescence development (9–11). These phenotypes all suggest that reduced U12 splicing is not immediately lethal to the cell but rather disrupts essential developmental processes.Despite the many U12 splicing phenotypes reported, no unifying developmental function has been ascribed to minor splicing. By focusing on individual tissues such as endosperm or blood, a common function for Rgh3 and ZRSR2 in cell differentiation becomes clear (18, 19). Extending to a more general interpretation of U12 splicing mutant phenotypes, the data suggest a role for U12 splicing to promote differentiation of a subset of cell types in both plants and animals. It is unlikely for U12 splicing to be needed in all differentiation processes, because the minor spliceosome has been lost in multiple eukaryotic lineages (2).
Conservation of Minor Splicing During Evolution.
The losses of minor splicing during evolution raise the possibility that the minor spliceosome evolved in a convergent manner to function in plant and animal cell differentiation. However, there appears to be more selective pressure to maintain U12 splicing in multicellular eukaryotes. Although the minor spliceosome is missing in some lineages with multicellular development, a larger fraction of unicellular eukaryotic genomes have lost U12 splicing (2, 40). The few unicellular species that retain the minor spliceosome, such as Acanthamoeba castellani, tend to have amoeboid cellular organization (2, 40). The A. castellani genome sequence revealed a high frequency of horizontal gene transfer, including potential eukaryote-to-eukaryote gene transfer that would confound deep evolutionary comparisons (41).Volvocine greenalgae illustrate convergent evolutionary innovation in multicellular development. Volvocines do not have a minor spliceosome and independently evolved multicellular species relative to higher plants (2, 42). Recent genome comparisons within this clade found that cell cycle genes are expanded in multicellular species with mutation of retinoblastoma as the primary change needed for unicellular species to become multicellular (41). In addition, genes encoding glycosylated extracellular matrix proteins are expanded in Volvox carteri, which has two distinct cell types. Both processes are highly represented in maizeU12-type intron-containing genes. However, neither retinoblastoma nor the volvocine-type extracellular matrix proteins have U12-type introns in any species (21, 25).Importantly, the U12-type introns affected in rgh3 and ZRSR2 mutants were maintained since the divergence of plants and animals (32). The number of U12-type introns in the last eukaryotic common ancestor is not known, but all lineages with a minor spliceosome have a small fraction of genes with U12-type introns. Assuming random loss of introns, the number of homologous genes that still have U12-type introns in human and maize is large, especially with one-half of the homologs having divergent U12-type intron positions. These observations argue for selective pressure to maintain a subset of genes with U12-type introns to carry out essential roles in cell differentiation.
Materials and Methods
RNA-Seq.
Normal and rgh3 kernels in the W22 inbred background were sown in soil (29 °C/20 °C, 12 h/12 h, day/night). Individual seedlings were harvested when about 5 cm high. Normal seedlings were genotyped to select homozygous WT individuals. Total mRNA was extracted from three biological replicates of rgh3 and WT roots and shoots using TRIzol Reagent (Life Technologies) with RNase-free glycogen (Fermentas) as an RNA carrier. Nonstrand-specific TruSeq (Illumina) cDNA libraries were built from 2 μg of RNA input with a 200-bp median insert length. Twelve libraries were pooled using the KAPA Library Quantification Kit (Kapa Biosystems) and sequenced on the HiSeq 2000 platform with 100-bp paired-end reads.
Read Mapping.
Bases with a quality score <20 were trimmed with the FASTQ/A utility in the FASTX toolkit (hannonlab.cshl.edu/fastx_toolkit/). Low-quality reads were discarded if >20% of bases had a quality score <20. Reads from repetitive elements in the MIPS database (43) were removed using Bowtie (44), version 0.12.7. Single-nucleotide polymorphisms (SNPs) between W22 and B73 RefGen_v2 genome were identified by mapping WT libraries to the B73 RefGen_v2 genome (45) (Dataset S6). Mosaik 2, version 2.1.33, was run with a hash size of 14, up to 100 hash positions per seed, and allowing 2% mismatch (46). Duplicate reads were removed using Picard, version 1.54 (broadinstitute.github.io/picard/). W22 SNPs were identified with FreeBayes, version 0.9.5, assuming 2% pairwise differences with B73 and requiring a 99% confidence statistic that an individual variant exists (47). FreeBayes parameters included at least 10 mapped reads with >40% of reads supporting the variant and at least one read having a minimum quality score of 20.GSNAP, version 2012-5-24, with an IIT file containing W22/B73 SNPs, was used to uniquely align all RNA-seq libraries with B73 RefGen_v2 (48). Alignment parameters allowed 2% mismatch and mate pairs to deviate 800 bp from the expected paired-end length of 200 bp. GSNAP was guided by the ZmB73_5b filtered gene set annotations but was permitted to discover novel splicing events. Transcript isoforms were assembled using Cufflinks, version 1.3.0 (20). Each library was assembled independently with a minimum intron size of 10 bp and an overlap radius of 10 bp. Isoform expression analysis was restricted to the ZmB73_5b filtered gene set. Differential splicing was tested for roots and shoots independently using Cuffdiff with the bias detection and correction algorithm and upper-quartile normalization.
Tests for Altered Splicing of Individual Introns.
Nonredundant introns were identified from the ZmB73_5b annotation. HTSeq, version 0.6.1p1 (49), tallied intron read counts, which were normalized by the fragments per kilobase of transcript per million mapped reads (FPKM) of the gene model in each library. Differences in intron reads were determined by a two-tailed t test with the Welch modification for degrees of freedom with no correction for multiple testing. Genes that were not expressed in every RNA-seq library were excluded from the analysis. Genomic coordinates for U12-type introns were identified by BLASTN searches against the B73 RefGen_v2 genome using sequences from the ERISdb database (21, 45).PSO was calculated from exon–exon junction reads and reads mapping to shortest nonredundant intron for each unique 5′- and 3′-splice site (50). Read counts were calculated with the intersectBed command from Bedtools, version 2.24.0 (51). Fisher’s exact test was calculated for each intron with the summed reads across all WT and all mutant libraries and a false-discovery rate of 0.05 using the Benjamini–Hochberg method.
RT-PCR.
RNA from 12 d after pollination (DAP) whole kernels was isolated by freezing rgh3 and normal kernels in liquid nitrogen with extraction using TRIzol Reagent followed by RNeasy RNA cleanup with on-column DNase I digestion (Qiagen). Embryo, starchy endosperm, and basal endosperm transfer cell-enriched tissue was hand dissected from 12 DAP whole kernels at −18 °C (52). RNA from these tissues was extracted with the Arcturus PicoPure RNA isolation kit (Applied Biosystems) with on-column DNase I digestion. Endosperm cultures were established as described (19). RNA was extracted from endosperm culture, seedling roots, and seedling shoots using the RNeasy Plant Mini Kit (Qiagen) with on-column DNase I digestion. Total RNA was reverse-transcribed using the SuperScript III First-Strand Synthesis System (Invitrogen) with oligo(dT)20 as primer.Gene-specific primers were designed to distinguish multiple splice variants from individual genes using one primer pair. For genes predicted by Cufflinks to have splicing differences in rgh3, annotated alternative transcripts along with raw read coverage depth, visualized by the Integrated Genome Browser (53), were used to target specific regions for RT-PCR. Amplification conditions consisted of the following: 1-min denaturation at 95 °C; 30–31 cycles of 95 °C for 1 min, 55 °C for 1 min, 72 °C for 30 s, and a final extension step of 72 °C for 5 min. RT-PCR products were cloned into the pCR4-TOPO vector for Sanger sequencing.
Conserved Protein Domain Enrichment Analysis.
All predicted protein isoforms in the maize filtered gene set were queried with HMMER 3.0 against the Pfam database using an inclusion threshold of 0.01 (54, 55). Hypergeometric P values were calculated for three samples and populations (Dataset S3): (i) rgh3 misspliced genes relative to all expressed U12-type intron-containing genes, (ii) rgh3 misspliced genes relative to all expressed maize genes, and (iii) all U12-type intron-containing genes relative to all genes in the filtered gene set. A gene was considered expressed if at least one intron was tested for normalized intron read depth differences between rgh3 and WT samples. Pfam domains were manually categorized into biological functions based on the annotation of each domain.
Human Homolog Analysis.
The B73 RefGen_v3 maize protein isoforms for the 408 genes in ERISdb with U12-type introns were downloaded from MaizeGDB (21, 56). Each protein isoform was queried against the human RefSeq protein database using BLASTP. Human homologs were defined as the best match with a minimum bit score of 80 (Dataset S4). National Center for Biotechnology Information (NCBI) Gene database unique identifiers were retrieved for each human RefSeq accession along with the corresponding HUGO Gene Nomenclature Committee (HGNC) human gene symbols (57). Human gene symbols containing U12-type introns were identified using the “intron FASTA” query at U12DB to download the complete set of U12-type introns (25). ENSEMBL gene identifiers were parsed from the FASTA file, and identifiers that were still current were converted to NCBI Gene database unique identifiers and human gene symbols using DAVID (58, 59). Predicted protein sequences from retired ENSEMBL gene identifiers were used to query the human RefSeq protein database using BLASTP. Intron sequences that did not have an ENSEMBL identifier were queried against the NCBI human genomic plus transcript database using BLASTN. The complete list of U12DB intron identifiers with current NCBI Gene database unique identifiers and human gene symbols is given in Dataset S5. The human gene symbols from maize–human homologs were used to cross-reference the human gene symbols identified from U12DB to determine maize and human genes that both contained a U12-type intron (Table S3).To determine U12-type intron positions relative to maize protein sequences, the 5′-exon at ERISdb was queried with BLASTX against the maize protein isoforms from the B73 RefGen_v3 annotation. Human protein isoforms were downloaded from the Consensus CDS protein set at the NCBI CCDS database (60). The 5′-exon sequence given at U12DB was used for pattern match searches of the consensus transcripts. The relative amino acid position of the U12-type intron in the maize and human homologs were determined using Clustal Omega (61). All maize protein isoforms related to each human protein were aligned along with protein isoforms that were truncated at the last complete codon in the exon 5′ of the U12-type intron. The U12-type intron position was scored as identical if the C-terminal residues of the maize and human truncated isoforms were at the same position in the alignment. Normalized U12-type intron positions were determined based on the amino acid residues for each protein isoform and then averaged by gene. When a protein had more than one U12-type intron, normalized positions were calculated independently for each intron. U12-type intron retention models were constructed for E2F gene family members using the first annotated transcript and protein isoform for each gene.
Isolation of Polysomes.
Normal and rgh3 seedlings were collected at 3- to 5-cm stage. For each biological replicate, seedlings from two self-pollinated families were combined, flash frozen, and pulverized in liquid nitrogen. Total, nuclei, and polysomal RNA fractions were extracted from 2.5–3 mL of ground tissue using the conventional isolation of plant polysomes protocol as described (62). Initial pellets to clarify the tissue extract were saved as the nuclear fraction, and 500 µL of the clarified supernatant was saved as the total RNA fraction. RNA from all fractions was extracted with Qiagen RNeasy kits and treated with Ambion TURBO DNA-free kit (Thermo Fisher Scientific) following the manufacturer’s instructions. RT-PCR was completed from 1 μg of RNA using M-MLV reverse transcriptase (Promega) to synthesize cDNA as described (63). PCR cycles were optimized for each gene.
Subcellular Localization.
The RGH3-GFP and U2AF2 fluorescent protein fusion constructs were previously described (19). Rgh3ΔUHM was cloned from Rgh3α by amplifying an N-terminal fragment from the start codon to the end of the first zinc finger using primers Rgh3ΔUHM-1 and Rgh3ΔUHM-2 (Table S4). A C-terminal fragment was amplified from the second zinc finger to the C-terminal stop codon using primers Rgh3ΔUHM-3 and Rgh3ΔUHM-4. The fragments were ligated by overlap extension PCR (64). U2AF2 and U2AF1 were amplified from leaf cDNA of normal 14-d-old seedlings. The RT-PCR products were subcloned into pENTR vector (Invitrogen) and then cloned into pB7-WGF2, pB7-RWG2, or pB7-FWG2 (65, 66) as described (19). BiFC constructs were created by transferring coding sequences from pENTR vector clones to pSAT4-DEST or pSAT5-DEST vectors (67, 68) by LR clonase reactions following manufacturer’s instructions (Invitrogen). The recombined transcription cassettes were digested with I-CeuI or I-SceI and ligated into the pPZP-RCS2 binary vector (67). Transient expression in Nicotiana benthamia and microscopy analyses were completed as described with some modifications (19). Binary vectors were transformed into Agrobacterium tumefaciens strain GV3101. Protein expression was visualized 24–48 h after transient transformation. YFP was excited at 514 nm and detected with an emission band of 525–565 nm.
Table S4.
Primers used in this study
Primer name
Sequence
RT-PCR primers
Actin-L
CATGAGGCCACGTACAACTCCATC
Actin-R
TCATACTCTCCCTTGGAGATCCAC
GRMZM2G011636-L
CTTCCATTGTCGGAGGGATTAG
GRMZM2G011636-R
GGTCACACAAAGTCAAATAGCAAAC
GRMZM2G021272-L
GGATTTGTTTGTCGCCAACT
GRMZM2G021272-R
AGCCCGTATAGCATCATTGC
GRMZM2G033430-L
CATGTGCGTGTCAGTTACAAATATC
GRMZM2G033430-R
GATGAGAGGTCCACCATCC
GRMZM2G040401-L
ATGCTTTCATAGGTTTAGGCTTCC
GRMZM2G040401-R
CCTTGGCGCTTGTCTTTTC
GRMZM2G052515-L
CTCCTCCAAGGCCTACACAG
GRMZM2G052515-R
ATCCTCCGCGTTGAATTTCT
GRMZM2G074015-L
TGGGAGTCAGCCATTCTTCTATA
GRMZM2G074015-R
CCTGGAGCAAAGTACTGGATAC
GRMZM2G083620-L
GTTGTGGGTGATGATGGTTCT
GRMZM2G083620-R
CTGCTCTTCTGCTCTGCTAG
GRMZM2G093716-L
GGAAAGTTGCGGTTGTCAAT
GRMZM2G093716-R
TCTGGCATTTGAGTGAAGGA
GRMZM2G096600-L
CAAGTTGCCTGAAGAAATACTGC
GRMZM2G096600-R
TCTGGTGGAAAGAAGACTCCT
GRMZM2G097568-L
TGTTCAGGATCTAGCATGGACA
GRMZM2G097568-R
GCAGGTAAAGCGGGTTGA
GRMZM2G103152-L
TGTCTTTGCTGAAAAGGAGAACTTA
GRMZM2G103152-R
ATGCTCAATGCCATCAATAACAGA
GRMZM2G106613-L
CCCAGAAAAACACCAATGCTTC
GRMZM2G106613-R
ATGGCACGCATCTTTGCT
GRMZM2G130432-L
GCATTTAGGCGGCGTGT
GRMZM2G130432-R
AGTTATCTGTGTCAGCACATTGATC
GRMZM2G131321-L
AGCGAAGGTTACCCCAAAG
GRMZM2G131321-R
GGAGGGTGCGTGAATAGG
GRMZM2G133028-L
AGTTACTGCTATCATCGTTGTTCC
GRMZM2G133028-R
ACCTGAACCAGTTAGAAAAGAGTG
GRMZM2G153434-L
TAGAATCGGGAGGATGTGTTTTG
GRMZM2G153434-R
CCAAGCAGCAACCAGTGA
GRMZM2G177026-L
AGAATTGCAGCGTGTCACAG
GRMZM2G177026-R
AGCTTCGAGTGGTGCTGTTT
GRMZM2G306935-L
GTTTGTCGGTGGTTTATTTGTCAG
GRMZM2G306935-R
CCTTTTGTGCCAGCAATGTG
GRMZM2G408305-L
GTGGACATTGATGATGCCGATA
GRMZM2G408305-R
AGCCTGACATCTTCATGGAAATAAC
GRMZM2G416751-L
GACTGGACCTGGTCTGTG
GRMZM2G416751-R
CCATTTGAAGCATCCTCAAGC
GRMZM2G587327-L
CGACTCCTGGGTCTTCAC
GRMZM2G587327-R
CCTTTGCCTGACGACGTT
GRMZM5G820727-L
CTTGAGCGAAGAGCTTCAGAA
GRMZM5G820727-R
CGTTTCATTGTTGTCTGTAACTTCC
mir156-L
GCACACACACAACCTGTTCA
mir156-R
CAGATGGGCTTGATGAGTGA
Rsp31B-L
GGATTTGTTTGTCGCCAACT
Rsp31B-R
AGCCCGTATAGCATCATTGC
Ubiquitin-L
TAAGCTGCCGATGTGCCTGCG
Ubiquitin-R
CTGAAAGACAGAACATAATGAGCACA
Recombinant cloning
Rgh3ΔUHM-1
ACAAGATGGAAGGCGGCCATATGCGG
Rgh3ΔUHM-2
CCGCATATGGCCGCCTTCCATCTTGTTGATTTATCAGGGTAAAAGTG
Rgh3ΔUHM-3
CCCTGATAAATCAACAAGATGGAAGGCGGCC
Rgh3ΔUHM-4
TGATTTATCAGGGTAAAAGTG
U2AF1-L
CACCATGGCTGAGCATCTTGCGTCCATCTTTG
U2AF1-R
TTTCACCTGAGCTGCCTCCCGCTCGCGGTTC
U2AF2-L
CACCATGTCCGAGTACGACGAGCGCTACC
U2AF2-R
TTTAAGCCCGATTGCCAATATTGCATAGGG
Primers used in this study
Protein Analyses.
Rabbit polyclonal N-terminal RGH3 antibodies was raised against the synthetic peptide: SAQEVLDKVAQETPNFGTE (Bio Synthesis). Total protein from seed tissue was extracted as described (69) except that the fresh tissue extracts were cleared with a single filtration step. For the in vitro transcription/translation, ORFs from described Rgh3 variants (19) were amplified with Phusion polymerase (Finnzymes) using primers containing an N-terminal 6×His tag and restriction sites for Sgf1 (Promega) and Pme1 (New England Bio Labs). PCR products were digested and ligated into pF3AWG vectors (Promega) and expressed using the TnT SP6 High-Yield Wheat Germ Protein Expression System (Promega) following the manufacturer’s instructions.
Authors: Heli K J Pessa; Dario Greco; Jouni Kvist; Gudrun Wahlström; Tapio I Heino; Petri Auvinen; Mikko J Frilander Journal: PLoS One Date: 2010-10-11 Impact factor: 3.240
Authors: Wan-Ping Lee; Michael P Stromberg; Alistair Ward; Chip Stewart; Erik P Garrison; Gabor T Marth Journal: PLoS One Date: 2014-03-05 Impact factor: 3.240
Authors: Catherine M Farrell; Nuala A O'Leary; Rachel A Harte; Jane E Loveland; Laurens G Wilming; Craig Wallin; Mark Diekhans; Daniel Barrell; Stephen M J Searle; Bronwen Aken; Susan M Hiatt; Adam Frankish; Marie-Marthe Suner; Bhanu Rajput; Charles A Steward; Garth R Brown; Ruth Bennett; Michael Murphy; Wendy Wu; Mike P Kay; Jennifer Hart; Jeena Rajan; Janet Weber; Catherine Snow; Lillian D Riddick; Toby Hunt; David Webb; Mark Thomas; Pamela Tamez; Sanjida H Rangwala; Kelly M McGarvey; Shashikant Pujar; Andrei Shkeda; Jonathan M Mudge; Jose M Gonzalez; James G R Gilbert; Stephen J Trevanion; Robert Baertsch; Jennifer L Harrow; Tim Hubbard; James M Ostell; David Haussler; Kim D Pruitt Journal: Nucleic Acids Res Date: 2013-11-11 Impact factor: 16.971
Authors: Camila Ribeiro; Tracie A Hennen-Bierwagen; Alan M Myers; Kenneth Cline; A Mark Settles Journal: Proc Natl Acad Sci U S A Date: 2020-12-15 Impact factor: 11.205
Authors: Amy E Siebert; Jacob Corll; J Paige Gronevelt; Laurel Levine; Linzi M Hobbs; Catalina Kenney; Christopher L E Powell; Fabia U Battistuzzi; Ruth Davenport; A Mark Settles; W Brad Barbazuk; Randal J Westrick; Gerard J Madlambayan; Shailesh Lal Journal: Genetics Date: 2022-09-30 Impact factor: 4.402
Authors: Daichi Inoue; Jacob T Polaski; Justin Taylor; Pau Castel; Sisi Chen; Susumu Kobayashi; Simon J Hogg; Yasutaka Hayashi; Jose Mario Bello Pineda; Ettaib El Marabti; Caroline Erickson; Katherine Knorr; Miki Fukumoto; Hiromi Yamazaki; Atsushi Tanaka; Chie Fukui; Sydney X Lu; Benjamin H Durham; Bo Liu; Eric Wang; Sanjoy Mehta; Daniel Zakheim; Ralph Garippa; Alex Penson; Guo-Liang Chew; Frank McCormick; Robert K Bradley; Omar Abdel-Wahab Journal: Nat Genet Date: 2021-04-12 Impact factor: 38.330
Authors: Wenbin Mei; Sanzhen Liu; James C Schnable; Cheng-Ting Yeh; Nathan M Springer; Patrick S Schnable; William B Barbazuk Journal: Front Plant Sci Date: 2017-05-10 Impact factor: 5.753