Yusuke Yagi1, Makoto Tachikawa2, Hisayo Noguchi1, Soichirou Satoh2, Junichi Obokata2, Takahiro Nakamura3. 1. Faculty of Agriculture; Kyushu University; Fukuoka, Japan. 2. Graduate School of Life and Environmental Sciences; Kyoto Prefectural University; Kyoto, Japan. 3. Faculty of Agriculture; Kyushu University; Fukuoka, Japan; Biotron Application Center; Kyushu University; Fukuoka, Japan.
Abstract
C-to-U RNA editing has been widely observed in organellar RNAs in terrestrial plants. Recent research has revealed the significance of a large, plant-specific family of pentatricopeptide repeat (PPR) proteins for RNA editing and other RNA processing events in plant mitochondria and chloroplasts. PPR protein is a sequence-specific RNA-binding protein that identifies specific C residues for editing. Discovery of the RNA recognition code for PPR motifs, including verification and prediction of the individual RNA editing site and its corresponding PPR protein, expanded our understanding of the molecular function of PPR proteins in plant organellar RNA editing. Using this knowledge and the co-expression database, we have identified two new PPR proteins that mediate chloroplast RNA editing. Further, computational target assignment using the PPR RNA recognition codes suggests a distinct, unknown mode-of-action, by which PPR proteins serve a function beyond site recognition in RNA editing.
C-to-U RNA editing has been widely observed in organellar RNAs in terrestrial plants. Recent research has revealed the significance of a large, plant-specific family of pentatricopeptide repeat (PPR) proteins for RNA editing and other RNA processing events in plant mitochondria and chloroplasts. PPR protein is a sequence-specific RNA-binding protein that identifies specific C residues for editing. Discovery of the RNA recognition code for PPR motifs, including verification and prediction of the individual RNA editing site and its corresponding PPR protein, expanded our understanding of the molecular function of PPR proteins in plant organellar RNA editing. Using this knowledge and the co-expression database, we have identified two new PPR proteins that mediate chloroplast RNA editing. Further, computational target assignment using the PPR RNA recognition codes suggests a distinct, unknown mode-of-action, by which PPR proteins serve a function beyond site recognition in RNA editing.
RNA Editing in Plant Chloroplasts and Mitochondria
RNA metabolism plays a particularly important role in organelle gene expression, mediating a number of distinct steps: RNA cleavage, splicing, translation, RNA stability and RNA editing., During RNA editing, RNA sequences are modified by insertion, deletion and conversion.- A substantial portion of plant organellar RNAs undergo C-to-U editing, modification of C in the genomic sequence to U in the transcript. Arabidopsis contains 34 and 488 C-to-U editing sites in RNAs transcribed from the chloroplast and mitochondrial genomes, respectively. Recently, a strand-specific deep sequencing data set revealed nine additional sites with a low extent of editing in Arabidopsis chloroplast.How a specific C residue is recognized for editing was first revealed by a plastid transformation technique in tobacco. This technique revealed that the RNA region required for editing is included within a 22-nucleotide segment comprised of 16 nucleotides upstream and five nucleotides downstream of the psbL editing site. Later in organelle assays in mitochondria and in vitro editing systems for chloroplasts and mitochondria were used to identify the cis-regulatory element required for RNA editing.- RNA editing in chloroplasts and mitochondria requires cis-elements of almost identical lengths and positions. Protein factors, rather than RNA, mediate recognition of the cis-elements for RNA editing in chloroplasts.,,
PPR Protein as Site-Recognition Factor for Organellar RNA Editing
The first identified trans-acting factor was CRR4, which was isolated by a forward genetic screen. The CRR4 gene encodes a pentatricopeptide repeat (PPR) protein that is essential for an RNA editing site on the ndhD RNA. CRR4 specifically interacts with the cis-element of the region -25/+10 spanning the editing site in vitro. After the discovery of CRR4, various forward genetic screens identified additional trans-acting factors for other chloroplast RNA editing sites., An important observation is that all the trans-factors identified in these studies belong to the E and DYW subgroups of the PLS subfamily of the PPR protein family. PPR motif-containing proteins are characterized by the presence of the degenerate 35 amino acid repeat for which this family of proteins is named. PPR protein genes are highly expanded in terrestrial plants; for instance, there are 450 members in Arabidopsis, but < 30 in other eukaryotes. Most PPR proteins are thought to be targeted to mitochondria or chloroplasts. PPR proteins are sequence-specific RNA-binding proteins that mediate all aspects of RNA processing in plant organelles, including RNA editing. The PPR family contains two subfamilies, P (classical P) and PLS. Editing-type PPR proteins belong to the plant-specific PLS subfamily. The typical PLS motif organization and the presence of a chloroplast-targeting signal enabled a systematic reverse genetic screen that identified six additional PPR proteins as trans-acting editing factors.An alternative strategy involving the use of natural editing variation in Arabidopsis identified MEF1, the first trans-acting factor associated with mitochondrial editing. Establishment of a high-throughput method to screen editing variations facilitated the identification of other mutants for mitochondrial RNA editing.,- Most of the proteins involved in mitochondrial RNA editing are PLS-type PPR proteins, as found in chloroplasts.
RNA Recognition of PPR Protein
We and another group recently discovered the principle underlying RNA recognition for PPR proteins involved in RNA editing., Editing PPR proteins recognize the 5′ region upstream of the editable C residue: the last PPR motif of the editing PPR protein was located four nucleotides before the editable C (Fig. 1). One PPR motif corresponds to one nucleotide, and amino acid variation at three positions (residues 1, 4 and ii) confers RNA target specificity. Two of the three residues have also been shown to determine PPR RNA recognition specificity (6 and 1′; residues 4 and ii, respectively, in our numbering in ref. 32).
Figure 1. Proposed model for RNA editing site recognition by PPR protein. RNA-editing PPR protein contains tandem repeats of the PPR motif (box) and auxiliary motif(s) (E, and DYW motifs; diamond) at the C terminus. The PPR protein interacts with RNA in a 1-motif to 1-nucleotide ratio in a linear, contiguous fashion. The last PPR motif is located four nucleotides before the editable C residue. Amino acid variation at three particular positions (residues 1, 4 and ii) within the motif determined recognition of a specific RNA in a predictable manner.
Figure 1. Proposed model for RNA editing site recognition by PPR protein. RNA-editing PPR protein contains tandem repeats of the PPR motif (box) and auxiliary motif(s) (E, and DYW motifs; diamond) at the C terminus. The PPR protein interacts with RNA in a 1-motif to 1-nucleotide ratio in a linear, contiguous fashion. The last PPR motif is located four nucleotides before the editable C residue. Amino acid variation at three particular positions (residues 1, 4 and ii) within the motif determined recognition of a specific RNA in a predictable manner.During RNA recognition, PPR protein residue 4 appears to be most important, discriminating purine or pyrimidine groups (A/G or U/C). The next most important residue is the “ii” residue, which recognizes amino or keto groups (A/C or G/U). Residue 1 is less important, but provides an additional constraint to binding nucleotides. The RNA recognition code facilitated computational assignment of PPR proteins to their corresponding sites of interaction.
Identification of New PPR Proteins for Chloroplast Editing
Various approaches have been used to identify PPR proteins involved in RNA editing. To identify additional PPR gene(s) for chloroplast RNA editing, we focused on the expression patterns of PPR transcripts. PPR gene expression has received little attention, because many PPR genes are assumed to be expressed ubiquitously and in low abundance. Based on the assumption that functionally related genes will be co-expressed (e.g., for chloroplast gene expression), co-regulated PPR genes were surveyed in the ATTED-II gene co-expression database (atted.jp/). The database was searched by using characterized chloroplast editing PPR genes as queries; we found that several PPR genes converged into the same co-expression cluster. The co-expression profiles for editing PPR genes were further analyzed by a hierarchical clustering using the PLS subfamily of 177 PPR genes in Arabidopsis, for which expression patterns have been registered in the database.We focused on a single co-expression cluster (Fig. 2A) that is enriched in chloroplast RNA editing PPR genes such as CRR21 (At5g55740), CRR28 (At1g59720), LPA66 (At5g48910), OTP84 (At3g57430) and OTP85 (At2g02980). The in silico target assignments were performed for uncharacterized PLS gene products in the co-expression cluster against all 34 chloroplast editing sites. Significant p values (p < 10−3) were obtained for several candidate PPR genes. Among them, At3g14330 and At5g66520 exhibited significant p values against the editing sites of psbE (64109) and ndhB (95225), and were selected for further analyses (Figs. 2B and 3).
Figure 2. Identification of CREF3 and CREF7 as site recognition factors for chloroplast RNA editing. (A) Hierarchical clustering of PPR genes based on the co-expression profile in ATTED-II (atted.jp/). Hierarchical clustering was constructed for 177 Arabidopsis PLS-type PPR genes using the ward method. A co-expression cluster enriched for chloroplast RNA editing PPR protein genes is shown. The newly identified editing PPR genes, CREF3 (At3g14330) and CREF7 (At5g66520), are underlined. (B) In silico target assignment for the PPR proteins. The target assignment was conducted as described in reference 27, using CREF3 (AT3G14330) and CREF7 (AT5G66520) against 34 editing sites in Arabidopsis chloroplasts. The deduced nucleotide frequencies for CREF3 and 7 are shown in Figure 3. The diamond represents the p value for the matching score against the editing site. The red diamond indicates the editing site for which a deficiency was observed in the mutant. (C) RNA-editing defects in the T-DNA insertion lines. RT-PCR and direct sequencing for the 34 chloroplast editing sites determined editing status. Sequence chromatograms for psbE (64109) and ndhB (95225) in the T-DNA mutants of Salk_077977 (At3g14330-deficient strain) and Salk_078415 (At5g66520 deficient strain) are shown. Open arrow indicates the edited U (T in cDNA), the closed arrow indicates unedited C in the RNA.
Figure 3. PPR context and the decoded nucleotide frequencies. The nucleotide-specifying residues (NSRs; residues 1, 4 and “ii”) were extracted from the PPR motifs of CREF3, CREF7, RARE1 and AtECB2, according to Figure 1. The NSRs were converted into a probability matrix that indicated the decoded nucleotide frequency according to the PPR code. The probability matrix was also shown by logo (weblogo.threeplusone.com/create.cgi). The NSRs that do not match known PPR codes (shown as ***), were converted to the background frequency. The asterisks indicate any amino acid. The target RNA sequence (Seq) is shown: psbE (64109) for CREF3, ndhB (95225) for CREF7 and accD (57868) for RARE1 and AtECB2.
Figure 2. Identification of CREF3 and CREF7 as site recognition factors for chloroplast RNA editing. (A) Hierarchical clustering of PPR genes based on the co-expression profile in ATTED-II (atted.jp/). Hierarchical clustering was constructed for 177 Arabidopsis PLS-type PPR genes using the ward method. A co-expression cluster enriched for chloroplast RNA editing PPR protein genes is shown. The newly identified editing PPR genes, CREF3 (At3g14330) and CREF7 (At5g66520), are underlined. (B) In silico target assignment for the PPR proteins. The target assignment was conducted as described in reference 27, using CREF3 (AT3G14330) and CREF7 (AT5G66520) against 34 editing sites in Arabidopsis chloroplasts. The deduced nucleotide frequencies for CREF3 and 7 are shown in Figure 3. The diamond represents the p value for the matching score against the editing site. The red diamond indicates the editing site for which a deficiency was observed in the mutant. (C) RNA-editing defects in the T-DNA insertion lines. RT-PCR and direct sequencing for the 34 chloroplast editing sites determined editing status. Sequence chromatograms for psbE (64109) and ndhB (95225) in the T-DNA mutants of Salk_077977 (At3g14330-deficient strain) and Salk_078415 (At5g66520 deficient strain) are shown. Open arrow indicates the edited U (T in cDNA), the closed arrow indicates unedited C in the RNA.Figure 3. PPR context and the decoded nucleotide frequencies. The nucleotide-specifying residues (NSRs; residues 1, 4 and “ii”) were extracted from the PPR motifs of CREF3, CREF7, RARE1 and AtECB2, according to Figure 1. The NSRs were converted into a probability matrix that indicated the decoded nucleotide frequency according to the PPR code. The probability matrix was also shown by logo (weblogo.threeplusone.com/create.cgi). The NSRs that do not match known PPR codes (shown as ***), were converted to the background frequency. The asterisks indicate any amino acid. The target RNA sequence (Seq) is shown: psbE (64109) for CREF3, ndhB (95225) for CREF7 and accD (57868) for RARE1 and AtECB2.The candidate gene for At3g14330 encodes a 710-amino acid protein belonging to the DYW subgroup containing 10 PPR motifs (www.uniprot.org/). The T-DNA insertion line of Salk_077977 was obtained as a putative At3g14330 disruptant from the ABRC stock center. PCR analysis confirmed an insertion of the T-DNA in the coding regions of the transgenic line, suggesting the mutant allele yields a null phenotype. The mutant line and wild-type (Colombia) seeds were grown on half-MS medium containing 1% sucrose under continuous white light for 2 wk at 23°C. Then, the plants were grown in soil for an additional 2 wk. The T-DNA line displayed no aberrant macroscopic phenotype including germination, growth, leaf color and seed setting ratio under our growth conditions. RNA was extracted from the leaves of 4-wk-old plants. Editing status of the 34 chloroplast editing sites was determined by using gene-specific primers. The At3g14330-deficient line Salk_077977 showed an RNA editing defect only for psbE (64109), which received the highest prediction score, indicating the At3g14330 gene product would be the site recognition factor for psbE (64109) (Fig. 2C). Therefore, At3g14330 was designated as Chloroplast RNA Editing Factor 3, CREF3.Another candidate, At5g66520, encodes a 620-amino acid protein in the DYW subgroup carrying 10 PPR motifs. The T-DNA insertion line, Salk_078415, was obtained. The mutant strain was suggested to be a null allele of At5g66520, due to T-DNA insertion in the coding region as verified by PCR analysis. This mutant also displayed no aberrant visible phenotype. Experimental verification of the chloroplast editing status of Salk_078415 revealed a deficiency in RNA editing of ndhB (95225), with no changes in the other 33 sites. Thus, the At5g66520 gene was designated Chloroplast RNA Editing Factor 7, CREF7, for editing of ndhB (95225).The in silico target assignments suggest CREF3 and 7 are not involved in nine recently identified novel editing sites with non-significant p values, although experimental verification is required. In addition, further analysis will be required to characterize the physiological and molecular phenotypes of CREF3- or CREF7-deficient lines.
Several Editing PPR Proteins Might Not Mediate Site Recognition
To date, dozens of PPR proteins have been identified for chloroplast and mitochondrial RNA editing sites in Arabidopsis.,- The pentatricopeptide protein of CRR4 interacts specifically with the proximal region (-20 to +10 nt) of the ndhD editing site. This study suggests a multi-component model for plastid RNA editing in which PPR protein recognizes the 5′ cis-acting element, providing the scaffold for RNA editing and facilitating access of the putative RNA editing enzyme. The in silico assignment tests have shown, in many cases, this model can be applied for other PPR proteins involved in RNA editing in chloroplast and mitochondria, because the deduced nucleotide frequencies for the PPR tract would be appropriate for specific interaction with the genetically identified editing site(s). This analysis further suggested the PPR tract aligned with the 5′ region by fitting the last PPR motif on the -4 nt of the editable C residue. Alternatively, the putative RNA editing enzyme might interact with the vicinity of the editing site (-3 to +10). The identity of the RNA editing enzyme remains unknown. The DYW motif is putatively responsible for the catalytic reaction, due to sequence similarities with cytidine deaminases and the strict phylogenetic correlation with RNA editing. This hypothesis is under debate. RNA deamination activity has not been observed in the DYW motif and genetic analysis has indicated the dispensability of DYW motifs CRR22 and CRR28 in vivo. In contrast, the importance of the DYW motif has been shown in DYW1, identified as an RNA editing factor acting specifically on the plastid ndhD editing site recognized by CRR4. DYW1 consists of a DYW motif with no PPR motifs and mediates RNA editing via physical interaction with CRR4. This finding suggests some DYW-containing PPR proteins may not be involved in site-recognition, but act in RNA editing with other canonical PPR protein(s) for the site-recognition. Indeed, the in silico target assignment suggested contradictory observations in several PPR proteins. For example, AtECB2 (also known as VAC1) and RARE1 were identified as editing PPR proteins targeting the same accD (57868) site in Arabidopsis chloroplasts.- A RARE1-deficient strain has no visible physiological defects, while loss of AtECB2 causes aberrant profiles of various RNAs, including the loss of multiple editing events and severe developmental defects. The in silico target assignment test showed that RARE1, but not AtECB2, is clearly assigned to the accD (57868) site (Fig. 4A). This analysis suggests RARE1 is indeed a site-recognition factor for accD (57868). Further, AtECB2 is required for accD editing, but might not be required for site recognition. However, the ATTED-II co-expression profile indicated that AtECB2 and RARE1 are highly related. This may suggest the two PPR proteins could be cooperatively involved in accD editing, and AtECB2 may interact with RARE1, as it does with DYW1 and CRR4. It is possible that AtECB2 interacts with the RNA in an unusual alignment and/or orientation. Therefore, interaction of AtECB2 with the accD-editing site was analyzed in various configurations. For instance, PPR protein in the N to C terminus orientation would interact with RNA in the 5′ to 3′ orientation for site recognition during organellar RNA editing. The last motif is located four nucleotides before the editable C residue, as shown by the significant p value at position -4 for RARE1 (Fig. 4B). In contrast to RARE1, AtECB2 yielded no significant p values at any position (the -20 to +20 in Fig. 4B). The in silico test was extended by PPR-RNA alignment with an opposite-oriented protein (Fig. 4C). Significance was not observed for RARE1 and AtECB2 in this anomalous alignment. The in silico assignment tests suggest the AtECB2 PPR context did not match, in any possible configuration, with the RNA sequence vicinity of the accD editing site.
Figure 4. In silico target assignment for RARE1 and AtECB2, required for editing of accD (57868). (A) Conventional in silico target assignment was performed for RARE1 and AtECB2 against 34 RNA editing sites in Arabidopsis chloroplasts. The editing site of accD (57868) is highlighted in red. The deduced nucleotide frequencies for RARE1 and AtECB2 are shown in Figure 3. (B) The target assignment test was performed in various configurations. Model A depicts the protein aligned in the N to C terminus orientation with the 5′ to 3′ nucleotide orientation, as representative of editing PPR proteins. Position 0 indicates that the last PPR motif is located at the editable C residue. The p values were estimated by fitting the last PPR motif into the -20th to +20th nucleotide around the editable C residues. (C) Model B, the protein was aligned to the nucleotide, but in the opposite orientation.
Figure 4. In silico target assignment for RARE1 and AtECB2, required for editing of accD (57868). (A) Conventional in silico target assignment was performed for RARE1 and AtECB2 against 34 RNA editing sites in Arabidopsis chloroplasts. The editing site of accD (57868) is highlighted in red. The deduced nucleotide frequencies for RARE1 and AtECB2 are shown in Figure 3. (B) The target assignment test was performed in various configurations. Model A depicts the protein aligned in the N to C terminus orientation with the 5′ to 3′ nucleotide orientation, as representative of editing PPR proteins. Position 0 indicates that the last PPR motif is located at the editable C residue. The p values were estimated by fitting the last PPR motif into the -20th to +20th nucleotide around the editable C residues. (C) Model B, the protein was aligned to the nucleotide, but in the opposite orientation.Similar contradictory observations have been made for several PPR proteins that are involved in RNA editing but contain only short PPR stretches, such as MEF8 (5 PPRs) and MEF8S (5 PPRs) in Arabidopsis and OGR1 in rice (6 PPRs)., The short PPR tract may not be sufficient to discriminate specific C residues. Indeed, the in silico assignment test failed to find the genetically identified editing sites for these PPR molecules. Thus, these PPR proteins may mediate RNA editing, as DYW1 interacts with other PPR proteins or via previously unexpected modes of action.
Perspective
Recent studies have added to our understanding of the molecular mechanism of plant organellar RNA editing, particularly the role of PPR protein. Most PLS proteins may act as site-recognition factors, as verified in vitro and in silico assignment tests.,,– However, we introduced several PLS-type PPR proteins, such as AtECB2, that may have unusual roles in RNA editing. These molecular mechanisms should be further analyzed with the newly identified non-PPR editing component MORF.,This study, combining co-expression profiling and in silico target assignment based on the PPR code, facilitated the identification of two PPR proteins for uncharacterized RNA editing sites in chloroplasts. This method can be applied to PPR proteins in mitochondrial RNA editing, deficiencies in which result in various physiological abnormalities: embryonic lethal, abiotic and biotic stress responses.- Co-expression analysis suggests the same nuclear signaling pathways coordinately regulate multiple organelle genes via PPR proteins. Indeed, co-expression of several PPR genes involved in mitochondria RNA processing has been observed during Arabidopsis germination. Further analysis of co-expressed PPR genes could allow us to define the regulatory network of nuclear-cytoplasmic interaction in terrestrial plants.RNA editing has received particular attention because RNA editing challenges the central dogma of molecular biology by changing genetic information at the transcript level. The raison d’être of RNA editing in plant organelles has been under debate, such as the consequence of the genetic antagonism between organellar and nuclear genomes, or correction of organellar genome mutations., Further study on PPR protein will facilitate the understanding of this peculiar process, and the large expansion of this protein family in terrestrial plants.
Authors: Qiang Zhu; Jasper Dugardeyn; Chunyi Zhang; Mizuki Takenaka; Kristina Kühn; Christian Craddock; Jan Smalle; Michael Karampelias; Jurgen Denecke; Janny Peters; Tom Gerats; Axel Brennicke; Peter Eastmond; Etienne H Meyer; Dominique Van Der Straeten Journal: Plant J Date: 2012-06-25 Impact factor: 6.417
Authors: Stephane Bentolila; Wade P Heller; Tao Sun; Arianne M Babina; Giulia Friso; Klaas J van Wijk; Maureen R Hanson Journal: Proc Natl Acad Sci U S A Date: 2012-05-07 Impact factor: 11.205
Authors: Katrin Stoll; Christian Jonietz; Sarah Schleicher; Catherine Colas des Francs-Small; Ian Small; Stefan Binder Journal: Plant Mol Biol Date: 2017-02-22 Impact factor: 4.076
Authors: Fan Zhang; Weijiang Tang; Boris Hedtke; Linlin Zhong; Lin Liu; Lianwei Peng; Congming Lu; Bernhard Grimm; Rongcheng Lin Journal: Proc Natl Acad Sci U S A Date: 2014-01-13 Impact factor: 11.205