Literature DB >> 24004908

PPR-SMRs: ancient proteins with enigmatic functions.

Sheng Liu1, Joanna Melonek1, Laura M Boykin2, Ian Small3, Katharine A Howell1.   

Abstract

A small subset of the large pentatricopeptide repeat (PPR) protein family in higher plants contain a C-terminal small MutS-related (SMR) domain. Although few in number, they figure prominently in the chloroplast biogenesis and retrograde signaling literature due to their striking mutant phenotypes. In this review, we summarize current knowledge of PPR-SMR proteins focusing on Arabidopsis and maize proteomic and mutant studies. We also examine their occurrence in other organisms and have determined by phylogenetic analysis that, while they are limited to species that contain chloroplasts, their presence in algae and early branching land plant lineages indicates that the coupling of PPR motifs and an SMR domain into a single protein occurred early in the evolution of the Viridiplantae clade. In addition, we discuss their possible function and have examined conservation between SMR domains from Arabidopsis PPR proteins with those from other species that have been shown to possess endonucleolytic activity.

Entities:  

Keywords:  Arabidopsis thaliana; Zea mays; chloroplast; endonuclease; genomes uncoupled; mitochondria; pentatricopeptide repeat protein; plastid; small MutS-related domain

Mesh:

Substances:

Year:  2013        PMID: 24004908      PMCID: PMC3858433          DOI: 10.4161/rna.26172

Source DB:  PubMed          Journal:  RNA Biol        ISSN: 1547-6286            Impact factor:   4.652


Introduction

The pentatricopeptide repeat (PPR) protein family was serendipitously discovered as a result of computational analysis of the then incomplete Arabidopsis thaliana genome sequence for gene products likely to be targeted to plastids and mitochondria. While subsequent analysis revealed that these proteins are ubiquitous in eukaryotes, they were found to be particularly prevalent in terrestrial plants (e.g., 450 members in Arabidopsis).– Since their discovery, a plethora of genetic, molecular, and biochemical evidence suggests that PPR proteins bind RNA in a highly specific manner and facilitate events such as cleavage, editing, splicing, turnover, and translation of their target organellar transcript(s).,, PPR proteins are defined by the presence of tandem repeats of degenerate 31–36 amino acid motifs and can be classified based on motif structure and the presence of additional C-terminal domains. The P subfamily consists of PPR proteins with orthodox 35 amino acid PPR (P) motifs, while the PLS subfamily includes PPR proteins with additional long (L) or short (S) motif variants and derive their name from their characteristic tandem arrays of P-L-S motif triplets. PLS PPR proteins are further classified, based on their C-terminal domain(s), into the E, E+, and DYW subgroups. In addition, while not yet formally recognized as subgroups, P-class PPR proteins can also be categorized by the presence of additional domains, such as the small MutS-related (SMR) domain. Searching the Arabidopsis genome reveals that eight proteins contain both PPR motifs and an SMR domain (Fig. 1). Despite the relatively small size of this subgroup, there has been sustained interest in this type of PPR protein since the revelation that genomes uncoupled 1 (GUN1) encodes a PPR protein with a C-terminal SMR domain. GUN1 is a central regulator of plastid retrograde signaling, where the developmental and/or functional state of the plastid exerts control on the expression of nuclear genes encoding plastid-localized proteins, such as photosynthesis-associated nuclear genes (PhANGs). Despite this important role, we still do not understand the precise molecular mechanisms of GUN1 and other proteins with similar domain architecture and what specific role, if any, the SMR domain plays in their function. This review will focus on this small but important group of PPR proteins that contain an SMR domain by summarizing our current knowledge from studies performed in higher plants, examining their presence in other organisms and discussing the possible role of the SMR domain.

Figure 1. Proteins containing an SMR domain in the model plant Arabidopsis thaliana. A non-redundant set of 12 proteins was identified by searching the Universal Protein knowledgebase (UniProt; www.uniprot.org) for Arabidopsis proteins that contain the InterPro domain IPR002625 (Smr protein/MutS2 C-terminal domain). Proteins are denoted by their corresponding Arabidopsis Genome Identifier (AGI; ATXGXXXXX) and, if applicable, followed by their common name (e.g., GUN1). Protein domain structure is shown alongside each AGI to demonstrate presence and location of the pentatricopeptide repeat (PPR), small MutS-related (SMR), domain of unknown function (DUF) 1771, and MutS domains. The schematics of protein domain structure were created by combining TPRpred to predict PPR domains (those with P > 0.01 were excluded), InterProScan to identify other domains and DOG 1.0 for visualization of their respective positions.

Figure 1. Proteins containing an SMR domain in the model plant Arabidopsis thaliana. A non-redundant set of 12 proteins was identified by searching the Universal Protein knowledgebase (UniProt; www.uniprot.org) for Arabidopsis proteins that contain the InterPro domain IPR002625 (Smr protein/MutS2 C-terminal domain). Proteins are denoted by their corresponding Arabidopsis Genome Identifier (AGI; ATXGXXXXX) and, if applicable, followed by their common name (e.g., GUN1). Protein domain structure is shown alongside each AGI to demonstrate presence and location of the pentatricopeptide repeat (PPR), small MutS-related (SMR), domain of unknown function (DUF) 1771, and MutS domains. The schematics of protein domain structure were created by combining TPRpred to predict PPR domains (those with P > 0.01 were excluded), InterProScan to identify other domains and DOG 1.0 for visualization of their respective positions.

The SMR Domain—What Is It?

MutS proteins are key enzymes involved in repair of mismatched DNA bases produced during biological processes such as DNA replication. The SMR domain was originally identified in the C-terminal region of the MutS2 protein from the cyanobacterium Synechocystis. MutS2 proteins suppress homologous recombination by endonucleolytic digestion of branched DNA structures formed early in this process and the nuclease activity of these proteins has been specifically attributed to the SMR domain. Moreover, while not all organisms have MutS2 orthologs, proteins containing SMR domains are widespread in bacterial and eukaryotic species., In a recent review, Fukui and Kuramitsu introduced a classification system for proteins containing SMR domains. Subfamily 1 consists of MutS2 orthologs and is restricted to proteins from bacterial and plant species, subfamily 2 includes proteins with domains in addition to the SMR domain and are usually found only in eukaryotes, while subfamily 3 comprises “stand-alone” SMR domains and comprises proteins from both prokaryotes and eukaryotes. In Arabidopsis, 12 proteins are found to contain an SMR domain (Fig. 1). As can be seen from the domain structure shown, AT1G65070 belongs to subfamily 1 (MutS2-like) while the remaining 11 proteins are classified as subfamily 2 SMR proteins. Of the 11 subfamily 2 SMR proteins, eight contain PPR motifs. Consistent with nomenclature used recently, we will refer to proteins with this domain architecture as PPR-SMR proteins (PPR-SMRs). When performing BLAST searches using PPR-SMRs, other plant PPRs are identified with C-terminal domains that potentially represent a highly degenerate SMR domain (e.g., AT3G18110/EMB1270). However, their relationship to bona fide PPR-SMRs remains to be clarified and these will not be discussed further.

PPR-SMRs—What Do We Know So Far?

Characterization of PPR-SMRs has focused on higher plant models, such as Arabidopsis and maize. Data collected thus far is derived from a combination of proteomic and mutant analyses and is summarized in Table 1 for the Arabidopsis PPR-SMRs and their maize orthologs.

Table 1. Summary of current knowledge of PPR-SMR proteins in the dicot and monocot plant models, Arabidopsis and maize

ArabidopsisMaize
Locus ID(name)
Subcellular localization
Mutant phenotype
Locus ID(name)
Subcellular localization
Mutant phenotype
AT2G31400 (GUN1)
PlastidYFP7
Normal gross phenotype in dark or light growth conditions but shows defective de-etiolation response27,28; unable to repress PhANGs upon treatment with NF27 or lincomycin7,29
GRMZM2G432850(N/A)
Possibly plastid 57
?
AT1G74850(pTAC2)
PlastidMS: chloroplast,21,58,59 stromal megadalton complex,16 nucleoids,17 TAC19
Seedling lethal, requires exogenous carbon source for further growth, cannot produce seeds, PEP promoter usage affected19
GRMZM2G122116(Zm-pTAC2)
PlastidMS: proplastid,18 nucleoids18
Very pale yellow green(PML: Zm-ptac2–1)Ivory(PML: Zm-ptac2–2,-3)
AT4G16390(SVR7)
PlastidGFP6; MS: chloroplast,21,59 stromal megadalton complex,16 nucleoids17
Slower growth with reduced chlorophyll concentration,25,26 plastid rRNA processing defects,25 reduced translation of plastid ATP synthase subunits26
GRMZM2G128665(ATP4)
PlastidMS: proplastid,18 nucleoids18
Pale green, reduced translation of ATP synthase subunits12 and stability of dicistronic rpl16-rpl14 RNAs36
AT5G46580(N/A)
PlastidMS: chloroplast,59 envelope,21 stromal megadalton complex,16 nucleoids17
?
GRMZM2G438524(PPR53)
PlastidMS: proplastid,18 nucleoids18
Very pale yellow green-virescent (PML: ppr53–1)Ivory-virescent(PML: ppr53–2,-3)
AT2G17033(N/A)
Plastid 60
?
GRMZM2G164202(PPR-SMR4)
PlastidMS: nucleoids18
WT-like (PML: ppr-smr4–1)
AT1G79490(EMB2217)
MitochondrionGFP13
Embryo defective, developmental arrest occurs at globular stage15
GRMZM2G345667(N/A)
Mitochondrion 57
?
AT1G74750(N/A)
Mitochondrion 14
?
GRMZM2G475897(N/A)Mitochondrion 57 ?
AT1G18900(N/A)Mitochondrion 14 ?

Localization data was derived from the SUBA3 database for Arabidopsis proteins and by manual curation of proteomic data sets for maize proteins. Subcellular localizations in italics are based on predictions while those in bold are based on experimental evidence (GFP/YFP, GFP/YFP fusion studies; MS, identified by mass spectrometry of protein samples). In some cases proteins were identified from a sample corresponding to a specific suborganellar location as specified (e.g., stroma, envelope, nucleoids, TAC). Mutant phenotype descriptions are based on manual curation of the literature and, where available, seedling phenotype descriptions from the maize photosynthetic mutant library (PML; http://pml.uoregon.edu/photosyntheticml.html). For PML descriptions, note that these mutants have not yet been analyzed in detail and the effect of the mutation on the expression of the gene still needs to be determined before definitive phenotypes are assigned.

Localization data was derived from the SUBA3 database for Arabidopsis proteins and by manual curation of proteomic data sets for maize proteins. Subcellular localizations in italics are based on predictions while those in bold are based on experimental evidence (GFP/YFP, GFP/YFP fusion studies; MS, identified by mass spectrometry of protein samples). In some cases proteins were identified from a sample corresponding to a specific suborganellar location as specified (e.g., stroma, envelope, nucleoids, TAC). Mutant phenotype descriptions are based on manual curation of the literature and, where available, seedling phenotype descriptions from the maize photosynthetic mutant library (PML; http://pml.uoregon.edu/photosyntheticml.html). For PML descriptions, note that these mutants have not yet been analyzed in detail and the effect of the mutation on the expression of the gene still needs to be determined before definitive phenotypes are assigned.

PPR-SMRs in higher plants are localized to both mitochondria and plastids

Of the eight Arabidopsis PPR-SMRs, three have either been found (AT1G79490) or are predicted (AT1G74750 and AT1G18900) to be localized to mitochondria. The corresponding maize orthologs are also predicted to be localized to the mitochondria. For the confirmed mitochondrial-localized PPR-SMR AT1G79490, it is also known that mutant lines have an embryo lethal phenotype (EMB2217), with developmental arrest occurring at the globular stage. Moreover, the corresponding gene has been reported to have transient, germination-specific expression at early stages of Arabidopsis seed germination, consistent with an important role in early plant development. However, this is the extent of the information available from the current literature for Arabidopsis mitochondrial PPR-SMRs. The five remaining PPR-SMRs all have experimental evidence indicating that they localize to the other endosymbiotically derived organelle, the plastid (Table 1). Extensive proteomic data are available for three of these (pTAC2, SVR7, and AT5G46580) and for the corresponding maize orthologs (Zm-pTAC2, ATP4, and PPR53). Specifically, these proteins were found in proteomic studies of Arabidopsis chloroplast stromal megadalton complexes as well as Arabidopsis and maize plastid nucleoids. In addition, pTAC2 and an ortholog of AT5G46580 were found in preparations of plastid transcriptionally active chromosomes (pTAC) from Arabidopsis and spinach, respectively. A minor fraction of AT5G46580 was also found in a plastid envelope-enriched sample. This places the AT5G46580 protein in three different compartments of the plastid: the thylakoid membranes (which nucleoids/TAC are associated with), the stroma, and the envelope. While it may be that this protein localizes to all of these plastid sub-compartments, it is possible that this protein may only be loosely associated with the nucleoids and easily removed during preparation of various sub-compartmental plastid fractions. Furthermore, while no experimental data exists for the Arabidopsis protein, the maize ortholog of AT2G17033, PPR-SMR4, has also recently been detected in nucleoid-enriched fractions. Finally, despite its central role in plastid retrograde signaling pathways, proteomic data for GUN1 is absent and its plastid localization is based on microscopic analysis of transiently expressed fluorescent protein fusions. GUN1-yellow fluorescent protein (YFP) was shown to accumulate in chloroplasts in a punctate pattern overlapping the patterns of pTAC2-cyan fluorescent protein, indicating co-localization of GUN1 and pTAC2 in actively transcribed sites of plastid nucleoids. However, given that more recent studies suggest that processes such as mRNA cleavage, splicing, and editing, as well as ribosome assembly, take place in association with the nucleoids, it is not clear whether GUN1 is specifically bound to plastid DNA and/or RNA in vivo. Apart from the lack of GUN1 protein detected, the fact that other plastid PPR-SMRs are routinely detected in plastid proteomic studies indicates they are more abundant than other PPR proteins, which are generally considered to be low abundance proteins. For example, SVR7 was the only PPR protein (out of 450) that could be reliably detected in whole leaf protein samples, where its abundance was found to decrease with increasing leaf age. In addition, recent proteomic analyses have allowed relative quantitation of protein abundance to be estimated using spectral counts derived from mass spectrometry (MS) analysis.–,, This approach is based on the observation that the number of MS/MS acquisitions of peptides coming from a protein shows a positive correlation to the relative concentration of the protein in the sample. These available data sets have allowed us to assess PPR-SMR protein abundance relative to other PPR proteins as summarized in Table 2. In general, in both Arabidopsis and maize, PPR-SMRs dominate the protein mass that can be attributed to PPR proteins, contributing 26–53% of the total PPR protein mass in samples ranging from total leaf protein extracts to purified nucleoids. While the exact reason for the high abundance of these proteins remains to be determined, we speculate that this could reflect binding to multiple targets (e.g., ATP4, see below) and/or that their targets are highly abundant (e.g., rRNA, see SVR7 below). More specifically, in maize, Zm-pTAC2 was found to be the most abundant PPR-SMR in all samples analyzed (13–34% of total PPR protein mass), with PPR53 the next most abundant PPR-SMR (6.5–21%). In Arabidopsis, pTAC2 was also the most abundant PPR-SMR in nucleoids (34%) but in other samples (total leaf and high molecular weight stromal fractions) SVR7 was consistently found to be the dominant PPR-SMR protein present (24–34%). Interestingly, the maize ortholog of SVR7, ATP4, while detected in all samples, was always found at lower levels (1–3% of total PPR protein mass) indicating different expression levels of these orthologs in the monocot and dicot lineages. It remains to be determined whether this difference underlies their reported functional divergence (see below).

Table 2. The relative abundance of PPR-SMR proteins in different Arabidopsis and maize protein samples based on normalized adjusted spectral counts as an estimate of protein abundance

ReferenceProtein fraction descriptionNo. proteins identifiedNo. PPR proteins identifiedNo. PPR-SMRproteins identified% of totalprotein massattributed toPPR proteins% of total PPR protein mass attributedto PPR-SMR proteins
24
Total leaf protein(Arabidopsis, Col-0,rosette leaves)
3424
17
3
0.05
48%32% SVR7, 14% pTAC2, 2% AT5G46580
17
Total leaf protein(Arabidopsis, no information)
815
9
3
0.02
50%34% SVR7, 14% AT5G46580, 2% pTAC2
16
Stromal fraction: low molecular weight (Arabidopsis, Col-0,rosette leaves, 55 d old)
398
0
0
0
0
16
Stromal fraction: high molecular weight A (Arabidopsis, Col-0,rosette leaves, 55 d old)
293
9
3
0.46
33%24% SVR7, 4.5% pTAC2, 4.5% AT5G46580
16
Stromal fraction: high molecular weight B (Arabidopsis, Col-0,rosette leaves, 55 d old)
230
6
3
0.47
53%28% SVR7, 19% pTAC2, 6% AT5G46580
17
Nucleoids(Arabidopsis, Col-0,young seedlings)
1026
26
3
1.04
47%34% pTAC2, 12% AT5G46580, 1% SVR7
17
Proplastids(maize, B73, third leaf blade of8−9 d old seedlings)
2242
32
3
0.67
48%24% Zm-pTAC2, 21% PPR53, 3% ATP4
18
Proplastids(maize, third leaf blade of8−9 d old seedlings)
1717
17
3
0.41
53%34% Zm-pTAC2, 17% PPR53, 2% ATP4
23
Chloroplasts(maize, WT-T43, third leaf blade of 12−14 d old seedlings)
1428
5
0
0.002
0
18
Nucleoids - average frombase-tip-young samples (maize)
1092
63
4
4.65
29%16% Zm-pTAC2, 11% PPR53,1.5% ATP4, 0.5% PPR-SMR4
18
Nucleoids, leaf base(maize, third leaf blade of8−9 d old seedlings)
678
46
4
4.89
27%13% Zm-pTAC2, 12% PPR53,1% ATP4, 1% PPR-SMR4
18
Nucleoids, leaf tip(maize, third leaf blade of8−9 d old seedlings)
710
35
3
2.68
26%18% Zm-pTAC2, 6.5% PPR53, 1.5% ATP4
18Nucleoids, young leaves(maize, leaf blades of7- to 8-d-old seedlings)8275546.3832%18% Zm-pTAC2, 12% PPR53,1.5% ATP4, 0.5% PPR-SMR4

For quantitation of protein mass, each protein accession is scored for total MS/MS spectral counts (SPC), unique SPC (uniquely matching to an accession), and adjusted SPC (adjSPC). AdjSPC is the sum of unique SPCs and SPCs from shared peptides across accessions with SPC distributed in proportion to their unique SPC. The normalized adjSPC (NadjSPC) for each protein is calculated through division of adjSPC by the sum of all adjSPC values for the proteins from the sample (e.g., per gel lane or protein extract). Thus, NadjSPC provides a relative protein abundance measure by mass. For example, a protein with NadjSPC = 0.01 contributes approximately 1% of the protein mass of the analyzed sample. NadjSPC values were obtained from the publications indicated and used to calculate the relative abundance of PPR and PPR-SMR proteins.

For quantitation of protein mass, each protein accession is scored for total MS/MS spectral counts (SPC), unique SPC (uniquely matching to an accession), and adjusted SPC (adjSPC). AdjSPC is the sum of unique SPCs and SPCs from shared peptides across accessions with SPC distributed in proportion to their unique SPC. The normalized adjSPC (NadjSPC) for each protein is calculated through division of adjSPC by the sum of all adjSPC values for the proteins from the sample (e.g., per gel lane or protein extract). Thus, NadjSPC provides a relative protein abundance measure by mass. For example, a protein with NadjSPC = 0.01 contributes approximately 1% of the protein mass of the analyzed sample. NadjSPC values were obtained from the publications indicated and used to calculate the relative abundance of PPR and PPR-SMR proteins.

PPR-SMR mutant analysis reveals diverse phenotypes and putative targets

Genetic approaches show that, despite their similarity in protein architecture, the gross and molecular mutant phenotypes for plastid PPR-SMRs differ dramatically. For example, at the level of plant vitality and growth, Arabidopsis mutant phenotypes range from seedling lethal (ptac2) to moderately slower growth and paler leaves (svr7,) to a normal, wild-type-like phenotype (gun1) under normal growth conditions. Similarly, this is the case for maize PPR-SMR protein orthologs with seedling phenotypes also ranging from wild-type-like (ppr-smr-4) to very pale yellow-green (Zm-ptac2 and ppr53; Table 1). “Genomes uncoupled” (GUN) refers to the mutant phenotype where nuclear and plastid gene expression is uncoupled. Twenty years ago, gun mutants were identified from a mutagenized collection of plants containing the GUS reporter gene driven by the promoter of a gene encoding a light harvesting complex protein, LHCB1.2. Mutants impaired in plastid-to-nuclear signaling were identified by screening seedlings in the presence of the carotenoid biosynthesis inhibitor, norflurazon (NF). The initial publication from this screen identified three gun mutants (gun1, gun2, gun3), in which LHCB1.2 expression was not repressed after NF treatment, compared with the control line. Since then, these and other gun mutants have been characterized, but it was not until 2007 that GUN1 was found to be a plastid-localized PPR-SMR protein. As well as the classical “genomes uncoupled” phenotype, characterized by the inability to repress PhANG gene expression when plastid function is inhibited, gun1 mutants are also retarded in their ability to de-etiolate, indicating that GUN1 plays a role in the transition from heterotrophic to photoautotrophic growth. Moreover, gun1 is unique among the gun mutants in that impaired repression of PhANGs occurs when the seedlings are subjected to treatment with either NF or plastid translation inhibitors,, such as lincomycin. This indicates that GUN1 is required for a retrograde signaling pathway involving plastid gene expression as well as another pathway involving carotenoid biosynthesis. For detailed information and further discussions on GUN1 and plastid retrograde signaling, we direct the reader to recent reviews in this area.– PTAC2 was identified as one of 18 novel components of plastid transcriptionally active chromosomes (pTACs). The ptac2 mutant is only viable when an exogenous carbon source is available and, when this is provided, it develops yellow cotyledons and pale green primary leaves, but is unable to proceed to reproductive growth. Examination of the ultrastructure of the plastids in the ptac2 mutant indicates that plastid development is severely impaired. Analysis of transcript abundance of plastome-encoded genes suggests an involvement of pTAC2 in plastid-encoded-polymerase (PEP)-dependent transcription and processing of chloroplast RNAs as the ptac2 mutant plants showed a strongly reduced accumulation of transcripts generated by PEP., The svr7 mutant was identified during a screen for suppressors of var2 variegation. VAR2 encodes a plastid protease (FtsH), and in its absence, leaves develop a characteristic variegated pattern, including white sectors where chloroplasts fail to develop. However, the svr7/var2 double mutant lacks these white sectors. Processing of 23S, 16S, and 4.5S rRNA is perturbed in svr7. In addition, a specific reduction in the accumulation of the ATP synthase subunits A, B, E, and F and reduced ribosome association of atpB/E and rbcL mRNAs in the svr7 mutant has also been observed, indicating that SVR7 is involved in translational activation of these transcripts. Given its similarity to GUN1, the authors also investigated if the svr7 mutant displays a “gun” phenotype by testing PhANG responses upon treatment with NF. These experiments indicated that the svr7 mutant is, like wild-type, able to repress PhANG expression upon inhibition of chloroplast function and, thus, does not display a “gun” phenotype. ATP4, the maize ortholog of SVR7, has also been characterized. RNA co-immunoprecipitation assays identified the dicistronic plastid atpB/E mRNA as a ligand for ATP4 in vivo. As for the svr7 mutant, polysome analysis indicates that translation of the atpB/E transcript is perturbed in the atp4 mutant. However, atp4 also shows reduced translation of the atpA transcript and exhibits a more extreme phenotype compared with svr7 with apparent loss of the plastid ATP synthase complex. Also, in contrast to svr7, the accumulation of processed atpF and psaJ transcripts and the stabilization of dicistronic rpl16-rpl14 RNAs is affected in the atp4 mutant. Thus, the phenotypes of atp4 and svr7 mutants suggest that the functions of these orthologs are not strictly conserved. Furthermore, while over-accumulation of plastid rRNA precursors is observed in the atp4 mutant, as seen for svr7, the authors note that these differences are likely to be secondary as they are not specific to atp4 and are also observed in other mutants impaired in plastid gene expression and/or ATP synthase activity.

PPR-SMRs—Which Organisms Have Them?

All of the studies undertaken to date that have identified PPR-SMR proteins or investigated their function have done so using higher plant species and have focused on single Arabidopsis or maize proteins, and there is little information on the evolutionary relationships between PPR-SMR proteins within or across species. What has not been previously examined is the extent to which this protein architecture, where PPR and SMR domains are coupled into a single protein, is present in other organisms. As whole genome sequences become accessible through the recent increases in sequencing data available for organisms representing diverse lineages, this provides an opportunity to examine the presence of PPR-SMR proteins in a wide range of organisms to determine their origins and diversification. PPR-SMR protein sequences were collated in two ways—by searching for proteins containing both PPR and SMR domains in Uniprot using the InterPro identifiers IPR002625 and IPR002885 and by BLAST using the SMR domains of the Arabidopsis members of this PPR subgroup. Sequences obtained were manually curated so major clades were represented by organisms for which complete genome sequences were available, where possible, and truncated and redundant sequences were removed. It is already known that proteins containing SMR domains are found in both prokaryotic and eukaryotic organisms while PPR motifs are confined to eukaryotes. Thus, it is not surprising that PPR-SMR proteins are essentially only found in eukaryotic organisms and, interestingly, largely confined to the Viridiplantae clade. One major exception to this is sequences found for PPR-SMR proteins in two strains of Legionella longbeachae. However, given the paucity of PPR proteins encoded in other bacterial species it is likely that these sequences are remnants of a horizontal gene transfer event, as has been previously suggested to explain PPR genes identified in an isolated number of bacterial species. PPR-SMR proteins were also found in heterokont species (brown algae, diatoms). Until recently, heterokont chloroplasts were thought to be derived from the secondary endosymbiosis of an ancestral red algae by a eukaryotic host—the “chromalveolate hypothesis.” However, we were unable to find PPR-SMR sequences in red algal genomes (Chondrus crispus and Cyanidioschyzon merolae). This suggests that the PPR-SMR proteins found in heterokonts may be derived from an ancestral endosymbiont from the green algal lineage, supporting the more recent hypothesis that endosymbiosis of a green algae into the ancestral host cell preceded the engulfment of a red algae., Our Bayesian phylogenetic analysis (Fig. 2) also reveals that orthologs of all Arabidopsis proteins are present in species representing the major angiosperm clades, including both dicots and monocots. However, the putative mitochondrial PPR-SMRs AT1G18900 and AT1G74750 are represented by a single ortholog in most other flowering plants, with Arabidopsis lyrata the only exception, indicating that a recent gene duplication event accounts for the extra protein present in Arabidopsis species. Five PPR-SMRs were identified in the lycophyte and bryophyte models, Selaginella moellendorfii and Physcomitrella patens, respectively. Homologs of GUN1, pTAC2, SVR7/AT5G46580, EMB2217, and AT1G18900/AT1G74750 were found in Selaginella while homologs of GUN1, pTAC2, AT2G17033, and AT1G18900/AT1G74750 were found in Physcomitrella. This suggests that the SVR7/AT5G46580 and EMB2217 clades arose when tracheophytes evolved. However, the discovery of PPR-SMR proteins in chlorophytes (Micromonas, Chlorella, and Ostreococcus) suggests that this type of PPR protein emerged early in the evolution of the PPR protein family in chloroplast-containing lineages.

Figure 2. Bayesian phylogenetic tree of PPR-SMR protein sequences from a range of different species. Sequences of PPR-SMR proteins were obtained from BLAST searches and InterPro domain searches (IPR002625 and IPR002885) and aligned using MUSCLE. A phylogenetic tree was constructed using MrBayes version 3.2.1 which employs Markov Chain Monte Carlo (MCMC) sampling to approximate the posterior probabilities of phylogenies (shown above the branches). MrBayes 3.2.1 was run in parallel on the Fornax supercomputer (located at iVEC@UWA) utilizing the BEAGLE library with a mixed model of molecular evolution (determined using jModelTest), utilizing 12 chains for 50 million generations and trees sampled every 1000 generations. All runs reached a plateau in likelihood score, which was indicated by the standard deviation of split frequencies (0.0015), and the potential scale reduction factor was close to one, indicating the MCMC chains converged. Sequences are color shaded based on their lineage as indicated.

Figure 2. Bayesian phylogenetic tree of PPR-SMR protein sequences from a range of different species. Sequences of PPR-SMR proteins were obtained from BLAST searches and InterPro domain searches (IPR002625 and IPR002885) and aligned using MUSCLE. A phylogenetic tree was constructed using MrBayes version 3.2.1 which employs Markov Chain Monte Carlo (MCMC) sampling to approximate the posterior probabilities of phylogenies (shown above the branches). MrBayes 3.2.1 was run in parallel on the Fornax supercomputer (located at iVEC@UWA) utilizing the BEAGLE library with a mixed model of molecular evolution (determined using jModelTest), utilizing 12 chains for 50 million generations and trees sampled every 1000 generations. All runs reached a plateau in likelihood score, which was indicated by the standard deviation of split frequencies (0.0015), and the potential scale reduction factor was close to one, indicating the MCMC chains converged. Sequences are color shaded based on their lineage as indicated.

The SMR Domain of PPR-SMRs—What Is Its Function?

The function of the SMR domain in PPR-SMR proteins has not yet been comprehensively explored. Currently, the only examination of the specific role of the SMR domain that has been published has come from the characterization of GUN1, which reported DNA-binding activity of the SMR domain, using a non-specific substrate (calf thymus DNA). However, studies of SMR domain-containing proteins in other organisms have focused on its specific role as a nuclease and evidence now exists for endonucleolytic activity in members representing all three subfamilies of SMR-domain-containing proteins.,,– Functional characterization of the C-terminal domain of the human BCL3 binding protein (a subfamily 2 SMR domain) provided the first evidence for endonuclease activity of the SMR domain. The recombinant domain was found to non-specifically incise a supercoiled plasmid DNA to generate an open circular form of the plasmid, demonstrating the nicking endonuclease activity of the protein. A specific role for the SMR domain of this protein in binding DNA was later demonstrated. Nuclease and DNA-binding activity was also confirmed for the subfamily 1 SMR domain of the MutS2 protein from Thermus thermophilus and for a subfamily 3 “stand-alone” SMR domain-containing protein, YdaL, from Escherichia coli. Interestingly, the Leishmania donovani mRNA cycling sequence-binding protein (LdCSBP) containing a CCCH Zn-finger RNA-binding domain and a subfamily 2 SMR domain has been reported to possess RNA endonuclease activity. The SMR domain of LdCSBP alone exhibits both DNA and RNA endonuclease activity, but the full-length protein shows only sequence-specific RNA cleavage activity. Given these reported activities of SMR domain-containing proteins, it is tempting to speculate on possible functions of PPR-SMR proteins. One possibility would be that PPR-SMRs are factors with a dual function in both DNA and RNA metabolism, whereby the PPR motifs confer RNA binding activity while the SMR domain confers DNA binding activity. This would be consistent with a role in transcription (e.g., pTAC2). Alternatively, by analogy to LdCSBP, the observation that a protein containing RNA binding and SMR domains can act as an RNA endonuclease raises the question of whether PPR-SMR proteins can act as sequence-specific RNA endonucleases, where the PPR motifs confer RNA sequence specificity and the SMR domain confers endonuclease activity. This possibility would be consistent with a role in mRNA processing (e.g., SVR7 and ATP4). While these proposed functions require rigorous experimental verification, given that endonucleolytic activity has been reported for four SMR domain-containing proteins, a comparison of the SMR domains from these proteins with those from Arabidopsis PPR-SMRs was undertaken to determine if conserved residues are present (Fig. 3). From an examination of different SMR domains belonging to the different SMR subfamilies conserved motifs specific to each subfamily have been identified. Subfamilies 1 and 3 have a characteristic HGXG centrally within the SMR domain. In contrast, subfamily 2 contains a TGXG motif at the same position. These motifs are perfectly conserved in those proteins known to have endonuclease activity (Fig. 3). For the Arabidopsis proteins five of the eight SMR domains linked to PPR motifs have a TGXG motif at this position. The three SMR domains that diverge from this motif are in pTAC2, EMB2217, and AT2G17033. Subfamily 2 SMR domains are also characterized by an LDXH motif toward the N terminus of the SMR domain. This is conserved in the subfamily 2 SMR domains already verified to confer DNA and RNA endonuclease activity (human B3BP protein and the Leishmania CSBP protein). For the Arabidopsis proteins, only two of the eight SMR domains linked to PPR motifs have LDXH at this position, namely GUN1 and AT2G17033. Thus, the only Arabidopsis PPR-SMR containing both conserved motifs is GUN1.

Figure 3. SMR domain alignment to assess amino acid sequence conservation. The SMR domains of the eight Arabidopsis PPR-SMR proteins were aligned with SMR domains from proteins that have been experimentally demonstrated to have endonucleolytic activity.,– The sequences are denoted by the SMR subfamily type (1_, 2_, or 3_) followed by the AGI (for Arabidopsis proteins) or alternative identifier (Tt_MutS2 – Thermus thermophilus MutS2 protein; Hs_B3BP – Homo sapiens BCL3 binding protein; Ld_CSBP – Leishmania donovani cycling sequence binding protein; Ec_YdaL – Escherichia coli YdaL protein), and the length of the SMR domain (e.g., /1–93). Alignment was performed using MUSCLE and visualized using Jalview (www.jalview.org) with ClustalX coloring by conservation. The positions of previously described conserved regions are indicated on the alignment: the LDXH motif present in subfamily 2 SMR domains and the centrally located HGXG/TGXG (subfamilies 1 and 3/subfamily 2) are bounded by the red and blue boxes, respectively.

Figure 3. SMR domain alignment to assess amino acid sequence conservation. The SMR domains of the eight Arabidopsis PPR-SMR proteins were aligned with SMR domains from proteins that have been experimentally demonstrated to have endonucleolytic activity.,– The sequences are denoted by the SMR subfamily type (1_, 2_, or 3_) followed by the AGI (for Arabidopsis proteins) or alternative identifier (Tt_MutS2Thermus thermophilus MutS2 protein; Hs_B3BPHomo sapiens BCL3 binding protein; Ld_CSBP – Leishmania donovani cycling sequence binding protein; Ec_YdaL – Escherichia coli YdaL protein), and the length of the SMR domain (e.g., /1–93). Alignment was performed using MUSCLE and visualized using Jalview (www.jalview.org) with ClustalX coloring by conservation. The positions of previously described conserved regions are indicated on the alignment: the LDXH motif present in subfamily 2 SMR domains and the centrally located HGXG/TGXG (subfamilies 1 and 3/subfamily 2) are bounded by the red and blue boxes, respectively.

Conclusions and Perspectives

PPR proteins that contain a C-terminal SMR domain represent a small but enigmatic subset of the PPR protein family whose members in higher plants show diverse protein abundance and varied putative functions in organellar RNA metabolism. Phylogenetic analysis indicates that PPR-SMRs are confined to green plants and algae but that they are ancient proteins that have modestly diversified during angiosperm evolution. Despite their ancient origins and important roles in extant plant species, we still lack knowledge of their specific roles in organelle gene expression—we don’t know their exact RNA-binding sites nor their mechanism of action, and whether already reported effects on RNA processing, accumulation, and translation are direct or indirect. While the exact identity of their target RNAs remains to be confirmed, recent breakthroughs in predicting binding sites and identifying binding site “footprints” should enable more rapid progress in this area. Also, given that PPR proteins have been suggested to be specific factors for effector proteins and several have been identified in protein complexes, identifying interaction partners may also shed light on their specific functions. The role of the SMR domain, which makes these proteins unique, remains elusive. While we have shown that amino acids in the SMR domain are conserved in some Arabidopsis PPR-SMR proteins compared with SMR domains with known endonucleolytic activity, future experiments that test the endonuclease activity of an SMR domain derived from a PPR protein as well as targeting specific amino acids to identify catalytic residues would be invaluable. Finally, if RNA endonuclease activity can be confirmed for an SMR domain, coupling this with PPR motifs that have been designed to target a specific transcript will enable “engineering” of sequence-specific RNA endonucleases. However, furthering our basic understanding of this small but unique subset of the PPR protein family is essential for these potentially exciting applications to come to fruition.
  60 in total

1.  Megadalton complexes in the chloroplast stroma of Arabidopsis thaliana characterized by size exclusion chromatography, mass spectrometry, and hierarchical clustering.

Authors:  Paul Dominic B Olinares; Lalit Ponnala; Klaas J van Wijk
Journal:  Mol Cell Proteomics       Date:  2010-04-26       Impact factor: 5.911

2.  Plastidial retrograde signalling--a true "plastid factor" or just metabolite signatures?

Authors:  Thomas Pfannschmidt
Journal:  Trends Plant Sci       Date:  2010-06-16       Impact factor: 18.313

3.  Genomic footprints of a cryptic plastid endosymbiosis in diatoms.

Authors:  Ahmed Moustafa; Bánk Beszteri; Uwe G Maier; Chris Bowler; Klaus Valentin; Debashish Bhattacharya
Journal:  Science       Date:  2009-06-26       Impact factor: 47.728

Review 4.  The evolution of RNA editing and pentatricopeptide repeat genes.

Authors:  Sota Fujii; Ian Small
Journal:  New Phytol       Date:  2011-05-09       Impact factor: 10.151

Review 5.  Do red and green make brown?: perspectives on plastid acquisitions within chromalveolates.

Authors:  Richard G Dorrell; Alison G Smith
Journal:  Eukaryot Cell       Date:  2011-05-27

6.  Reconstruction of metabolic pathways, protein expression, and homeostasis machineries across maize bundle sheath and mesophyll chloroplasts: large-scale quantitative proteomics using the first maize genome assembly.

Authors:  Giulia Friso; Wojciech Majeran; Mingshu Huang; Qi Sun; Klaas J van Wijk
Journal:  Plant Physiol       Date:  2010-01-20       Impact factor: 8.340

7.  Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis.

Authors:  Claire Lurin; Charles Andrés; Sébastien Aubourg; Mohammed Bellaoui; Frédérique Bitton; Clémence Bruyère; Michel Caboche; Cédrig Debast; José Gualberto; Beate Hoffmann; Alain Lecharny; Monique Le Ret; Marie-Laure Martin-Magniette; Hakim Mireau; Nemo Peeters; Jean-Pierre Renou; Boris Szurek; Ludivine Taconnat; Ian Small
Journal:  Plant Cell       Date:  2004-07-21       Impact factor: 11.277

8.  pTAC2, -6, and -12 are components of the transcriptionally active plastid chromosome that are required for plastid gene expression.

Authors:  Jeannette Pfalz; Karsten Liere; Andrea Kandlbinder; Karl-Josef Dietz; Ralf Oelmüller
Journal:  Plant Cell       Date:  2005-12-02       Impact factor: 11.277

9.  Nuclease activity of the MutS homologue MutS2 from Thermus thermophilus is confined to the Smr domain.

Authors:  Kenji Fukui; Hiromichi Kosaka; Seiki Kuramitsu; Ryoji Masui
Journal:  Nucleic Acids Res       Date:  2007-01-10       Impact factor: 16.971

10.  A model for tetrapyrrole synthesis as the primary mechanism for plastid-to-nucleus signaling during chloroplast biogenesis.

Authors:  Matthew J Terry; Alison G Smith
Journal:  Front Plant Sci       Date:  2013-02-13       Impact factor: 5.753

View more
  28 in total

1.  PPR-SMR protein SOT1 has RNA endonuclease activity.

Authors:  Wen Zhou; Qingtao Lu; Qingwei Li; Lei Wang; Shunhua Ding; Aihong Zhang; Xiaogang Wen; Lixin Zhang; Congming Lu
Journal:  Proc Natl Acad Sci U S A       Date:  2017-02-06       Impact factor: 11.205

2.  In Arabidopsis thaliana distinct alleles encoding mitochondrial RNA PROCESSING FACTOR 4 support the generation of additional 5' termini of ccmB transcripts.

Authors:  Katrin Stoll; Christian Jonietz; Sarah Schleicher; Catherine Colas des Francs-Small; Ian Small; Stefan Binder
Journal:  Plant Mol Biol       Date:  2017-02-22       Impact factor: 4.076

3.  Combined Large-Scale Phenotyping and Transcriptomics in Maize Reveals a Robust Growth Regulatory Network.

Authors:  Joke Baute; Dorota Herman; Frederik Coppens; Jolien De Block; Bram Slabbinck; Matteo Dell'Acqua; Mario Enrico Pè; Steven Maere; Hilde Nelissen; Dirk Inzé
Journal:  Plant Physiol       Date:  2016-01-11       Impact factor: 8.340

4.  Editing of Mitochondrial Transcripts nad3 and cox2 by Dek10 Is Essential for Mitochondrial Function and Maize Plant Development.

Authors:  Weiwei Qi; Zhongrui Tian; Lei Lu; Xiuzu Chen; Xinze Chen; Wei Zhang; Rentao Song
Journal:  Genetics       Date:  2017-02-17       Impact factor: 4.562

5.  Organellar and Secretory Ribonucleases: Major Players in Plant RNA Homeostasis.

Authors:  Gustavo C MacIntosh; Benoît Castandet
Journal:  Plant Physiol       Date:  2020-06-08       Impact factor: 8.340

6.  RNA processing factor 7 and polynucleotide phosphorylase are necessary for processing and stability of nad2 mRNA in Arabidopsis mitochondria.

Authors:  Birgit Stoll; Daniel Zendler; Stefan Binder
Journal:  RNA Biol       Date:  2014-07-29       Impact factor: 4.652

7.  The PPR-SMR Protein ATP4 Is Required for Editing the Chloroplast rps8 mRNA in Rice and Maize.

Authors:  Jinghong Zhang; Yipo Guo; Qian Fang; Yongli Zhu; Yang Zhang; Xuejiao Liu; Yongjun Lin; Alice Barkan; Fei Zhou
Journal:  Plant Physiol       Date:  2020-09-14       Impact factor: 8.340

8.  The PPR-SMR protein PPR53 enhances the stability and translation of specific chloroplast RNAs in maize.

Authors:  Reimo Zoschke; Kenneth P Watkins; Rafael G Miranda; Alice Barkan
Journal:  Plant J       Date:  2016-02-05       Impact factor: 6.417

9.  NONU-1 Encodes a Conserved Endonuclease Required for mRNA Translation Surveillance.

Authors:  Marissa L Glover; A Max Burroughs; Parissa C Monem; Thea A Egelhofer; Makena N Pule; L Aravind; Joshua A Arribere
Journal:  Cell Rep       Date:  2020-03-31       Impact factor: 9.423

10.  GUN1 Controls Accumulation of the Plastid Ribosomal Protein S1 at the Protein Level and Interacts with Proteins Involved in Plastid Protein Homeostasis.

Authors:  Luca Tadini; Paolo Pesaresi; Tatjana Kleine; Fabio Rossi; Arthur Guljamow; Frederik Sommer; Timo Mühlhaus; Michael Schroda; Simona Masiero; Mathias Pribil; Maxi Rothbart; Boris Hedtke; Bernhard Grimm; Dario Leister
Journal:  Plant Physiol       Date:  2016-01-28       Impact factor: 8.340

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.