Literature DB >> 31288414

Interactome Analysis and Docking Sites of MutS Homologs Reveal New Physiological Roles in Arabidopsis thaliana.

Mohamed Ragab AbdelGawwad1, Aida Marić2, Abdullah Ahmed Al-Ghamdi3, Ashraf A Hatamleh3.   

Abstract

Due to their sedentary lifestyle, plants are constantly exposed to different stress stimuli. Stress comes in variety of forms where factors like radiation, free radicals, "replication errors, polymerase slippage", and chemical mutagens result in genotoxic or cytotoxic damage. In order to face "the base oxidation or DNA replication stress", plants have developed many sophisticated mechanisms. One of them is the DNA mismatch repair (MMR) pathway. The main part of the MMR is the MutS homologue (MSH) protein family. The genome of Arabidopsis thaliana encodes at least seven homologues of the MSH family: AtMSH1, AtMSH2, AtMSH3, AtMSH4, AtMSH5, AtMSH6, and AtMSH7. Despite their importance, the functions of AtMSH homologs have not been investigated. In this work, bioinformatics tools were used to obtain a better understanding of MSH-mediated DNA repair mechanisms in Arabidopsis thaliana and to understand the additional biological roles of AtMSH family members. In silico analysis, including phylogeny tracking, prediction of 3D structure, interactome analysis, and docking site prediction, suggested interactions with proteins were important for physiological development of A. thaliana. The MSH homologs extensively interacted with both TIL1 and TIL2 (DNA polymerase epsilon catalytic subunit), proteins involved in cell fate determination during plant embryogenesis and involved in flowering time repression. Additionally, interactions with the RECQ protein family (helicase enzymes) and proteins of nucleotide excision repair pathway were detected. Taken together, the results presented here confirm the important role of AtMSH proteins in mismatch repair and suggest important new physiological roles.

Entities:  

Keywords:  DNA mismatch repair; MSH; docking site; interactome

Mesh:

Substances:

Year:  2019        PMID: 31288414      PMCID: PMC6651420          DOI: 10.3390/molecules24132493

Source DB:  PubMed          Journal:  Molecules        ISSN: 1420-3049            Impact factor:   4.411


1. Introduction

Living organisms are exposed to different damaging factors at all times. Therefore, maintenance of genome stability and integrity is one of the key roles of a cell. DNA damaging factors jeopardize the integrity of the DNA and can come from endogenous and exogenous sources [1]. Just as there are many damaging factors, organisms developed diverse pathways to fight against deleterious DNA damage and to retain genomic stability [2,3,4,5,6,7,8]. Plants are in special need of effective DNA repair machinery. They do not have a continuous germ line, but meristematic cells give rise to the gametes. These meristematic cells divide and potentially accumulate mutations during the lifetime of plants. Without repair, these mutations will be passed on to the next generation. While DNA repair pathways are well understood in yeast and mammalians, our knowledge in plants falls farbehind. Therefore, there is a need for more research that will shed additional light on this interesting and powerful part of plant genomes. This work focuses on the mismatch repair (MMR) pathway in Arabidopsis thaliana. Mismatch repair MMR is post-replicative DNA repair machinery. It is able to recognize non-Watson–Crick pairing as well as insertion/deletion loops (IDLs) [9]. Additionally, MMR has several other functions; it controls homologous recombination (HR) and most probably prevents synapse formation between divergent sequences [9]. Together with DNA polymerases and exonucleolytic proofreading, MMR keeps high fidelity of DNA with only one mispair every 1010 bases [10]. The mismatch creates a nick in the DNA helix and is recognized by MutS or its eukaryotic counterparts—MutS homolog (MSH) proteins. The MSH recruits downstream proteins that make a nick in the new strand. Exonuclease is then recruited to cut out part of the DNA strand surrounding the mismatch. The gap is finally filled in by DNA polymerase and sealed with DNA ligase. On the other hand, since proofreading exonucleases have limited capabilities, IDLs will mostly be repaired by MMR machinery. Besides its role in post-replicative point mutation repair, MMR plays an important dual role in homologous recombination. First, MMR recognizes mismatches in recombination intermediates, but on the other hand, MMR is able to prevent recombination between diverged sequences and excessive exchange of their genetic material [11,12]. Extensive duplication events enabled MSH proteins of MMR to specialize and recognize a variety of mismatches [13,14]. The MSH proteins are present throughout all kingdoms of life, suggesting conservation of MMR through the evolution [15]. The versatility of MSH proteins in eukaryotes enables MMR to recognize a surprising amount of different mutations. This work will focus on MutS homologue (MSH) proteins in plants. Arabidopsis thaliana encodes seven MSH homologs. AtMSH1 is thought to be the only non-nucleus-based MSH protein. It is dually targeted to mitochondria and chloroplast and plays a very important role in maintaining the stability of their genomes [16]. AtMSH1 mutants show anincreasein the reorganization of the mitochondrial genome and result in decreased abiotic stress response, fluctuation in growth dynamics, extended flowering and maturity, and reduced heat tolerance and sterility [17,18]. AtMSH2 protein is involved in the initiation of MMR and recognition of mismatch. Besides, it is involved in the control of DNA HR and prevents recombination of divergent strands [19]. Additionally, AtMSH2 is part of the repair pathway of UV-induced DNA damage [20]. AtMSH3 is an MMR protein that works in conjunction with AtMSH2. Together, they form a MutS beta heterodimer that recognizes damage and initiates repair of DNA loops of different sizes [14]. AtMSH4 is a somewhat different member of the plant MSH family which is not directly involved in MMR. Instead, MSH4 regulates meiotic recombination and keeps it at a normal level. AtMSH4 is only present in floral tissues, which is in line with its role in reproduction [21]. AtMSH5 works in association with AtMSH4. It is expressed in flower tissue and promotes proper segregation during chiasma formation in prophase I. AtMSH5 mutation leads to serious fertility reduction [22,23]. AtMSH6 forms a MutS alpha heterodimer with MSH2 that recognizes base–base mismatches and short (trinucleotide) IDLs. AtMSH7 is a plant-specific MSH protein. With AtMSH2, it forms a MutS gamma heterodimer that recognizes only T/G mismatch and initiates mismatch repair. Due to the complexity of the topic and limited amount of information available on AtMSH proteins, the aim of this work is to shed additional light on the function of AtMSHs, leaning on the predicted structure, detailed interactome analysis of the proteins, and docking prediction.

2. Results

2.1. Multiple Sequence Alignment

ClustalOmega aligned sequences of seven AtMSH proteins and retrieved results are shown in Supplementary Materials Figure S1. Residues were colored based on their physicochemical properties (small and hydrophobic residues are in red; acidic in blue; basic in magenta; and hydroxyl, sulfhydryl, and amine in green). These results are quantified and available in the form of a percent identity matrix in Table 1. Multiple sequence alignment MSA showed highest identity between MSH6MSH7. This is in line with previous research that suggested MSH7 diverged from MSH6 [14]. The second highest scoring pair was the MSH2MSH3 dimer. The highest divergence was noticed for the MSH1 protein. This is in line with the expected results since MSH1 is a mitochondrial protein.
Table 1

The identity matrix percent of similarities among AtMSH proteins.

ProteinPercent Identity
MSH1MSH4MSH5MSH6MSH7MSH2MSH3
MSH1100.0019.0818.0322.8621.4919.0920.36
MSH419.08100.0022.5124.5325.2924.5622.88
MSH518.0322.51100.0023.6022.3322.9324.16
MSH622.8624.5323.60100.0033.3023.9526.30
MSH721.4925.2922.3333.30100.0025.3725.56
MSH219.0924.5622.9323.9525.37100.0027.93
MSH320.3622.8824.1626.3025.5627.93100.00

2.2. Phylogenetic Profile Rendering

Results retrieved from Phylogeny.fr for A. thaliana MSH proteins are visible in cladogram (Figure 1). During the long course of evolution, MutS genes of endosymbiotic bacteria gave rise to a specialized group of MSH genes. This was achieved through multiple duplication events [24]. The phylogenetic tree supports the theory that MSH1 was originally a mitochondrial gene. This will be further elaborated in the discussion.
Figure 1

Cladogramphylogenetic tree representing the evolutionary relationships of AtMSH proteins.

2.3. Protein 3D Structure Prediction and Refinement

Three-dimensional modeling is a cornerstone of modern structural biology. Determination of the protein structure is the most important step towards the determination of its function, determining possible ligands and docking sites, and finding conserved motifs and domains. The structures here are the result of a bioinformatics approach and are based on homology modeling. The results of the 3D structure prediction are given in Figure 2.
Figure 2

MSH1-MSH7 proteins’ 3D structure prediction visualized by PyMOL (left) and validated by Ramachandran plot (right). N-terminus and C-terminus are indicated.

2.4. Protein 3D Structure Validation

Both experimental and in silico models of the 3D structure have to be validated before being named acceptable. Bioinformatic tools use different references to validate a model; measuring bond distances, bond energy, torsion angles, B-factor, free energy of the molecule, etc. The results of the Ramachandran plot assessment for each AtMSH protein model are visible in Figure 2. Further validation was done in Model Quality Assessment Programs (MQAPs) such as PROCHECK, which certify the stereochemical properties of the model and use the free energy scoring tool dDFIRE to assess energy functions by ab initio refolding of fully unfolded terminal segments with secondary structures while keeping the rest of the proteins fixed in their native conformations [25]. Summary of results from all the validation tools is shown in Table 2. The tools render the MSH models as reliable.
Table 2

3D structure prediction verification tools of MSH proteins.

ProteinRAMPAGE (Residues in Allowed Region)PROCHECK (G-Factor)dDFIRE
MSH1 98.1%−0.28−1755.42
MSH2 98.9%−0.05−2105.67
MSH3 98.0%−0.18−2052.20
MSH4 98.1%−0.16−1753.06
MSH5 98.7%−0.23−1797.04
MSH6 98.1%−0.18−2093.36
MSH7 98.3%−0.19−1841.97

2.5. Protein Domain Identification

Domains are conserved regions of protein that can be a strong indicator of its function. Conserved regions of MSH proteins, as detected by SMART (Simple Modular Architecture Research Tool), are listed in Table 3. All proteins share a MUTSac domain [26]. This is the ATPase domain of the MSH proteins located at the C-terminal [27]. Although detailed information is not available from the eukaryotic MUTSac domain; the prokaryotic model suggests that only one monomer of the MSH dimer binds ADP through the MUTSac domain. Mismatch recognition initiates ATP binding which results in conformational change of the dimer and its movement along the DNA. Another domain that was present in all MSH protein except mitochondrial MSH1 is MUTSd.
Table 3

Domains of MSH proteins.

Domain and AccessionProtein
AtMSH1AtMSH2AtMSH3AtMSH4AtMSH5AtMSH6AtMSH7
MUTSac (SM000534) Start7616598105465621076846
End947855100673375712681043
MUTSd (SM000533) Start-314440190211716573
End-6427935315471056822
Pfam:MutS_I (PF01624) Start12522105--380268
End228129218--496382
Pfam:MutS_II (PF05188) Start-142235--505388
End-284361--676542
Pfam:GIY-YIG (PF01541) Start1024------
End1091------
TUDOR (PF00567) Start-----121-
End-----179-
This is a DNA-binding domain of MutSfamily, anda core domain made up of two subdomains that bind the DNA as levers. This domain is homologous to domain III of MutS in Thermusaquaticus [28]. Both MutS_I and MutS_II domains were identified by Pfam. They are homologous to domain II of Thermusaquaticus [29]. These domains functionally resemblethe RNase H domain that is responsible for RNA digestion and related to reverse transcriptase action. Similarly, MutS_II corresponds to domain II of MutS of Thermusaquaticus and is involved in DNA binding by MutS. Of all MSH homologues in A. thaliana, GIY-YIG domain is present only in MSH1. This is a catalytic domain present at the N-terminal of endonuclease [30] and its connection to DNA repair has already been inferred [31]. Finally, the TUDOR domain is present only in MSH6.The proteins that contain the Tudor domain are known as histone modification and categorized as chromatin remodeling proteins. Gene expression and DNA replication are greatly affected by histone modifications and chromatin remodeling, but how these processes are incorporated has not been fully investigated. It obvious that TUDOR domain proteins are development regulators carrying out functions that are not disclosed in plants [32].

2.6. Interactome Analysis

Before interactome analysis, proteins were assessed for solvent accessibility using the Protein Predict server. The results showed high solvent accessibility of all homologs which was an indication that we can expect extensive interactions and interactome profiles. The interactome analysis was a keystone of this work. It provided valuable information about proteins and protein families that interact with MSH homologues of A. thaliana (Table 4; Supplementary Materials Figure S9). All AtMSH homologues interact with thethree core proteins MLH1, MLH3, and PMS1 (Postmeiotic Segregation 1). This wasexpected since these are the plant eukaryotic counterparts of bacterial MutL and play important roles in MMR. All the AtMSH proteins, with the exception of AtMSH4, interacted with PCNA1 and PCNA2 (proliferating cell nuclear antigen), which play important roles in DNA replication as sliding clamps that enable elongation of leading strands [33]. Four out of seven homologues (MSH1, MSH2, MSH6, and MSH7) interact with TIL1 and/or TIL2—proteins with important physiological roles in plant growth and development. Extensive interactions were noticed between MSH homologues, RECQSIM and ERCC1. These proteins are an important part of other DNA damage repair pathways. Seven out of ten MSH4 interactors were not seen in any other homologue, which is an indication of a specific role of this protein. MSH5 extensively interacted with DNA helicases (RECQ4A, RECQSIM, RecQI3, RECQI1, RECQ4B).
Table 4

Interactors of AtMSH proteins as retrieved by interactome analysis in STRING.

MSHInteractorCV
NameAccessionFunction
AtMSH1 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.996
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson–Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PCNA1AT1G07370.1Proliferating cellular nuclear antigen 1; auxiliary protein of DNA polδ; controls eukaryotic DNA replication 0.992
PCNA2AT2G29570.1proliferating cell nuclear antigen 2; auxiliary protein of DNA polδ; controls eukaryotic DNA replication0.991
MSH2AT3G18524.1MUTS homolog 2; (see Introduction for detailed description)0.981
MSH5AT3G20475.1MUTS-homolog 5; (see Introduction for detailed description)0.981
RECA3AT3G10140.1RECA homolog 3; plays role in recombination ability DNA strand transfer0.980
TIL1AT1G08260.1TILTED 1; DNA polymerase II; involved in DNA replication. Important physiological role (timing and determination of cell fate during plant embryogenesis and root pole development; required for proper shoot (SAM) and root apical meristem (RAM) function; required for flowering repression0.974
TIL2AT2G27120.1TILTED 2; DNA polymerase II; involved in DNA replication, promotes cell cycle and cell type patterning. Contributes to flowering time repression0.974
AtMSH2 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.997
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson–Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis.0.999
PCNA1AT1G07370.1proliferating cellular nuclear antigen 1; auxiliary protein of DNA polδ; controls eukaryotic DNA replication 0.997
PCNA2AT2G29570.1proliferating cell nuclear antigen 2; auxiliary protein of DNA polδ; controls eukaryotic DNA replication0.998
MSH7AT3G24495.1MUTS homolog 7; (see Introduction for detailed description)0.997
MSH6AT4G02070.1MUTS homolog 6; (see Introduction for detailed description)0.983
UVH1AT5G41150.1DNA repair endonuclease UVH1; probably involved in NER and repair of UV light damage, and oxidative damage. In vitro, repairs DSBs and is required for homologous recombination0.991
TIL1AT1G08260.1TILTED 1; DNA polymerase II; involved in DNA replication. Important physiological role (timing and determination of cell fate during plant embryogenesis and root pole development; required for proper shoot (SAM) and root apical meristem (RAM) function; required for flowering repression0.974
RECQSIMAT5G27680.1RECQ helicase SIM; Involved in DNA repair; 3′-5′ helicase specific for plants 0.991
ERCC1AT3G05210.1DNA excision repair protein ERCC-1; involved in NER. In vitro, repairs DSBs and is required for homologous recombination. UVH1/RAD1-ERCC1/RAD10 complex acts as endonuclease0.990
AtMSH3 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.997
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson-Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PCNA1AT1G07370.1proliferating cellular nuclear antigen 1; auxiliary protein of DNA polδ; controls eukaryotic DNA replication 0.955
PCNA2AT2G29570.1proliferating cell nuclear antigen 2; auxiliary protein of DNA polδ; controls eukaryotic DNA replication0.968
AT2G02550AT2G02550.2PIN domain-containing protein; nuclease0.965
AT1G29630AT1G29630.2exonuclease 1; dsDNAexonuclease. May be involved in DNA mismatch repair (MMR) 0.965
AT1G18090AT1G18090.15′-3′ exonuclease family protein0.965
ERCC1AT3G05210.1DNA excision repair protein ERCC-1; involved in NER. In vitro, repairs DSBs and is required for homologous recombination. UVH1/RAD1-ERCC1/RAD10 complex acts as endonuclease0.953
RECQSIMAT5G27680.1RECQ helicase SIM; Involved in DNA repair; 3′-5′ helicase specific for plants 0.863
AtMSH4 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson–Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis0.998
MSH5AT3G20475.1MUTS-homologue 5; (see Introduction for detailed description)0.987
RAD51AT5G20850.1DNA repair protein RAD51-like 1; binds ss- and dsDNA; DNA-dependent ATPase; repair of meiotic DBSs generated by AtSPO11-1 and in homologous recombination. Important for vegetative growth and root mitosis0.984
MUS81AT4G30870.1MMS andUV sensitive 81; part of endonuclease complex. Involved in DNA repair and homologous recombination (HR) in somatic cells. 0.980
ATSPO11-1AT3G13170.1Meiotic recombination protein SPO11-1; part of meiotic recombination. Cleaves DNA to make DSB and start meiotic recombination0.970
RCKAT3G27730.1ROCK-N-ROLLERS; DNA helicase important for meiosis0.964
DMC1AT3G22880.1Disruption of meiotic control 1; May participate in meiotic recombination0.958
SPO11-2AT1G63990.1sporulation 11-2; involved in meiotic recombination. Cleaves DNA to make DSB and start meiotic recombination0.931
AtMSH5 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson-Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PCNA1AT1G07370.1proliferating cellular nuclear antigen 1; auxiliary protein of DNA polδ; controls eukaryotic DNA replication 0.993
PCNA2AT2G29570.1proliferating cell nuclear antigen 2; auxiliary protein of DNA polδ; controls eukaryotic DNA replication0.993
RECQ4AAT1G10930.1ATP-dependent DNA helicase Q-like 4A; DNA helicase possibly involved in repair of DNA0.994
RECQSIMAT5G27680.1RECQ helicase SIM; Involved in DNA repair; 3′-5′ helicase specific for plants 0.991
RecQI3AT4G35740.1ATP-dependent DNA helicase Q-like 3; DNA helicase; possible role in DNA repair. Mediates DNA strand annealing 0.991
RECQI1AT3G05740.1RECQ helicase l1; DNA helicase; possible role in DNA repair0.991
RECQ4BAT1G60930.1RECQ helicase L4B; DNA helicase; possible role in DNA repair; promotes crossovers0.991
AtMSH6 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.998
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson–Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PCNA1AT1G07370.1proliferating cellular nuclear antigen 1; auxiliary protein of DNA polδ; controls eukaryotic DNA replication 0.995
PCNA2AT2G29570.1proliferating cell nuclear antigen 2; auxiliary protein of DNA polδ; controls eukaryotic DNA replication0.996
MSH5AT3G20475.1MUTS-homologue 5; (see Introduction for detailed description)0.984
MSH2AT3G18524.1MUTS homolog 2; (see Introduction for detailed description)0.983
TIL1AT1G08260.1TILTED 1; DNA polymerase II; involved in DNA replication. Important physiological role (timing and determination of cell fate during plant embryogenesis and root pole development; required for proper shoot (SAM) and root apical meristem (RAM) function; required for flowering repression0.974
TIL2AT2G27120.1TILTED 2; DNA polymerase II; involved in DNA replication, promotes cell cycle and cell type patterning. Contributes to flowering time repression0.974
RECQSIMAT5G27680.1RECQ helicase SIM; Involved in DNA repair; 3′-5′ helicase specific for plants 0.970
AtMSH7 MLH1AT4G09140.1MUTL-homologue 1; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis.0.999
MLH3AT4G35520.1MUTL protein homolog 3; correcting IDLs in MMR coming from DNA replication, DNA damage or heterologous recombination in meiosis0.997
PMS1AT4G02460.1Postmeiotic segregation 1; correcting non-Watson–Crick base pairing and IDLs in MMR; coming from DNA replication, DNA damage or heterologous recombination in meiosis0.999
PCNA1AT1G07370.1proliferating cellular nuclear antigen 1; auxiliary protein of DNA polδ; controls eukaryotic DNA replication 0.996
PCNA2AT2G29570.1proliferating cell nuclear antigen 2; auxiliary protein of DNA polδ; controls eukaryotic DNA replication0.995
MSH5AT3G20475.1MUTS-homologue 5; (see Introduction for detailed description)0.984
MSH2AT3G18524.1MUTS homolog 2; (see Introduction for detailed description)0.997
TIL1AT1G08260.1TILTED 1; DNA polymerase II; involved in DNA replication Important physiological role (timing and determination of cell fate during plant embryogenesis and root pole development; required for proper shoot (SAM) and root apical meristem (RAM) function; required for flowering repression0.976
TIL2AT2G27120.1TILTED 2; DNA polymerase II; involved in DNA replication, promotes cell cycle and cell type patterning. Contributes to flowering time repression0.976
RECQSIMAT5G27680.1RECQ helicase SIM; Involved in DNA repair; 3′-5′ helicase specific for plants 0.968

2.7. Protein Subcellular Localization

Organelles are in charge of different cellular processes and hold different sets of proteins. Therefore, protein localization represents an important step in deciphering protein function, but is also suggested to be key to functional diversity [34]. Localization of AtMSH proteins is given in Table 5.
Table 5

Subcellular localization of AtMSH proteins.

ProteinSubcellular LocalizationSubnuclear Localization
MSH1MitochondrionChloroplast--
MSH2NucleusNucleolus
MSH3NucleusNucleolus
MSH4NucleusNucleolus
MSH5NucleusNucleolus
MSH6NucleusNucleolus
MSH7NucleusNucleolus

2.8. Docking Site Prediction

Proteins develop their functionality through interactions with other macromolecules (DNA, RNA, other proteins, etc.). Therefore, understanding protein–protein interaction is crucial for elucidation of its function and the analysis of the whole proteome. Results obtained from ClusPro and SPIDDER were in line with each other. Each AtMSH protein was docked against its most interesting interactors. The results from ClusProareare are shown in Supplementary Materials Figures S2–S8. In order to confirm the results, the docking was done in SPPIDER http://sppider.cchmc.org/ and the results coincided with the ClusPro analysis and were accompanied by tables indicating active sites of AtMSH protein and its interactor (Supplementary Materials Table S1).

3. Discussion

DNA, just like all organic molecules, can undergo chemical changes. However, structural changes on DNA have much larger and far-reaching consequences. DNA mutations can arise as result of DNA replication slips or spontaneous chemical changes stemming from exposure to different damaging factors. Therefore, cells had to evolve mechanisms to cope with these damages. One of them is the mismatch repair pathway. Most important proteins of this MMR are MutS homologs (MSHs). Evolutionary conservation of the MMR pathway and MSH orthologs in the plant and animal kingdoms, being higher in comparison to bacterial counterparts, allows us to transfer knowledge from A. thaliana to animals. Plants carry seven homologs of MSH, compared to five homologs in humans and six in yeast.Therefore, it was very interesting to check for functions of plant MSH homologs in MMR, estimate potential redundancy in their role, and look for new avenues in which they function. Although the majority of MutS homologs belong to the same protein family, a certain degree of functional diversity among MSH proteins was observed. Phylogenetic analysis of plant MutS homologs confirmed this functional specification. This is especially striking in the example of MSH6 and MSH7 which diverged recently but have different functions. It would be of great importance to find out which amino acids are responsible for this functional specification. It is of great significance to look at genes and their occurrence from an evolutionary perspective. This is why the first step was to look at MSA and a phylogenetic tree of MSH proteins. MSA revealed a ~200 amino acid long conserved region at the C-terminal that suggested the core domain of the AtMSH protein family and a common ancestor of these proteins. This is in line with previous studies [35]. As indicated by Culligan et al.,the first duplication event enabled one copy to encode the mitochondrial MSH1 protein and the other copy gave rise to a diversified MSH family. The secondnuclear duplication gave rise to ancestors of (i) MSH6 and MSH7 and (ii) MSH2, MSH3, MSH4, and MSH5. Further duplication and specialization enabled MSH6, the MSH7 ancestor, to form subfamilies of MSH6 and MSH7. On the other hand, duplication gave rise to the meiosis-specific MSH4 subfamily and ancestor of MSH2, MSH3, and MSH5. By further duplication, the later three diverged into separate genes. This diversification was followed by specialization in function. Further analysis showed that the conserved region identified as a core domain by MSA is a MUTSac domain. Some regions are conserved only in subfamilies of MSHs, so they can confer specific functions to these proteins. Pfam:MutS_I (PF01624) is a domain with unknown function, but judging by its presence in all MSH proteins except MSH4 and MSH5, which does not have DNA mismatch binding ability, it is reasonable to hypothesize that MutS_I is a DNA binding domain, as suggested by studies in corresponding domains of Thermusaquaticus. The interactome of MSH proteins shown here shows that these proteins have crucial roles in plants: (i) they maintain the stability of nuclear and organellar DNA and (ii) control numerous physiological processes. This is not the first time that proteins of DNA repair mechanisms were found to influence physiological characteristics of plants [36]. Subcellular localization indicates that MSH homologs are predominantly placed in the nucleus. The only exception is MSH1, which is localized in mitochondria and chloroplast. This was proved to be essential for substoichiometric shifting in plant mitochondria, stability of plastid genome, and consequently for plant growth, through interactions described below [37]. MLH1MLH3 with MSH homologs suggests similar roles for plant homologues. It is important to note that MLH1 mutants in A. thaliana exhibit reduced fertility [38,39]. This is another important physiological trait directly influenced by MSH homologs. Proposed mechanisms of MLH1 function and its interaction with MSH homologs are certainly worth investigating further in plants.MSH homologs, with the exception of MSH4, extensively interact with replication factors PCNA1 and PCNA2. This is another proof of their importance for maintenance of DNA integrity, as these proteins have already been extensively studied in relation to DNA repair [35]. Additionally, these proteins have been correlated with the control of shoot differentiation and meristem organization, indicating another venue influenced by MSH proteins [40]. MSH1 interacts extensively with proteins involved in replication and recombination. The results shown here support the hypothesis that replication initiation is mediated by recombination, which would explain interaction with both groups of proteins. One of the physiological roles influenced by MSH homologs includes plant growth, controlled by the mitochondrial genome rearrangements through the interaction of MSH1RECA3 proteins. RECA3 is a protein involved in recombination and strand transfer activity, whose mutants were found to influence plant growth, leaf variegation, and altered leaf morphology [41,42,43]. MSH1RECA3 interaction is supported by the same subcellular localization and involvement in the same substoichiometric shifting process. RECA3 interacts with MSH1 through the AAA domain, but it is possible to see a hole in the docking site (Supplementary Materials Figure S2). Using the results obtained in the BindN program, which identified DNA-binding residues in the interacting site (Arg212, Ser213, Arg214, Gly216, Ser94, Thr95), we can hypothesize that it is the area where DNA binds. Extensive interaction of MSH (MSH1, MSH2, MSH6 and MSH7) proteins was observed in relation to DNA polymerase epsilon catalytic subunit A (TIL1/POL2a/TILTED1) and/or DNA polymerase epsilon catalytic subunit B (TIL2/POL2b/TILTED2) [44]. Partial interaction through domains (MUTSac, Pfam:MutS_II, Pfam:MutS_I) was supported by interactions through electrostatic charges. TIL1 and TIL2 proteins alter root and shoot development, repress flowering, homologous recombination, abscisic acid signaling, and cell cycling [41,43]. This way, MSH homologs could be involved in the control of physiological characteristics of plants. Thus far, it was discovered that TIL1 mutants (abo4-1) have higher rates of homologous recombination and display pleiotropic defects in both vegetative and reproductive development, but the mechanism behind this was not investigated [45]. Another interesting protein found in the interactome was RECQSIM. Important information about it come from the work of Bagherieh-Najjar et al. who indicated that RECQSIM has a role in DNA repair and recombination, but they did not propose the mechanism behind this repair [46]. MSH2, MSH3, MSH5, MSH6, and MSH7 interactions with RECQSIM (through MUTSac, Pfam:MutS_II, Pfam:MutS_I domains, mostly supported by electrostatic charges),which is indicated here, proposes a possible model by which RECQSIM can contribute to DNA repair and genome stability, and consequently, influence plant growth and development. MSH5 interacts with six members of the RECQ family. This extensive MSH5–RECQ interaction indicates the important role of the RECQ family for plant DNA stability and fertility. ERCC1 and UVH1 are proteins involved in nucleotide excision repair and in mitotic homologous recombination [47,48]. AtMSH homologs were found to interact with AtRAD1 and AtRAD10 at the highest confidence value. AtMSH2 interacts with AtERCC1 and AtUVH1 through theC-terminal domain MUTSac between the 659th and 855th residue; while AtMSH3 interacts with UVH1 through MUTSd, a DNA-binding domain. Therefore, we can assume that MutSβ (MSH2MSH3 heterodimer) is responsible for HR. This is in line with research done in yeast [49]. MSH4 is a special member of the MSH protein family. It has a role exclusively in meiotic recombination and not in MMR. It has a special interactome and different domain profile, where MSH4 shares only MUTSac (ATPase domain) and MUTSd (DNA-binding domain) with other MSH proteins and does not contain any MMR-related domains. As meiosis-specific protein, MSH4 interacts with AtSPO11-1 and AtSPO11-2 proteins through the MUTSac domain. AtSPO11-1 and AtSPO11-2 are components of topoisomerase 6, responsible for formation of DSBs [50]. Localization of MSH5 is dependent on the occurrence of MSH4; therefore, they do not have a redundant role [22]. Instead, MSH5 plays an important role in stabilizing chiasma during meiosis, and directly influences the fertility of the plant. MSH5 extensively interacts with RECQ homologs. It was found in E. coli and S. cerevisiae, and the mutation in RECQ leads to increased levels of recombination. This functional link to recombination and their interaction with AtMSH5 is a sign that they could have the same function in plants. What the exact mechanism of interaction between MSH5 and RECQ is a line of research worth exploring further. AtRECQ2 disrupts D-loops and prevents non-productive recombination events or channel repair pathways into non-productive recombination. Knowing that AtMSH5 is involved in meiosis regulation, AtMSH5AtRECQ2 interaction is potentially very important for the fertility of plants, but to our knowledge, this has not been explored yet.

4. Materials and Methods

4.1. Sequences Retrieving and Multiple Sequence Alignment MSA

The amino acid sequences of seven AtMSH homologs were first retrievedfrom the National Center for Biotechnology Information (NCBI) [51] and the Arabidopsis Information Resource (TAIR) [52] (Table 6). Obtained sequences were aligned using the Clustal Omega tool [53,54,55,56].
Table 6

AtMSH proteins accession numbersfrom the National Center for Biotechnology Information (NCBI) and the Arabidopsis Information Resource (TAIR).

Protein NameAccession NumberSequence Length
NCBITAIR
AtMSH1Q84LK0.1AT3G24320.11118 aa
AtMSH2O24617.1AT3G18524.1937 aa
AtMSH3O65607.2AT4G25540.11081 aa
AtMSH4F4JP48.1AT4G17380.1792 aa
AtMSH5F4JEP5.1AT3G20475.1807 aa
AtMSH6O04716.2AT4G02070.11324 aa
AtMSH7Q9SMV7.1AT3G24495.11109 aa

4.2. Phylogenetic Profile Rendering

Sequences of AtMSH proteins were submitted for phylogenetic analysis in Phylogeny.fr. One Click mode was used here to construct a phylogenetic tree of AtMSH homologues using the neighbor joining method [57,58].

4.3. Protein 3D Structure Prediction and Refinement

The 3D structure of MSH homologues in A. thaliana has not been determined yet. Therefore, the homology modeling method was used to predict their 3D structure. Amino acid sequences of MSH proteins were submitted to Phyre2 (Protein Homology/analogY Recognition Engine V 2.0) portal [59]. In order to obtain structures closer to native state, the .pdb files retrieved from Phyre2 were refined using the protein structure refinement server 3Drefine [60,61]. 3Drefine is a free web server that brings the structure closer to a native state. It uses a two-step approach. First, hydrogen bonds are optimized, and second, energy at the atomic level is minimized. The critical assessment of techniques for protein structure prediction (CASP), which is used as the gold standard for the assessment of bioinformatics tools, recognized 3Drefine as a tool that brings structural improvement at the global and local levels of protein structure. Three-dimensional visualization of the protein surface was done using PyMOL software [62]. Additional visualization of 3D structure was done in DeepView-Swiss-PdbViewer available at ExPASy Bioinformatics Resource Portal [63]. DeepView is a powerful tool for macromolecular modeling that enables visualization of electrostatic potentials of proteins [64].

4.4. Protein 3D Structure Validation

After 3D structures were predicted as described above, these models were validated using several tools. First, the RAMPAGE tool was used for assessment of the Ramachandran plot [65]. A Ramachandran plot aligns backbone angles ψ (C–Cα bond) and φ (C–N bond) [66]. This is arguably the best assessment of the 3D structure prediction.

4.5. Protein Domain Identification

In order to identify functional domains of AtMSH proteins, SMART (Simple Modular Architecture Research Tool); Normal mode was used [67,68]. Besides SMART default HMMER search that uses hidden Markov models, Pfam domains were included.

4.6. Interactome Analysis

Following domain identification, proteins underwent interactome analysis. This was done in order to identify which proteins interact with AtMSH proteins. For interactome analysis, a STRING (functional protein association network) database was used [69]. STRING integrates information scattered over multiple databases in order to report on physical and functional protein–protein interactions. Protein sequences were submitted to STRING and parameters were set to show 10 interactors of highest confidence (>0.900).

4.7. Protein Localization

Localization is an important indication of protein function and biological interaction. Subcellular localization was done using online tools and literature. The PSI-predictor (Plant Subcellular Localization integrative predictor) was exploited [70]. It combines group voting and a neural network to integrate data from 11 independent predictors and outperforms all of them individually. For AtMSHproteins that were localized to the nucleus, further characterization was done in order to check in which part of the nucleus they are localized. This was done using Nuc-Ploc: predicting protein subnuclear localization [71].

4.8. Docking Site Prediction

The AtMSH homologs and their corresponding interactorswere submitted to docking site prediction tools, in order to visualize regions responsible for interaction. Two docking tools were used in this study: ClusPro and SPPIDER [72,73,74,75]. In 2004, when it was published, ClusPro was first a completely automated program for computational protein docking. ClusPro creates over 70,000 possible conformations, evaluates complexes, and selects ones with the highest surface complementarities and optimal electrostatic characteristics. ClusPro showed very good results in the Critical Assessment of Prediction of Interactions (CAPRI) and confirmed that cluster size-based ranking is reliable for identification of near-native conformations [76]. Solvent accessibility-based Protein–Protein Interface identification and Recognition (SPPIDER) offers a user-friendly website and is able to detect protein interfaces in two ways [77]; it has a different approach compared to other interface prediction tools. It uses relative solvent accessibility (RSA) as a reference point and calculates the RSA of an amino acid in (a) predicted and (b) unbound state. Here, RSA loss was set to at least 4% after the formation of the complex.
  3 in total

1.  MutS homologue 4 and MutS homologue 5 Maintain the Obligate Crossover in Wheat Despite Stepwise Gene Loss following Polyploidization.

Authors:  Stuart D Desjardins; Daisy E Ogle; Mohammad A Ayoub; Stefan Heckmann; Ian R Henderson; Keith J Edwards; James D Higgins
Journal:  Plant Physiol       Date:  2020-06-11       Impact factor: 8.340

Review 2.  Plant DNA Polymerases.

Authors:  Jose-Antonio Pedroza-Garcia; Lieven De Veylder; Cécile Raynaud
Journal:  Int J Mol Sci       Date:  2019-09-27       Impact factor: 5.923

3.  Population Structure and Genetic Diversity in Korean Cowpea Germplasm Based on SNP Markers.

Authors:  Eunju Seo; Kipoong Kim; Tae-Hwan Jun; Jinsil Choi; Seong-Hoon Kim; María Muñoz-Amatriaín; Hokeun Sun; Bo-Keun Ha
Journal:  Plants (Basel)       Date:  2020-09-12
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.