Literature DB >> 36203893

New insights on thioredoxins (Trxs) and glutaredoxins (Grxs) by in silico amino acid sequence, phylogenetic and comparative structural analyses in organisms of three domains of life.

Soumila Mondal1, Shailendra P Singh1.   

Abstract

Thioredoxins (Trxs) and Glutaredoxins (Grxs) regulate several cellular processes by controlling the redox state of their target proteins. Trxs and Grxs belong to thioredoxin superfamily and possess characteristic Trx/Grx fold. Several phylogenetic, biochemical and structural studies have contributed to our overall understanding of Trxs and Grxs. However, comparative study of closely related Trxs and Grxs in organisms of all domains of life was missing. Here, we conducted in silico comparative structural analysis combined with amino acid sequence and phylogenetic analyses of 65 Trxs and 88 Grxs from 12 organisms of three domains of life to get insights into evolutionary and structural relationship of two proteins. Outcomes suggested that despite diversity in their amino acids composition in distantly related organisms, both Trxs and Grxs strictly conserved functionally and structurally important residues. Also, position of these residues was highly conserved in all studied Trxs and Grxs. Notably, if any substitution occurred during evolution, preference was given to amino acids having similar chemical properties. Trxs and Grxs were found more different in eukaryotes than prokaryotes due to altered helical conformation. The surface of Trxs was negatively charged, while Grxs surface was positively charged, however, the active site was constituted by uncharged amino acids in both proteins. Also, phylogenetic analysis of Trxs and Grxs in three domains of life supported endosymbiotic origins of chloroplast and mitochondria, and suggested their usefulness in molecular systematics. We also report previously unknown catalytic motifs of two proteins, and discuss in detail about effect of abovementioned parameters on overall structural and functional diversity of Trxs and Grxs.
© 2022 The Author(s).

Entities:  

Keywords:  3D structure comparison; Glutaredoxins; Glutathione; Oxidoreductase; Phylogeny; Thioredoxins

Year:  2022        PMID: 36203893      PMCID: PMC9529593          DOI: 10.1016/j.heliyon.2022.e10776

Source DB:  PubMed          Journal:  Heliyon        ISSN: 2405-8440


Introduction

Thioredoxins (Trxs) and glutaredoxins (Grxs) are heat stable, small (∼9–16 kDa) redox-controlling thiol-disulphide oxidoreductases that share di-cysteine active site motif (CXXC) and a common Trx/Grx fold [1, 2]. Trxs and Grxs are found in all domains of life where two proteins are responsible for maintaining cellular redox homeostasis [1, 2, 3, 4]. Trx/Grx fold is characterized by the presence of four β strands and three flanking α helices. The β strands are oriented in a 4312 fashion where 3rd strand is antiparallel to the rest of the β strands [2, 5]. Although Trxs and Grxs belong to the same superfamily and share Trx/Grx fold, the two proteins differ in their source of reducing power. Trxs are reduced by thioredoxin reductase (TR) in an NADPH-dependent reaction while reduced glutathione (GSH) acts as a source of reducing equivalents for Grxs [1, 2, 3]. After catalyzing reduction of their substrates, oxidized Grxs are reduced by GSH which results in the generation of oxidized glutathione (GSSG). GSSG is reduced by NADPH-dependent glutathione reductase (GR) to give GSH. Together, Grxs, GSH, GSSG, and NADPH-dependent GR constitute the glutaredoxin system [4]. Trx was first discovered in Escherichia coli (E. coli) as an electron donor for the reduction of ribonucleotide reductase (RNR) enzyme [6, 7]. Later, Grx was identified as a backup system of Trx in E. coli for the reduction of the RNR enzyme [8]. However, subsequent studies in different organisms established importance of Trxs and Grxs in development, protection of proteins from oxidative damage, signal transduction, protein folding, photosynthesis, abiotic stress resulting reactive oxygen species (ROS), programmed cell death (PCD), cardiac, neurodegenerative and cancerous diseases [1, 2, 9, 10, 11,12]. Thus, Trxs and Grxs regulate diverse range of cellular functions and affect the overall fitness and development of different organisms by controlling the redox state of their target proteins. Different eukaryotic organisms possess Trxs and Grxs that are targeted to different cellular compartments. However, all Trxs essentially retain catalytic CGPC motif and Trx/Grx fold despite their different intracellular locations and specificity for substrates [5, 13]. In contrast, Grxs are broadly categorized into two groups, i.e., monothiol and dithiol Grxs, based on the number of cysteine residues present in their catalytic site [1, 14, 15]. Grxs can also be divided in six different classes having motif sequence CXX [C/S] (Class I), CGFS (Class II), CC-type (CCXX, CXXC, CCXS; Class III), CXX [C/S] with DER or DUF 547 domain (Class IV), CPWG with extended C-terminal (Class V) and CPW [C/S] with one additional DUF 236 domain at N-terminal (Class VI) [14, 15]. The CC-type class III Grx is only found in higher plants while class V and VI are only present in a few marine cyanobacteria [14, 15]. Grxs catalyze forward reaction of glutathionylation via a dithiol mechanism similar to Trxs; however, they can also act using the monothiol mechanism which is required for deglutathionylation of proteins [1, 4]. Besides controlling the redox state of proteins, Grxs, specifically monothiol Grxs, play a significant role in iron homeostasis by participating in biosynthesis and targeting of iron-sulfur clusters [16, 17]. Earlier studies focused on biochemical and structural characterization of Trxs and Grxs. In addition to their catalytic motif based classification, computational studies established evolutionary relationship of Grxs or Trxs from different organisms [1, 5, 14, 15]. Here, we conducted in silico comparative structural analysis combined with sequence and phylogenetic analyses of Trxs and Grxs in 12 organisms of three domains of life to get better insights into their evolutionary and structural relationship. Results obtained suggested that substitutions with amino acids having similar chemical properties helped Trxs and Grxs to conserve their Trx/Grx fold and function during evolution. Trxs and Grxs are structurally more similar in prokaryotes than eukaryotes though two proteins have opposite electrostatic surface potential. However, catalytic motifs are constituted by uncharged amino acids in both proteins. Results of phylogenetic analysis suggested the usefulness of Trxs and Grxs sequences in establishing an evolutionary lineage.

Materials and methods

Experimental organisms and sequence retrieval from biological databases

Total 12 organisms such as Archaeoglobus veneficus, Escherichia coli K12, Synechococcus elongatus PCC 7942, Saccharomyces cerevisiae S288c, Arabidopsis thaliana, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Xenopus laevie, Gekko japonicus, Gallus and Homo sapiens were selected as a representative of archaea, bacteria, cyanobacteria, fungi, plant, nematode, arthropod, fish, amphibian, reptile, bird and mammal, respectively. These organisms are commonly used as a model biological system to study various biological processes including but not limited to signal transduction, gene regulation, metabolism, the function of a particular protein or a gene, redox and iron homeostasis, various diseases and developmental process. Trxs and Grxs amino acid sequences from the abovementioned organisms were manually retrieved from NCBI Genome Database (https://www.ncbi.nlm.nih.gov/genome/?term) and UniProt Database (https://www.uniprot.org/) [18]. The duplicate, truncated and missannotated sequences were eliminated manually. Total 153 sequences of Trxs and Grxs were used for further analysis. The retrieved sequences were sorted into respective classes based on their active site motif and their location within a cell [19]. Sub-cellular localization of Trxs and Grxs in different organisms was predicted using CELLO (http://cello.life.nctu.edu.tw/) [20] and WoLF PSORT servers (https://www.genscript.com/wolf-psort.html) [21].

Primary sequence analysis

The physiochemical properties like theoretical isoelectric point (pI), molecular weight (MW), extinction coefficient (EC) and peptide length were analyzed using Expasy Protparam server (http://web.expasy.org/protparam/) [22]. The catalytic site residues and protein domains were identified by NCBI Conserved Domain Database (CDD) (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). Protein sequences were subjected to multiple sequence alignment (MSA) using Clustal W software with default parameters setting and Gonnet as protein weight matrix [23]. The WebLogo (http://weblogo.berkeley.edu/logo.cgi) diagram for Trxs and Grxs were built using MSA files [24]. The percentage amino acid composition of Trxs and Grxs were computed using Molecular Evolutionary Genetics Analysis software version X (MEGA-X) with the help of MSA files [25].

Phylogenetic analysis

Phylogenetic analysis was done using MEGA-X software [25]. The evolutionary tree was built using the maximum likelihood method [26] and evolutionary distance was computed using the JTT method [27]. All positions containing gaps and missing data in MSA were eliminated during the construction of the phylogenetic tree. Bootstrap analysis was done assigning 500 replication cycles [28]. The tree was drawn to scale with branch lengths measured in the number of substitutions per site. The branch length in the phylogenetic tree is directly proportional to the rate of amino acid substitution and evolutionary distance.

Comparative structural analysis

The high-resolution 3D protein structures of selected organisms available in the Protein Data Bank were retrieved for structural analysis [29]. The solved tertiary structures of Trxs and Grxs of E. coli, S. cerevisiae, Synechocystis sp. PCC 6803, D. melanogaster, A. thaliana and H. sapiens were retrieved from the PDB database. The Trxs structures having PDB id 1SRX, 1THX, 3F3R, 1XWC, 1 ER T and 1XFL, while Grxs structures having PDB id 2WCI, 4MJE, 5J3R, 2WUL and 3IPZ were used in this study for structural analysis. The unsolved structure was modelled by the Swiss Model server using a suitable template and validated through the Ramachandran plot [30]. Structures were analyzed using UCSF Chimera 1.14 software based on electrostatic surface potential and hydrophobicity index [31]. Structural heterogeneity was computed on the basis of root mean square deviation (RMSD) and percentage similarity. The topology diagrams of proteins were generated using Pro-origami online server (http://munk.csse.unimelb.edu.au/pro-origami/) [32].

Results

Distribution and characteristics of Trxs and Grxs: grxs are more versatile than trxs

We included E. coli K12, A. veneficus, S. elongatus PCC 7942, S. cerevisiae S288c, A. thaliana, C. elegans, D. melanogaster, D. rerio, X. laevie, G. japonicus, G. gallus and H. sapiens in this study. These organisms were selected as a representative of three domains of life to study the characteristics and relationship of Trxs and Grxs (Table1). Total 153 protein sequences were manually retrieved from biological databases using the names of the abovementioned organisms. Out of 153 proteins, 88 protein sequences were of Grxs while 65 sequences were of Trxs. Active site motif analysis revealed that Trxs generally possesses a conserved active site motif CGPC in all studied organisms except A. thaliana. We report active site motifs such as CGPC, CGGC, CPPC, CVPC, CASC, CRKC and CGSC in Trxs of A. thaliana (Table 2).
Table 1

The number of thioredoxins and glutaredoxins found in studied organisms.

ClassOrganismThioredoxinGlutaredoxin
ArchaeaArchaeoglobus veneficsus53
CyanobacteriaSynechococcus elongatus PCC 794232
BacteriaEscherichia coli K-1244
FungiSaccharomyces cerevisiae S288C38
PlantArabidopsis thaliana2735
NematodeCaenorhabditis elegans76
ArthropodDrosophila melanogaster33
FishDanio rerio39
AmphibianXenopus laevis42
ReptileGekko japonicus15
BirdGallus24
MammalHomo sapiens37
Total6588
Table 2

Sub-cellular localization and primary sequence analysis of thioredoxins (Trxs) based on active site, attached protein domain, theoretical pI, amino acids length, molecular weight (Dalton) and extinction coefficient (M−1 cm−1). Asterisks indicate non-CGPC active sites identified in this study. sv; splice variants of same gene id 25828.

OrganismLocationAccession NoName of proteinActive siteDomain attachedpILengthMolecular weightExtinction coefficient
Homo sapiensMitochondriaXP_005261565.1Trx isoform X1 (sv)CGPCTrx8.8519721728.0515470
MitochondriaXP_006724289.1Trx isoform X2 (sv)CGPCTrx8.4616618383.3013980
CytosolNP_003320.2Trx isoform 1CGPCTrx4.8210511737.506990
Danio rerioMitochondriaNP_991204.1TrxCGPCTrx8.5016618458.318480
CytosolQ7ZUI4Trx1CGPCTrx4.6910812014.599970
CytosolNP_001002461.1Trx2CGPCTrx5.3010711874.568480
Drosophila melanogasterCytosolNP_572212.1TrxTCGPCTrx4.2315717488.5412950
CytosolNP_523526.1Trx2CGPCTrx4.2310617488.5412950
CytosolP47938Trx1CGPCTrx4.7310711736.668480
Caenorhabditis elegansCytosolNP_001256207.1Trx7CGPCTrx9.4511912384.4811460
CytosolNP_503440.2Trx6CGPCTrx4.7314211736.668480
CytosolNP_500961.2Trx5∗CGHCPDIa Family4.4213612938.526990
CytosolNP_500578.2Trx4CGPCTrx4.8815816172.3721430
CytosolNP_001021885.1Trx1CGPCTrx5.7411515672.7238960
CytosolNP_001021886.1Trx2CGPCTrx7.5611418338.1528420
CytosolNP_491142.1Trx3CGPCTrx5.5910713323.4811460
Escherichia coli K12CytosolNP_418228.2Trx1CGPCTrx4.6910912967.049970
CytosolNP_417077.1Trx2CGPCTrx4.6613912103.946990
CytosolWP_074455222.1TrxACGPCTrx4.6710911806.6213980
CytosolSQD02739.1TrxCCGPCTrx5.0013915554.7716500
Saccharomyces cerevisiae S288CCytosolNP_011725.3Trx2CGPCTrx4.7910411203.8811460
CytosolNP_013144.1Trx1CGPCTrx4.7910311234.989970
Mitochondria5YKW_ATrx3CGPCTrx9.0812714432.1011460
Archaeoglobus veneficusCytosolWP_013682912.1Trx1CGPCTrx5.3710612081.096990
CytosolWP_083809303.1Trx2∗CPYCTrx6.1312414289.687450
CytosolWP_013683699.1Trx3CGPCTrx7.7013415242.8119480
CytosolWP_013683913.1Trx4∗CPYCTrx4.93819034.642980
CytosolWP_013684269.1Trx5∗CPSCTrx5.6519621997.2622920
Synechococcus elongatus PCC 7942CytosolWP_011244574.1Trx1CGPCTrx4.9010711648.4013980
CytosolWP_208672674.1Trx2CGPCTrx7.7311112680.6226470
CytosolWP_011378192.1Trx3CGPCTrx4.4910711924.6920970
Arabidopsis thalianaChloroplastNP_849585.1TrxM1CGPCTrx9.1417919664.5822460
CytosolNP_001117249.1Trx4 CYS HIS rich∗CGGCTrx6.8817719372.155960
MitochondriaNP_564371.1TrxO2CGPCTrx9.0115917623.1818450
ChloroplastNP_175021.2TrxY2CGPCTrx8.4516718592.3216960
CytosolNP_175128.1TrxH5∗CPPCTrx5.1911813122.3211000
CytosolOAP18792.1TrxH4∗CPPCTrx9.0618219647.7312950
CytosolNP_564566.1TrxXCGPCTrx7.8012914531.7513980
CytosolNP_176182.1TrxH7CGPCTrx7.8011914531.7513980
CytosolNP_177146.1TrxH8CGPCTrx8.9714817250.1926470
CytosolNP_001325846.1TrxH10∗CVPCTrx8.7515417370.9224980
ChloroplastNP_177802.2TrxY1CGPCTrx9.0317219250.1911460
MitochondriaNP_001078006.1TrxO1CGPCTrx9.4519421191.2916960
CytosolNP_186922.1TrxF1CGPCTrx9.1217819325.4318450
CytosolNP_187329.1TrxZCGPCTrx5.6518320670.0811460
CytosolNP_001325992.1TrxH9CGPCTrx5.1214015334.3716500
ChloroplastQ9SEU8TrxM2CGPCTrx9.3518620312.4320970
ChloroplastQ9SEU7TrxM3CGPCTrx8.6517319500.3018450
ChloroplastNP_188155.1TrxM4CGPCTrx9.6219321172.2819480
CytosolNP_190672.1TrxH1CGPCTrx5.6411412672.7316500
CytosolNP_001330106.1Trx5 CYS HIS rich∗CGGCTrx8.1718620427.3815470
CytosolOAO89812.1TrxH3∗CPPCTrx5.0611813109.2811000
CytosolQ38879TrxH2CGPCTrx5.7413314675.8511000
CytosolNP_198811.1Trx2CGPCTrx5.7413414732.9011000
CytosolNP_194346.1Trx1 CYS HIS rich∗CGSCTrx8.7222124352.5123950
CytosolNP_567831.1Trx2 CYS HIS rich∗CASCTrx9.0623525843.6420970
CytosolAED90720.1Trx2 WCRKC∗CRKCTrx8.3219221836.1325440
ChloroplastOAO91968.1TrxF2CGPCTrx9.0618519999.2018450
Xenopus laevisCytosolNP_001080066.1Trx2L homoeologCGPCTrx7.7117018584.595500
CytosolNP_001088487.1TrxL homoeologCGPCTrx5.3410511755.559970
CytosolA2VDE6Trx1CGPCTrx4.9610511864.726990
CytosolNP_001085522.1Trx2CGPCTrx5.3410511755.559970
Gekko japonicusMitochondriaXP_015265683.1TrxCGPCTrx9.2717419212.446990
Gallus gallusCytosolNP_990784.1TrxCGPCTrx5.1010511700.526990
MitochondriaNP_001026581.1TrxCGPCTrx9.3414015170.6412490
The number of thioredoxins and glutaredoxins found in studied organisms. Sub-cellular localization and primary sequence analysis of thioredoxins (Trxs) based on active site, attached protein domain, theoretical pI, amino acids length, molecular weight (Dalton) and extinction coefficient (M−1 cm−1). Asterisks indicate non-CGPC active sites identified in this study. sv; splice variants of same gene id 25828. A. thaliana possessed maximum number, i.e., 27, and types, i.e., f, h, m, o, x and y, of Trxs in comparison to other studied organisms (Table 2). The mitochondrion of A. thaliana have only ‘o’ type Trx while chloroplast contains m, y, x and f Trxs. The absence of these Trxs in the cytoplasm stated their specificity towards cellular organelles. Notably, organelle-specific Trxs typically had CGPC active site motif sequence (Table 2). Due to the absence of signal peptide, Trx ‘h’ was considered as a cytoplasmic Trx. This proposal is supported by the fact that ‘h’ Trx is found in the phloem sap of rice and other plants where it can translocate through plasmodesmata due to its small size [33]. A. veneficus Trxs were found to have CGPC, CPSC, and CPYC active site motifs. Trxs analyzed in this study were approximately 140 amino acids long, had a single Trx family domain, and showed a wide range of theoretical pI, i.e., 4.2 to 9.5 (Table 2). The frequency of occurrence of different amino acids at different positions of the peptide chain dictates diversity in the protein sequence. Trxs had very high percentage of non-polar amino acids such as alanine, valine, leucine, serine, and charged amino acids such as aspartic acid and lysine (Figure 1). However, valine was the most abundant amino acid while tryptophan, tyrosine, histidine, arginine, methionine and glutamine residues were the least abundant amino acids in Trxs (Figure 1). The frequency of phenylalanine as compared to other aromatic amino acids was considerably high in all Trxs. Trxs had ∼1.5% cysteine while some amino acids were completely absent in Trxs. For example, arginine and histidine were absent in Yeast, histidine was absent in Drosophila, and arginine was absent in the Trxs of fish, birds and mammals (Figure 1; Supplementary Table 1). The cysteine residues present in the active sites of Trxs were highly conserved (Supplementary Fig. 1). Also, a cis-proline residue located five residues after the catalytic site, and its neighbor threonine residue, two glycine, one phenylalanine, one valine and two aspartic acid residues were highly conserved in all studied Trxs (Supplementary Fig. 1).
Figure 1

Frequency of occurrence of different amino acids in 65 thioredoxins (Trxs) of different organisms of three domains of life. The frequencies of different amino acids in Trxs of different organisms are shown on the x-axis. The twenty different amino acids are marked with different color codes in the diagram.

Frequency of occurrence of different amino acids in 65 thioredoxins (Trxs) of different organisms of three domains of life. The frequencies of different amino acids in Trxs of different organisms are shown on the x-axis. The twenty different amino acids are marked with different color codes in the diagram. Contrary to Trxs, Grxs showed diversity in their active site sequence. In addition to commonly known CPFC, CPYC and CGFS active site motifs, we report CSYC, CGYC, CPYS, CSYS, and CFYC active sites in Grxs of studied organisms. Importantly, the diversity in the active site was more common in eukaryotes (Table 3). A. thaliana possessed 17 monothiol and 14 dithiol Grxs. Unlike Trxs, Grxs were either monomeric or multimeric proteins similar to previous reports [14, 15, 33]. The multidomain Grxs had either PICOT domain or multiple Grx domains. Also, multidomain Grxs were commonly found in eukaryotic systems. Grxs were approximately 100–150 amino acids long and had theoretical pI ranging from 4.5 to 9.5 (Table 3). Important to note that Trxs and Grxs had an almost similar range of theoretical pI values despite their difference in amino acids composition. However, the vast range of theoretical pI was the result of the diversity of amino acids present in the two proteins. Grxs had a comparatively higher percentage of a cysteine residue, i.e., 2%, than Trxs. Grxs possessed high percentage of leucine, glutamic acid, glycine, alanine, serine and valine amino acids (Figure 2). The presence of tryptophan and histidine was less than 1%, while phenylalanine and tyrosine were higher than 2.5% (Figure 2).
Table 3

Subcellular localization and primary sequence analysis of glutaredoxins (Grxs) based on active site, attached protein domain, theoretical pI, amino acids length, molecular weight (Dalton) and extinction coefficient (M−1 cm−1).

OrganismLocationAccession NoName of proteinActive siteDomain attachedpILengthMolecular weightExtinction coefficient
Synechococcus elongatus PCC 7942CytosolWP_011244089.1Grx3CPFCGrx5.50879474.798480
CytosolWP_208672557.1Grx4CGFSGrx4.3110812064.989970
Homo sapiensCytosolNP_001230587.1Grx1CPYCGrx8.3310611775.742980
MitochondriaNP_057150.2Grx2 isoform 1CSYCGrx9.5416518723.4614440
MitochondriaNP_001306220.1Grx2 isoform 3CSYCGrx8.3412414127.217450
MitochondriaXP_016856886.1Grx2 isoform X1CSYCGrx8.3412414127.217450
CytosolNP_001186797.1Grx3 isoform 1CGFS2 Grx_PICOT1 Trx_PICOT (N-terminal)5.3133537432.0333920
CytosolNP_001308909.1Grx3 isoform 2CGFSGrx4.9618921497.7121430
CytosolXP_016870963.1Grx3 isoform X1CGFSGrx4.9618921497.7121430
Xenopus laevisCytosolXP_018080656.1Grx3CGFS2 Grx_PICOT1 Trx_PICOT (N-terminal)5.3832636608.9932430
MitochondriaAAH70695.1Grx5CGFSGrx7.9115016642.0016960
Drosophila melanogasterCytosolQ9W2D1Grx1CPYCGrx8.8211613024.947450
CytosolNP_609641.1Grx4CGFS1 Grx_PICOT1 Trx_PICOT (N-terminal)4.8021623630.9719940
MitochondriaQ8SXQ5Grx5CGFSGrx7.7315917668.349970
Danio rerioCytosolNP_001005942.1GrxCPYCGrx7.6310511348.264470
CytosolAAH96952.1GrxCSYCGrx6.5410511379.244470
CytosolNP_001005942.1Grx1CSYCGrx6.5410511337.164470
CytosolXP_005161791.1Grx2 isoform X1CPYCGrx8.1415717185.568480
CytosolXP_005161792.1Grx2 isoform X2CPYCGrx8.6115216632.905960
CytosolXP_005161790.1Grx2 isoform X3CPYCGrx7.5017018420.044470
CytosolXP_009294644.1Grx2 isoform X4CPYCGrx6.7913414408.312980
CytosolNP_001005950.1Grx3CGFS1 Grx_PICOT1 Trx_PICOT (N-terminal)5.1832636335.4428420
MitochondriaNP_998186.1Grx5CGFSGrx8.7115517376.9415470
Caenorhabditis elegansCytosolNP_490812.1Grx1CPYCGrx8.4610511297.939970
CytosolNP_001040891.1Grx2CTFCGrxC5.7611913130.917450
CytosolNP_001040892.1Grx3CTFCGrxC5.719610693.077450
CytosolNP_499610.1Grx4CGFSGrx6.7314215767.999970
CytosolNP_001033391.1Grx5CGYCGrx7.6513114619.7410430
MitochondriaNP_001023757.1Grx6CGFS1 Grx_PICOT1 Trx_PICOT (N-terminal)1 Grx domain (C-terminal)5.1934237939.1423950
Escherichia coli K12CytosolMBI0727830.1GrxACPYCGrx4.81859684.8511460
CytosolSTF72101.1GrxBCPYCGrx7.7221524350.2122920
CytosolMBA1844136.1GrxCCPYCGrx6.71839137.494470
Cytosol2WCIAGrxDCGFSGrx4.6911512878.7618450
Saccharomyces cerevisiae S288CCytosolNP_009895.1Grx1CPYCGrx4.9811012380.195960
CytosolNP_010801.1Grx2CPYCGrx6.7314315861.474470
CytosolNP_010383.4Grx3CGFS1 Grx_PICOT1 Trx_PICOT (N-terminal)4.3725028261.3418450
CytosolNP_011101.3Grx4CGFS1 Grx_PICOT1 Trx_PICOT (N-terminal)4.5724427492.9016960
MitochondriaNP_015266.1Grx5CGFSGrxD4.8515016931.4511460
CytosolNP_010274.1Grx6CSYSGrx6.0123125783.2815930
CytosolP38068Grx7CPYSGrx5.6420322565.2014440
CytosolQ05926Grx8CPDCGrxC7.7810912519.4024980
Archaeoglobus veneficusCytosolWP_013684320.1Grx1CPHC2 HEAT repeat (C-terminal)4.7131510459.2712950
CytosolF2KMX5Grx2CPYCGrx4.93819034.642980
CytosolWP_013683032.1Grx3CPKCGrx5.179534962.431490
Gekko japonicusCytosolAAW51391.1GrxCGFS1 Grx_PICOT1 Grx domain (C-terminal)9.2416218454.8316960
CytosolXP_015283682.1Grx1CPYCGrx6.8110311533.465960
CytosolXP_015281914.1Grx2 isoform X1CSYCGrx9.6614516353.529970
CytosolXP_015281914.1Grx2 isoform X2CSYCGrx8.8612814280.244470
CytosolXP_015262026.1Grx3CGFS2 Grx_PICOT1 Trx_PICOT (N-terminal)5.5533537312.7439420
Gallus gallusCytosolNP_990491.1Grx1CPYCGrx8.4010111397.452980
CytosolXP_004943187.1Grx2CFYCGrx7.6712313390.372980
CytosolXP_422200.3Grx2 X1CFYCGrx9.4213715164.572980
CytosolNP_001264313.1Grx3CGFS2 Grx_PICOT1 Trx_PICOT (N-terminal)5.4732836573.9433920
Arabidopsis thalianaCytosolNP_566522.1Grx4CGFSGrx5.9116918734.219970
CytosolNP_568962.1GrxC1CGYCGrx5.3912513610.619970
CytosolQ29PZ1GrxC10CCMCGrx5.2314815709.226990
CytosolNP_191854.1GrxC11CCMCGrx8.8610311312.198480
CytosolNP_001318444.1GrxC12CCMCGrx7.7410311261.068480
CytosolO82255GrxC13CCLCGrx7.6310211275.324470
CytosolNP_191855.1GrxC14CCLCGrx7.6510211283.222980
CytosolQ9FNE2GrxC2CPYCGrx6.7111111756.428480
CytosolKAG7590455.1GrxC3CPYCGrx5.7813014247.384470
CytosolQ8LFQ6GrxC4CPYCGrx5.6813514827.1111460
ChloroplastQ8GWS0GrxC5CSYCGrx9.1217418813.5912950
ChloroplastQ8L9S3GrxC6CCMCGrx4.9514415566.016990
CytosolQ96305GrxC7CCMCGrx6.8813614201.6015470
CytosolQ8LF89GrxC8CCMCGrx8.7814014969.609970
CytosolQ9SGP6GrxC9CCMCGrx5.7113714758.198480
CytosolNP_171801.1GrxS1CCMSGrx5.7510211067.888480
CytosolQ9LIF1GrxS10CCMSGrx8.5410211056.0513980
CytosolQ9M9Y9GrxS11CCLSGrx8.309910889.982980
ChloroplastQ8LBS4GrxS12CSYSGrx8.3717919124.9020970
CytosolQ84TF4GrxS13CCLGGrx8.9515016365.8520970
ChloroplastQ84Y95GrxS14CGFSGrx8.6717319309.409970
MitochondriaNP_001030704.1GrxS15CGFSGrx5.9116918734.219970
ChloroplastKAG7569621.1GrxS16CGFS1 Grx_PICOT1 GIY-YIG Domain (N-terminal)7.7229332205.7226930
CytosolQ9ZPH2GrxS17CGFS3 Grx_PICOT1 Trx_PICOT (N-terminal)5.0048853115.2639420
CytosolNP_197361.1GrxS2CCMSGrx6.0610211039.098480
CytosolO23421GrxS3CCMSGrx7.7610211168.206990
CytosolO23419GrxS4CCMSGrx8.5410211161.256990
CytosolO23420GrxS5CCMSGrx6.7110211205.216990
CytosolNP_191852.1GrxS6CCMSGrx8.5510211111.036990
CytosolQ6NLU2GrxS7CCMSGrx8.5410211244.346990
CytosolNP_193301.1GrxS8CCMSGrx7.7310211311.388480
CytosolNP_180612.1GrxS9CCMSGrx6.8010211080.081490
CytosolNP_186849.1ROXY1CCMCGrx6.8813614201.6015470
CytosolNP_174170.1ROXY19CCMCGrx5.7113714758.198480
CytosolKAG7550582.1ROXY2CCMCGrx8.7814014969.609970
Figure 2

Frequency of occurrence of different amino acids in 88 glutaredoxins (Grxs) of different organisms of three domains of life. The frequencies of different amino acids in Grxs of different organisms are shown on the x-axis. The twenty different amino acids are marked with different color codes in the diagram.

Subcellular localization and primary sequence analysis of glutaredoxins (Grxs) based on active site, attached protein domain, theoretical pI, amino acids length, molecular weight (Dalton) and extinction coefficient (M−1 cm−1). Frequency of occurrence of different amino acids in 88 glutaredoxins (Grxs) of different organisms of three domains of life. The frequencies of different amino acids in Grxs of different organisms are shown on the x-axis. The twenty different amino acids are marked with different color codes in the diagram. However, other amino acids were moderately present in all Grxs (Supplementary Table 2). Similar to Trxs, several amino acids were absent in Grxs. For example, histidine was absent in some Grxs of archaea while tryptophan was absent in Grxs of fungus, nematode, reptile and mammal. Grxs of fish lacked aspartate and tryptophan, and histidine and tryptophan amino acids were absent in the bird's Grxs (Figure 2; Supplementary Table 2). Although Grxs showed diversity in their amino acids composition, some of the residues were highly conserved. The N-terminal cysteine residue of active site motif, the cis-proline and the two consecutive glycine residues were highly conserved in all Grxs (Supplementary Fig. 2). Notably, amino acid residues of glutathione binding sites were not conserved and substitution by a similar group of amino acid residues was observed at glutathione binding sites in Grxs (Supplementary Fig. 2).

Trxs and Grxs phylogeny support endosymbiotic theory for origin of chloroplast and mitochondria

Phylogenetic tree consisting of 65 Trxs and 88 Grxs protein sequences from 12 different organisms was constructed using maximum likelihood method to decipher the evolutionary relationship between two groups of proteins (Figure 3). Trxs and Grxs got separated from a common ancestor at the very beginning of the phylogenetic tree and resulted in two individual groups of proteins (Figure 3). Trxs group was differentiated into different clusters based on phylogenetic analysis. The mitochondrial Trxs clustered together and formed a separate group with Trxs of bacteria and archaea (Figure 3). The chloroplast specific Trxs clustered with the Trxs of cyanobacteria and nematode. However, cytoplasmic Trxs of A. thaliana formed distinct clusters and shared ancestral relationships with nematodes and amphibians (Figure 3). Trxs of higher organisms shared common clusters in the phylogenetic tree (Figure 3). In contrast, Grxs were divided into three major subgroups. One of the subgroups had monothiol CGFS type Grxs while another subgroup had CC-type Grxs from A. thaliana forming a separate cluster that got separated at the beginning of the tree (Figure 3). The third subgroup possessed particularly dithiol Grxs from all organisms. It is worth mentioning here that similar to Trxs, Grxs located in the chloroplast and mitochondria shared common clusters with Grxs of cyanobacteria and bacteria, respectively (Figure 3).
Figure 3

Evolutionary relationship of thioredoxins (Trxs) and glutaredoxins (Grxs). The evolutionary history was inferred by using the Maximum Likelihood method and JTT matrix-based model. The tree with the highest log likelihood (-3169.91) is shown. The tree was drawn to scale and branch length in the tree is directly proportional to the rate of amino acid substitution and evolutionary distance. The analysis involved 65 Trxs and 88 Grxs of 12 different organisms representing three domains of life.

Evolutionary relationship of thioredoxins (Trxs) and glutaredoxins (Grxs). The evolutionary history was inferred by using the Maximum Likelihood method and JTT matrix-based model. The tree with the highest log likelihood (-3169.91) is shown. The tree was drawn to scale and branch length in the tree is directly proportional to the rate of amino acid substitution and evolutionary distance. The analysis involved 65 Trxs and 88 Grxs of 12 different organisms representing three domains of life.

Evolutionary relationship of Trxs in three domains of life

We further explored the evolutionary relationship among Trxs of 12 distantly related organisms (Supplementary Fig. 3). The phylogenetic tree of Trxs was divided into four major subgroups. Subgroup 1 was comprised of Trxs of bacteria, archaea, cyanobacteria together with mitochondrial and chloroplastic Trxs, while subgroup 2 had cytoplasmic Trxs of eukaryotes. However, in subgroup 2, A. thaliana Trxs formed an independent cluster which shared common ancestry with cytoplasmic Trxs of amphibian, mammal, fish and bird (Supplementary Fig. 3). Notably, Trxs f1 and f2 of A. thaliana shared close ancestry with Trx4 and Trx5 of A. veneficus which suggested that plant Trx f has an archaebacterial origin. Trxs of bird, mammal, amphibian, fish, and nematode formed a separate cluster. Two cytosolic Trxs of A. thaliana (Table 2; accession number NP_198811.1 and NP_190672.1) shared an ancestral relationship with 3 Trxs of S. cerevisiae and formed a small cluster in subgroup 2. The cysteine-histidine-rich Trxs of A. thaliana formed a small subgroup 3. Two Trxs of A. veneficus and three Trxs (two Trxs f and one WCRKC Trx) of A. thaliana build subgroup 4. Importantly, two Trxs of A. veneficus got separated at the very beginning from an inner node of the tree (Supplementary Fig. 3).

Evolutionary relationship of Grxs in three domains of life

Phylogenetic tree of 88 Grxs showed three major subgroups (Supplementary Fig. 4). Subgroup 1 had Grxs of eukaryotic organisms which further divided to form two individual groups. The first group generally possessed Grxs of plants where monothiol Grxs formed a separate cluster (S1 to S8) from dithiol Grxs. The other part of subgroup 1 was largely dominated by the Grxs of bird, fish, mammal and fungus but Grxs of nematode, arthropod and some Grxs of the plant were also clustered in this group (Supplementary Fig. 4). Notably, Grxs of fungus and plant were present in close proximity and got separated at the beginning of the inner node while Grxs of other organisms were found to be descendent of fungus and plant. The presence of Grxs of fungus and plants in close proximity suggested their common ancestral relationship. Subgroup 2 had Grxs of both prokaryotes and eukaryotes whereas Grxs of archaea got separated at the very beginning from other organisms (Supplementary Fig. 4). The mitochondrial Grxs shared ancestral relationships with the Grxs of bacteria, while Grx3, Grx4 and Grx5 of different organisms formed small out-groups. Subgroup 3 had a few Grxs of both prokaryotes and eukaryotes forming the smallest subgroup which further divided to form two distinct out-groups. One group possessed Grxs of archaea, bacteria and fungus while the other group primarily had Grxs of eukaryotes along with bacterial and cyanobacterial Grxs. Importantly, subgroup 3 did not contain Grxs of plant and arthropod.

Trxs and Grxs have opposite electrostatic surface potential

We conducted a comparative analysis of Trxs and Grxs of distantly related organisms to better understand their evolutionary and structural relationship. To achieve this, Trxs and Grxs of E. coli, S. cerevisiae, Synechocystis sp. PCC 6803, D. melanogaster, A. thaliana, and H. sapiens were selected as their solved structures were available in the database except for Grx of D. melanogaster. Therefore, we modelled Grx of D. melanogaster using a suitable template by the Swiss Model server [30]. We also modelled the Trx structure of A. thaliana as its 2D overlapped structure solved by NMR was available in the PDB database but it is difficult to conduct structural analysis with such structures. The modelled structures were validated by Ramachandran plot analysis where 99% of residues fall under the allowed region (data not shown). Similar to previous findings [15, 34], the core of Trxs and Grxs was made up of β strands while the catalytic motif was positioned on the surface of two proteins (Figures 4 and 5; Supplementary Fig. 5). The anchor residues, which are crucial for the thermodynamic and redox properties of proteins and their catalytic activity [35], were present on the surface in close proximity to the active site (Figure 4). The surface of Trxs was dominated by negatively charged and neutral amino acids, while Grxs had neutral and positively charged amino acids on their surface. However, in both cases, the active site was constituted by neutral (uncharged) amino acids (Figure 4). The catalytic site of Trxs was surrounded by negatively charged amino acids, while positively charged amino acids surrounded the catalytic site in Grxs (Figure 4). The surface of both Trxs and Grxs was dominated by hydrophilic amino acids together with a few hydrophobic residues (Figure 5). However, the catalytic site in all Trxs and Grxs was comprised of both hydrophilic and hydrophobic amino acid residues (Figure 5). The presence of hydrophilic residues on the surface of all studied Trxs and Grxs suggested their soluble nature. Thus, Trxs and Grxs have similar protein surface properties except for their electrostatic potential.
Figure 4

Electrostatic surface potential views of thioredoxins (Trxs) and glutaredoxins (Grxs) structures of different organisms representing three domains of life. The positive surface potential is denoted by blue color and negative surface potential is shown in red color. The protein surface possessing a neutral charge is shown in white color. Anchor residues are shown on the surface using a single letter code.

Figure 5

Hydrophobicity surface views of thioredoxins (Trxs) and glutaredoxins (Grxs) structures of different organisms representing three domains of life. The hydrophilic regions present on the surface of the proteins are shown in cyan color while a hydrophobic portion of the protein surface is shown in orange color. Anchor residues are shown on the surface using a single letter code.

Electrostatic surface potential views of thioredoxins (Trxs) and glutaredoxins (Grxs) structures of different organisms representing three domains of life. The positive surface potential is denoted by blue color and negative surface potential is shown in red color. The protein surface possessing a neutral charge is shown in white color. Anchor residues are shown on the surface using a single letter code. Hydrophobicity surface views of thioredoxins (Trxs) and glutaredoxins (Grxs) structures of different organisms representing three domains of life. The hydrophilic regions present on the surface of the proteins are shown in cyan color while a hydrophobic portion of the protein surface is shown in orange color. Anchor residues are shown on the surface using a single letter code.

Trxs maintain domain architecture and topology which is not conserved in Grxs

Trxs and Grxs showed similar topology and had a conserved Trx/Grx fold which was primarily composed of 4β strands and at least 3α helices (Supplementary Fig. 5). For better understanding, we divided Trx/Grx fold into two domains, i.e., the N-terminal domain shown in orange color and the green-colored C-terminal domain (Supplementary Fig. 5). The N-terminal domain of both Trxs and Grxs was composed of 2 parallel β strands and one α helix while the C-terminal domain had 2 anti-parallel β strands and one α helix. Important to mention that β1 and α1 are extra secondary structures found in all Trxs that were not part of the conserved fold (Supplementary Fig. 5a). The N-terminal and C-terminal domains of the two proteins were connected by an α helix; specifically α3 helix in Trxs and α2 helix in Grxs. The α2 and α4 helices of Trxs were located on one side of the central β-sheet while the α3-helix was located on the opposite side. Similarly, α1 and α3 helices of Grxs were located on one side of the central β-sheet while α2-helix was located on the opposite side (Supplementary Fig. 5b). The α3-helix of Trxs was oriented perpendicularly to α2 and α4 helices, and α2-helix of Grxs was oriented perpendicularly to α1 and α3 helices (Supplementary Fig. 5). All studied Trxs structures were typically containing 5β strands and 4α helices, and similarly, all studied Grxs structures possessed additional α helix other than 3α helices which was not part of the Trx/Grx fold (Supplementary Fig. 6). For example, Grxs of E. coli, D. melanogaster, H. sapiens and A. thaliana had two additional α helices; one at N-terminal and another at the C-terminal end. Similarly, cyanobacterial Grxs had one additional α helix at their C-terminal end while Yeast Grxs had two additional α helices; one at N-terminal and one at C-terminal, and two antiparallel β strands (Supplementary Fig. 6). In summary, topology analysis suggested that all Trxs strictly maintained domain architecture and topology which was not conserved in Grxs.

Trxs and Grxs are structurally more different in eukaryotes than prokaryotes due to altered helical conformation

We calculated structural similarity and computed root mean square deviation (RMSD) between equivalent atom positions after optimal superimposition of the different structures to decipher structural differences between Trxs and Grxs. Global RMSD value of superimposed Trx and Grx structures of the same organism was higher in eukaryotes than prokaryotes (Figure 6). This finding suggested that structures of Trxs and Grxs are more similar in prokaryotes than eukaryotes. The high RMSD values of superimposed Trx and Grx structures of eukaryotes suggested their structural differences and their involvement in distinct cellular processes (Figure 6). Importantly, Trxs and Grxs shared common conformation of core β strands with minimum local RMSD values (data not shown). However, two proteins largely differed in conformation of α helices which resulted high RMSD score (Figure 6). For example, conformation of α1 helix was different in Trx and Grx of Arabidopsis, Drosophila, Human, and Yeast, while two proteins showed different conformation of α2 and α3 helices in E. coli and cyanobacterium. Also, eukaryotic Trxs and Grxs showed a higher conformational change in the helices than prokaryotic equivalents (Figure 6). We also calculated percent similarity which corresponds to the number of residues or percentage of total residues matched in the aligned structures. The percent similarity value depends on both sequence and structural conservation of the proteins, and therefore, proteins having similar sequence and secondary structure possess higher percent identity. The percentage identities were higher for prokaryotic Trx and Grx than eukaryotes which suggested that two proteins are more conserved in terms of their sequence and structure in prokaryotes than eukaryotes (Figure 6).
Figure 6

Comparative structure analysis of thioredoxins (Trxs) and glutaredoxins (Grxs). Secondary structure alignment and comparison between Trxs and Grxs in six different organisms representing three domains of life were made using superimposed structure. The structures of Trx and Grx from the same organism were superimposed and structural similarities were computed based on RMSD and percentage identity. Trx structures are shown in brown color while Grx structures are shown in cyan color.

Comparative structure analysis of thioredoxins (Trxs) and glutaredoxins (Grxs). Secondary structure alignment and comparison between Trxs and Grxs in six different organisms representing three domains of life were made using superimposed structure. The structures of Trx and Grx from the same organism were superimposed and structural similarities were computed based on RMSD and percentage identity. Trx structures are shown in brown color while Grx structures are shown in cyan color.

Trxs differ in distantly related organisms but conserved conformation of anchor residues

To assess the structural and functional diversity of Trxs, we conducted a comparative structural analysis of Trxs of different organisms based on percent similarity and RMSD value. All studied Trxs of prokaryotes and eukaryotes had conserved conformation of core β strands, helices and loops (Figure 7). However, RMSD values of superimposed Trxs structures were higher for distantly related organisms while lower RMSD values were obtained for closely related organisms (Figure 7). The E. coli Trx scored minimum RMSD value with cyanobacterial Trx and showed maximum percent similarity. Similarly, cyanobacterial Trx showed higher similarity to E. coli Trx than any other studied structures (Figure 7). Yeast Trx showed maximum similarity to human Trx; Drosophila Trx was similar to Human Trx and vice versa, and A. thaliana shared maximum similarity to Human Trx (Figure 7).
Figure 7

Matrix-based comparative structure analysis of thioredoxins (Trxs) and glutaredoxin (Grxs) structures of different organisms. Secondary structure alignment and comparison of Trxs and Grxs in six different organisms representing three domains of life were made using superimposed structure. The structures of Trxs and Grxs found in the different organisms were superimposed based on the matrix and structural similarities were computed based on RMSD and percentage identity. Trxs and Grxs structures of different organisms are shown in different colors as given below the image.

Matrix-based comparative structure analysis of thioredoxins (Trxs) and glutaredoxin (Grxs) structures of different organisms. Secondary structure alignment and comparison of Trxs and Grxs in six different organisms representing three domains of life were made using superimposed structure. The structures of Trxs and Grxs found in the different organisms were superimposed based on the matrix and structural similarities were computed based on RMSD and percentage identity. Trxs and Grxs structures of different organisms are shown in different colors as given below the image. In addition to the conserved fold, the Trxs structures have amino acid residues known as anchor residues which include catalytic residues C32, G33, P34 and C35 along with D26, V25, F27, A29, W31, P40, F42, D61, P76, T77, and G92 [35]. We report that some of the anchor residues in studied Trxs were replaced by other residues having similar chemical properties (Figure 1; Supplementary Fig. 7). The 42nd position in E. coli Trx was occupied by leucine. In cyanobacteria, tyrosine, isoleucine and alanine were present at the 26th, 42nd and 77th position while leucine replaced V25 and F42 in D. melanogaster. Similarly, F42 was replaced by I42 in the Trx of Yeast. We analyzed the conformation of these anchor residues by the superimposition of the structures (Supplementary Fig. 7). A low local RMSD score for anchor residues (data not shown) suggested that Trxs strictly conserved conformations of structurally and functionally important residues even in distantly related organisms despite substitutions and other structural differences.

Grxs possess flexible active site but strictly conserved conformation of cis-proline and GG-kink

The structural conformation of Grxs varied from organism to organism; however, it had same structural fold in all studied organisms (Figure 7). All Grxs shared a common conformation of core β strands and possessed more loop regions than Trxs (Figure 7; Supplementary Fig. 6). The presence of a higher number of loop regions in Grxs rendered higher flexible structure than Trxs. However, the conformation of helices and loops in Grxs was not conserved as compared to Trxs. The conformation of eukaryotic Grxs was more conserved in comparison to prokaryotic Grxs (Figure 7). Interestingly, E. coli Grx showed higher similarity with Drosophila, Human and Arabidopsis Grxs. Similar to Trxs, Grxs possessed structurally and functionally important anchor residues which included catalytic residues, glutathione binding residues, cis-proline, and GG-kink [15]. We analyzed the 3D conformation of catalytic residues, cis-proline, and GG-kink by the superimposition of studied Grxs structures, and did not include the GSH binding site due to variability in the amino acid residues (Supplementary Fig. 8). However, the position of amino acids in the GSH binding site was strictly conserved in all Grxs. In prokaryotes, GSH binding site residues were located at two positions, i.e., one upstream and the other downstream of the catalytic site, while eukaryotes had GSH binding site residues at three positions, i.e., one upstream and two downstream of the catalytic site. Results of the 3D conformational analysis of superimposed structures indicated that Grxs have a more flexible active site in comparison to Trxs while cis-proline and GG-kink residues conserved their 3D conformation during evolution (Supplementary Figs. 7 and 8). Also, the active site conformation of E. coli Grx was analogous to human and Arabidopsis Grxs while Grxs of human and Arabidopsis had similar conformation of catalytic residues (Supplementary Fig. 8). Overall, structural analyses suggested that the conformation of active sites of Trxs is highly conserved in different organisms despite variation in their overall structural conformation. In contrast, the active sites of Grxs were more flexible in terms of their structural conformation. This suggested that Trxs are more specific for their substrates than Grxs. Thus, Grxs could target different substrates and show functional plasticity. This could be also a possible explanation for Grxs to act as a backup system for Trxs depending on their electrostatic surface potential, especially, in prokaryotes where Trxs and Grxs are structurally more similar [5, 8].

Discussion

We conducted in silico amino acids sequence, phylogenetic, and comparative structural analyses to decipher the evolutionary and structural relationship between Trxs and Grxs of 12 organisms from three domains of life. Total 153 protein sequences, including 88 Grxs and 65 Trxs of distantly related organisms, were analyzed and outcomes suggested that Trxs possess more rigid and conserved catalytic site than Grxs (Tables 2 and 3; Supplementary Figs. 7 and 8). This observation supported that Trxs are more specific for their substrates than Grxs that have flexible active sites due to variation in their amino acids composition [2, 5, 16]. However, N-terminal cysteine residue of the active site was conserved in all Grxs which supported its proposed essentiality for initiating the reaction [15, 36]. In addition to active site residues, several other amino acid residues in Trxs and Grxs were found conserved during MSA analysis (Supplementary Fig. 1). Notably, these residues were abundant in Trxs than Grxs and could be required for maintaining 3D structure and redox properties of two proteins; however, this proposition needs to be experimentally validated. Generally, Trxs have conserved CGPC active site motif, however, we report Trxs with different catalytic motifs [37]. The maximum variation in the active sites of Trxs was found in A. thaliana which had 2 CGGC (Trx4 and Trx5), 3 CPPC (Trx h3, h4 and h5), 1 CVPC (Trx h10), 1 CGSC (Trx1), 1 CASC (Trx2) and 1 CRKC (Trx2) Trxs in the cytosol (Table 2). Similarly, Trx2 and Trx5 of A. veneficus had CPYC and CPSC active sites, respectively, and C. elegans had CGHC active site motif in Trx5 (Table 2). We report presence of disulfide isomerase (PDI) family domain in Trx5 of C. elegans. PDI is a redox-active protein commonly found in eukaryotes and catalyze oxidative protein folding in the endoplasmic reticulum (ER) [38, 39]. PDI can also reduce disulfide bonds and prevent protein aggregation, and facilitate the folding of newly synthesized proteins by acting as chaperones [40]. These proteins usually contain redox-active multiple Trx domains containing CXXC active site motif and could also possess one or more redox inactive Trx-like domains [38]. Based on domain information, it is predicted that Trx5 of C. elegans could participate in oxidative protein folding, however, further investigation is required for the proposed role of Trx5 in this organism. Notably, multimodular Grxs were observed in higher organisms while prokaryotes such as E. coli, A. veneficus and S. elongatus PCC 7942 had single modular Grxs (Table 3). However, it is important to mention that cyanobacterial class V and VI Grxs are multimodular proteins that were absent in freshwater S. elongatus PCC 7942. S. elongatus PCC 7942 possess class I and II Grxs while class V and VI Grxs are exclusively found in a few marine cyanobacteria [15]. We report several multimodular Grxs where monothiol Grxs of eukaryotes contain one N-terminal Trx domain and one or more Grx domains similar to PICOT proteins (Table 3). PICOT from the plant contains three repeats of Grx-like domain, metazoan other than insect has two repeats while fungus contains only one domain of Grx (Table 3). PICOT proteins are glutaredoxin-3 or Protein Kinase C (PKC) interacting proteins that show homology to Trxs and Grx-PICOT-like proteins [41, 42]. We identified two HEAT repeats in Grx1 of A. veneficus and a GIY-YIG domain in Grx S16 of A. thaliana (Table 3). The HEAT repeat is a tandem repeat of 37–47 amino acids long module which was found in several cytoplasmic proteins [43]. HEAT repeat-containing proteins are involved in intracellular transport processes where HEAT repeat domain facilitates protein-protein interaction [44]. The GIY-YIG domain-containing proteins are involved in several cellular processes such as DNA repair and recombination, transfer of mobile genetic elements, genomic stability and restriction of foreign DNA [45, 46, 47]. Thus, the presence of HEAT and GIY-YIG domains in Grxs suggest their role in intracellular transport and maintenance of genomic DNA, respectively. However, further experimental evidence is required for this proposal. Together, these observations suggested that the prevalence of additional domains and flexibility of catalytic sites renders higher functional diversity to Grxs. The analysis of amino acids composition gives an idea about the change in the frequency of occurrence of different amino acids in a family of proteins during evolution [48]. Also, the amino acids composition of a protein plays an important role in determining its structure, biological function and cellular localization. The flexibility in the frequency of occurrence of different amino acids was observed in both Trxs and Grxs, however, two proteins preserved approximately the same amount of non-polar amino acids in their structures (Figures 1 and 2; Supplementary Tables 1 and 2). The straight-chain non-polar amino acids are required for helix formation [49], and therefore, it is proposed that three helices of the Trx/Grx fold were maintained during evolution by preserving non-polar amino acids in both Trxs and Grxs of different organisms. Similarly, Trxs and Grxs had least amount of aromatic amino acids, particularly tryptophan (Figures 1 and 2; Supplementary Table 1 and 2). Phenylalanine, tyrosine and tryptophan are typically hydrophobic but compared to common hydrophobic residues such as leucine and valine, aromatic amino acids play important role in structural conformation [50]. For example, tyrosine and tryptophan contribute to hydrogen bonds formation while tryptophan is involved in the cation-π interaction. It is a strong non-covalent binding interaction that contributes to the secondary structure of proteins and protein-ligand interactions [50]. Thus, the low frequency of occurrence of aromatic amino acids is the characteristic feature of Trxs and Grxs proteins. The specific requirement of cysteine for catalytic reaction [15, 36] could be responsible for their conserved frequency in both Trxs and Grxs (Figures 1 and 2; Supplementary Table 1 and 2). All studied Trxs and Grxs had almost similar percentage of different amino acids in their sequence, however, the presence or absence of one or more charged amino acids in their sequence resulted in a wide range of theoretical pI (Figures 1 and 2; Tables 2 and 3; Supplementary Tables 1 and 2). The MSA analysis confirmed a similar percentage of different amino acids in Trxs and Grxs but their position in the peptide chain was different (Figures 1 and 2; Supplementary Tables 1 and 2; Supplementary Figs. 1 and 2). This observation is important as Trxs and Grxs are found in subcellular compartments [1, 3, 5]. The change in the frequency of the amino acids and their position leads to a change in the theoretical pI value which could help proteins to adjust with the environment of subcellular compartments. However, a further experimental investigation by point mutation studies is required to support the proposition that the position of certain amino acids in a peptide chain can help proteins in adjusting to environment of their intracellular location. The phylogenetic analysis of 153 Trxs and Grxs from 12 different organisms suggested that two proteins originated from a common ancestor and diverged later during evolution to form two groups of proteins (Figure 3). Also, monothiol and dithiol Grxs appeared as two separate clades which indicated their ancestral relationship, and based on our analysis, we propose that monothiol Grxs originated from dithiol Grxs (Figure 3). This proposal is supported by the previous study where the phylogenetic tree of Grxs was clearly divided into two distinct groups of dithiol and monothiol Grxs [51]. However, it is important to mention that in the previous study two groups of Grxs got separated from a common ancestor at the very beginning of the phylogenetic tree in contrast with the present study where Trxs were also part of analysis (Figure 3). Further detailed but separate phylogenetic analyses of Trxs and Grxs (Supplementary Figs. 3 and 4) supported the endosymbiotic theory that suggests that mitochondria and chloroplast of a eukaryotic cell originated from endosymbiosis of alpha-proteobacteria and cyanobacteria, respectively [52]. Also, the results of the phylogenetic study suggested that Trxs and Grxs sequences can be used to establish an evolutionary lineage in molecular systematics. The plant possesses substrate-specific Trxs such as f, m, x, y, h, and o types that carry out diverse cellular functions in different cellular compartments [53]. Trx f, Trx m, Trx x and Trx y are plastidial Trxs, Trx o is mitochondrial while Trx h is a cytoplasmic Trx. Trx f activates fructose 1,6–biposphatase (FBPase), Trx m activates malate dehydrogenase (MDH), and Trxs x and y catalyses 2-cys peroxiredoxins (Prx) and PrxQ. Trx o regulates the activity of enzymes involved in the TCA cycle while specific substrate(s) of Trx h is still unknown but due to the absence of any signal sequence, it is considered to control redox level in the cytoplasm [53]. Trxs f1 and f2 of A. thaliana were found close to the Trx4 and Trx5 of A. veneficous in phylogenetic analysis of Trxs, and therefore, it appears that plant Trx f has an archaebacterial origin (Supplementary Fig. 3). In contrast, other Trxs of A. thaliana, except Trx y and Trx m, appeared in the same clad with eukaryotic Trxs which suggested their eukaryotic origin (Supplementary Fig. 3). Trx y and Trx m were found in chloroplast and showed a closed relationship with cyanobacterial Trxs which supported the endosymbiotic origin of chloroplast from cyanobacteria. However, it is noticeable that some of the mitochondrial Trxs of S. cerevisiae and A. thaliana shared eukaryotic origin (Supplementary Fig. 3). This observation suggested that these mitochondrial Trxs were evolved later during evolution and their coding sequence were incorporated into the mitochondrial genome. It was proposed earlier that both f- and h-type Trxs are of archaebacterial origin [54], however, here, we report that h-type Trx of A. thaliana showed eukaryotic ancestry (Supplementary Fig. 3). Similarly, six divergent sequences of Trxs of archaebacteria were reported to be originated from animal and eubacterial ancestors [54]. However, we report that not all but a few archaebacterial Trxs such as Trx1, Trx2 and Trx3 share an ancestral relationship with animals and eubacteria (Supplementary Fig. 3). The phylogenetic analysis of Grxs suggested that CC-type Grxs of A. thaliana are of eukaryotic origin while Grx3 of cyanobacteria shared a close relationship with Grx1 of archaebacteria as well as eukaryotic Grxs. Owing to the presence of two prokaryotic sequences in this clad, it is proposed that these Grxs share a common prokaryotic origin (Supplementary Fig. 4). The presence of chloroplastidial and mitochondrial Grxs in the same clade supported the endosymbiotic theory and their prokaryotic origin [51]. Similar to Trxs, some Grxs of A. veneficous shared ancestral relationship with eukaryotes and eubacteria which suggested their common origin. The structural conformation of proteins impacts their function, and therefore, proteins of the same family conserved their overall structural conformation despite the difference in their amino acid sequence. All protein structures studied were found to have two regular states, i.e., α-helix and β-strand. The remaining unassigned regions known as an irregular state (coil) corresponded to a large number of different conformations (Supplementary Figs. 5 and 6). Notably, all studied Trxs and Grxs had a common structural Trx/Grx fold despite difference in their amino acids sequences (Figures 1 and 2). The robust structure of proteins despite difference in their amino acids suggested an error in gene replication event which permitted variation in proteins during evolution [55, 56, 57, 58]. However, the two cysteine residues C32 and C35 of active site and their structural conformation were highly conserved in all Trxs (Figures 4 and 5; Supplementary Fig. 7). Similarly, three conserved proline were present in all studied Trxs (Figures 4 and 5; Supplementary Fig. 7). First proline was situated in the catalytic CGPC motif (P34) which is the key residue for reducing power as its substitution by a serine or a threonine affected the redox and stability properties of Trxs [59, 60, 61]. We report substitution at the 34th position of Trxs by residues such as tyrosine, serine, glycine, or lysine (Table 2). Notably, the substitution of P34 by amino acids having similar properties to proline did not alter the overall structure of the protein; however, the effect of these substitutions on the function of Trxs needs to be investigated. The second conserved proline, i.e., P40, introduces a kink in the α2 helix that separates the active site CGPC motif from the rest of the helix (Figures 4 and 5; Supplementary Fig. 7). Thus, P40 helps in the proper positioning of the catalytic site in the α2 helix. However, substitution of P40 destabilizes the structure of the protein without affecting the redox properties [62, 63]. The third proline, i.e., P76, was positioned on the opposite side of the CGPC active site motif and it was always found in cis-conformation (Figures 4 and 5; Supplementary Fig. 7). P76 is essential for maintaining the conformation of the active site and redox potential of the protein, and its substitution by alanine resulted in decreased catalytic efficiency of Trx [64]. The conserved threonine (T77) situated next to the P76 is involved in structuring the area opposite to the CGPC active site motif [65]. The G33 of active site motif CGPC helps in maintaining the conformation of the active site and also influence the redox potential of Trxs [65]. The G33 and P34 collectively provide a flat surface around the active site due to the absence of protruding side chains in these amino acids. The G84 and G92 determine the length of the β5 strand while F12 present at N-terminal α1 helix is required for correct positioning of the α1 helix (Figures 4 and 5; Supplementary Fig. 7). Also, F12 together with F27 found at the C-terminal of β2 strand and isoleucine and valine residues of the central β sheet create a hydrophobic site which was proposed to act as a site for interaction with other proteins [65]. The W31 residue is important for the thermodynamic stability of Trxs which interact with A29 located in a turn through van der Waals interaction (Supplementary Fig. 7) [66]. A29 is known to prevent a shift in the position of the indole side chain of W31 due to the small size of alanine. However, important to mention that substitution of W31 by alanine resulted in swapping of domain dimer that caused loss of biochemical activities of Trxs-fold containing proteins [66]. The conserved D26 and K57 residues are part of a charged region present between the core β sheet and the kinked α2 helix, and D26 is the key residue that is considered to activate the nucleophilic activity of C35 of the active site motif [61, 68]. All studied Grxs shared a common Trx/Grx fold with Trxs proteins but in this study we report presence of additional secondary structures in Grxs that varied in different organisms (Supplementary Fig. 6). This observation suggested that the topology of Grxs varies from organism to organism but domain architecture is always maintained. Also, dithiol and monothiol Grxs conserved their residues which were important for their structural conformation and function (Figures 4, 5, 6, and 7; Supplementary Fig. 8). These residues included catalytic site motif, glutathione binding site, cis-proline and GG-kink which were conserved in all classes of Grxs (Supplementary Fig. 8). Importantly, the residues constituting glutathione binding site may vary among different classes of Grxs, however, their position in a peptide chain remain highly conserved. The conserved cis-proline and GG-kink are signature residues present in all Grxs, however, the structural and functional importance of these residues in Grxs of different organisms is still not well studied (Figures 4 and 5; Supplementary Fig. 8). The cis-proline is known to play a significant role in protein folding and redox dynamics while two glycines in the GG-kink are required for proper orientation of the α3 helix [15, 69]. The substitution of either Gly115 or Gly116 with valine or serine residues resulted in the loss of yeast Grx5 function [69]. Generally, Trxs and Grxs were found to have cysteine residues only in their catalytic site, however, human cytosolic Trx1 contains three additional structural cysteine residues (C62, C69, and C73). These structural cysteine residues play important role in substrate recognition, dimerization, and regulate the activity of thioredoxin reductase [70]. In summary, despite variation in the amino acid composition, anchor residues and their conformation is highly conserved in Trxs and Grxs of distantly related organisms. This observation suggests that specific position of highly conserved anchor residues is important for proper functioning of Trxs and Grxs. Trxs and Grxs were characterized by the presence of conserved Trx/Grx fold and anchor residues, however, two proteins surprisingly had opposite electrostatic surface potential (Figure 4). All Trxs had negative while Grxs had positive electrostatic surface potential that surrounded the catalytic sites constituted by uncharged amino acids (Figure 4). The electrostatic surface potential of a protein dictates the binding of substrates and/or ligands, and recently, Trxs and Grxs were classified based on their surface electrostatic charges [5]. Interestingly, distantly related Trxs and Grxs clustered together when an automated clustering of electrostatic surface potential properties was done for their functional classification and function prediction [5]. Here, it is proposed that recognition of target proteins could be regulated by attractive and repulsive electrostatic surface potential. However, further studies targeting substrates of Trxs and Grxs is required to validate this proposal. It should be noted that electrostatic surface potential is an emerging global property of Trxs and Grxs [5] which can be used together with structural similarity (Figures 6 and 7) for functional classification and explaining their substrate specificity and/or redundancy.

Conclusions

Trxs and Grxs show variation in their amino acid sequence, however, diversity in sequence does not alter their structural fold even in distantly related organisms. The structural conformation of Trxs along with their anchor residues are conserved throughout the evolution whereas the structure and active site of Grxs are more flexible. The dynamic catalytic site and presence of additional module in Grxs could permit them to exhibit versatile substrate catalysis and reaction mechanism. Also, flexibility in catalytic site and overlapping electrostatic surface potential could permit Grxs to act as a backup system for Trxs, especially in prokaryotes where two proteins are more similar than eukaryotes.

Declarations

Author contribution statement

Soumila Mondal: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Wrote the paper. Shailendra P. Singh: Conceived and designed the experiments; Analyzed and interpreted the data; Wrote the paper.

Funding statement

Dr Shailendra P. Singh was supported by Science and Engineering Research Board [ECR/2016/000578]. This work was supported by the funding from the Institute of Eminence incentive grant, Banaras Hindu University (R/Dev/D/IOE/Incentive/2021-2022/32399).

Data availability statement

Data included in article/supp. material/referenced in article.

Declaration of interest's statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.
  66 in total

1.  WebLogo: a sequence logo generator.

Authors:  Gavin E Crooks; Gary Hon; John-Marc Chandonia; Steven E Brenner
Journal:  Genome Res       Date:  2004-06       Impact factor: 9.043

2.  Inhibition of the c-Jun N-terminal kinase/AP-1 and NF-kappaB pathways by PICOT, a novel protein kinase C-interacting protein with a thioredoxin homology domain.

Authors:  S Witte; M Villalba; K Bi; Y Liu; N Isakov; A Altman
Journal:  J Biol Chem       Date:  2000-01-21       Impact factor: 5.157

3.  The rapid generation of mutation data matrices from protein sequences.

Authors:  D T Jones; W R Taylor; J M Thornton
Journal:  Comput Appl Biosci       Date:  1992-06

Review 4.  Thioredoxin and glutaredoxin systems in plants: molecular mechanisms, crosstalks, and functional significance.

Authors:  Yves Meyer; Christophe Belin; Valérie Delorme-Hinoux; Jean-Philippe Reichheld; Christophe Riondet
Journal:  Antioxid Redox Signal       Date:  2012-06-08       Impact factor: 8.401

5.  HEAT repeats in the Huntington's disease protein.

Authors:  M A Andrade; P Bork
Journal:  Nat Genet       Date:  1995-10       Impact factor: 38.330

6.  Structural and functional characterization of the mutant Escherichia coli glutaredoxin (C14----S) and its mixed disulfide with glutathione.

Authors:  J H Bushweller; F Aslund; K Wüthrich; A Holmgren
Journal:  Biochemistry       Date:  1992-09-29       Impact factor: 3.162

7.  Mutation of conserved residues in Escherichia coli thioredoxin: effects on stability and function.

Authors:  F K Gleason
Journal:  Protein Sci       Date:  1992-05       Impact factor: 6.725

8.  Evolution and diversity of glutaredoxins in photosynthetic organisms.

Authors:  Jérémy Couturier; Jean-Pierre Jacquot; Nicolas Rouhier
Journal:  Cell Mol Life Sci       Date:  2009-06-09       Impact factor: 9.261

Review 9.  Endosymbiotic theory for organelle origins.

Authors:  Verena Zimorski; Chuan Ku; William F Martin; Sven B Gould
Journal:  Curr Opin Microbiol       Date:  2014-10-10       Impact factor: 7.934

10.  Substrate specificity of thioredoxins and glutaredoxins - towards a functional classification.

Authors:  Manuela Gellert; Md Faruq Hossain; Felix Jacob Ferdinand Berens; Lukas Willy Bruhn; Claudia Urbainsky; Volkmar Liebscher; Christopher Horst Lillig
Journal:  Heliyon       Date:  2019-12-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.