Literature DB >> 31528640

In silico analysis of suitable signal peptides for secretion of a recombinant alcohol dehydrogenase with a key role in atorvastatin enzymatic synthesis.

Mortaza Taheri-Anganeh1, Seyyed Hossein Khatami2, Zeinab Jamali3, Amir Savardashtaki1, Younes Ghasemi4,5, Zohreh Mostafavi-Pour2,6.   

Abstract

An elevated cholesterol level might lead to cardiovascular disease (CVD). Statins block the cholesterol synthesis pathway in the liver. Atorvastatin is the most widespread statin worldwide and, its chemical synthesis requires toxic catalysts, resulting in environmental pollution. Hence, enzymatic synthesis of atorvastatin is desirable. This process could be done by Lactobacillus kefir alcohol dehydrogenase (LKADH). Therefore, recombinant enzyme secretion by Escherichia coli using signal peptides (SPs) might result in easy production and purification. To achieve this objective, we used some online bioinformatics web servers to evaluate the suitable SPs for translocation of LKADH into extracellular spaces. "Signal Peptide Website" and "UniProt" were utilized to retrieve the SPs and LKADH sequences. "SignalP 4.1" was used to determine SPs and their cleavage site location and the results were rechecked by "Philius". Physicochemical features of SPs were evaluated by "ProtParam", then solubility of their fusion with LKADH was assessed by "Protein-sol". Finally, secretion pathway and sub-cellular localization of the selected stable and soluble LKADH fusions were predicted by "PRED-TAT" and "ProtCompB". Amongst the 41 evaluated SPs, only LPTA_ECOLI, SUBF_BACSU, CHIS_BACSU, SACB_BACAM, CDGT_BACST and AMY_BACLI could translocate LKADH out of cytoplasm. The six selected SPs in the result section were suitable to design a soluble secretory LKADH that accelerate its scale-up production and might be useful in future experimental researches.

Entities:  

Keywords:  Alcohol dehydrogenase; Atorvastatin; In silico; Signal peptide

Year:  2019        PMID: 31528640      PMCID: PMC6510209          DOI: 10.22099/mbrc.2019.31801.1372

Source DB:  PubMed          Journal:  Mol Biol Res Commun        ISSN: 2322-181X


INTRODUCTION

Cardiovascular disease (CVD) is one of the major cause of morbidity and mortality worldwide [1]. Increased level of low-density lipoprotein cholesterol (LDL-C) is considered as the main risk factor for CVD. 3-hydroxy-3-methylglutaryl-coenzyme A (HMG-CoA) reductase inhibitors (‘statins’) were introduced since the late 1980s to prevent cardiovascular disease. Statins are potent inhibitors for HMG-CoA reductase, which block the cholesterol synthesis pathway in the liver and reduce major cardiovascular events [2]. It is proven that statins might interfere with other biological pathways as well as having several potential therapeutic effects [3]. Atorvastatin is a type of statin that reduces LDL cholesterol, which results in mortality rate reduction due to coronary heart diseases. Lipitor® (atorvastatin calcium) is one of the best-selling drugs in the world. The side chain of atorvastatin has two chiral cores that their synthesis is a critical step in atorvastatin synthesis process [4,5]. It was revealed that the chemical synthesis of atorvastatin needs expensive catalysts, causing extreme environmental pollutions. Therefore, applicable enzymes can reduce costs and prevent toxicity in the environment. Lactobacillus kefir alcohol dehydrogenase (LKADH) was introduced in 1990 by Hummel et al. LKADH is shown to be a beneficial enzyme for the industrial production of atorvastatin since it can act as a suitable enzyme for synthesizing the side chain of atorvastatin by reducing tert-butyl 6-chloro-3, 5-dioxohexanoate to tert-Butyl (S)-6-chloro-5-hydroxy-3-oxohexanoate, using nicotinamide adenine dinucleotide phosphate (NADPH) in a fed-batch system. NADPH is an expensive reagent, which is the limiting factor in the process of atorvastatin side chain production by LKADH; hence, its regeneration has significant importance, especially in industrial scales. On the other hand, LKADH as a single-enzyme system can effectively regenerate NADPH using cost-efficient solvent like ethanol along with the synthesis of the product. This binary function is considered as a significant advantage [6,7]. Recombinant DNA technology can help the industrial production of proteins by reducing the cost and increasing the efficacy of the bioprocesses. Escherichia coli (E.coli) is one of the most widespread expression hosts that can produce heterologous recombinant proteins [8]. High-level expression of recombinant proteins in E.coli can result in high amount aggregation of insoluble misfolded proteins in the cytoplasm, which is considered as the inclusion bodies. These aggregated intermediates are unable to get a suitable biological activity. Hence, the inclusion bodies has to be refolded to get the appropriate soluble proteins. [9]. One solution is to synthesize secretory recombinant proteins, which excrete into the periplasmic space or culture medium. Preventing protease attack, facilitated purification, correct formation of disulfide bonds and accurate protein folding are the advantages of the secretory production of recombinant proteins in comparison with cytoplasmic expression [10]. Considering the secretory production of recombinant proteins, they can be guided to the periplasmic space or culture medium by fusing suitable signal sequences to their N-terminus. There are several common translocation systems in the E.coli including Sec system, signal recognition particle-dependent (SRP-dependent) pathway, and twin-arginine translocation (TAT) system. Hence, it is possible to increase the efficiency of a translocation system using alternative signal peptides (SPs) that might be obtained from some heterologous species. A significant increase in the protein production at a commercial level, is the result of using SPs. A SP has various motifs, necessary to target a specific protein in the extra-cytoplasmic spaces. It is located at N-terminus of immature desired protein and can be detached by signal (leader) peptidase. The length of SPs is usually 15-30 amino acids that includes three distinct regions. Generally n-region consists of 5-8 positively charged residues, h-region is composed of 8-12 hydrophobic residues and, c-region contains 5-7 polar residues, which include cleavage site location in carboxyl terminus [11]. Various computational approaches were applied to predict a suitable N-terminal SPs, and different bioinformatics tools were utilized to predict the SPs presence and their locations [12]. In the present study, several online web servers were used to investigate suitable SPs for secretory production of LKADH. To the best of our knowledge, we could not find any in silico studies for secretory production of LKADH.

MATERIALS AND METHODS

Dataset retrieval: “Signal Peptide Website” (http://www.signalpeptide.de/) was employed to retrieve 41 appropriate SPs. SPs were chosen according to several criteria. Selected SPs were marked as confirmed in the mentioned database and belonged to bacterial secretory proteins. The collected data were validated using the “UniProt” server (http://www.uniprot.org/) according to the experimental evidences. The amino acid sequence of LKADH was retrieved from the UniProt (Table 1).
Table 1

Amino acid sequences of the signal peptides

Accession No. (Uniprot) Signal Peptide Protein name Source Amino Acid Sequence
P00634PPB_ECOLIAlkaline phosphatase Escherichia coli (strain K12) MKQSTIALALLPLLFTPVTKA
P02932PHOE_ECOLIOuter membrane pore protein E Escherichia coli (strain K12) MKKSTLALVVMGIVASASVQA
P0A910OMPA_ECOLIOuter membrane protein A Escherichia coli (strain K12) MKKTAIAIAVALAGFATVAQA
P02931OMPF_ECOLIOuter membrane protein F Escherichia coli (strain K12) MMKRNILAVIVPALLVAGTANA
P09169OMPT_ECOLIProtease 7 Escherichia coli (strain K12) MRAKLLGIVLTTPIAISSFA
P06996OMPC_ECOLIOuter membrane protein C Escherichia coli (strain K12) MKVKVLSLLVPALLVAGAANA
P13811ELBH_ECOLXHeat-labile enterotoxin B chain Escherichia coli MNKVKFYVLFTALLSSLCAHG
P02943LAMB_ECOLIMaltoporin Escherichia coli (strain K12) MMITLRKLPLAVAVAAGVMSAQAMA
P0AEX9MALE_ECOLIMaltose-binding periplasmic protein Escherichia coli (strain K12) MKIKTGARILALSALTTMMFSASALA
P0AEG4DSBA_ECOLIThiol:disulfide interchange protein dsbA Escherichia coli (strain K12) MKKIWLALAGLVLAFSASA
P0AEE5DGAL_ECOLID-galactose-binding periplasmic protein Escherichia coli (strain K12) MNKKVLTLSAVMASMLFGAAAHA
P38683TORT_ECOLIPeriplasmic protein torT Escherichia coli (strain K12) MRVLLFLLLSLFMLPAFS
P0A855TOLB_ECOLIProtein tolB Escherichia coli (strain K12) MKQALRVAFGFLILWASVLHA
P22542HSTI_ECOLXHeat-stable enterotoxin II Escherichia coli (strain K12) MKKNIAFLLASMFVFSIATNAYA
P62593BLAT_ECOLXBeta-lactamase TEM Escherichia coli MSIQHFRVALIPFFAAFCLPVFA
P00805ASPG2_ECOLIl-asparaginase 2 Escherichia coli (strain K12) MEFFKKTALAALVMGFSGAALA
A2TJI4CEXE_ECOLXProtein cexE Escherichia coli MKKYILGVILAMGSLSAIA
P05458PTRA_ECOLIProtease 3 Escherichia coli (strain K12) MPRSTWFKALLLLVALWAPLSQA
P45523FKBA_ECOLIFKBP-type peptidyl-prolyl cis–trans isomerase Escherichia coli MKSLFKVTLLATTMAVALHAPITFA
P69776LPP_ECOLIMajor outer membrane lipoprotein Escherichia coli (strain K12) MKATKLVLGAVILGSTLLAG
P31550THIB_ECOLIThiamine-binding periplasmic protein Escherichia coli (strain K12) MLKKCLPLLLLCTAPVFA
Q47537TAUA_ECOLITaurine-binding periplasmic protein Escherichia coli (strain K12) MAISSRNTLLAALAFIAFQAQA
P23857PSPE_ECOLIPhage shock protein E Escherichia coli (strain K12) MFKKGLLALALVFSLPVFA
P07102PPA_ECOLIPeriplasmic appA protein Escherichia coli (strain K12) MKAILIPFLSLLIPLTPQSAFA
P34210OMPP_ECOLIOuter membrane protease ompP Escherichia coli (strain K12) MQTKLLAIMLAAPVVFSSQEASA
P24093DRAA_ECOLXDr hemagglutinin structural subunit Escherichia coli (strain K12) MKKLAIMAAASMVFAVSSAHA
P0A915OMPW_ECOLIOuter membrane protein W Escherichia coli (strain K12) MKKLTVAALAVTTLLSGSAFA
P0AFI5PBP7_ECOLID-alanyl-D-alanine endopeptidase Escherichia coli (strain K12) MPKFRVSLFSLALMLAVPFAPQAVA
P33590NIKA_ECOLINickel-binding periplasmic protein Escherichia coli (strain K12) MLSTLRRTLFALLACASFIVHA
P0ADV1LPTA_ECOLILipopolysaccharide export system protein lptA Escherichia coli (strain K12) MKFKTNKLSLNLVLASSLLAASIPAFA
P16397SUBF_BACSUBacillopeptidase F Bacillus subtilis MRKKTKNRLISSVLSTVVISSLLFPGAAGA
Q02113CWBA_BACSUAmidase enhancer Bacillus subtilis MKSCKQLIVCSLAAILLLIPSVSFA
P34957QOX2_BACSUQuinol oxidase subunit 2 Bacillus subtilis MVIFLFRALKPLLVLALLTVVFVLGG
O07921CHIS_BACSUChitosanase Bacillus subtilis MKISMQKADFWKKAAISLLVFTMFFTLMMSETVFA
P21130SACB_BACAMLevansucrase Bacillus amyloliquefaciens MNIKKIVKQATVLTFTTALLAGGATQAFA
P39824BLAC_BACSUBeta-lactamase Bacillus subtilis MKLKTKASIKFGICVGLLCLSITGFTPFFNSTHAEA
P07980GUB_BACAMBeta-glucanase Bacillus amyloliquefaciens MKRVLLILVTGLFMSLCGITSSVSA
P31797CDGT_BACSTCyclomaltodextrin glucanotransferase Bacillus stearothermophilus MRRWLSLVLSMSFVFSAIFIVSDTQKVTVEA
P06874THER_BACSTThermolysin Bacillus stearothermophilus MNKRAMLGAIGLAFGLLAAPIGASA
P00808BLAC_BACLIBeta-lactamase Bacillus licheniformis MKLWFSTLKLKKAAAVLLFSCVALAG
P06278AMY_BACLIAlpha amylase Bacillus licheniformis MKQQKRLYARLLTLLFALIFLLPHSAAAA

The amino acids in the n-region are boldfaced and the underlined amino acids shows the c-region.

Amino acid sequences of the signal peptides The amino acids in the n-region are boldfaced and the underlined amino acids shows the c-region. Prediction of signal peptides presence and their cleavage site location: “SignalP 4.1” (http://www.cbs.dtu.dk/services/SignalP/) is an online web server that distinguishes three regions of SPs and their presence probability for target protein based on artificial neural networks (ANNs). SignalP was upgraded to version 4.1 in 2012 with the advent of a cut-off value that was named D-score. D-score is used for the final decision about SPs presence in N-terminus of input amino acid sequences. In this study, if a sequence had a D-score higher than 0.57 was considered as SP. SignalP results for each amino acid sequence are made of three scores based on the neural networks. SPs cleavage sites are determined using C-score (raw cleavage site score). S-score (signal peptide score) distinguishes the sequence of SPs from the target protein sequence and proteins without SPs. Y-score (combined cleavage site score) is the geometric average of the C-score and the slope of the S-score, which differentiate cleavage site prediction better than the raw C score alone [13]. SignalP results were rechecked by “Philius” (http://www.yeastrc.org/philius/). A sequence with type confidence more than 0.5 was considered as signal peptide. Evaluation of signal peptides physico-chemical properties and solubility: “ProtParam” is an online server at (http://web.expasy.org/), which was employed to predict the different physico-chemical properties of the SPs, including amino acid composition, molecular weight, theoretical pI (isoelectric point), positively charged residues, instability index, aliphatic index, and grand average of hydropathicity (GRAVY). ProtParam evaluates these features based on a protein sequence [14]. SPs instability index separately and connected to LKADH was evaluated. The solubility of SPs and LKADH fusions were predicted using “Protein-sol” online software. Protein–Sol (http://protein-sol.manchester.ac.uk) is a free online web server. Protein–Sol gives a predicted scaled solubility in the 0-1 range to interpret results easily [15]. Unstable fusion proteins and insoluble ones were removed in the next step. prediction of signal peptides secretion pathway and subcellular localization: “PRED-TAT” online server (http://www.compgen.org/tools/PRED-TAT) was used to predict SPs connected LKADH fusions secretion pathway. PRED-TAT differentiates Sec from Tat targeting SPs and predicts their cleavage sites by providing a reliability score in the 0-1 range. The prediction method of the aforementioned server is dependent on Hidden Markov Models (HMMs) and has a standard appropriate architecture for both Sec and Tat SPs [16]. Sub-cellular localization of SPs connected LKADH fusions was evaluated using “ProtCompB” online server (http://www.softberry.com). Its prediction of the localized fusion was based on neural networks, containing the last localization database of homologous proteins. The average accuracy of “ProtCompB” is between 86-100% [17, 18].

RESULTS

Evaluation of signal peptides three regions and probability: The amino acid length for all selected SPs in the n-, h- and c-regions were 2-10, 10-20, and 3-9, respectively. The most critical parameter to identify SPs presence was D-score. If a SP had a D-score higher than 0.57, it was considered as appropriate SP for the target protein. The in silico analysis indicated that the highest D-score belonged to CWBA_BACSU (0.916). QOX2_BACSU (0.217), CHIS_BACSU (0.313), LPP_ECOLI (0.342), BLAC_BACSU (0.377), CDGT_BACST (0.388), BLAT_ECOLX (0.443), GUB_BACAM (0.463), TOLB_ECOLI (0.471), THER_BACST (0.510) and OMPT_ECOLI (0.524) were not suitable SPs for the excretion of LKADH protein, since they had D-scores below 0.57 (Table 2). These signal peptides were removed in the next step.
Table 2

In silico prediction of signal peptides for LKADH

Protein Name SignalP analysis
Philius analysis
n-region h-region c-region Cleavage Site C-score Y-score S-score S-mean D-score Type confidence
PPB_ECOLI1-4(4)5-15(11)16-21(6)TKA-MT (21,22)0.3930.5060.8680.7000.5970.99
PHOE_ECOLI1-4(4)5-17(13)18-21(4)VQA-MT (21,22)0.6880.7380.9010.8040.7690.99
OMPA_ECOLI1-3(3)4-16(13)17-21(5)AQA-MT (21,22)0.7510.7940.9540.8610.8250.99
OMPF_ECOLI1-4(4)5-18(14)19-22(4)ANA-MT (22,23)0.8050.8170.9370.8640.8390.99
OMPT_ECOLI*---0.3070.4270.8080.6330.5240.98
OMPC_ECOLI1-4(4)5-16(10)17-20(4)ANA-MT (21,22)0.7840.8290.9560.9010.8630.99
ELBH_ECOLX1-3(3)4-17(14)18-21(4)AHG-MT (21,22)0.7500.7340.9120.7460.7400.99
LAMB_ECOLI1-7(7)8-19(12)20-25(6)AMA-MT (25,26)0.7090.7600.9640.8770.8150.96
MALE_ECOLI1-4(4)5-19(15)20-26(7)ALA-MT (26,27)0.6300.7540.9810.9230.8340.98
DSBA_ECOLI1-3(3)4-14(11)15-19(5)ASA-MT (19,20)0.5650.7230.9590.9220.8160.99
DGAL_ECOLI1-4(4)5-16(10)17-23(7)AHA-MT (23,24)0.6900.7810.9670.9100.8410.98
TORT_ECOLI1-2(2)3-14(12)15-18(4)AFS-MT (18,19)0.5170.6940.9570.9280.8040.99
TOLB_ECOLI*---0.4570.4690.6070.4740.4710.99
HSTI_ECOLX1-3(3)4-18(15)19-23(5)AYA-MT (23, 24)0.8200.8240.9560.8770.8490.97
BLAT_ECOLX*---0.5360.4540.5490.4250.4430.99
ASPG2_ECOLI1-6(6)7-18(12)19-22(4)ALA-MT (22,23)0.8050.7730.9410.7700.7720.99
CEXE_ECOLX1-3(3)4-14(11)15-19(5)AIA-MT (19,20)0.6390.7120.9430.7990.7530.97
PTRA_ECOLI1-6(6)7-20(14)21-25(5)SQA-MT (23,24)0.7530.8330.9640.9200.8740.99
FKBA_ECOLI1-6(6)7-20(14)21-25(5)TFA-MT (25,26)0.5220.6070.9570.8190.7060.99
LPP_ECOLI*---0.1480.2510.6890.4970.3420.99
THIB_ECOLI1-4(4)5-14(10)15-18(4)VFA-MT (18,19)0.5150.6560.9190.8450.7440.99
TAUA_ECOLI1-7(7)8-17(10)18-22(5)AQA-MT (22,23)0.7750.7410.9190.7750.7570.99
PSPE_ECOLI1-4(4)5-15(11)16-19(4)VFA-MT (19,20)0.8500.8580.9520.8760.8660.99
PPA_ECOLI1-2(2)3-13(11)14-22(9)AFA-MT (22, 23)0.7280.7340.8960.7910.7610.99
OMPP_ECOLI1-4(4)5-15(11)16-23(8)ASA-MT (23,24)0.5170.5770.8530.7240.6460.95
DRAA_ECOLX1-3(3)4-13(10)14-21(8)AHA-MT (21,22)0.6220.7350.9460.8950.8100.99
OMPW_ECOLI1-3(3)4-13(10)14-21(8)AFA-MT (21,22)0.7590.8160.9490.8890.8500.99
PBP7_ECOLI1-3(3)4-17(14)18-25(8)AVA-MT (25,26)0.5890.6950.9720.8860.7850.99
NIKA_ECOLI1-7(7)8-19(12)20-22(3)VHA-MT (22,23)0.7750.8040.9250.8610.8310.99
LPTA_ECOLI1-4(4)5-23(19)24-27(4)AFA-MT (27,28)0.8280.8460.9770.9120.8770.99
SUBF_BACSU1-4(4)5-23(19)24-30(7)AGA-MT (30,31)0.5810.6690.9550.8590.7580.99
CWBA_BACSU1-5(5)6-19(18)20-25(6)SFA-MT (25,26)0.8530.8940.9760.9400.9160.99
QOX2_BACSU*---0.1360.1700.3720.2960.2170.99
CHIS_BACSU*---0.3570.3310.5360.2830.3130.86
SACB_BACAM1-8(8)9-24(16)25-29(5)AFA-MT (29,30)0.5580.5960.9510.7720.6790.97
BLAC_BACSU*---0.2740.3020.7520.5040.3770.95
GUB_BACAM*---0.3070.3980.7150.5750.4630.98
CDGT_BACST*-----0.1300.2800.7060.5720.3880.99
THER_BACST*---0.2570.4000.8130.6990.5100.98
BLAC_BACLI1-2(2)3-22(20)23-26(4)ALA-GM (25,26)0.7560.8240.9650.9140.8660.99
AMY_BACLI1- 10(10)11-22(12)23-29(7)AAA-AM (28,29)0.4520.5900.8890.6650.6180.99

SignalP 4.1 outputs includes several different scores. The C-score and S-score were used for determination of cleavage sites and signal peptides positions, respectively. Y-score indicates the geometric average between the C-score and a smoothed derivative of the S-score. S-mean is arithmetic average of the S-score from the beginning to position where the Y-score is the max. D-score is the mean of the S-mean and Y-max which determines secretory and non-secretory proteins with cut-off value of 0.5. Sequences with D-score > 0.5 are considered as signal peptide.

In silico prediction of signal peptides for LKADH SignalP 4.1 outputs includes several different scores. The C-score and S-score were used for determination of cleavage sites and signal peptides positions, respectively. Y-score indicates the geometric average between the C-score and a smoothed derivative of the S-score. S-mean is arithmetic average of the S-score from the beginning to position where the Y-score is the max. D-score is the mean of the S-mean and Y-max which determines secretory and non-secretory proteins with cut-off value of 0.5. Sequences with D-score > 0.5 are considered as signal peptide. Physicochemical properties and solubility of signal peptides: Various physico-chemical features of the SPs are shown in Table 3. We chose the SPs length in the range of 18-30. The net positive charge of n-region was 0 for OMPP_ECOLI, 1 for TORT_ECOLI, BLAT_ECOLX, ASPG2_ECOLI, TAUA_ECOLI and PPA_ECOLI, 3 for MALE_ECOLI, LPTA_ECOLI, SACB_BACAM and BLAC_BACSU, 4 for BLAC_BACLI and AMY_BACLI, 5 for SUBF_BACSU and 2 for the other 20 SPs. The grand average of hydropathy (GRAVY) is defined as the sum of hydropathy of amino acids and implemented for total hydropathy comparison. [14]. SUBF_BACSU (0.497) and TORT_ECOLI (2.061) had the lowest and highest GRAVYs. The hydrophobicity value is indicated, using the aliphatic index, related to the aliphatic amino acids (i.e., alanine, valine, isoleucine, and leucine) composition of a protein sequence. [14]. QOX2_BACSU (198.46) had the highest aliphatic index, unlike the lowest one which belonged to BLAC_BACSU (92.22). The instability of the signal peptides alone and in connection with LKADH protein were predicted by the instability index. Having an instability index below 40 indicated the stability of a protein and vice versa. Based on our results, BLAC_BACSU (15.03) in connection with LKADH was the most stable fusion protein.
Table 3

Prediction of signal peptides physico-chemical properties

Signal Peptides Amino Acid Length MW pI Net Positive Charge Aliphatic Index GRAVY Instability Index Without protein Instability index with protein Solubility (Probability)
PPB_ECOLI212256.8210.002139.520.971Unstable (56.02)Unstable (19.61)Soluble(0.661)
PHOE_ECOLI212104.5910.002130.001.195Stable (1.44)Stable (15.42)Soluble(0.688)
OMPA_ECOLI212046.5010.002121.431.295Stable (9.52)Stable (16.04)Soluble(0.697)
OMPF_ECOLI222266.8311.002150.911.259Unstable (67.18)Stable (20.64)Soluble(0.678)
OMPT_ECOLI202102.6111.002146.501.290Stable (2.62)Stable (15.55)Soluble(0.672)
OMPC_ECOLI212078.6310.002171.901.552Stable (14.37)Stable (16.41)Soluble(0.695)
ELBH_ECOLX212358.849.102111.430.695Stable (26.85)Stable (17.37)Soluble(0.608)
LAMB_ECOLI252545.2211.002125.201.332Unstable (42.97)Stable (18.96)Soluble(0.668)
MALE_ECOLI262698.3411.173113.081.012Stable (2.85)Stable (15.30)Soluble(0.657)
DSBA_ECOLI191990.4810.002144.211.416Stable (11.50)Stable (16.22)Soluble(0.662)
DGAL_ECOLI232362.8910.002102.170.952Stable (14.15)Stable (16.38)Soluble(0.668)
TORT_ECOLI182111.729.501173.332.061Stable (26.66)Stable (17.25)Soluble(0.640)
TOLB_ECOLI212371.9211.002139.521.219Unstable (43.26)Stable (18.63)Soluble(0.618)
HSTI_ECOLX232552.099.702102.171.026Stable (32.43)Stable (17.91)Soluble(0.608)
BLAT_ECOLX232626.228.021110.431.539Unstable (56.40)Stable (19.91)Soluble(0.573)
ASPG2_ECOLI222274.768.35193.641.136Stable (-1.15)Stable (15.16)Soluble(0.683)
CEXE_ECOLX191979.519.702154.211.411Stable (29.75)Stable (17.50)Soluble(0.678)
PTRA_ECOLI232613.2011.002131.740.857Unstable (51.93)Stable (19.54)Soluble(0.604)
FKBA_ECOLI252676.3110.002121.201.212Stable (14.37)Stable (16.38)Soluble(0.649)
LPP_ECOLI201956.4610.002161.001.400Stable (10.64)Stable (16.14)Soluble(0.707)
THIB_ECOLI181974.608.892157.221.589Unstable (65.64)Stable (19.85)Soluble(0.646)
TAUA_ECOLI222308.729.501120.451.055Stable(34.41)Stable (18.01)Soluble(0.659)
PSPE_ECOLI192065.6310.002148.951.711Stable (17.37)Stable (16.64)Soluble(0.643)
PPA_ECOLI222384.998.501155.451.405Unstable (53.16)Stable (19.52)Soluble(0.645)
OMPP_ECOLI232406.885.750114.780.904Unstable (44.47)Stable (18.91)Soluble(0.697)
DRAA_ECOLX212135.6310.00298.101.162Stable (16.49)Stable (16.57)Soluble(0.674)
OMPW_ECOLI212093.5510.002125.711.210Stable (1.44)Stable (15.42)Soluble(0.694)
PBP7_ECOLI252705.3611.002117.201.228Unstable (57.99)Stable (20.32)Soluble(0.611)
NIKA_ECOLI222434.9910.352137.731.350Unstable (60.45)Stable (20.10)Soluble(0.634)
LPTA_ECOLI272849.4710.303130.370.881Stable (17.32)Stable (16.65)Soluble(0.628)
SUBF_BACSU303145.8012.025117.000.497Stable (20.75)Stable (17.02)Soluble(0.634)
CWBA_BACSU252649.358.892160.001.596Unstable (42.97)Stable (18.96)Soluble(0.634)
QOX2_BACSU262843.6811.002198.462.242Stable (8.20)Stable (15.80)Soluble(0.646)
CHIS_BACSU354094.039.52283.710.800Stable (17.79)Stable (16.73)Soluble(0.577)
SACB_BACAM293008.6110.303107.930.710Stable (26.18)Stable (17.57)Soluble(0.642)
BLAC_BACSU363875.669.30392.220.628Stable (4.18)Stable (15.03)Soluble(0.575)
GUB_BACAM252640.309.502148.001.508Unstable (45.44)Stable (19.18)Soluble(0.657)
CDGT_BACST313562.258.501116.130.887Unstable (46.86)Stable (19.90)Soluble(0.632)
THER_BACST252414.9511.002121.601.100Stable (33.02)Stable (18.06)Soluble(0.678)
BLAC_BACLI262810.5410.044131.541.192Stable (15.30)Stable (16.46)Soluble(0.607)
AMY_BACLI293299.0711.104141.720.752Unstable (50.51)Stable (20.08)Soluble(0.577)

MW (molecular weight), pI (isoelectric point), Instability index, GRAVY (grand average of hydropathicity). Proteins with instability index more than 40 were considered as unstable.

Prediction of signal peptides physico-chemical properties MW (molecular weight), pI (isoelectric point), Instability index, GRAVY (grand average of hydropathicity). Proteins with instability index more than 40 were considered as unstable. Prediction of secretion pathway and sub-cellular localization: The results of PRED-TAT web server revealed that all the remaining stable and soluble SPs belonged to the Sec pathway, except QOX2_BACSU that targets the protein to transmembrane segment. These SPs can translocate fused LKADH to different compartments. ProtCompB server sub-cellular localization evaluation, indicating that amongst SPs in this step, LPTA_ECOLI and SACB_BACAM can localize LKADH in periplasmic space, SUBF_BACSU, CHIS_BACSU, CDGT_BACST and AMY_BACLI can translocate this heterologous protein into extracellular space, and other SPs can direct this heterologous protein into the cytoplasm (Table 4).
Table 4

Evaluation of secretion pathways and sub-cellular location of SPs

Signal peptides Secretion pathway Reliability Score (%) Cytoplasmic Membrane Secreted (extracellular) Periplasmic Final prediction site
PHOE_ECOLISec0.9959.910.030.0000.06Cytoplasmic
OMPA_ECOLISec0.9999.820.090.030.06Cytoplasmic
OMPF_ECOLISec0.9989.900.040.000.06Cytoplasmic
OMPT_ECOLISec0.9919.930.020.000.05Cytoplasmic
OMPC_ECOLISec0.9909.840.090.000.07Cytoplasmic
ELBH_ECOLXSec0.9859.800.120.020.06Cytoplasmic
LAMB_ECOLISec0.9989.920.020.000.06Cytoplasmic
MALE_ECOLISec0.9979.920.060.000.02Cytoplasmic
DSBA_ECOLISec0.9949.830.110.000.05Cytoplasmic
DGAL_ECOLISec0.9999.830.110.000.06Cytoplasmic
TORT_ECOLISec0.9909.960.020.000.03Cytoplasmic
TOLB_ECOLISec0.9919.850.110.000.04Cytoplasmic
HSTI_ECOLXSec0.9999.62 0.10 0.24 0.04 Cytoplasmic
BLAT_ECOLXSec0.9969.890.000.070.03Cytoplasmic
ASPG2_ECOLISec0.9999.800.130.010.06Cytoplasmic
CEXE_ECOLXSec0.9939.820.060.070.05Cytoplasmic
PTRA_ECOLISec0.9959.820.120.000.05Cytoplasmic
FKBA_ECOLISec0.9929.810.150.000.04Cytoplasmic
LPP_ECOLISec0.9689.890.050.000.06Cytoplasmic
THIB_ECOLISec0.9979.860.090.000.06Cytoplasmic
TAUA_ECOLISec0.9989.900.040.000.06Cytoplasmic
PSPE_ECOLISec0.9999.720.100.130.05Cytoplasmic
PPA_ECOLISec1.0009.820.120.000.06Cytoplasmic
OMPP_ECOLISec0.9979.930.020.000.05Cytoplasmic
DRAA_ECOLXSec1.0009.850.080.000.06Cytoplasmic
OMPW_ECOLISec1.0009.820.100.020.06Cytoplasmic
PBP7_ECOLISec0.9979.830.120.000.05Cytoplasmic
NIKA_ECOLISec0.9969.880.070.000.06Cytoplasmic
LPTA_ECOLISec1.0000.000.601.128.28Periplasmic
SUBF_BACSUSec0.9962.221.855.770.16Secreted
CWBA_BACSUSec0.9999.940.000.020.04Cytoplasmic
QOX2_BACSUTM segment0.7889.950.000.020.03Cytoplasmic
CHIS_BACSUSec0.9752.200.846.770.19Secreted
SACB_BACAMSec0.9991.040.000.008.96Periplasmic
BLAC_BACSUSec0.9952.360.002.415.23membrane bound Periplasmic
GUB_BACAMSec0.9989.930.000.030.04Cytoplasmic
CDGT_BACSTSec0.9581.371.696.940.00Secreted
THER_BACSTSec0.9979.740.030.190.04Cytoplasmic
BLAC_BACLISec0.9929.910.030.000.06Cytoplasmic
AMY_BACLISec0.9990.001.578.430.00Secreted
Evaluation of secretion pathways and sub-cellular location of SPs

DISCUSSION

Overexpression of recombinant proteins in the intracellular space of E. coli is usually accompanied with high inclusion body aggregation; hence, it is essential to launch a method for periplasmic or extracellular secretion of proteins [9]. Sec, SRP and TAT are some of protein secretion pathways, which are recruited by prokaryotes. The role of these pathways is to direct proteins into periplasmic space according to their SPs. Thus, choosing a proper SP is a critical step in designing secretory recombinant proteins [11]. The new era in medical and biology has begun with the advent of some other sciences, such as computational biology and bioinformatics. The advantages of using bioinformatics program before launching an experimental study are reducing the costs and increasing the accuracy and validity of the experimental researches [19]. There are many of these bioinformatics online web tools, which can be used to find suitable SPs. A parameter that determines a peptide as a SP is D-score of the signalP server; hence, D-score is used to sort all SPs in the first step. According to D-scores (Table 2), 31 out of 41 selected SPs were identified as SPs for LKADH, but more features were needed to evaluate a suitable SP. Some important physico-chemical characteristics of SPs including instability index, GRAVY, net positive charge, h-region length has to be considered for effective protein secretion. A crucial region in a SP is n-region that can confer the ability for translocation to the desired secretory protein. The existence of one or more basic residues causes the n-region to be positively charged. The positive charges facilitate the interaction between SPs and phospholipids, which helps protein to translocate through the membrane [20]. Therefore, any substitution that changes the basic residues with neutral or acidic ones in the signal peptide sequences can reduce the rate of protein synthesis and their secretion [21]. The results of the present study showed a range of 0-5 for positive charges of all SPs. As all the selected SPs are in a suitable range of positively charged residues for n-region, it is also necessary to consider other characteristics for selecting a suitable SP. The h-region is another key region of a SP that its hydrophobic feature has an essential role in membrane targeting and extracellular secretion of proteins [22]. Therefore, hydrophobicity is the most crucial factor for the activity of this region, and the length of h-region is also a determinant of hydrophobicity. Consequently, increasing the length of h-region can raise the level of hydrophobicity, helping to promote the protein secretion rate. Aliphatic index and GRAVY are two major parameters that determine the hydrophobicity, and any increase in these parameters can lead to elevated hydrophobicity [18, 22]. As shown in Table 3, all 31 evaluated SPs in this step of the study had high aliphatic indexes, and their hydrophobicity is suitable for secretion. During protein transportation into the extracellular space, signal peptidases cleave the signal peptide sequence in the cleavage site and produce a mature protein product. The cleavage site located in the c-region often has less hydrophobicity, and it includes a signal sequence that is recognized by the signal peptidase. According to the -1, -3 rule a residue with a small neutral side chain like alanine, serine and glycine should be located at -1 and -3 positions. Hence, an Ala-X-Ala box sequence which forms, can be identified and cleaved by signal peptidase [23]. As shown in Table 1, alanine is the most common residue found at the -1 and -3 positions, and the others are approximately similar to AXA box. Two common translocating pathways in both gram-negative and -positive bacteria that trigger proteins to the extracellular space, are Sec and Tat [24]. In E. coli, about 50% of all total proteins are excreted, with more than 90% secreted via the Sec pathway. Using the Sec pathway, the unfolded proteins can translocate across the membrane and target the extracellular space, either via co-translational (SRP pathway) or post-translational. On the other hand, fully folded proteins get out of the cytoplasm by Tat pathway, a process that uses Tat translocation complex. Folding of proteins in the cytoplasm can result in their aggregation and degradation due to cellular proteases; thus, it seems that Sec and SRP pathways are more suitable for the secretory production of proteins than Tat [25-27]. As indicated in Table 4, all SPs in this step were specific for the Sec pathway with reliability scores of more than 0.9. For this reason, based on the analysis none of them were omitted, since we required sub-cellular localization analysis. At the end, it was determined that amongst 30 stable and soluble LKADH fused SPs, directed toward Sec pathway, 26 SPs were able to translocated to the cytoplasm, and only 4 translocated to LKADH into the periplasmic and extracellular space. As far as we know, this is the first study aiming to investigate suitable SPs in fusion with LKADH by analyzing their potential effects on the secretion of this protein. It is logical that selecting a suitable and accurate SP for a given protein can reduce the cost and time for production and purification processes of recombinant proteins. This study evaluated 41 different SPs in order to select the most applicable ones for secreting the recombinant LKADH protein out of E. coli host. The results of this work indicated that LPTA_ECOLI, SUBF_BACSU, CHIS_BACSU, SACB_BACAM, CDGT_BACST and AMY_BACLI SPs could be theoretically considered as suitable candidates for the LKADH secretion. However, further experimental investigations should be carried out to validate these results.
  6 in total

1.  GP4: an integrated Gram-Positive Protein Prediction Pipeline for subcellular localization mimicking bacterial sorting.

Authors:  Stefano Grasso; Tjeerd van Rij; Jan Maarten van Dijl
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

2.  Designing a Multi-Epitope Antigen for Serodiagnosis of Strongyloides stercoralis Based on L3Nie.01 and IgG Immunoreactive Epitopes.

Authors:  Ahmad Movahedpour; Zohreh Mostafavi-Pour; Bahador Sarkari; Mortaza Taheri-Anganeh; Navid Nezafat; Amir Savardashtaki; Younes Ghasemi
Journal:  Avicenna J Med Biotechnol       Date:  2022 Apr-Jun

3.  Design of a new multi-epitope peptide vaccine for non-small cell Lung cancer via vaccinology methods: an in silico study.

Authors:  Fatemeh Heidary; Mehdi Tourani; Fatemeh Hejazi-Amiri; Seyyed Hossein Khatami; Navid Jamali; Mortaza Taheri-Anganeh
Journal:  Mol Biol Res Commun       Date:  2022-03

4.  Development of a recombinant nucleocapsid protein-based ELISA for the detection of IgM and IgG antibodies to SARS-CoV-2.

Authors:  Maryam Ranjbar; Marzieh Asadi; Marjan Nourigorji; Bahador Sarkari; Zohreh Mostafavi-Pour; Kamiar Zomorodian; Zahra Shabaninejad; Mortaza Taheri-Anganeh; Amir Maleksabet; Mohsen Moghadami; Amir Savardashtaki
Journal:  Biotechnol Appl Biochem       Date:  2021-12-29       Impact factor: 2.724

5.  An in silico Design, Expression and Purification of a Chimeric Protein as an Immunogen Candidate Consisting of IpaD, StxB, and TolC Proteins from Shigella spp.

Authors:  Javad Fathi; Shahram Nazarian; Emad Kordbacheh; Nahal Hadi
Journal:  Avicenna J Med Biotechnol       Date:  2022 Jul-Sep

6.  LytU-SH3b fusion protein as a novel and efficient enzybiotic against methicillin-resistant Staphylococcus aureus.

Authors:  Mortaza Taheri-Anganeh; Seyyed Hossein Khatami; Zeinab Jamali; Ahmad Movahedpour; Younes Ghasemi; Amir Savardashtaki; Zohreh Mostafavi-Pour
Journal:  Mol Biol Res Commun       Date:  2019-12
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.