Literature DB >> 35118058

SPSED: A Signal Peptide Secretion Efficiency Database.

Chong Peng1,2,3,4, Yixue Guo4, Shaodong Ren4, Cen Li4, Fufeng Liu1,2,3,4, Fuping Lu1,2,3,4.   

Abstract

Entities:  

Keywords:  bacteria; database; recombinant protein; secretion efficiency; signal peptide

Year:  2022        PMID: 35118058      PMCID: PMC8804277          DOI: 10.3389/fbioe.2021.819789

Source DB:  PubMed          Journal:  Front Bioeng Biotechnol        ISSN: 2296-4185


× No keyword cloud information.

Introduction

Signal peptides (SPs) are short amino acid sequences that direct the linked proteins into the secretory pathway. SPs are found in the N-terminus of proteins in virtually all organisms. Signal peptidases will remove signal peptides after the protein translocation. Signal peptides are usually 16–30 amino acids long and consist of a positively charged n-region, a hydrophobic h-region, and a c-region. The c-region contains the signal peptidase recognition site (von Heijne, 1990; 1998). Signal peptides are important in diverse fields that range from protein secretion mechanisms to disease diagnosis, especially in recombinant protein production (Freudl, 2018; Owji et al., 2018). For industrial enzymes production in bacterial cell factories, secreting the synthesized target proteins by the guidance of signal peptides will provide active and stable enzymes and a cost-effective downstream recovery process. In practice, different signal peptides show considerable differences in their ability to drive the secretion of the target protein. The optimum signal peptide for each recombinant protein is not consistent. The optimum signal peptide for one protein secretion could be inefficient for other proteins and vice versa. Systematic screening of a high-capacity signal peptide library has proven to be a powerful method to identify the optimal signal peptide for a target protein (Brockmeier et al., 2006; Mathiesen et al., 2009; Peng et al., 2019). Admittedly, it is a time-consuming and labor-intensive job for researchers to pick out the optimum signal peptide for the target protein by experimental method. However, in silico identification of the best-performing signal peptide for a given protein is still not easy to implement. It remains unclear how the signal peptides influence the secretion efficiency of the recombinant proteins. Researchers can only get a partial understanding of this phenomenon by the non-integrated and fragmented data in the relevant literature. Two existing signal peptide databases, SPdb (Choo et al., 2005) and Signal Peptide Website (available at http://www.signalpeptide.com/) only contain signal peptide sequence but provide no information about signal peptide secretion capacity. A comprehensive collection of signal peptide secretion efficiency data is urgently needed to provide a reference for good-performing signal peptide selection in recombinant protein production. Herein, data about signal peptide secretion efficiency for specific target proteins were manually collected and a Signal Peptide Secretion Efficiency Database (SPSED) was constructed. SPSED is more focused on the signal peptide secretion efficiency for specific target proteins. SPSED is freely available at http://www.spsed.com/ with all major browsers supported. The database provides a user-friendly interface for browsing, searching, and downloading of SPSED records. Users can also BLAST a query sequence against SPSED to find a homologous secreted protein or signal peptide. We believe that SPSED is a valuable resource for recombinant protein production and researches in the mechanism of signal peptide secretion.

Data Retrieval

Screening of a signal peptide library fused to the secretion target is an effective method to optimize the export of target protein. The signal peptide secretion efficiency data for alkaline active xylanase (Zhang et al., 2016), alkaline protease (Liu et al., 2019), aminopeptidase (Guan et al., 2016), subtilisin BPN’ (Degering et al., 2010), cutinase (Brockmeier et al., 2006; Hemmerich et al., 2016), natto phytase (Tsuji et al., 2015), nattokinase (Cai et al., 2016), nuclease (Mathiesen et al., 2009), and α-amylase (Fu et al., 2018) stored in SPSED are obtained by this method. Brockmeier et al. (2006) constructed a signal peptide library containing 173 predicted SPs from Bacillus subtilis 168. Cutinase from Fusarium solani pisi was used as the reporter protein. B. subtilis TEB1030 was used as the expression host. The screening revealed a dramatic difference in lipolytic activity of the culture supernatants. In this experiment, the metagenomic esterase EstCL1 was also used as the target protein with a subset of SPs in the library. Intriguingly, there was no correlation between the signal peptide secretion capacity for cutinase and esterase (Brockmeier et al., 2006). The comprehensive analysis of Lactobacillus plantarum signal peptide functionality reconfirmed the above conclusion. In this experiment, a signal peptide library containing 76 predicted signal peptides from L. plantarum WCFS1 was constructed. Signal peptides in the library showed considerable variation in terms of their performance to drive secretion of staphylococcal nuclease (NucA). To further test the signal peptides’ general usefulness, a selected set of SPs were used to direct the secretion of lactobacillal amylase (AmyA). Signal peptides’ secretion effect on AmyA and NucA were not consistent (Mathiesen et al., 2009). An optimal matching between the SP and the mature part of the target protein is essential for efficient protein secretion. We retrieved articles that optimized protein secretion by signal peptides screening in the PubMed database, Google Scholar, and CNKI (China National Knowledge Infrastructure). We took the target proteins in the articles as objects and extracted the protein yield guided by different signal peptides manually from the articles. The nucleotide sequences and amino acid sequences of target proteins were then extracted from UniProt (Bateman et al., 2019) and GenBank database (Benson et al., 2013). The types and sources of SPs were obtained from the original articles. We got the signal peptides sequences directly if they were provided in the articles. If, on the other hand, the sequences of the signal peptides were not provided in the articles, we first downloaded the sequence of proteins from which the signal peptides come and then intercepted the SPs sequences according to SPs length or by signal peptide prediction server SingalP (Armenteros et al., 2019). Based on the sequences, we calculated the charge and hydrophobicity of the signal peptides and then drew the hydrophobicity plots.

Database Description

SPSED is built using SQLite allowing rapid retrieval of data and making resources easy to maintain. Figure 1 shows a snapshot of the SPSED database interface. The global navigation bar is located at the top of every page to enable the quick switch between different pages (Figure 1A). One entry in the database corresponds to the secretion yield of a specific target protein with the guidance of a specific signal peptide. We assigned a unique SPSED identification number for each record. The nucleotide sequence, amino acid sequence, UniProt link, GenBank link, and expression host of the secreted target protein have been displayed in both the ‘Secreted Protein Detail’ page and the ‘Detail Information’ page. The signal peptide source, type, sequence, DNA, and protein sequence of signal peptide original protein have been provided. We used the ratio of current yield to the highest yield to represent the secretion performance of each signal peptide. We also calculated the hydrophobicity of the whole signal peptide and the hydrophobicity of the h-region with values according to the Kyte-Doolittle hydrophobic scale (Kyte and Doolittle, 1982). The hydrophobicity plot of the whole signal peptide sequence is given on the signal peptide detail page. The charge of the whole signal peptide and the charge of the n-region is also calculated and provided in the database. Users can browse the secreted proteins and the secreted enzyme yield driven by different signal peptides (Figures 1B–D).
FIGURE 1

A screenshot of the SPSED database interface, which can illustrate the relationship among the main pages. (A) The global navigation bar which is located at the top of every page. (B) The browse page of the SPSED database. (C) Database browse interface for accessing the detailed information of target proteins and signal peptides. (D) A representative view of the record in SPSED. (E) The advanced search page of SPSED. (F) The advanced search result page of the database. (G) The BLAST search page of SPSED. (H) The BLAST result page of SPSED.

A screenshot of the SPSED database interface, which can illustrate the relationship among the main pages. (A) The global navigation bar which is located at the top of every page. (B) The browse page of the SPSED database. (C) Database browse interface for accessing the detailed information of target proteins and signal peptides. (D) A representative view of the record in SPSED. (E) The advanced search page of SPSED. (F) The advanced search result page of the database. (G) The BLAST search page of SPSED. (H) The BLAST result page of SPSED. The database also provides an “Advanced Search” page for a customizable search of SPSED records. Users can select the options listed in the five select boxes named “Secreted Protein,” “Expression Host,” “Signal Peptide Type,” “Source Organism of the Signal Peptide,” and “Screening of Secretory Efficiency” to filter out the records they are interested in (Figures 1E,F). Besides, we have installed the BLAST (Altschul et al., 1997) program locally. When the protein or signal peptide of interest is not in SPSED, users can BLAST the query sequence against our database to find a homologous secreted protein or signal peptide. The query amino acid sequence is required to be pasted in the textbox in fasta format. When the alignment is ready, a BLAST result page with links to the database records is provided (Figures 1G,H). Users are encouraged to submit new data via the ‘Submit’ page. The submitted data will be manually revised and incorporated into the release of the SPSED database. The “Links” page provides a list of tools for signal peptide prediction and web resources related to protein secretion. Target proteins yield that is driven by different signal peptides are packaged by the expression hosts and the target proteins. The compressed data are presented on the ‘Download’ page for batch downloading. The SPSED database is available online at http://www.spsed.com/and requires no registration.

Conclusion and Perspectives

In conclusion, we have developed a signal peptide secretion efficiency database SPSED. This database is, to our knowledge, the first attempt to provide the yield of industrial enzymes with the guidance of different signal peptides, which can reflect the signal peptide secretion capacity for the target protein. In the current version of SPSED, 1025 signal peptide secretion efficiency data collected from 20 experiments are included. SPSED has been experiencing slow and linear database growth because all records in the SPSED database are manually curated. Manual selection is required in literature mining to pick out articles that report detailed enzyme activity data. Besides, the target protein, expression host, signal peptide, and enzyme activity data in the literature also need to be collected manually. It is now difficult to develop an automated process to gather data for SPSED in batch. Encouraging users to submit their signal peptide secretion efficiency data could be a potential way for a faster inclusion of records. Anyway, the maintenance and revision of the SPSED database will keep on going. We believe that SPSED will facilitate the production of enzymes and studies on the mechanism of signal peptide secretion. Biotechnologists who work on recombinant protein production can pick out a good-performing signal peptide for their target protein from this database. Microbiologists can also investigate the mechanism of efficient protein secretion by analyzing data in SPSED, which is worth exploring.
  21 in total

1.  Life and death of a signal peptide.

Authors:  G von Heijne
Journal:  Nature       Date:  1998-11-12       Impact factor: 49.962

Review 2.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

3.  A simple method for displaying the hydropathic character of a protein.

Authors:  J Kyte; R F Doolittle
Journal:  J Mol Biol       Date:  1982-05-05       Impact factor: 5.469

4.  Systematic Screening of Optimal Signal Peptides for Secretory Production of Heterologous Proteins in Bacillus subtilis.

Authors:  Gang Fu; Jinlan Liu; Jinshan Li; Beiwei Zhu; Dawei Zhang
Journal:  J Agric Food Chem       Date:  2018-12-07       Impact factor: 5.279

5.  Optimal secretion of alkali-tolerant xylanase in Bacillus subtilis by signal peptide screening.

Authors:  Weiwei Zhang; Mingming Yang; Yuedong Yang; Jian Zhan; Yaoqi Zhou; Xin Zhao
Journal:  Appl Microbiol Biotechnol       Date:  2016-05-25       Impact factor: 4.813

6.  Construction of a highly active secretory expression system via an engineered dual promoter and a highly efficient signal peptide in Bacillus subtilis.

Authors:  Chengran Guan; Wenjing Cui; Jintao Cheng; Rui Liu; Zhongmei Liu; Li Zhou; Zhemin Zhou
Journal:  N Biotechnol       Date:  2016-01-25       Impact factor: 5.079

7.  SPdb--a signal peptide database.

Authors:  Khar Heng Choo; Tin Wee Tan; Shoba Ranganathan
Journal:  BMC Bioinformatics       Date:  2005-10-13       Impact factor: 3.169

8.  Factors Influencing Recombinant Protein Secretion Efficiency in Gram-Positive Bacteria: Signal Peptide and Beyond.

Authors:  Chong Peng; Chaoshuo Shi; Xue Cao; Yu Li; Fufeng Liu; Fuping Lu
Journal:  Front Bioeng Biotechnol       Date:  2019-06-11

9.  Genome-wide analysis of signal peptide functionality in Lactobacillus plantarum WCFS1.

Authors:  Geir Mathiesen; Anita Sveen; May Bente Brurberg; Lasse Fredriksen; Lars Axelsson; Vincent Gh Eijsink
Journal:  BMC Genomics       Date:  2009-09-10       Impact factor: 3.969

10.  GenBank.

Authors:  Dennis A Benson; Mark Cavanaugh; Karen Clark; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; Eric W Sayers
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.