Literature DB >> 26467475

CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides.

Faiza Hanif Waghu1, Ram Shankar Barai1, Pratima Gurung1, Susan Idicula-Thomas2.   

Abstract

Antimicrobial peptides (AMPs) are known to have family-specific sequence composition, which can be mined for discovery and design of AMPs. Here, we present CAMPR3; an update to the existing CAMP database available online at www.camp3.bicnirrh.res.in. It is a database of sequences, structures and family-specific signatures of prokaryotic and eukaryotic AMPs. Family-specific sequence signatures comprising of patterns and Hidden Markov Models were generated for 45 AMP families by analysing 1386 experimentally studied AMPs. These were further used to retrieve AMPs from online sequence databases. More than 4000 AMPs could be identified using these signatures. AMP family signatures provided in CAMPR3 can thus be used to accelerate and expand the discovery of AMPs. CAMPR3 presently holds 10247 sequences, 757 structures and 114 family-specific signatures of AMPs. Users can avail the sequence optimization algorithm for rational design of AMPs. The database integrated with tools for AMP sequence and structure analysis will be a valuable resource for family-based studies on AMPs.
© The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 26467475      PMCID: PMC4702787          DOI: 10.1093/nar/gkv1051

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Antimicrobial peptides (AMPs) are host defense molecules produced by a wide range of organisms including bacteria or protozoa as well as animals, where they are produced by the innate immune system (1). AMPs kill microbes via various mechanisms, such as destruction of the microbial membrane, inhibition of macromolecule synthesis (2–4) etc. Due to these multiple mechanisms of action, it is difficult for microbes to gain resistance against AMPs as compared to conventional antibiotics. Few of the naturally occurring AMPs have also been observed to regulate various physiological functions such as anti-inflammatory properties, angiogenesis and wound healing besides their antimicrobial activity (5,6). Development in sequencing technology has accelerated availability of genomic and proteomic data of various organisms in public sequence repositories. The annotations of AMPs in these large data sets using wet-lab methods are cost and resource-intensive. AMPs belong to various AMP families. These families exhibit distinctive sequence composition such as cysteine conservation in defensins (7), abundance of histidines in histatins (8), conservation of unusual amino acid such as aminoisobutyric acid in peptaibols (9) and lanthionine in bacteriocins (lantibiotics) (10) etc. This family-specific sequence conservation can be exploited to identify AMPs from a large pool of sequence data. Family-based signatures such as patterns and Hidden Markov Models (HMMs) can be powerful tools to retrieve and annotate sequences available in sequence databases. Sequence signatures (patterns and HMMs) present in 1386 experimentally studied AMPs represented by 45 families were generated and used to fetch AMPs from sequence databases. This data has been collated and presented as an update to CAMP database. CAMPR3 currently holds 10247 sequences, 757 structures and 114 signatures present in 45 AMP families.

MATERIALS AND METHODS

Data collection and organization

Sequences, structures and family information of AMPs

To update the existing CAMP database (11), protein data available in NCBI (12), UniProtKB (13) and PDB (14) databases post 2013 was queried using appropriate keywords such as ‘antimicrobial’, ‘antibacterial’, ‘antifungal’, ‘antiviral’ and ‘antiparasitic’. The obtained hits were manually curated to extract information on sequence, structure, protein definition, accession numbers, reference literature, activity, taxonomy of the source organism, target organisms with minimum inhibitory concentration (MIC) values, hemolytic activity of the peptide and protein family descriptions. This information is made available in CAMPR3. Links to UniProtKB, PDB, PubMed (12) and other databases dedicated to AMPs are also made available for the benefit of the users.

Signatures of AMPs

Experimentally validated AMPs, whose family information is available in CAMP (11) was used to generate family-based signatures. Families containing at least two members were considered for signature creation. 1386 sequences, representing 45 AMP families were used to generate patterns and HMMs. PRATT tool (15) was used for generation of patterns. Multiple sequence alignments of each AMP family were created using Clustal Omega (16,17) and these were used as input to build HMM models using ‘hmmbuild’ program of HMMER 3.1b1 package (18). A heuristically determined fitness value of 26 or above was used as a threshold for selecting patterns for retrieval of sequences. Since length is an important parameter for sequence alignment, length-based patterns and HMMs were also created. The generated patterns and HMMs were queried against the protein database of NCBI and UniProtKB using ScanProsite tool (19) and jackhmmer tool of HMMER web server (20), respectively, to retrieve hits. The HMMs were queried until convergence or stopped after three iterations. Sequences retrieved using HMMs, having a threshold e-value below 0.005 were considered for further screening. The retrieved hits were curated based on their AMP definitions. For each retrieved AMP; information related to sequence, protein definition, accession numbers, activity, source organism, target organisms, protein family descriptions and links to databases like UniProtKB and PubMed along with the generated signatures are provided in CAMPR3. Protein sequences, whose definition suggested antimicrobial activity and had at least one supporting literature reference in PubMed proving its antimicrobial activity by wet-lab methods, were included in the Experimentally Validated data set. 590 sequences were retrieved from APD2 (21). These sequences are integrated in the Experimentally Validated or Predicted data set based on the annotation provided by APD2. AMPs that have annotations indicating their antimicrobial activity but do not have supporting PubMed reference literature were included in the Predicted data set. These sequences are predicted to be antimicrobial either based on their GO (22)/Pfam (23)/InterPro (24)/UniProtKB/NCBI annotations or they were retrieved based on the AMP family signatures.

Algorithm for rational design of AMPs

An in-house Perl script was created to generate all possible single residue substitutions of user defined sequence/s. These sequences are then run through the prediction models (Support Vector Machines (SVMs), Random Forests (RF) and Discriminant analysis (DA)) generated and available in the previous release of CAMP database (11).

Database architecture

The database is built using MySQL Server 5.1.33 as back-end and the front-end is built using PHP, HTML, JavaScript, Open Flash Chart 2 and Perl. The database is hosted on Apache web server 2.2.11. Statistical software R version 2.9.1 (25) was used for development of the prediction server. JSmol viewer (http://wiki.jmol.org/index.php/JSmol) has been integrated for AMP structure visualization. A brief description of the user interface of CAMPR3 is provided as follows. Home: the home page provides information about various features of the database. Databases: the data is divided into four databases which include sequence, structure, patents and the newly incorporated signature database. Tools: the database includes the following tools for analysis. The AMP prediction tool has been developed in-house. Access to various tools relevant to sequence/structure analysis and available in public domain have also been provided in CAMPR3 for the benefit of the users. AMP prediction: users can (i) predict AMPs (ii) predict antimicrobial region within peptides and (iii) rationally design AMPs by generating an exhaustive combinatorial library of sequences for a user-defined sequence and predict effect of single residue substitutions on antimicrobial activity using SVMs, RF and DA. BLAST: users can use BLAST tool (26) to query protein sequence/s against various data sets of CAMPR3 which include the entire database, sequence, structure, patent, experimentally validated, predicted and predicted based on signature data sets to find homologous sequences, structures and other relevant information. Clustal Omega: users can use Clustal Omega tool of EMBL-EBI to obtain multiple sequence alignment of peptides. Vector Alignment Search Tool: users can identify similar protein structures and distant homologs that cannot be identified by sequence comparison using VAST of NCBI (27). PRATT: users can generate AMP family-specific patterns using this tool from ExPASy. ScanProsite: using this tool from Swiss Institute of Bioinformatics, users can (i) scan proteins against the PROSITE collection of PSSMs/patterns; (ii) scan patterns against protein sequence, structure or user defined database/s and (iii) scan user defined patterns against a set of protein sequences. PHI-BLAST: users can use PHI-BLAST (28) to find AMPs similar to the query based on a family-specific pattern. jackhmmer: users can iteratively search a protein sequence/structure database using a set of protein sequences/multiple sequence alignment/HMM as an input to find homologs using this tool from EMBL-EBI. Search: basic and advanced search options are available for search of AMP families/sequences/structures and signatures. Links: links to other online AMP databases are provided. Statistics: information on CAMPR3 statistics can be viewed. Help: detailed description and use of the various features and tools incorporated in the database is provided for the benefit of the users.

RESULTS AND DISCUSSION

CAMPR3 provides comprehensive information on AMPs and their families as represented by their sequences, structures, activity, signatures, source and target organisms. The unique feature of CAMPR3 as compared to other AMP databases is that information of family-specific signatures has been provided for a large set of both eukaryotic as well as prokaryotic AMPs. It presently contains 114 AMP family-specific sequence signatures (36 patterns and 78 HMMs). Using these signatures, a total of 4222 AMPs were identified, out of which 2739 were absent in the previous CAMP database. Use of signatures is particularly significant for retrieving sequences that have to be queried specifically by their definitions. For example, AMPs such as thionin-2.1 (UniProt ID: Q42596), varv peptide A/kalata-B1 (UniProt ID: Q5USN7) etc. could not be retrieved from UniProtKB database using search keywords such as ‘antimicrobial’ but could be retrieved using their family signatures. CAMPR3 currently holds 10247 AMP sequences, of which 4857 are experimentally validated, and 5390 are predicted. Of these, 3491 have been recently identified. The structure database has also been updated to include 757 antimicrobial structures. Sequence composition is an important determinant of antimicrobial activity. It has been well demonstrated by antimicrobial assays of AMPs and their analogues that minor variations in peptide sequence can drastically alter its antimicrobial activity (29). The prediction algorithm for AMPs, available in CAMPR3 now includes an additional feature for rational design of AMPs. This feature can be used to predict the effect of single residue substitutions on antimicrobial activity. The features incorporated in CAMPR3 will significantly promote AMP family-based studies. AMPs belonging to a particular AMP family can be effortlessly obtained using the family-based search. This feature, along with the family signatures and tools available in CAMPR3 for sequence and structure analysis, will allow users to study the various AMP families independently and effectively.

CONCLUSION

The database is available for retrieval of sequences/structures/patents/signatures and families of AMPs. Comparison of CAMPR3 with the existing databases dedicated to AMPs is presented in Table 1. AMPs that are not easily retrievable using simple keyword search have been identified/retrieved from public sequence databases using family signatures.
Table 1.

Comparison of CAMPR3 with few of the existing AMP databases

DatabaseSequencesStructuresSignaturesNature of dataReference
CAMPR310247757114 (36 Patterns and 78 HMMs)General-
APD22604350AbsentGeneral(21)
AMPer1298Absent186 HMMsEukaryotic AMPs(30)
LAMP5548PresentaAbsentGeneral(31)
BACTIBASE22872PresentaBacteriocins(32)
YADAMP2525PresentaAbsentGeneral(33)
PhytAMP27339PresentaPlant AMPs(34)
Peptaibiotics database1344AbsentAbsentPeptaibols(35)
Defensins Knowledgebase566PresentaAbsentDefensins(36)

aDifficult to retrieve total count.

aDifficult to retrieve total count. The highlights of this updated database are as follows. Massive update on AMP sequences and structures (10247 AMP sequences and 757 AMP structures). Family-specific signatures of eukaryotic and prokaryotic AMPs. Sequence optimisation prediction algorithm for antimicrobial activity. CAMPR3 has been developed with an objective to expand and accelerate research on AMPs.
  35 in total

Review 1.  Role of defensins and cathelicidin LL37 in auto-immune and auto-inflammatory diseases.

Authors:  Loredana Frasca; Roberto Lande
Journal:  Curr Pharm Biotechnol       Date:  2012-08       Impact factor: 2.837

Review 2.  Antimicrobial peptides and peptaibols, substitutes for conventional antibiotics.

Authors:  Hervé Duclohier
Journal:  Curr Pharm Des       Date:  2010       Impact factor: 3.116

3.  Protein sequence similarity searches using patterns as seeds.

Authors:  Z Zhang; A A Schäffer; W Miller; T L Madden; D J Lipman; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  1998-09-01       Impact factor: 16.971

Review 4.  Surprising similarities in structure comparison.

Authors:  J F Gibrat; T Madej; S H Bryant
Journal:  Curr Opin Struct Biol       Date:  1996-06       Impact factor: 6.809

Review 5.  Antimicrobial peptides: promising compounds against pathogenic microorganisms.

Authors:  J Cruz; C Ortiz; F Guzmán; R Fernández-Lafuente; R Torres
Journal:  Curr Med Chem       Date:  2014       Impact factor: 4.530

6.  Mechanism of action of puroindoline derived tryptophan-rich antimicrobial peptides.

Authors:  Evan F Haney; Alexandra P Petersen; Cheryl K Lau; Weiguo Jing; Douglas G Storey; Hans J Vogel
Journal:  Biochim Biophys Acta       Date:  2013-04-02

7.  AMPer: a database and an automated discovery tool for antimicrobial peptides.

Authors:  Christopher D Fjell; Robert E W Hancock; Artem Cherkasov
Journal:  Bioinformatics       Date:  2007-03-06       Impact factor: 6.937

8.  LAMP: A Database Linking Antimicrobial Peptides.

Authors:  Xiaowei Zhao; Hongyu Wu; Hairong Lu; Guodong Li; Qingshan Huang
Journal:  PLoS One       Date:  2013-06-18       Impact factor: 3.240

9.  Analysis Tool Web Services from the EMBL-EBI.

Authors:  Hamish McWilliam; Weizhong Li; Mahmut Uludag; Silvano Squizzato; Young Mi Park; Nicola Buso; Andrew Peter Cowley; Rodrigo Lopez
Journal:  Nucleic Acids Res       Date:  2013-05-13       Impact factor: 16.971

10.  The InterPro protein families database: the classification resource after 15 years.

Authors:  Alex Mitchell; Hsin-Yu Chang; Louise Daugherty; Matthew Fraser; Sarah Hunter; Rodrigo Lopez; Craig McAnulla; Conor McMenamin; Gift Nuka; Sebastien Pesseat; Amaia Sangrador-Vegas; Maxim Scheremetjew; Claudia Rato; Siew-Yit Yong; Alex Bateman; Marco Punta; Teresa K Attwood; Christian J A Sigrist; Nicole Redaschi; Catherine Rivoire; Ioannis Xenarios; Daniel Kahn; Dominique Guyot; Peer Bork; Ivica Letunic; Julian Gough; Matt Oates; Daniel Haft; Hongzhan Huang; Darren A Natale; Cathy H Wu; Christine Orengo; Ian Sillitoe; Huaiyu Mi; Paul D Thomas; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2014-11-26       Impact factor: 16.971

View more
  140 in total

1.  Adevonin, a novel synthetic antimicrobial peptide designed from the Adenanthera pavonina trypsin inhibitor (ApTI) sequence.

Authors:  Mayara S Rodrigues; Caio F R de Oliveira; Luís H O Almeida; Simone M Neto; Ana Paula A Boleti; Edson L Dos Santos; Marlon H Cardoso; Suzana M Ribeiro; Octávio L Franco; Fernando S Rodrigues; Alexandre J Macedo; Flávia R Brust; Maria Lígia R Macedo
Journal:  Pathog Glob Health       Date:  2018-12-20       Impact factor: 2.894

2.  Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides.

Authors:  Michela Chiara Caprani; John Healy; Orla Slattery; Joan O'Keeffe
Journal:  Interdiscip Sci       Date:  2021-05-12       Impact factor: 2.233

3.  Discovery of Next-Generation Antimicrobials through Bacterial Self-Screening of Surface-Displayed Peptide Libraries.

Authors:  Ashley T Tucker; Sean P Leonard; Cory D DuBois; Gregory A Knauf; Ashley L Cunningham; Claus O Wilke; M Stephen Trent; Bryan W Davies
Journal:  Cell       Date:  2018-01-04       Impact factor: 41.582

4.  Collection of antimicrobial peptides database and its derivatives: Applications and beyond.

Authors:  Faiza Hanif Waghu; Susan Idicula-Thomas
Journal:  Protein Sci       Date:  2019-09-30       Impact factor: 6.725

5.  The antimicrobial peptide database provides a platform for decoding the design principles of naturally occurring antimicrobial peptides.

Authors:  Guangshun Wang
Journal:  Protein Sci       Date:  2019-08-10       Impact factor: 6.725

6.  dbAMP: an integrated resource for exploring antimicrobial peptides with functional activities and physicochemical properties on transcriptome and proteome data.

Authors:  Jhih-Hua Jhong; Yu-Hsiang Chi; Wen-Chi Li; Tsai-Hsuan Lin; Kai-Yao Huang; Tzong-Yi Lee
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

7.  Identification of Antimicrobial Peptides from Novel Lactobacillus fermentum Strain.

Authors:  Anna S Pavlova; Georgii D Ozhegov; Georgij P Arapidi; Ivan O Butenko; Eduard S Fomin; Nikolai A Alemasov; Dmitry A Afonnikov; Dina R Yarullina; Vadim T Ivanov; Vadim M Govorun; Airat R Kayumov
Journal:  Protein J       Date:  2020-02       Impact factor: 2.371

8.  The antimicrobial activity of protein elicitor AMEP412 against Streptomyces scabiei.

Authors:  Quan Liu; Yongrui Shen; Kuide Yin
Journal:  World J Microbiol Biotechnol       Date:  2020-01-07       Impact factor: 3.312

9.  FoldamerDB: a database of peptidic foldamers.

Authors:  Bilal Nizami; Dorottya Bereczki-Szakál; Nikolett Varró; Kamal El Battioui; Vignesh U Nagaraj; Imola Cs Szigyártó; István Mándity; Tamás Beke-Somfai
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

10.  Proteomics assisted profiling of antimicrobial peptide signatures from black pepper (Piper nigrum L.).

Authors:  P Umadevi; M Soumya; Johnson K George; M Anandaraj
Journal:  Physiol Mol Biol Plants       Date:  2018-04-03
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.