Literature DB >> 24265220

CAMP: Collection of sequences and structures of antimicrobial peptides.

Faiza Hanif Waghu1, Lijin Gopi, Ram Shankar Barai, Pranay Ramteke, Bilal Nizami, Susan Idicula-Thomas.   

Abstract

Antimicrobial peptides (AMPs) are gaining importance as anti-infective agents. Here we describe the updated Collection of Antimicrobial Peptide (CAMP) database, available online at http://www.camp.bicnirrh.res.in/. The 3D structures of peptides are known to influence antimicrobial activity. Although there exists databases of AMPs, information on structures of AMPs is limited in these databases. CAMP is manually curated and currently holds 6756 sequences and 682 3D structures of AMPs. Sequence and structure analysis tools have been incorporated to enhance the usefulness of the database.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 24265220      PMCID: PMC3964954          DOI: 10.1093/nar/gkt1157

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Antimicrobial peptides (AMPs) are widely studied as potential alternatives for antibiotics. Surge in research on AMPs has led to the development of several databases and prediction tools. Some of these are general databases such as APD2 (1), DAMPD (2) and LAMP (3), whereas others are specialized databases like—AMSdb (http://www.bbcm.units.it/∼tossi/pag1.htm) that contains AMPs from only plant and animal sources; RAPD (4) provides information on recombinant methods to generate AMPs; PhytAMP (5) and BACTIBASE (6) are databases dedicated to AMPs from plant and bacterial sources, respectively; Defensins knowledgebase (7) and PenBase (8) are devoted to AMPs from defensin and penaeidin families, respectively; Peptaibol Database (9) is a database of peptaibols (unusual class of peptides); BAGEL (10) is a database of bacteriocins; and HIPdb (11) is a database of experimentally validated HIV-inhibiting peptides. The enormous amount of data on AMPs had motivated us to develop a general database, Collection of Antimicrobial Peptides (CAMP) (12), which included a sequence-based prediction tool for AMPs. While all these databases provide comprehensive information on sequences of AMPs, information on structures of AMPs is limited. The topological features of peptides play a crucial role in dictating antimicrobial activity (13). Although many sequence-based prediction algorithms are available, the knowledge of 3D structural features of known AMPs has not been exploited to develop prediction algorithms. The lack of structural databases of AMPs is probably one of the main impediments in this direction. Presently, there are several AMPs whose structural information is available in the Protein Data Bank (PDB) (14). However, retrieving information on structures of AMPs from the structural databases such as PDB is not a trivial task; for example, the structures may have additional chains that are non-AMPs, and these have to be filtered out by manual curation. The structures may also not be easily retrieved from structure databases based on simple keyword searches such as ‘antibacterial’, ‘antifungal’, etc. To address these shortcomings, the current release of CAMP has been developed.

MATERIALS AND METHODS

Data collection and organization

Sequence and structural information of AMPs was retrieved from protein databases of NCBI, UniProtKB (15) and PDB using combination of keywords like ‘antimicrobial’, ‘antibacterial’, ‘antifungal’, ‘antiviral’ and ‘antiparasitic’. Manually curated information related to sequence, structure, protein definition, accession numbers, reference literature, activity, taxonomy of the source organism, target organisms with minimum inhibitory concentration (MIC) values, hemolytic activity of the peptide, functional and structural classifications, protein family descriptions and links to external databases like UniProtKB, PDB, PubMed and other AMP databases is made available to the users.

Database architecture

The updated CAMP database is built on Apache HTTP server 2.0.59. MySQL Server 5.0 is used at the back-end, whereas the front-end is built using PHP, HTML, JavaScript, Perl and Open Flash Chart 2. Below is a brief description of the user interface of CAMP: Home: The CAMP database along with its various features is described in this section. Databases: Data are sectioned into sequence, structure and patent databases. Tools: The following analysis tools are available to the users. AMP prediction: Users can predict AMPs and/or scan for antimicrobial regions within the peptides using Support Vector Machine (SVM), Random Forests (RF) and Artificial Neural Network (ANN). Feature calculator: Amino acid composition, secondary structural propensities and physicochemical properties such as net charge, hydrophobicity, etc of the peptides can be calculated. BLAST: Users can use BLAST (16) tool against the sequence or structure database of CAMP to find homologous sequences or structures, respectively. ClustalW: Multiple sequence alignment of the peptides can be obtained using ClustalW (17) tool from EMBL-EBI. Vector Alignment Search Tool: Similar protein structures can be identified using this NCBI tool (18). PRATT: This tool from ExPASy can be used to find patterns in a set of related AMPs (19,20). Helical wheel: Alpha-helical AMPs can be studied using the helical wheel Java applet created by Edward K. O'Neil and Charles M. Grisham (University of Virginia in Charlottesville, Virginia). PDB2PQR: This clone server can be used for converting PDB files into PQR file format, (PQR files are PDB files where B-factor and occupancy columns have been replaced by radius and per-atom charge, respectively) which could be used for further structural studies (21,22). Search: Users can search for sequences and/or structures of AMPs using basic and advanced search options. Links to other available AMP databases have been provided. Statistics: Coverage of the database based on the nature of data, taxonomy of source organism and activity has been depicted using pie charts and Venn diagram. Help: A detailed explanation about the features and tools available in the database has been provided in this section.

Prediction algorithm

Dataset creation

The positive dataset constituted of 3010 AMP sequences. These were obtained from the patent and experimentally validated datasets of CAMP, after removing sequences that (i) are redundant (100% similarity cut-off), (ii) have non-standard amino acids and (iii) have length >100. CD-HIT server was used for removing redundant sequences (23). The negative dataset consists of 4011 sequences, generated in our previous work (12). It includes experimentally proven non-antimicrobial sequences, arbitrary sequences generated using random numbers and protein sequences retrieved from the UniProt database without annotation as ‘antimicrobial’. The sequences had length approximately in the same range as the positive dataset. The CD-HIT program (23) was used to eliminate sequences with >90% identity. These datasets were randomly divided into training (70%) and test (30%) datasets.

Model generation

Sixty-four best peptide descriptors based on the RF Gini score were used for developing SVM-, RF- and ANN-based prediction models. All the models were evaluated using Matthews correlation coefficient (MCC), prediction accuracy and 10-fold cross-validation accuracy on training and test datasets. For developing the prediction models, implementation of SVM, RF and ANN in R (version 2.15.3) was used (24).

SVM

Kernlab package in R was used to train the SVM classifier (25). In this study, we have used polynomial kernel function. The values of the hyper parameters were set as follows: degree = 4, scale = 0.01 and offset = 1.

RF

‘randomForest’ package was used to train the RF classifier with a maximum of 1500 trees (26).

ANN

‘nnet’ package in R was used for building the ANN-based prediction model (27).

RESULTS AND DISCUSSION

The updated CAMP is a comprehensive database on sequences and structures of AMPs. It currently holds 6756 sequences of AMPs (experimentally validated (2602), predicted (2438) and patents (1716)), which include 2736 recently identified AMP sequences. The information on the sequence, AMP family, source, target organism and activity is captured in the database. As can be seen in Figure 1A–C, CAMP has a wide coverage on the above fields.
Figure 1.

(A) Pie chart of AMP families in CAMP, (B) Pie chart of source organisms of AMPs in CAMP, (C) Venn diagram of classification of AMP activity in CAMP and (D) Relative amino acid composition of experimentally validated and predicted sequences of AMPs in CAMP as compared with Swiss-Prot composition.

(A) Pie chart of AMP families in CAMP, (B) Pie chart of source organisms of AMPs in CAMP, (C) Venn diagram of classification of AMP activity in CAMP and (D) Relative amino acid composition of experimentally validated and predicted sequences of AMPs in CAMP as compared with Swiss-Prot composition. CAMP presently contains 682 AMP structures. Multiple structures of AMPs, if available in PDB, are also integrated in the database. Although structural information on AMPs is available in databases such as APD2, LAMP, etc, the structures can be directly viewed using Jmol viewer in CAMP. Direct viewing of structures is also available in Defensins knowledgebase, PhytAMP, HIPdb and BACTIBASE. However, these databases cater to specific class of AMPs. Another interesting feature of the current release of CAMP is that users can selectively retrieve information on specific families of AMPs of their interest; e.g. cathelicidins, defensins and cecropins. The AMP family information for the peptides has been annotated manually using information from Pfam (28), InterPro (29) and associated literature. The distribution of the AMP families in the database can be seen in Figure 1A. The prediction algorithm for AMPs has been modified using the updated sequence information. Supplementary Table S1 shows the prediction accuracy, MCC and cross-validation accuracy of the prediction models. Users can predict the antimicrobial activity of proteins and/or scan regions (with user-defined lengths) within proteins for antimicrobial activity. Tools that aid in sequence and structure analysis such as feature calculator, PRATT, ClustalW, Vector Alignment Search Tool, BLAST and PDB2PQR have also been incorporated in CAMP. Effect of mutations on the structure of AMPs and/or their analogs can be visualized using the Jmol visualizer integrated in the database. Helicity is known to influence antimicrobial activity (30) and therefore, tool for helical wheel projection is also available. AMPs are known to be rich in hydrophobic and cationic amino acids. The ratio of the percentage frequency of amino acids in CAMP to the percentage frequency of amino acids in UniProtKB/Swiss-Prot protein knowledgebase (Release 2013_08 of 24 July 2013) is plotted in Figure 1D. As expected, AMPs were observed to be enriched in positively charged and hydrophobic residues such as Arg, Lys, Gly, Cys, Trp and Val residues.

CONCLUSIONS

CAMP holds a massive update on AMP sequences and incorporates several tools relevant to design of AMPs. The 3D conformations of peptides are known to be critical determinants of antimicrobial activity. The prominent feature of the current release of CAMP is the addition of experimentally derived structures of AMPs, which can be directly viewed using the Jmol viewer. The update also facilitates family-based study on AMPs. A detailed comparison of CAMP with the existing databases on AMPs is presented in Table 1. The information, present in an easily searchable and downloadable form, is envisaged to accelerate sequence–structure–activity studies on AMPs.
Table 1.

Comparison of CAMP with existing AMP databases

FeaturesDatabase
RAPDPhytAMPBACTIBASE second releaseDefensins knowledg- ebasePenBasePeptaibol databaseAMSDbHIPdbAPD2DAMPDLAMPCAMP
TypeSpecific (Recombinantly produced AMPs only)Specific (Plant AMPs only)Specific (Bacteriocins only)Specific (Defensin family AMPs only)Specific (Penaeidin family AMPs only)Specific (Peptaibols only)Specific (Eukaryotic AMPs only)Specific (HIV inhibiting peptides only)GeneralGeneralGeneralGeneral
Total number of entries1792732205662831789510682307123255477438
Prediction algorithmAbsentPresentPresentAbsentAbsentAbsentAbsentAbsentPresentPresentAbsentPresent
Structural informationAbsentPresentPresentPresentAbsentPresentaPresentaPresentPresentaPresentaPresentaPresent
Search based on AMP familyPresentPresentAbsentPresentAbsentAbsentPresentPresentAbsentPresentAbsentPresent
MIC valuesAbsentPresentPresentPresentAbsentAbsentPresentPresentPresentPresentPresentPresent
Separate searches for experimental and predicted datasetsAbsentAbsentAbsentAbsentAbsentAbsentAbsentAbsentAbsentAbsentPresentPresent
ToolsDNA translator, peptide calculator, DNA sequence convertorBLAST, FASTA, Smith-Waterman search, ClustalW, muscle, physiochemical profileBLAST, FASTA, Smith-Waterman search, ClustalW, Muscle, T-coffee, physiochemical profile, MODELLERBLAST and ClustalWBLAST and ClustalWAbsentHydroMCalc and HydroPlotHIPdb map, HIPdb BLASTAMP designerBLAST, ClustalW, NJPLOT, HMMER, hydrocalulator, signalp, graphical views.BLASTClustalW, PRATT, helical wheel, vector alignment search tool , BLAST, PDB2PQR, Feature calculator

aThe PDB IDs are available. Structures cannot be directly viewed.

Comparison of CAMP with existing AMP databases aThe PDB IDs are available. Structures cannot be directly viewed.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

FUNDING

This work [RA/18-09/2013] was supported by grants from Department of Science and Technology, Government of India [SB/S3/CE/028/2013]; and Indian Council of Medical Research. Funding for open access charge: Waived by Oxford University Press. Conflict of interest statement. None declared.
  26 in total

Review 1.  Host-defense antimicrobial peptides: importance of structure for activity.

Authors:  N Sitaram; R Nagaraj
Journal:  Curr Pharm Des       Date:  2002       Impact factor: 3.116

2.  PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations.

Authors:  Todd J Dolinsky; Jens E Nielsen; J Andrew McCammon; Nathan A Baker
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

3.  PenBase, the shrimp antimicrobial peptide penaeidin database: sequence-based classification and recommended nomenclature.

Authors:  Yannick Gueguen; Julien Garnier; Lorenne Robert; Marie-Paule Lefranc; Isabelle Mougenot; Julien de Lorgeril; Michael Janech; Paul S Gross; Gregory W Warr; Brandon Cuthbertson; Margherita A Barracco; Philippe Bulet; André Aumelas; Yinshan Yang; Dong Bo; Jianhai Xiang; Anchalee Tassanakajon; David Piquemal; Evelyne Bachère
Journal:  Dev Comp Immunol       Date:  2006       Impact factor: 3.636

4.  RAPD: a database of recombinantly-produced antimicrobial peptides.

Authors:  Yifeng Li; Zhengxin Chen
Journal:  FEMS Microbiol Lett       Date:  2008-12       Impact factor: 2.742

Review 5.  Surprising similarities in structure comparison.

Authors:  J F Gibrat; T Madej; S H Bryant
Journal:  Curr Opin Struct Biol       Date:  1996-06       Impact factor: 6.809

6.  Finding flexible patterns in unaligned protein sequences.

Authors:  I Jonassen; J F Collins; D G Higgins
Journal:  Protein Sci       Date:  1995-08       Impact factor: 6.725

7.  LAMP: A Database Linking Antimicrobial Peptides.

Authors:  Xiaowei Zhao; Hongyu Wu; Hairong Lu; Guodong Li; Qingshan Huang
Journal:  PLoS One       Date:  2013-06-18       Impact factor: 3.240

8.  DAMPD: a manually curated antimicrobial peptide database.

Authors:  Vijayaraghava Seshadri Sundararajan; Musa Nur Gabere; Ashley Pretorius; Saleem Adam; Alan Christoffels; Minna Lehväslaiho; John A C Archer; Vladimir B Bajic
Journal:  Nucleic Acids Res       Date:  2011-11-21       Impact factor: 16.971

9.  The Pfam protein families database.

Authors:  Marco Punta; Penny C Coggill; Ruth Y Eberhardt; Jaina Mistry; John Tate; Chris Boursnell; Ningze Pang; Kristoffer Forslund; Goran Ceric; Jody Clements; Andreas Heger; Liisa Holm; Erik L L Sonnhammer; Sean R Eddy; Alex Bateman; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2011-11-29       Impact factor: 16.971

10.  PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations.

Authors:  Todd J Dolinsky; Paul Czodrowski; Hui Li; Jens E Nielsen; Jan H Jensen; Gerhard Klebe; Nathan A Baker
Journal:  Nucleic Acids Res       Date:  2007-05-08       Impact factor: 16.971

View more
  65 in total

1.  Molecular cloning and characterization of six defensin genes from lentil plant (Lens culinaris L.).

Authors:  Reza Mir Drikvand; Seyyed Mohsen Sohrabi; Kamran Samiei
Journal:  3 Biotech       Date:  2019-02-23       Impact factor: 2.406

2.  Combined Bioinformatic and Rational Design Approach To Develop Antimicrobial Peptides against Mycobacterium tuberculosis.

Authors:  C Seth Pearson; Zachary Kloos; Brian Murray; Ebot Tabe; Monica Gupta; Jun Ha Kwak; Pankaj Karande; Kathleen A McDonough; Georges Belfort
Journal:  Antimicrob Agents Chemother       Date:  2016-04-22       Impact factor: 5.191

3.  Collection of antimicrobial peptides database and its derivatives: Applications and beyond.

Authors:  Faiza Hanif Waghu; Susan Idicula-Thomas
Journal:  Protein Sci       Date:  2019-09-30       Impact factor: 6.725

4.  dbAMP: an integrated resource for exploring antimicrobial peptides with functional activities and physicochemical properties on transcriptome and proteome data.

Authors:  Jhih-Hua Jhong; Yu-Hsiang Chi; Wen-Chi Li; Tsai-Hsuan Lin; Kai-Yao Huang; Tzong-Yi Lee
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

5.  cDNA cloning and molecular characterization of a defensin-like antimicrobial peptide from larvae of Protaetia brevitarsis seulensis (Kolbe).

Authors:  Jiae Lee; Kyeongrin Bang; Sejung Hwang; Saeyoull Cho
Journal:  Mol Biol Rep       Date:  2016-03-12       Impact factor: 2.316

6.  Cm-p5: an antifungal hydrophilic peptide derived from the coastal mollusk Cenchritis muricatus (Gastropoda: Littorinidae).

Authors:  Carlos López-Abarrategui; Christine McBeth; Santi M Mandal; Zhenyu J Sun; Gregory Heffron; Annia Alba-Menéndez; Ludovico Migliolo; Osvaldo Reyes-Acosta; Mónica García-Villarino; Diego O Nolasco; Rosana Falcão; Mariana D Cherobim; Simoni C Dias; Wolfgang Brandt; Ludger Wessjohann; Michael Starnbach; Octavio L Franco; Anselmo J Otero-González
Journal:  FASEB J       Date:  2015-04-28       Impact factor: 5.191

7.  Identification and screening of potent antimicrobial peptides in arthropod genomes.

Authors:  Deepesh Duwadi; Anishma Shrestha; Binyam Yilma; Itamar Kozlovski; Munaya Sa-Eed; Nikesh Dahal; James Jukosky
Journal:  Peptides       Date:  2018-03-01       Impact factor: 3.750

8.  FoldamerDB: a database of peptidic foldamers.

Authors:  Bilal Nizami; Dorottya Bereczki-Szakál; Nikolett Varró; Kamal El Battioui; Vignesh U Nagaraj; Imola Cs Szigyártó; István Mándity; Tamás Beke-Somfai
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

9.  Imbalanced multi-label learning for identifying antimicrobial peptides and their functional types.

Authors:  Weizhong Lin; Dong Xu
Journal:  Bioinformatics       Date:  2016-08-26       Impact factor: 6.937

10.  Comparative Analysis of the Antimicrobial Activities of Plant Defensin-Like and Ultrashort Peptides against Food-Spoiling Bacteria.

Authors:  Joanna Kraszewska; Michael C Beckett; Tharappel C James; Ursula Bond
Journal:  Appl Environ Microbiol       Date:  2016-06-30       Impact factor: 4.792

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.