Literature DB >> 25270878

CancerPPD: a database of anticancer peptides and proteins.

Atul Tyagi1, Abhishek Tuknait1, Priya Anand1, Sudheer Gupta1, Minakshi Sharma1, Deepika Mathur1, Anshika Joshi1, Sandeep Singh1, Ankur Gautam1, Gajendra P S Raghava2.   

Abstract

CancerPPD (http://crdd.osdd.net/raghava/cancerppd/) is a repository of experimentally verified anticancer peptides (ACPs) and anticancer proteins. Data were manually collected from published research articles, patents and from other databases. The current release of CancerPPD consists of 3491 ACP and 121 anticancer protein entries. Each entry provides comprehensive information related to a peptide like its source of origin, nature of the peptide, anticancer activity, N- and C-terminal modifications, conformation, etc. Additionally, CancerPPD provides the information of around 249 types of cancer cell lines and 16 different assays used for testing the ACPs. In addition to natural peptides, CancerPPD contains peptides having non-natural, chemically modified residues and D-amino acids. Besides this primary information, CancerPPD stores predicted tertiary structures as well as peptide sequences in SMILES format. Tertiary structures of peptides were predicted using the state-of-art method, PEPstr and secondary structural states were assigned using DSSP. In order to assist users, a number of web-based tools have been integrated, these include keyword search, data browsing, sequence and structural similarity search. We believe that CancerPPD will be very useful in designing peptide-based anticancer therapeutics.
© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25270878      PMCID: PMC4384006          DOI: 10.1093/nar/gku892

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Cancer is one of the most devastating diseases accounting for millions of deaths worldwide every year (1). Conventional chemotherapy remains the principle mode of cancer treatment but it is not effective due to the adverse effects on normal cells and frequent development of multidrug-resistance by cancer cells (2). This grim situation underscores the urgent need to develop novel therapeutic means to tackle this deadly disease. In this context, small peptides known as anticancer peptides (ACPs) have shown tremendous potential as many ACPs exhibit cancer-selective toxicity and thus may avoid the shortcomings of the conventional chemotherapy (3). With several advantages, including high specificity, low intrinsic toxicity, high tissue penetration and ease of modifications, peptides have become the preferred choice as therapeutics compared to small molecules and antibody (4,5). Over the last decade, peptide-based therapeutics have revolutionized the pharmaceutical market (4). The number of approved peptide-based drugs has been increasing from the last few decades (4), which reflects the potential of peptides as therapeutics. Cancer is one of the diseases where peptide therapeutics have been extensively explored over the years (6–10). ACPs are small peptides (less than 50 amino acids), and are often cationic in nature (consisting of basic and hydrophobic residues) (11). Most of the ACPs are derived from antimicrobial peptides (AMPs) and thus AMPs and most ACPs have similar characteristics (11,12). Similar to bacterial cells, surface of cancer cells is also negatively charged and due to this, many AMPs show broad-spectrum toxicity against both bacteria and cancer cells. The electrostatic interaction of ACPs with negatively charged components of plasma membrane of cancer cells is believed to play a crucial role in the cancer-selective toxicity of ACPs (13). Based on the cytotoxicity profiles, ACPs can be grouped in two main categories (14). First category consists of ACPs, which are toxic to bacterial and cancer cells but not toxic to normal cells. Second category includes ACPs, which are toxic to all, bacterial, cancer and normal cells (14). Though ACPs have been studied extensively, the mechanism of ACPs is not fully understood so far. Many studies have concluded that most ACPs, like AMPs, exhibit membrane-lytic mode of action (12–14). However, few ACPs have shown other modes of action also like induction of apoptosis via disruption of mitochondrial membrane. It has been reported that the selective toxicity of many ACPs toward cancer cells is attributed to the differences in the lipid content and other components of biological membranes between normal and cancer cells (11,12). Despite the huge therapeutic importance of ACPs, to date, no dedicated repository of ACPs and anticancer proteins has been developed as yet. A few AMPs databases like Antimicrobial Peptide Database V2 (15), Collection of Antimicrobial Peptides Database (16) and Database of Anuran Defense Peptides (17) contain information on anticancer activities of some AMPs, however, the information related to ACPs is not comprehensive. Last decade witnessed an exponential growth in the ACP-based research, which led to the discovery of hundreds of novel ACPs and their derivatives. Many of these ACPs have shown promising results in various pre-clinical studies and few ACP-based formulations have already reached clinical trials (3). This important information is scattered in the literature and thus very difficult to access. In order to assist the scientific community engaged in developing anticancer drugs, we have made an attempt to collect and compile all the scattered information related to ACPs and anticancer proteins and developed a repository-CancerPPD. We believe that CancerPPD will be helpful for both bioinformatics and experimental researchers working in the field of ACP-based therapeutics.

MATERIALS AND METHODS

Data collection and compilation

In order to develop a comprehensive information resource on ACPs and proteins, an extensive search was carried out to collect information on ACPs and anticancer proteins. For this, first research articles and patents providing information related to ACPs were extracted from various search engines like PubMed, Google scholar and Patent lens. Specific searches were carried out using a combination of keywords like ‘ACPs’, ‘antitumor peptides’, ‘anti-angiogenic peptides’, ‘anti-metastatic peptides’ and ‘host defense peptides’. This exhaustive search yielded around 750 research articles and 30 patents. From these articles and patents, only experimentally verified ACPs and other relevant experimental information were extracted manually. In addition to ACPs, information related to anticancer proteins was also extracted from UniProt and PubMed. For this, text search was carried out in PubMed and UniProt using keywords ‘anticancer protein’ OR ‘antitumor protein’. Finally, 624 ACPs and 121 anticancer proteins along with detailed information like nature of peptides, origin of peptides, sequence, modifications, assay types, cell line tested, etc. were compiled systematically. Since we did not want to lose any information, multiple entries for a single peptide have been made if identical peptide has been tested on different cell lines or it has been evaluated by different assays. Therefore, total entries in CancerPPD are 3491.

Database architecture and web interface development

After the collection and compilation of all the information, the database was launched using Apache HTTP server on Linux Platform. MySQL an object-relational database management system was used to manage all data in the back-end. It provides commands to retrieve and store the data into the database. HTML, PHP and JAVA scripts were used to improve the front-end web interface. All common gateway interface and database interfacing scripts were written in the PHP and Perl programming language. The architecture of CancerPPD is shown in Figure 1.
Figure 1.

Architecture of CancerPPD.

Architecture of CancerPPD.

DATABASE CONTENT

CancerPPD contains two types of information; primary and secondary information. Primary information was manually curated from literature and consists of various fields, including (i) PMID, (ii) peptide sequence, (iii) name of the peptide, (iv) length of the peptide, (v) configuration (linear or cyclic), (vi) chirality (L/D/Mix), (vii) chemical modification, (viii) N-terminal modification, (ix) C-terminal modification, (x) origin of the peptide, (xi) anticancer activity of peptide, (xii) tested cell lines, (xiii) assay types, (xiv) test times, (xv) cancer types and (xvi) target tissues. Secondary information, like tertiary structures and simplified molecular-input line-entry system (SMILES), is derived from the primary information. In order to make it a comprehensive resource, we have compiled structures and SMILES of all the peptides. To provide structural information, all the peptides were searched and mapped on the Protein Data Bank (PDB) (18) sequences and if an exact match was obtained, we assigned the same structure to that peptide. A total of 32 peptide structures were obtained from PDB. Those peptides, which failed to map on PDB sequences, were predicted using PEPstr algorithm (19), which is the state-of-the-art algorithm for predicting the tertiary structure of peptides. Briefly, PEPstr uses PSIPRED (20) and BetaTurns (21) to predict secondary structure and types of beta-turns, respectively. It then uses the ideal dihedral angle of these intermediate predicted states as restraints to make an initial structure. Following energy minimization and short molecular dynamics of the initial structure using AMBER (22), the final structure is given as the predicted peptide tertiary structure. PEPstr handles peptides with only natural residues ranging in length from 7 to 25 residues. In the present work, we relaxed the length criteria from 5 to 40 residues. Using PEPstr, we were able to predict the structure of 211 peptides. A total of 9 sequences with lengths of >40 residues each, were treated as proteins and their structure was predicted using I-TASSER (23,24) server, which was among the best template-based protein structure prediction servers in CASP10 (25) assessment. To handle three peptides with length less than five residues, we generated the initial structure with linear conformation using phi and psi dihedral angles of 180°. Molecular dynamics was performed on the initial structure using AMBER and the structure having lowest energy in the trajectory was finally considered as the final predicted structure of the peptide. Many ACPs also contain chemically modified residues and, therefore, cannot be handled directly by PEPstr. The unavailability of any algorithm (either online or standalone) for handling non-natural or chemically modified residues, prompted us to extend the PEPstr algorithm to handle these modifications for the prediction of peptide structures. Terminal modifications (acetylation/amidation), N to C cyclization, disulfide bridges between cysteine residues as well as the flipping of stereochemistry of a residue from L- to D-form were carried out using inbuilt functions in AMBER11. Special force field libraries (26–28) available for standard molecular dynamics packages like AMBER and GROMACS (29) were used to incorporate other terminal modifications (beta-alanine, hydroxylation) or chemical modifications (ornithine, naphthyl alanine, etc.). In this way, a total of 617 peptide structures were obtained. Due to the lack of force field libraries, the structure of a few peptides having complex modifications was not predicted. Although the PEPstr algorithm incorporated very short molecular dynamics (25 picoseconds) on the initial structure, we extended the simulation of all structures to 1 nanosecond. After obtaining the predicted structures of the peptides, we assigned the eight types of secondary structure (H = alpha helix; B = beta-bridge; E = extended strand; G = 3/10 helix; I = pi helix; T = turn; S = bend; C = loop) using DSSP software (30). DSSP takes input in PDB file format and assigns the secondary structure based on the hydrogen bonding patterns and geometrical features between amino acids. We also represent the predicted structures in SMILES format. Open Babel software (31) was used to convert the 3D structures in 2D SMILES notation.

RESULTS

Implementation

A user-friendly web interface (Figure 2) has been designed to query the database. Various tools have been integrated, which facilitate data retrieval, search and analysis conveniently. Following is the brief description of various options available in CancerPPD.
Figure 2.

Schematic representation of CancerPPD web interface.

Schematic representation of CancerPPD web interface.

Data retrieval tools

Simple search

This option provides basic facility to retrieve data from the database. It allows users to perform keyword search on any field of the database like PMID, origin of peptide, cancer cell lines, nature of peptide. Users can select different fields to be displayed. The keyword should be without spaces.

Complex query

This option is for the users who wish to perform complex search to extract the desired information from CancerPPD. It offers a multiple query system, by default it allows to perform four queries at a time, and the user can perform keyword search on any selected field. The server provides facility to use standard logical operators (e.g. = , >, < and LIKE). Users can combine output of different queries using operators like ‘AND and OR’. It also has the option to add or remove queries to be executed.

Peptide sequence search

This option provides facility to search a given peptide sequence against sequences of all peptides available in CancerPPD. It provides two options called exact and substring search. In case of exact search, the server extracts those peptides from the database, which have identical amino acid sequence with the query peptide. In case of substring search, the server extracts those peptide sequences that contain amino acids present in the query peptide.

SMILES search

The above options allow users to search the peptide sequences at amino acid level. Some time, researchers wish to understand whether a particular atom or bond or group is more frequent in ACPs. In order to assist users and to understand the property of ACPs at atom/bond level, we have maintained the structures of peptides in SMILES format. Server facilitates users to search their chemicals in SMILES format in the database search.

Browsing tools

We have developed a browsing facility, which facilitates the retrieval of information in a classified form. A user-friendly interface has been integrated for browsing the ACPs on the major fields that include ACPs, anticancer proteins, type of cell line tested, year of discovery, assay classes and peptide length. Users can browse the ACPs and anticancer proteins using these fields. In addition, users can browse ACPs based on their chemical modifications like N-terminal, C-terminal modifications. In CancerPPD, information of 249 types of cancer cell lines originated from around 21 types of tissues has been compiled. Users can browse ACPs evaluated against a particular type of cancer cell line. Further, to make it a more informative resource, cell lines were linked with Catalogue of Somatic Mutations (COSMIC) (32) and Cancer Cell Line Encyclopedia (CCLE) (33).

Analysis tools

CancerPPD integrates various web-based tools for performing various analyses like sequence similarity search, multiple sequence alignment and structure alignment. We have integrated Basic Local Alignment Search Tool (BLAST) (34) in CancerPPD that allows users to perform BLAST search against peptides in the database. This allows users to identify ACPs in the database that have high sequence similarity with query peptide sequence. Additionally, we have also integrated another similarity search tool to perform sequence similarity based on Smith–Waterman algorithm (35). The peptide-mapping tool allows users to identify ACPs within their proteins of interest. The server searches for ACPs and maps them on the query protein submitted by the user.

Data statistics

CancerPPD consists of 3491 peptide entries corresponding to 624 unique peptides. While searching for the ACPs in the literature, it was noted that most ACPs have been tested on various cancer cell lines showing different IC50 values. Therefore, to secure all this information, multiple entries of a single ACP have been made if the identical peptide has been found to be tested on different cell lines (e.g. HeLa, A549, MCF-7, etc.) or has been tested by different assays (e.g. MTT, WST-1, LDH, etc.). The information of cancer cell lines, which have been used as model system to evaluate the anticancer activity of ACPs is very important and thus information of total 249 cancer cell lines corresponding to 21 tissue types (Figure 3) was compiled and linked with various databases like CCLE and COSMIC. Since the stability of the peptide is one of the major concerns while developing therapeutic peptides, many ACPs and their analogs with different chemical modifications/ non-natural amino acids have been designed and evaluated for their anticancer activity. This information has also been compiled and total 290 peptide entries have been made, which have different chemical modifications. Apart from this, entries of peptide having L-amino acids (3274), D-amino acids (26) and both L- and D-amino acids (178) have been compiled. The most common assay to determine the anticancer activity of ACPs is the cell viability assay, which measure the viability of cells using various substrates like MTT. However, various other assays have also been reported in the literature for determining the anticancer activity of these peptides. The current version of CancerPPD holds information of 16 different types of assay used to test the anticancer potency of peptides.
Figure 3.

Representation of ACPs (entries) evaluated against various cancers cell lines originated from different tissues.

Representation of ACPs (entries) evaluated against various cancers cell lines originated from different tissues.

DISCUSSION

The therapeutic peptide market emerged almost 40 years back in 1970s. Since then peptides have not been very popular as drug candidates but it has only been in the recent past that the pharmaceutical industries have shown an interest in therapeutic peptides and made a heavy investment in the peptide-based drugs (3,36). This renewal of interest in therapeutic peptides could be due to the various limitations of conventional drugs, including frequent development of drug resistance, non-specificity, poor delivery, etc. The present manuscript describes a repository of ACPs named CancerPPD, which is an important and a much-needed resource. ACPs belong to an important class of therapeutic peptides, which have received a significant attention over the years. The ability of many ACPs to selectively kill cancer cells without affecting normal cells makes them an attractive alternative candidate for cancer therapy (3,13). The last decade has seen a progressive growth in peptide-based research, particularly in therapeutic peptides, which is exemplified by hundreds of research articles and by the development of the various databases of therapeutic peptides, including CPPsite (37), TumorHope (38), Hemolytik (39), ParaPep (40), Brainpeps (41), Quorumpeps (42), etc. The development of CancerPPD will be an important addition to this elite group of knowledge resources. Over the last decade, huge data on ACPs has been generated and, therefore, a systematic cataloging of this data will be important to understand the properties of ACPs and to delineate the features responsible for anticancer activity of these peptides. This analysis will further be helpful to design and predict better ACPs. Therefore, CancerPPD has been built with an aim to provide comprehensive information related to ACPs. Apart from ACP sequences, CancerPPD contains predicted tertiary structures of ACPs and ACPs in SMILES format. Users can make the best use of CancerPPD in the following ways: (i) users can search for the best ACPs, (ii) users can also search whether their peptide of interest is already exists in the CancerPPD or not, (iii) users can analyze the extent to which their peptides are similar to the existing ACPs, (iv) users can map their query sequences as well as structures of their peptides of interest with structure of any ACP available in CancerPPD, (iv) structures available in CancerPPD will be used for docking and various membrane simulations studies, (v) CancerPPD offers a huge and latest data set of ACPs, which can be used for development of various prediction methods for ACPs and (vi) the SMILES of ACPs will be used to develop QSAR models. We hope that CancerPPD will be a useful resource for researchers working in the area of cancer therapeutics.

LIMITATIONS AND UPDATE OF CancerPPD

In CancerPPD, we have stored the predicted tertiary structures of most of the ACPs along with the peptides with modified amino acids. However, one of the limitations is that the tertiary structures of few peptides have not been predicted due to the complex chemical modifications like lanthionine bridges. The force field libraries for such kind of residue modifications are presently not available. In the future, we will try to predict the structure of such peptides. One critical task in the field of database development is to timely update it with newly generated information. Therefore, in this database, we have integrated an updating facility. Scientific community may submit ACPs to CancerPPD using online submission form. Our team will update the database regularly from online submission as well as from newly published literature.

AVAILABILITY

CancerPPD is freely available at http://crdd.osdd.net/raghava/cancerppd/. The mobile version of this database can be accessed at http://crdd.osdd.net/raghava/cancerppd/mobile/cancerppd/.
  42 in total

1.  Protein secondary structure prediction based on position-specific scoring matrices.

Authors:  D T Jones
Journal:  J Mol Biol       Date:  1999-09-17       Impact factor: 5.469

2.  Peptide-based drug design: here and now.

Authors:  Laszlo Otvos
Journal:  Methods Mol Biol       Date:  2008

Review 3.  Brainpeps: the blood-brain barrier peptide database.

Authors:  Sylvia Van Dorpe; Antoon Bronselaer; Joachim Nielandt; Sofie Stalmans; Evelien Wynendaele; Kurt Audenaert; Christophe Van De Wiele; Christian Burvenich; Kathelijne Peremans; Hung Hsuchou; Guy De Tré; Bart De Spiegeleer
Journal:  Brain Struct Funct       Date:  2011-12-29       Impact factor: 3.270

4.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.

Authors:  Sander Pronk; Szilárd Páll; Roland Schulz; Per Larsson; Pär Bjelkmar; Rossen Apostolov; Michael R Shirts; Jeremy C Smith; Peter M Kasson; David van der Spoel; Berk Hess; Erik Lindahl
Journal:  Bioinformatics       Date:  2013-02-13       Impact factor: 6.937

5.  Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features.

Authors:  W Kabsch; C Sander
Journal:  Biopolymers       Date:  1983-12       Impact factor: 2.505

Review 6.  Promises of apoptosis-inducing peptides in cancer therapeutics.

Authors:  David Barras; Christian Widmann
Journal:  Curr Pharm Biotechnol       Date:  2011-08       Impact factor: 2.837

7.  Assessment of template-based protein structure predictions in CASP10.

Authors:  Yuanpeng J Huang; Binchen Mao; James M Aramini; Gaetano T Montelione
Journal:  Proteins       Date:  2014-02

8.  COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer.

Authors:  Simon A Forbes; Nidhi Bindal; Sally Bamford; Charlotte Cole; Chai Yin Kok; David Beare; Mingming Jia; Rebecca Shepherd; Kenric Leung; Andrew Menzies; Jon W Teague; Peter J Campbell; Michael R Stratton; P Andrew Futreal
Journal:  Nucleic Acids Res       Date:  2010-10-15       Impact factor: 16.971

Review 9.  From antimicrobial to anticancer peptides. A review.

Authors:  Diana Gaspar; A Salomé Veiga; Miguel A R B Castanho
Journal:  Front Microbiol       Date:  2013-10-01       Impact factor: 5.640

10.  Forcefield_NCAA: ab initio charge parameters to aid in the discovery and design of therapeutic proteins and peptides with unnatural amino acids and their application to complement inhibitors of the compstatin family.

Authors:  George A Khoury; James Smadbeck; Phanourios Tamamis; Andrew C Vandris; Chris A Kieslich; Christodoulos A Floudas
Journal:  ACS Synth Biol       Date:  2014-01-14       Impact factor: 5.110

View more
  72 in total

1.  ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides.

Authors:  Leyi Wei; Chen Zhou; Huangrong Chen; Jiangning Song; Ran Su
Journal:  Bioinformatics       Date:  2018-12-01       Impact factor: 6.937

2.  Enantiomeric CopA3 dimer peptide suppresses cell viability and tumor xenograft growth of human gastric cancer cells.

Authors:  Joon Ha Lee; In-Woo Kim; Yong Pyo Shin; Ho Jin Park; Young Shin Lee; In Hee Lee; Mi-Ae Kim; Eun-Young Yun; Sung-Hee Nam; Mi-Young Ahn; Dongchul Kang; Jae Sam Hwang
Journal:  Tumour Biol       Date:  2015-10-02

Review 3.  From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

Authors:  Aitor Blanco-Míguez; Alberto Gutiérrez-Jácome; Martín Pérez-Pérez; Gael Pérez-Rodríguez; Sandra Catalán-García; Florentino Fdez-Riverola; Anália Lourenço; Borja Sánchez
Journal:  Protein Sci       Date:  2016-04-19       Impact factor: 6.725

4.  Repositioning of experimentally validated anti-breast cancer peptides to target FAK-PAX complex to halt the breast cancer progression: a biomolecular simulation approach.

Authors:  Abbas Khan; Shengzhou Shan; Tayyba Fatima Toor; Muhammad Suleman; Yanjing Wang; Jia Zhou; Dong-Qing Wei
Journal:  Mol Divers       Date:  2022-05-30       Impact factor: 2.943

Review 5.  Potential role of bioactive peptides in prevention and treatment of chronic diseases: a narrative review.

Authors:  Arrigo F G Cicero; Federica Fogacci; Alessandro Colletti
Journal:  Br J Pharmacol       Date:  2016-09-29       Impact factor: 8.739

Review 6.  Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification.

Authors:  Xiao Liang; Fuyi Li; Jinxiang Chen; Junlong Li; Hao Wu; Shuqin Li; Jiangning Song; Quanzhong Liu
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

7.  Incorporating support vector machine with sequential minimal optimization to identify anticancer peptides.

Authors:  Yu Wan; Zhuo Wang; Tzong-Yi Lee
Journal:  BMC Bioinformatics       Date:  2021-05-29       Impact factor: 3.169

8.  Ensemble-AMPPred: Robust AMP Prediction and Recognition Using the Ensemble Learning Method with a New Hybrid Feature for Differentiating AMPs.

Authors:  Supatcha Lertampaiporn; Tayvich Vorapreeda; Apiradee Hongsthong; Chinae Thammarongtham
Journal:  Genes (Basel)       Date:  2021-01-21       Impact factor: 4.096

9.  Copper-binding anticancer peptides from the piscidin family: an expanded mechanism that encompasses physical and chemical bilayer disruption.

Authors:  Fatih Comert; Frank Heinrich; Ananda Chowdhury; Mason Schoeneck; Caitlin Darling; Kyle W Anderson; M Daben J Libardo; Alfredo M Angeles-Boza; Vitalii Silin; Myriam L Cotten; Mihaela Mihailescu
Journal:  Sci Rep       Date:  2021-06-16       Impact factor: 4.379

10.  Self-assembled peptide and protein nanostructures for anti-cancer therapy: Targeted delivery, stimuli-responsive devices and immunotherapy.

Authors:  Masoud Delfi; Rossella Sartorius; Milad Ashrafizadeh; Esmaeel Sharifi; Yapei Zhang; Piergiuseppe De Berardinis; Ali Zarrabi; Rajender S Varma; Franklin R Tay; Bryan Ronain Smith; Pooyan Makvandi
Journal:  Nano Today       Date:  2021-03-11       Impact factor: 18.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.