Literature DB >> 23523652

Search and analysis of identical reverse octapeptides in unrelated proteins.

Abstract

For the past few decades, intensive studies have been carried out in an attempt to understand how the amino acid sequences of proteins encode their three dimensional structures to perform their specific functions. In order to understand the sequence-structure relationship of proteins, several sub-sequence search studies in non-redundant sequence-structure databases have been undertaken which have given some fruitful clues. In our earlier work, we analyzed a set of 3124 non-redundant protein sequences from the Protein Data Bank (PDB) and retrieved 30 identical octapeptides having different secondary structures. These octapeptides were characterized by using different computational procedures. This prompted us to explore the presence of octapeptides with reverse sequences and to analyze whether these octapeptides would adopt similar structures as that of their parent octapeptides. Our identical reverse octapeptide search resulted in the finding of eight octapeptide pairs (octapeptide and reverse octapeptide) with similar secondary structure and 23 octapeptide pairs with different secondary structures. In the present work, the geometrical and biophysical characteristics of identical reverse octapeptides were explored and compared with unrelated octapeptide pairs by using various computational tools. We thus conclude that proteins containing identical reverse octapeptides are not very abundant and residues in the octapeptide pairs do not contribute to the stability of the protein. Furthermore, compared to unrelated octapeptides, identical reverse octapeptides do not show certain biophysical and geometrical properties.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2013 PMID： 23523652 PMCID： PMC4357837 DOI： 10.1016/j.gpb.2012.11.005

Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN： 1672-0229 Impact factor: 7.691

Introduction

With the vast amount of knowledge gained from the three dimensional (3D) structures of proteins, one might expect to easily understand sequence-structure-function relationships. However this is not the case and many unsolved problems remain. Various attempts have been made to study the stability and folding of proteins and peptides by using reduced representation of proteins as well as by reading the protein sequences backward. Several theoretical and experimental approaches toward studying retro proteins have been carried out and give contradictory results [1]. Skolnick and coworkers made a systematic study by generating retro proteins and found a very low number of similar 3D structures between inversely similar proteins and they also reported that short helices keep their conformations even when the sequence is inverted [2]. Systematic search in public domain sequence databases revealed that a protein with identical composition but with backward read primary structure should fold under native conditions in to a similar structure compared with the original sequence [3]. Since the retro protein has the same amino acid composition and hydrophobicity profiles as the native protein, all structure prediction methods based on amino acid composition may fail. The assumption that the retro protein would adopt the mirror image of the native protein is very unlikely, because right-handed helices would have to be replaced by left-handed helices. Changes in the secondary structure of the retro protein are significant because the packing of the native topology is similar to the packing of the original protein [4]. Nebel et al proposed that the abundance of retro proteins primarily can be explained by the fact that they display the same repeat structures and amino acid propensity of existing proteins [5]. In contrast to proteins, studies of smaller peptide fragments with inverted sequences have provided fruitful clues toward understanding protein sequence-structure relationships. Several studies on reversed proteins and peptides consisting of D-amino acids which form the mirror image structures of the respective L-amino acid protein have been carried out [6-8] and validated experimentally by synthesizing D-human immuno deficiency virus protease which showed reciprocal chirality [9]. There are several other interesting studies of retro peptides. One such example is a hairpin peptide mimic of the FcRI chain that inhibits IgE-FcRI interactions both in its native and retro form, which has potential for the treatment of allergic disorders [10]. Rai has used the S peptide from ribonuclease S as a model system and explored the relationship between native and reverse conformations of the peptide by using circular dichroism (CD) [7]. Huang and coworkers synthesized a novel peptide by reversing the sequence of the human metallothionein-2α domain to analyze its chemical and spectroscopic properties. The results of their analysis indicate that the reversion of the amino acid sequence for α domain does not change its foldability and metal-binding capacity, suggesting that the order of its sequence is not essential for the formation of a critical metal-tetrathiolate nucleus [11]. In our previous work, we performed a systematic analysis of the role of long-range contacts in homologous families of proteins in determining the final native structure [12]. Our analysis indicated the importance of co-operative long-range interactions in determining the final conformations of two proteins with 88% sequence identity but adopting different folds and functions [13]. The importance of cooperative long-range interactions for protein stability was further verified experimentally by studying the fragments of chymotrypsin inhibitor 2 in barley [14]. In our previous paper, we have characterized 30 identical octapeptides adopting different conformations by computing difference in the number of long-range contacts of residues [15]. In the present work, we performed a systematic search of identical reverse octapeptides in the sequences of the Protein Data Bank (PDB) [16] to explore the structural and functional features of the protein fragments in reversed form. The results from our analysis imply that protein fragments of eight residues of a given protein and their reversed form in another unrelated protein are not very abundant. The geometric and biophysical properties, such as volume, surface area, dipole and quadrapole moments and continuous symmetry measure in such octapeptides, were examined in detail. Also the gene ontology (GO) terms of the proteins containing reverse octapeptides are extracted to understand the structure-function relationship.

Results

Reverse octapeptide search

While searching for the reverse form of the overlapping octapeptide segments in each of the proteins of the 3124 PDB sequences, we found 8 octapeptide pairs (octapeptide and reverse octapeptide) with similar secondary structure (Table S1) and 23 octapeptide pairs with different secondary structures (Table 1). The occurrence of proteins containing identical reverse octapeptides with similar and different secondary structures is very low, which was 0.5% and 1.4%, respectively, while 98% of proteins do not contain identical reverse octapeptides. These data demonstrated the low abundance of such inverse fragments in a large number of unrelated protein sequences. However, it is not surprising that an octapeptide and its reverse sequence adopt similar secondary structures, due to the similar amino acid composition and certain similar biophysical properties such as hydrophobicity, hydrophilicity, etc. Interestingly, we have found eight octapeptide pairs with similar secondary structure, which are mainly made up of helices.

Table 1

Biophysical and geometrical properties of octapeptides with inverse sequences having different secondary structures

No.		PDB ID	Octapeptide	Secondary structure	Stabilizing residues	Hphil (%)	Hpho (%)	Others (%)	Attribute	RMSD	CSM	Volume (Å³)	Surface area (Å²)	Quadrapole (Debyes)
1	P1	1A8D_A(52-59)	VPGINGKA	EECCCCEE	No	12	25	62	Basic	4.3	96.655	971	621.9	81
1	P2	1YZY_A(88-95)	AKGNIGPV	CCCCHHHH	No	12	25	62	Basic	4.3	94.398	949	630.15	42
2	P1	1D8W_A(418-425)	YEKEILSR	HHHHCCCC	No	30	20	50	Neutral	0.6	83.482	1260	718.53	100
2	P2	2E52_A(244-251)	RSLIEKEY	HHHHHHHH	No	30	20	50	Neutral	0.6	85.753	1305	772.85	24
3	P1	1DM9_B(103-110)	KLNALTMP	HHHHCCCC	No	38	0	62	Basic	3.0	99.401	959	580.7	71
3	P2	1W8K_A(137-144)	PMTLANLK	CEEHHHHH	No	38	0	62	Basic	3.0	92.286	1118	653.28	45
4	P1	1ED1_A(47-54)	AESLLENK	CHHHHHCH	No	20	13	67	Acidic	1.6	80.533	1117	740.55	163
4	P2	1R6X_A(20-27)	KNELLSEA	HHHHHHHH	No	20	13	67	Acidic	1.6	77.565	1129	719.77	110
5	P1	1G4M_A(106-113)	KKLGEHAY	HHCCCCEE	No	50	25	25	Basic	1.0	90.681	1178	802.44	248
5	P2	2UVK_A(276-283)	YAHEGLKK	CCCCCCCC	No	50	25	25	Basic	1.0	99.514	1102	757.62	27
6	P1	1GJW_A(2-9)	LLREINRY	HHHHHHHH	No	38	50	12	Basic	3.5	81.161	1378	849.18	43
6	P2	2ODI_A(231-238)	YRNIERLL	EEEEEECC	No	38	50	12	Basic	3.5	59.256	1363	973.29	234
7	P1	1ITX_A(354-361)	QTCTGGSS	CCCCEECC	No	0	0	100	Neutral	1.4	96.532	864	659.05	88
7	P2	2V3I_A(20-27)	SSGGTCTQ	CCCCCEEE	No	0	0	100	Neutral	1.4	87.245	960	637.51	109
8	P1	1JL1_A(139-146)	AAAMNPTL	HHHHCCCC	No	40	20	40	Neutral	0.04	95.226	972	660.34	71
8	P2	2NVO_A(383-390)	LTPNMAAA	CCHHHHHH	No	40	20	40	Neutral	0.04	93.055	967	580.17	80
9	P1	1N1B_A(481-488)	YHDILCLA	CCHHHHHH	No	9	18	73	Neutral	2.7	86.665	1140	686.48	105
9	P2	2FA1_A(142-149)	ALCLIDHY	EEEEEEEE	No	9	18	73	Neutral	2.7	90.691	1154	797.82	53
10	P1	1O2D_A(131-138)	VVEIPTTA	EEEEECCC	132V	20	27	53	Acidic	1.6	93.583	1023	724.32	87
10	P2	1TG7_A(626-633)	ATTPIEVV	CCEEEEEE	No	20	27	53	Acidic	1.6	93.519	1026	735.4	49
11	P1	1QOY_A(177-184)	AYAGAAAG	HHHCCCCC	No	0	12	88	Neutral	3.2	99.551	787	550.79	57
11	P2	1RWR_A(209-216)	GAAAGAYA	CCCCCCEE	No	0	12	88	Neutral	3.2	99.532	776	590.71	67
12	P1	1QUS_A(277-284)	GQAPGLPN	CCCCCCCC	No	20	20	60	Neutral	2.5	67.633	918	649.24	134
12	P2	1QWO_A(286-293)	NPLGPAQG	CCCCHHHH	No	20	20	60	Neutral	2.5	75.013	886	558.93	91
13	P1	1SU8_A(269-276)	IVSVSKEM	HHHHHHHC	No	19	25	56	Neutral	3.4	81.048	1126	667.56	53
13	P2	1W77_A(75-82)	MEKSVSVI	CCCCEEEE	80S, 82I	19	25	56	Neutral	3.4	89.778	1112	786.99	103
14	P1	1SVF_A(177-184)	SPAITAAN	HHHHHHCC	No	0	12	88	Neutral	4.1	90.993	907	555.91	23
14	P2	1YNF_A(110-117)	NAATIAPS	HCEEEECC	No	0	12	88	Neutral	4.1	84.159	896	667.46	64
15	P1	1UA4_A(409-416)	IKEGIGEV	CCCCEEEE	No	38	38	25	Acidic	1.9	52.912	1048	737.5	48
15	P2	1X2I_A(48-55)	VEGIGEKI	CCCCCHHH	No	38	38	25	Acidic	1.9	46.389	1043	706.78	19
16	P1	2ANE_A(68-75)	LFTVGTVA	CCCEEEEE	73T, 75A	0	50	50	Neutral	1.5	67.694	1000	724.57	170
16	P2	3B8D_A(263-270)	AVTGVTFL	CCCEEEEC	268T	0	50	50	Neutral	1.5	80.077	1012	734.76	76
17	P1	2BKX_A(202-209)	KAEAVRKL	HHHHHHHH	No	50	25	25	Basic	3.6	66.488	1204	797.6	44
17	P2	2BZ1_A(3-10)	LKRVAEAK	EEEEEEEE	No	50	25	25	Basic	3.6	47.348	1173	873.21	260
18	P1	2CHH_A(4-11)	GVFTLPAN	CEEECCCC	No	0	38	62	Neutral	2.2	99.601	1000	716.51	97
18	P2	2Q2R_A(21-28)	NAPLTFVG	CCCEEEEE	No	0	38	62	Neutral	2.2	99.613	1010	712.11	154
19	P1	2ISB_A(111-118)	EEVVEAMR	HHHHHHHC	No	47	20	33	Acidic	2.9	93.375	1092	668.75	39
19	P2	2JDJ_A(85-92)	RMAEVVEE	HHEEEEEE	No	47	20	33	Acidic	2.9	99.661	1002	773.7	117
20	P1	2NY1_A(253-260)	PVVSSQLL	CCCCCCEE	257S, 258Q	0	50	50	Neutral	4.1	99.461	1087	756.28	14
20	P2	2A40_B(220-227)	LLQSSVVP	HHHHHCCC	No	0	50	50	Neutral	4.1	99.342	1076	688.67	39
21	P1	2ZBL_A(14-21)	EQETDRIF	HHHHHHHH	No	21	14	64	Acidic	1.1	82.528	1230	707.03	39
21	P2	1M0W_A(271-278)	FIRDTEQE	EECCCCCE	No	21	14	64	Acidic	1.1	84.776	1221	736.33	133
22	P1	1R6X_A(20-27)	KNELLSEA	HHHHHHHH	No	29	29	43	Acidic	1.6	77.567	1129	719.77	110
22	P2	1ED1_A(47-54)	AESLLENK	CHHHHHCH	No	29	29	43	Acidic	1.6	80.533	1117	740.55	163
23	P1	2EIY_A(171-178)	KMEAVAAG	HHHHHHCC	No	25	25	50	Neutral	0.4	93.512	941	619.3	125
23	P2	1TWD_A(232-239)	GAAVAEMK	HHHHHHHH	No	25	25	50	Neutral	0.4	91.508	805	545.36	30

Note: P1 and P2 refer to the octapeptide and its reverse octapeptide, respectively. Hphil, Hpho and others column indicates the occurrence (in percentage) of hydrophilic residues, hydrophobic residues and other residues in the octapeptide, respectively. No in Stabilizing residues column indicates that there are no stabilizing residues; if a stabilizing residue exists, number and stabilizing amino acid in a single letter code are indicated. CSM, continuous symmetry measure; RMSD, root mean square deviation.

The occurrence of each of the 20 amino acid residues in reverse octapeptides having similar and different conformations is shown in Figure 1. Leucine (L) and glutamic acid (E), which are generally considered as helix favoring residues, appear with high percentage, whereas in peptides adopting different secondary structures, their occurrence was relatively low (∼10% and 9%, respectively). Since glutamic acid and its salts perform important functions such as neurotransmitting and flavor enhancing, understanding the role of glutamic acid in octapeptides may provide fruitful insights in peptide and biomaterials design [17]. The low occurrence of tryptophan (W) may be explained by the fact that the L-stereoisomer of tryptophan is present in proteins, but the D-stereoisomer is only occasionally found in naturally-produced peptides [18].

Figure 1

Occurrence of each amino acid residue in identical reverse octapeptide pairs with similar and different conformations

The amino acid residues in the octapeptide pairs with similar secondary structure are mainly occupied by hydrophilic residues and they are generally surrounded by very minimum contacting residues [19-20]. Since hydrophobic residues tend to be buried in the core of proteins, the low occurrence of hydrophobic residues in octapeptide pairs with similar secondary structure suggests that octapeptide fragments would be found on the protein surface and contribute to various interactions with other molecules. On the other hand, majority of the residues in octapeptide pairs with different secondary structures are neither hydrophilic (D, E, R, K and H) nor hydrophobic residues (F, I, L, M, V, W, A and P). Instead, these octapeptide pairs are mainly made up of residues such as G, S, T, C, N, Q and Y.

Volume and surface area of octapeptide pairs

The functional specificity of a peptide arises from its unique physico–chemical properties such as volume and surface area, which are important geometrical quantities associated with macromolecular structures and motions. Hence, we performed a systematic analysis of surface area and volume of octapeptide pairs with similar conformations (Table S1) and those with different conformations (Table 1). The correlation coefficient between the volumes of the octapeptides and their reverse with similar and different secondary structures is 0.93 and 0.91, respectively. Consequently, the correlation coefficient between surface areas of the identical reverse octapeptide pairs with similar secondary structure and that of the octapeptide pairs with different secondary structures is 0.95 and 0.73, respectively. However, the correlation coefficient between volume and surface area of unrelated octapeptide pairs with similar and different secondary structures seems to be very poor (−0.04 and −0.03) (Table S2). The difference in surface areas of unrelated octapeptide pairs implies that the octapeptide and its reverse octapeptide show similar geometric properties such as volume and surface area. Hence, from our results, we suggest that the octapeptides and their reverse have similar spatial arrangements like related octapeptide pairs whereas unrelated octapeptide pairs do not adopt similar volume and surface area. The Cα average root mean square deviation (RMSD) of identical reverse octapeptides is 2.27, which is 3.01 for unrelated octapeptide pairs. This implies that the Cα atoms of the unrelated octapeptides and identical reverse octapeptides may not be accurately superimposed on their pair, probably due to the influence of several other factors such as conformation angle preferences, surrounding residues, position of the octapeptide in the sequence, etc.

Continuous symmetry of octapeptide pairs

Chirality and symmetry are treated as key descriptors of molecules. Continuous symmetry is a quantitative measure used to track the changes in molecules with similar composition during dynamic processes. The values of continuous symmetry measure (CSM) seem high for most of the octapeptide pairs with different secondary structures (Table 1) or with similar secondary structure (Table S1), which implies that the symmetry of the molecule is distorted. The difference between CSM of an octapeptide and its reverse varies between the different octapeptide pairs. Since molecular symmetry can explain many chemical properties, the higher values of CSM in both cases (octapeptide pairs with similar or different secondary structures) imply that the interpretation of physical and chemical properties with respect to their conformations of octapeptide pairs is problematic.

Dipole and quadraploe moment of octapeptide pairs

The net charge and dipole moment of a protein may affect its binding properties and function [21] and hence we have computed net charge and dipole moment of the octapeptide and their reverse and correlated between them. From our analysis, no obvious correlation was observed between net charges and dipole moments. The correlation coefficients between the dipole moment of octapeptides and their reverse were −0.75 and −0.52 for those with similar secondary structure and with different secondary structures, respectively. The difference in dipole moments of the identical reverse octapeptides with different conformations is shown in Figure 2. The results reveal the importance of dipole moment in adopting different conformations of octapeptide pairs, even though they possess similar residue composition, hydrophobicity, hydrophilicity and volume. A similar trend is observed for the quadrapole moment of the octapeptides and their reverse. These analyses suggested that the dipole and quadrapole moment may play a vital role in protein fragments involved in various binding processes.

Figure 2

Difference in dipole moment of identical reverse octapeptides with different conformations

Role of stabilizing residues in reverse octapeptides

It is important to know how proteins are stabilized by various residues and their interactions. In order to understand the role of stabilizing residues in identical reverse octapeptides, we have predicted the stabilizing residues by using the SRide server [22]. The results reveal that residues present in octapeptide pairs with similar secondary structure did not contribute to the stability of a protein. For octapeptide pairs with different secondary structures, four octapeptide pairs contained stabilizing residues and contributed to the stability of the protein (Table 1). Our results suggest that the octapeptide pairs with similar secondary structure may not contribute to stability whereas the identical reverse octapeptides with different secondary structures may have biological significance to the stability of the protein. Understanding the mechanism by which protein stability is achieved is a challenging task and may help in protein design experiments [23].

GO terms of protein pairs

The description of proteins containing identical reverse octapeptides with similar and different secondary structures and their corresponding GO terms are given in Tables S3 and S4, respectively. Most of the proteins containing reverse octapeptides are enzymes. Interestingly, 12 hydrolase classes of enzymes contain identical reverse octapeptides. Reverse octapeptides are also observed in transferases, isomerases and toxins. From the GO terms extracted, in the case of octapeptide pairs with similar secondary structure, 37.5% of the proteins are involved in binding (DNA binding, metal ion binding, GTP and ATP binding). 18.75% of the proteins are hypothetical (Figure 3). One synthetic designed protein called GCN4 leucine zipper protein is observed, whose biophysical properties have been extensively studied [24].

Figure 3

Distribution of GO terms of protein containing identical reverse octapeptides with similar and different secondary structures

In the case of octapeptide pairs with different secondary structures, the GO terms extracted contain 60.86% of proteins with the term binding, which include zinc ion binding, manganese binding, DNA/RNA binding, cation binding, metal ion binding, carbohydrate binding, cellulose binding, magnesium ion binding, protein binding, ATP binding and copper ion binding. 6.52% of the proteins have transferase phosphate containing groups. Also, they have two hypothetical and one synthetic protein (Figure 3). Among 46 proteins, 11 belong to the family of hydrolases and 6 belong to the family of transferases. These suggest that reverse octapeptides are mostly found in enzymes particularly in the hydrolase family.

Discussion

In recent years, several methods based on similarity and dissimilarity comparison have advanced rapidly to understand the structure, function and property of proteins [25-28] paving the way to the design of novel proteins. For example, Alexander et al [29] designed two proteins with identical sequence apart from one residue mismatch and found that these two proteins adopt completely different folds and functions. Despite the progress, it still remains whether information gained from such studies is strong enough to predict the structure with 100% accuracy. Therefore, we believe that the inverse sequence similarity comparisons of protein or peptide sequences are also important to gain valuable insights into protein sequence-structure relationship. Since an identical match of up to eight amino acids may not imply structural similarity [30], we performed an identical reverse octapeptide search to know whether inverse octapeptide sequences were present in unrelated proteins. Although these kinds of data analyses on oligopeptides have been actively pursued previously [31,32], no previous reports were found in the literature regarding reverse peptide sequence search. This is the first analysis on polypeptides. Our analysis examined recently-increased datasets of experimentally-determined high resolution PDB sequences and structures (3124 unrelated proteins) and reveals the following observations. We find very low occurrences of octapeptides and their reverse in many PDB sequences. Due to the directionality of a protein molecule along the sequence, a protein with the reversed sequence is not something with opposite nature. Thus, it is not surprising that the pairs of certain octapeptides and their reverse octapeptides sometimes have similar secondary structure. The amino acid composition of octapeptides and their reverse octapeptides reveals a relatively high proportion of L and E and null occurrence of W and Y. Also, the high proportion of hydrophilic residues in octapeptide pairs implies that these residues may form few contacts and contribute to binding processes rather than contributing to the stability of the protein. Therefore, our analysis indicates that the low abundance of identical reverse octapeptides in unrelated proteins is due to amino acid composition. From the volume and surface area computations, the identical reverse octapeptide pairs have similar volume and surface area whereas unrelated octapeptide pairs do not. Also, the higher continuous symmetry measure of identical reverse octapeptides indicates a high distortion of the symmetry of molecules. From the analysis of GO terms, we found that the proteins containing octapeptide pairs are mostly enzymes and involved mainly in binding and transport processes. The physical properties such as dipole moment and charge are important factors, which can be related to the protein folding laws and its function. Differences in surfaces of protein fragments and differences in dipole and quadrapole moments are the major reasons for the octapeptide pairs adopting different structures and functions. Although, a systematic search for the reverse form of the overlapping octapeptide segments is an interesting topic, only 8 octapeptide pairs (octapeptide and reverse octapeptide) with similar secondary structures and 23 octapeptide pairs with different secondary structures were identified in 3124 PDB sequences. Although the number of octapeptide pairs seems too low to provide insight into the sequence-structure relationship of proteins, the current study indicates that the proteins containing identical reverse octapeptides are not very abundant and they do not display specific biophysical and geometric properties as compared to unrelated octapeptides.

Materials and methods

Dataset

In our previous work [15], we compiled 3124 non-redundant protein sequences and their corresponding three state secondary structure assignments from PDB [16] to perform various computational studies. The protein sequences in the dataset were reversed by using a string reverse algorithm with an in house PERL program. Overlapping octapeptide segments of each of the reversed protein sequences were searched against 3124 non-redundant PDB sequences. The search process results in a reversed form of the octapeptide which occurs in unrelated proteins. The octapeptide pairs (octapeptide and its reverse octapeptide) thus obtained were grouped as octapeptide pairs with similar secondary structures and octapeptide pairs with different secondary structures. The percentage of occurrences of amino acid residues in each of the groups was also computed. Additionally, in order to know whether reverse octapeptides having different secondary structures display specific properties compared to unrelated octapeptides, we created a control group of unrelated octapeptides by randomly shuffling 23 pairs of identical reverse octapeptides with different conformations for a comparative study.

Computation of percentage of hydrophobic, hydrophilic and charged residues

The percentage of hydrophobic (F, I, L, M, V, W, A and P), hydrophilic (D, E, R, K and H) and other residues (G, S, T, C, N, Q and Y) was computed for the octapeptide pairs adopting similar and different secondary structures. We have also computed net charge and attributes (acidic, basic and neutral) of octapeptide pairs. To perform the aforementioned computations, we used a peptide property calculator web server, which is freely available at https://www.genscript.com/ssl-bin/site2/peptide_calculation.cgi. The structures of identical reverse octapeptide pairs and unrelated octapeptides were superimposed on each other to obtain the RMSD value using the superimpose server [33].

Computation of volume and surface area of identical reverse octapeptides

To find the difference in physical properties of octapeptide pairs (octapeptide and its reverse), we computed volume and surface area using the structure analysis web servers of the Helix Systems at Scientific Super Computing at the National Institute of Health accessible at http://helixweb.nih.gov/structbio/basic.html.

Computation of continuous symmetry measure

To understand how a change of symmetry affects the chemical and physical properties of a protein fragment, we computed the CSM by using a web server accessible at http://www.csm.huji.ac.il/new/?cmd=symmetry. The measure is based on the distance between the investigated molecule and the nearest structure that has the desired symmetry. The value of this measure ranges from 0 to 100 and higher CSM values indicate the change in the physical shape of octapeptide fragments [34].

Computation of dipole and quadrapole moment

Dipole and quadrapole moments of molecules might play a significant role in affecting the properties and activities of proteins. Hence, we have computed dipole and quadrapole moments of octapeptide fragment pairs by a web server located at the Weizmann institute, Israel [21].

Prediction of stabilizing residues in identical reverse octapeptides

In order to identify whether the residues in an octapeptide and its reverse octapeptide play a vital role in stabilizing the protein molecule, we predicted the stabilizing residues using the SRide server [22]. This server predicts stabilizing residues using several structure-based parameters such as surrounding hydrophobicity [35], long-range order [36], conservation score, etc.

Extraction of GO terms

In order to find what type of proteins or enzymes possesses identical reverse octapeptide, GO terms of the proteins containing octapeptide pairs (octapeptide and its reverse) were extracted from PDB annotations and stored in appropriate format. The extraction process was performed in a completely-automatic manner using in house java scripts on SUN ULTRA 40M2 workstation.

Authors’ contributions

SS and KMS designed the research. KMS performed computation. SS and KMS analyzed data. KMS wrote the manuscript. Both authors read and approved the final manuscript.

Competing interests

The authors declare no competing interests.

31 in total

1. Relative importance of secondary structure and solvent accessibility to the stability of protein mutants. A case study with amino acid properties and energetics on T4 and human lysozymes.

Authors: K Saraboji; M Michael Gromiha; M N Ponnuswamy
Journal: Comput Biol Chem Date: 2005-02 Impact factor: 2.877

Review 2. Anatomy of hot spots in protein interfaces.

Authors: A A Bogan; K S Thorn
Journal: J Mol Biol Date: 1998-07-03 Impact factor: 5.469

3. Structural diversity of sequentially identical subsequences of proteins: identical octapeptides can have different conformations.

Authors: S Sudarsanam
Journal: Proteins Date: 1998-02-15

4. Inverse sequence similarity in proteins and its relation to the three-dimensional fold.

Authors: R Preissner; A Goede; E Michalski; C Frömmel
Journal: FEBS Lett Date: 1997-09-08 Impact factor: 4.124

5. Solution structure of contryphan-R, a naturally occurring disulfide-bridged octapeptide containing D-tryptophan: comparison with protein loops.

Authors: P K Pallaghy; A P Melnikova; E C Jimenez; B M Olivera; R S Norton
Journal: Biochemistry Date: 1999-08-31 Impact factor: 3.162

6. Retroinverso mimetics of S peptide.

Authors: Jagdish Rai
Journal: Chem Biol Drug Des Date: 2007-11-10 Impact factor: 2.817

7. SRide: a server for identifying stabilizing residues in proteins.

Authors: Csaba Magyar; M Michael Gromiha; Gerard Pujadas; Gábor E Tusnády; István Simon
Journal: Nucleic Acids Res Date: 2005-07-01 Impact factor: 16.971

8. Superimpose: a 3D structural superposition server.

Authors: Raphael A Bauer; Philip E Bourne; Arno Formella; Cornelius Frömmel; Christoph Gille; Andrean Goede; Aysam Guerler; Andreas Hoppe; Ernst-Walter Knapp; Thorsten Pöschel; Burghardt Wittig; Valentin Ziegler; Robert Preissner
Journal: Nucleic Acids Res Date: 2008-05-20 Impact factor: 16.971

9. A server and database for dipole moments of proteins.

Authors: Clifford E Felder; Jaime Prilusky; Israel Silman; Joel L Sussman
Journal: Nucleic Acids Res Date: 2007-05-25 Impact factor: 16.971

10. Remediation of the protein data bank archive.

Authors: Kim Henrick; Zukang Feng; Wolfgang F Bluhm; Dimitris Dimitropoulos; Jurgen F Doreleijers; Shuchismita Dutta; Judith L Flippen-Anderson; John Ionides; Chisa Kamada; Eugene Krissinel; Catherine L Lawson; John L Markley; Haruki Nakamura; Richard Newman; Yukiko Shimizu; Jawahar Swaminathan; Sameer Velankar; Jeramia Ory; Eldon L Ulrich; Wim Vranken; John Westbrook; Reiko Yamashita; Huanwang Yang; Jasmine Young; Muhammed Yousufuddin; Helen M Berman
Journal: Nucleic Acids Res Date: 2007-12-11 Impact factor: 16.971

1 in total

1. DeepBindPoc: a deep learning method to rank ligand binding pockets using molecular vector representation.

Authors: Haiping Zhang; Konda Mani Saravanan; Jinzhi Lin; Linbu Liao; Justin Tze-Yang Ng; Jiaxiu Zhou; Yanjie Wei
Journal: PeerJ Date: 2020-04-06 Impact factor: 2.984

1 in total