| Literature DB >> 17118141 |
Vijayaraj Nagarajan1, Navodit Kaushik, Beddhu Murali, Chaoyang Zhang, Sanyogita Lakhera, Mohamed O Elasri, Youping Deng.
Abstract
BACKGROUND: Naturally occurring antimicrobial peptides are currently being explored as potential candidate peptide drugs. Since antimicrobial peptides are part of the innate immune system of every living organism, it is possible to discover new candidate peptides using the available genomic and proteomic data. High throughput computational techniques could also be used to virtually scan the entire peptide space for discovering out new candidate antimicrobial peptides. RESULT: We have identified a unique indexing method based on biologically distinct characteristic features of known antimicrobial peptides. Analysis of the entries in the antimicrobial peptide databases, based on our indexing method, using Fourier transformation technique revealed a distinct peak in their power spectrum. We have developed a method to mine the genomic and proteomic data, for the presence of peptides with potential antimicrobial activity, by looking for this distinct peak. We also used the Euclidean metric to rank the potential antimicrobial peptides activity. We have parallelized our method so that virtually any given protein space could be data mined, in search of antimicrobial peptides.Entities:
Mesh:
Substances:
Year: 2006 PMID: 17118141 PMCID: PMC1683563 DOI: 10.1186/1471-2105-7-S2-S2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Property based coding method. The spectrum of the individual components (SH-hydrophobicity, dftC-charge, dftP-polarity, dftS-cysteine, dftD-amino acid distribution) and the power spectrum (Total) of 6 different Antimicrobial sequences. The Power spectrum (Total) shows a clear peak at period 5 in all the 6 sequences, which is more distinct, sharp and high in magnitude, compared to the same peak in dftH. The noise level is also highly minimized in the Total spectrum, compared to the dftH. The Total spectrum seems to represent the comprehensive nature of all the properties taken in to consideration.
Figure 2Power spectrum of new hits. Plot of the reference power spectrum and the power spectrum of the three hits. A distinct peak is clearly seen at period 5.
Sequence analysis details for APD predictions
| G|P (#) | Naa (#) | Paa (#) | HR (%) | NetC | BI (kcal/mol) | |
| hit1 | 0 | 1 | 4 | 31 | 3 | 3.75 |
| hit2 | 1 | 1 | 3 | 62 | 2 | 0.79 |
| hit3 | 2 | 4 | 2 | 31 | -2 | - |
G|P is the number of Glycine|Proline residues, Naa is the number of negatively charged aminoacids, Paa is the number of positively charged aminoacids, HR is the hydrophobicity ratio, NetC is the net charge and BI is the Boman Index in kcal/mol.
BLAST similarity search results against APD
| Similar hits | % similarity | |
| hit1 | AP00402 | 30 |
| AP00403 | 25 | |
| AP00163 | 24.13 | |
| hit2 | AP00511 | 38.88 |
| AP00214 | 38.09 | |
| AP00303 | 35.29 | |
| hit3 | AP00335 | 31.57 |
| AP00288 | 26.08 | |
| AP00072 | 25.92 |
G|P is the number of Glycine|Proline residues, Naa is the number of negatively charged aminoacids, Paa is the number of positively charged aminoacids, HR is the hydrophobicity ratio, NetC is the net charge and BI is the Boman Index in kcal/mol.
Figure 3Parallel Processing performance. Plot of computation time VS. problem size for different number of processing elements, during the virtual data mining of 165 random sequences.