| Literature DB >> 28149056 |
Varughese Deepthi1, Vineetha V I Nair2, Vipin Thomas1, Navya Raj1, Shidhi P Ramakrishnan1, Juveria Khan3, Monika Kaushik3, Pawan K Dhar4, Achuthsankar S Nair1.
Abstract
De novo emergence of genes is the most fundamental form of genetic diversity that is attracting the attention of the scientific community. Identification of short open reading frames (sORFs) from the non-coding regions of different genomes has been leading this thought recently. The coding potential of these newly identified sORFs have been investigated through experimental and computational approaches in recent studies. In the present work we have tried to make peptides from intergenic sequences of D. melanogaster genome leading to therapeutic applications. Towards this goal of making novel peptides from non-coding genome, we have found strong computational evidence of 145 peptides with conformational stability from the intergenic sequences of D. melanogaster. The structure of these completely unique peptides was predicted using ab initio method. The function annotation of these peptides was carried out using this structural information. The newly generated proteins were categorised as DNA/Protein/ion binding proteins, electron transporters and a very few as enzymes too. Experimental studies can certainly provide validations to these preliminary findings. This work provides further evidence of untapped potential of non-coding genome.Entities:
Keywords: Non-coding; antimicrobial peptides; de novo peptides; junk DNA; short ORFs
Year: 2016 PMID: 28149056 PMCID: PMC5267965 DOI: 10.6026/97320630012202
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1In silico strategy designed for identifying potential peptides from non-coding DNA
Figure 2Predicted protein patterns from the novel peptides / proteins
Sequence profiles predicted from EKA sequence dataset (145 peptides)
| Protein sequence Profiles | Frequency |
| Microbodies C-terminal targeting signal | 7 |
| Histidine-rich region profile | 1 |
| Cysteine-rich region profile | 2 |
| Phenylalanine rich region profile | 1 |
| Leucine zipper pattern | 8 |
| Bipartite nuclear localization signal profile | 1 |
| Prenyl group binding site (CAAX box) | 4 |
| Cell attachment sequence | 3 |
EKA sequences showing sequence similarity with antimicrobial peptides
| Peptide ID | AMPs | Activity |
| EKA-26 | WAM2 | Gram+ & Gram- |
| EKA-31 | Latarcin 1 | Gram+ & Gram-, Fungi, Mammalian cells |
| EKA-35 | BMAP-27 | Gram+ & Gram-, Virus, Fungi, Parasites |
| EKA-56 | RANATU ERIN 2 | Gram+ & Gram- |
| EKA-66 | Hyfl G | Unknown |
| EKA-80 | Cliotide T1 | Mammalian cells, Cancer cells |
| WAM2 = wallaby antimicrobial 2 | ||
Predicted structural properties of the novel EKA peptides (best results shown)
| Peptide ID | I-Tasser (C-score) | Total Energy | No. of Stability centres | Instability INDEX | Non covalent interaction | Non canonical interaction |
| EKA-1 | 2.646 | -1780.711 | 6 | 37.9 | 84 | 12 |
| EKA-7 | -2.435 | -1219.2 | 5 | 11.63 | 78 | 115 |
| EKA-55 | -1.498 | -955.09 | 6 | 38.62 | 100 | 13 |
| EKA-95 | -2.048 | -1494.83 | 6 | 29.48 | 84 | 16 |
| EKA-109 | -2.422 | -1731.463 | 7 | 25.81 | 44 | 20 |
| EKA-143 | -2.18 | -1953.243 | 8 | 29.91 | 35 | 10 |
| EKA-149 | 1.55 | -5624.295 | 15 | 29.53 | 264 | 378 |
Function Annotation and Localization predictions using Gene Ontology Terms
| Peptide ID | Peptide length | GO-molecular function predicted | GO Location predicted | Structural homologs with known function | RMSD |
| EKA-21 | 43 | lipid binding | integral to membrane | GTP binding Protein | 1.1 |
| EKA-27 | 33 | voltage gated pottasium channel activity | voltage gated pottasium channel complex | Glutathione - regulated Pottassium - Efflux system | 1.7 |
| EKA-36 | 33 | DNA binding | nucleus | GCN4 (general control protein) | 0.55 |
| EKA-91 | 35 | adenyl ribonucleotide binding | plasma membrane | RecX, DNA binding protein | 1.5 |
| EKA-97 | 46 | DNA binding | nucleus | Ligand Binding | 2.2 |
| EKA-115 | 40 | Ion binding | Proteosome accessorycomplex | Proteosome activator | 1.1 |
| EKA-124 | 43 | Ion binding | cell | RAB1 domain | 1.4 |
EKA peptide sequences and their structural homologs
| Peptide ID | Structural homolog | ZScore | RMSD | Total Residue | Aligned Residues | Similar molecule |
| EKA-3 | 2HY6-D | 3.9 | 1.7 | 32 | 32 | General control protein gcn4 |
| EKA-21 | 3AHA-F | 4.1 | 1 | 34 | 34 | Transmembrane protein gp41 |
| EKA-36 | 4NJ2-A | 4.6 | 1.2 | 33 | 33 | General control protein gcn4 |
| EKA-50 | 3W8V-A | 3.1 | 2.8 | 32 | 30 | Gcn4n coiled coil peptide |
| EKA-63 | 4CBJ-G | 4.8 | 2.1 | 69 | 49 | ATP synthase subunit c |
| EKA-81 | 3HTU-H | 3.8 | 1.8 | 36 | 36 | Vacuolar protein-sorting-associated protein |
| EKA-95 | 2KHH-A | 4.6 | 2.6 | 57 | 57 | mRNA export factor mex67 |
| EKA-115 | 1EC5-A | 5.5 | 1.8 | 48 | 40 | Protein (four-helix bundle model) |
| EKA-117 | 2R2V-H | 3.6 | 2.2 | 31 | 30 | Gcn4 leucine zipper |
| EKA-145 | 4OJK-C | 5 | 1 | 37 | 37 | Ras-related protein rab-11b |
Figure 3EKA peptide structures predicted using I-TASSER. (A) EKA-27, (B) EKA-33, (C) EKA-36 with the structural homolog GCN4 superimposed. Red shade is given for GCN4 monomer and blue shade for EKA_36 peptide