| Literature DB >> 17597888 |
Achuthsankar S Nair1, Sivarama Pillai Sreenadhan.
Abstract
In this paper, a revision for the existing method of locating exons by genomic signal processing technique employing four binary indicator sequences is presented. The existing method relies on the pronounced period three peaks observed in the Fourier power spectrum of the exon regions which are absent in non-coding regions. The authors have abandoned the four sequences all together and adopted a single 'EIIP indicator sequence' which is formed by substituting the electron-ion interaction pseudopotentials (EIIP) of the nucleotides A, G, C and T in the DNA sequence, reducing the computational overhead by 75%. The power spectrum of this sequence reveals period three peaks for exon regions. Also a number of exons have been identified which exhibit period three peaks when mapped to 'EIIP indicator sequence' and which do not show the same when the binary indicator sequences are employed. We could get better discrimination between exon areas and non-coding areas of a number of genomes when the sequences are mapped to EIIP indicator sequences and the power spectra of the same are taken in a sliding Kaiser window, compared to the existing method using a rectangular window which utilizes binary indicator sequences.Entities:
Year: 2006 PMID: 17597888 PMCID: PMC1891688
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Electron Ion Interaction pseudo potentials of nucleotides
| Nucleotide | EIIP |
|---|---|
| A | 0.1260 |
| G | 0.0806 |
| C | 0.1340 |
| T | 0.1335 |
Exons from selected genes where EIIP indicator sequence gives better N/3 peaks compared to binary indicator sequences
| Serial No. | Accession number | Description Of gene | Length of sequence | Exon area & length (N) | Comments: (a) Using binary; (b) Using EIIP |
|---|---|---|---|---|---|
| 1 | AF019074 | EKLF, mus musculus erythroid kruppel like factor gene | 6350 | 3761-4574 (814) | Peak in (a) at 131 (not near N/3), Peak in (b) at 272 (near N/3). |
| 2 | AB009589 | Human gene for Osteomodulin | 12414 | 10624-10949 (326) | Peak in (a) at 8 (not near N/3), Peak in (b) at 110 (near N/3). |
| 3 | AF065986 | Human keratocan gene | 7659 | 6638-6810 (173) | Peak in (a) at 40 (not near N/3), Peak in (b) at 53 (near N/3). |
| 4 | AF015224 | Human mammoglobin gene | 4206 | 1713-1900 (188) | Peak in (a) at 18 (not near N/3), Peak in (b) at 63 (near N/3). |
| 5 | AB016625 | Human OCTN2 gene | 25871 | 15591-15792 (172) | Peak in (a) at 76 (not near N/3), Peak in (b) at 56 (near N/3). |
Figure 1Power spectrum of HUMELAFIN (D13156) obtained using binary indicators
Figure 2Power spectrum of HUMELAFIN (D13156) obtained using EIIP indicator
Examples of genes whose power spectra show better discrimination between coding and non-coding regions with EIIP indicator sequence mapping than with binary indicator sequence mapping
| No | Gene Name, Acc. No, Description | Regions (Nucleotide positions) | Highest peak (binary slide) | Highest peak (EIIP slide) | Discrimination measure D for binary slide | Discrimination measure D for EIIP slide |
|---|---|---|---|---|---|---|
| 1. | F56F11.4a,NC001135, a gene from | E1(929-1135) | 2.1 | 1.02 | 1.19 | 2.0 |
| E2(2528-2857) | 7.01 | 2.75 | ||||
| E3(4114-4377) | 6.0 | 2.4 | ||||
| E4(5465-5644) | 5.3 | 1.1 | ||||
| E5(7255_7605) | 3.4 | 1.25 | ||||
| Intron regions | 1.77 | 0.51 | ||||
| 2. | HUMBETGLOA L26462, human betaglobin A chain | E1(866-957) | 1.86 | 0.84 | 1.05 | 2.4 |
| E2(1088-1310) | 4.34 | 0.9 | ||||
| E3(2161-2289) | 3.0 | 1.13 | ||||
| Intron regions | 1.77 | 0.35 | ||||
| 3. | HUMCBRG, M62420, Homosapiens carbonyl reductase gene | E1(276-566) | 9.74 | 1.74 | 0.55 | 1.16 |
| E2(1112-1219) | 1.3 | 0.43 | ||||
| E3(2608-3044) | 6.67 | 1.0 | ||||
| Intron regions | 2.36 | 0.37 | ||||
| 4. | HUMELAFIN, D13156, Homo sapiens gene for elafin | E1(247-325) | 2.05 | 0.65 | 0.95 | 1.55 |
| E2(1185-1459) | 2.72 | 1.575 | ||||
| Intron regions | 2.15 | 0.42 | ||||
| 5. | GalR2, AF042784 Mus musculus galin receptor type 2 gene | E1(24-388) | 9.6 | 1.27 | 3.19 | 14.11 |
| E2(1449-2199) | 3.19 | 0.917 | ||||
| Intron regions | 1.0 | 0.09 | ||||
| 6. | PP32R1, AF00A216 Homosapiens candidate tumor suppressor gene | E(4453-5157) | 30.6 | 11.92 | 19.13 | 21.67 |
| Intergenic regions | 1.6 | 0.55 | ||||
| 7. | HMX1, AF009614, Mus musculus homeobox containing nuclear transcriptional factor gene | E1(1267-1639) | 4.76 | 1.71 | 2.14 | 2.80 |
| E2(3888-4513) | 7.09 | 3.94 | ||||
| Intron regions | 2.22 | 0.61 | ||||
| 8. | PSMB5, AB003306, Mus musculus DNA for PSMB5 | E1(1020-1217) | 3.82 | 0.95 | 1.43 | 3.17 |
| E2(2207-2513) | 2.10 | 1.05 | ||||
| E3(4543-4832) | 7.65 | 2.12 | ||||
| Intron regions | 1.47 | 0.38 | ||||
| 9. | HSODF2, X74614, Homosapiens ODF2 gene | E1(280-599) | 3.82 | 0.865 | 2.43 | 4.33 |
| E2(843-1275) | 6.725 | 1.75 | ||||
| Intron regions | 1.57 | 0.2 |