| Literature DB >> 18992259 |
Alcides Perez-Bello1, Cristian Robert Munteanu, Florencio M Ubeira, Alexandre Lopes De Magalhães, Eugenio Uriarte, Humberto González-Díaz.
Abstract
The importance of the promoter sequences in the function regulation of several important mycobacterial pathogens creates the necessity to design simple and fast theoretical models that can predict them. This work proposes two DNA promoter QSAR models based on pseudo-folding lattice network (LN) and star-graphs (SG) topological indices. In addition, a comparative study with the previous RNA electrostatic parameters of thermodynamically-driven secondary structure folding representations has been carried out. The best model of this work was obtained with only two LN stochastic electrostatic potentials and it is characterized by accuracy, selectivity and specificity of 90.87%, 82.96% and 92.95%, respectively. In addition, we pointed out the SG result dependence on the DNA sequence codification and we proposed a QSAR model based on codons and only three SG spectral moments.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18992259 PMCID: PMC7126577 DOI: 10.1016/j.jtbi.2008.09.035
Source DB: PubMed Journal: J Theor Biol ISSN: 0022-5193 Impact factor: 2.691
LN construction rules for the Mps of the gene Alpha in Mycobacterum bovis (BCG)
| c1g2a3c4t5t6t7c8g9c10c11c12g13a14a15t16c17g18a19c20 | |||
|---|---|---|---|
| a21t22t23t24g25g26c27c28t29c30c31a32c33a34c35a36c37g38g39t40 | |||
| a41t42g43t44t45c46t47g48g49c50c51c52g53a54g55c56a57c58a59c60 | |||
| g61a62c63g64a65 | |||
| Nucleotide | |||
| 1 | c1a3t5g25 | 0 | 0 |
| 2 | g2c10g26 | 0 | −1 |
| 3 | c4t16 | −1 | 0 |
| 4 | t6c8 | 1 | 0 |
| 5 | t7 | 2 | 0 |
| 6 | g9 | 1 | −1 |
| 7 | c11c27t29 | −1 | -1 |
| 8 | c12a14g18c28c30g48 | −2 | −1 |
| 9 | g13g49 | −2 | −2 |
| 10 | a15c17a19t45t47 | −2 | 0 |
| 11 | c20a32t44c46 | −3 | 0 |
| 12 | a21 | −3 | 1 |
| 13 | t22 | −2 | 1 |
| 14 | t23 | −1 | 1 |
| 15 | t24 | 0 | 1 |
| 16 | c31 | −3 | −1 |
| 17 | c33g43 | −4 | 0 |
| 18 | a34t42 | −4 | 1 |
| 19 | c35a41 | −5 | 1 |
| 20 | a36 | −5 | 2 |
| 21 | c37 | −6 | 2 |
| 22 | g38 | −6 | 1 |
| 23 | g39 | −6 | 0 |
| 24 | t40 | −5 | 0 |
| 25 | c50 | −3 | −2 |
| 26 | c51 | −4 | −2 |
| 27 | c52a54 | −5 | −2 |
| 28 | g53g55 | −5 | −3 |
| 29 | c56 | −6 | −3 |
| 30 | a57 | −6 | −2 |
| 31 | c58 | −7 | −2 |
| 32 | a59 | −7 | −1 |
| 33 | c60a62 | −8 | −1 |
| 34 | g61 | −8 | −2 |
| 35 | c63a65 | −9 | −1 |
| 36 | g64 | −9 | −2 |
Fig. 1LN for the Mps of the gene Alpha in Mycobacterum bovis (BCG).
SG codifications for the virtually translated Mps of the gene Alpha in Mycobacterum bovis (BCG)
| DNA codon SG | |
|---|---|
| DNA nucleotide sequence | c1g2a3c4t5t6t7c8g9c10c11c12g13a14a15t16c17g18a19c20a21 |
| t22t23t24g25g26c27c28t29c30c31a32c33a34c35a36c37g38g39 | |
| t40a41t42g43t44t45c46t47g48g49c50c51c52g53a54g55c56a57c58a59c60g61a62c63 | |
| DNA codons sequence | cga1ctt2tcg3ccc4gaa5tcg6aca7 |
| ttt8ggc9ctc10cac11aca12cgg13 | |
| tat14gtt15ctg16gcc17cga18gca19cac20gac21 | |
| Virtually translated amino acid sequence | R1L2S3P4E5S6T7F8G9L10H11T12R13Y14V15L16A17R18A19H20D21 |
Fig. 2SG for the Mps of the gene Alpha in Mycobacterum bovis (BCG).
Summary of the LDA results for DNA LN and SG models vs. RNA 2S folding representations
| TI | Ac (%) | Se (%) | Sp (%) | Final TIs | Vars. | Ref. | |||
|---|---|---|---|---|---|---|---|---|---|
| Primary structure of DNA nucleotide & LN | |||||||||
| LN | 78.33 | 72.59 | 79.84 | LN | 1 | 0.74 | 230.5 | 0.0001 | a |
| LN | 81.73 | 78.52 | 82.58 | LN | 3 | 0.89 | 76.3 | 0.0001 | a |
| LN | 90.87 | 82.96 | 92.95 | LN | 2 | 0.82 | 142.1 | 0.0001 | a |
| Pool | 92.88 | 75.56 | 97.46 | LN | 4 | 0.83 | 130.8 | 0.0001 | a |
| Primary structure of DNA nucleotide sequences & SG | |||||||||
| SG | 66.25 | 81.48 | 62.23 | SG | 2 | 0.78 | 69.62 | 0.001 | a |
| SG | 71.21 | 85.19 | 67.51 | SG | 3 | 0.76 | 49.54 | 0.001 | a |
| TI | 75.39 | 68.15 | 77.30 | 3 | 0.73 | 58.19 | 0.001 | a | |
| Pool | 81.58 | 68.15 | 85.13 | SG | 3 | 0.67 | 79.94 | 0.001 | a |
| Primary structure of DNA | |||||||||
| SG | 70.43 | 76.30 | 68.88 | SG | 3 | 0.75 | 52.31 | 0.001 | a |
| SG | 74.77 | 82.96 | 72.60 | SG | 3 | 0.74 | 56.37 | 0.001 | a |
| TI | 76.16 | 59.26 | 80.63 | 3 | 0.72 | 60.98 | 0.001 | a | |
| Pool | 80.80 | 74.81 | 82.39 | SG | 5 | 0.67 | 47.04 | 0.001 | a |
| RNA electrostatic parameter of thermodynamically-driven 2S folding | |||||||||
| 2S | 97.60 | 93.30 | 100.00 | 2Sθ0 | 1 | 0.34 | 724.47 | 0.001 | b |
| 2S | 93.83 | 83.70 | 98.89 | 2S | 2 | 0.44 | 515.03 | 0.05 | c |
| 2S | 96.58 | 85.19 | 100.00 | 2S | 2 | 0.41 | 38.8 | 0.001 | d |
Note: the terms Ac, Se, and Sp mean accuracy, sensitivity and specificity, and measure the ratio of the number of total, Mps, or Cgs sequences correctly classified by the model with respect to the real classification; Vars.=no of variables in the QSAR equations; SG=star-graph; LN=lattice network; 2S=secondary structure; super index “e” represents the embedded calculations; references (Ref.) are a: this work, b: (González-Díaz et al., 2007c), c: (González-Díaz et al., 2005a) and d: (González-Díaz et al., 2006a).
Fig. 3RNA 2S for the Mps of the gene Alpha in Mycobacterum bovis (BCG).