| Literature DB >> 19481099 |
Zhao-Hui Qi1, Jian-Min Wang, Xiao-Qin Qi.
Abstract
We introduce a new approach to investigate the dual nucleotides compositions of 11 Gram-positive and 12 Gram-negative eubacteria recently studied by Sorimachi and Okayasu. The approach firstly obtains a 16-dimension vector set of dual nucleotides by PN-curve from the complete genome of organism. Each vector of the set corresponds to a single gene of genome. Then we reduce the 16-dimension vector set to 2-dimension by principal components analysis (PCA). The reduction avoids possible loss of information averaging all 16-dimension vectors. Then we suggest a 2D graphical representation based on the 2-dimension vector to investigate the classification patters among different organisms.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19481099 PMCID: PMC7126582 DOI: 10.1016/j.jtbi.2009.05.011
Source DB: PubMed Journal: J Theor Biol ISSN: 0022-5193 Impact factor: 2.691
Genomes used for this study.
| Strain | Accession (GenBank) | RefSeq identifier | Total length (bp) | Genes |
|---|---|---|---|---|
| 2,878,529 | 2775 | |||
| 1,852,441 | 1811 | |||
| 4,214,630 | 4225 | |||
| 3,031,430 | 2786 | |||
| 2,944,528 | 2940 | |||
| 963,879 | 815 | |||
| 580,076 | 525 | |||
| 816,394 | 733 | |||
| 874,478 | 692 | |||
| 4,403,837 | 4293 | |||
| 3,268,203 | 2770 | |||
| 1,111,523 | 886 | |||
| 910,724 | 875 | |||
| 1,616,554 | 1707 | |||
| 1,667,867 | 1630 | |||
| 1,643,831 | 1535 | |||
| 4,639,675 | 4467 | |||
| 4,809,037 | 4711 | |||
| 2,961,149 | 2889 | |||
| 1,072,315 | 1119 | |||
| 4,653,728 | 4103 | |||
| 2,184,406 | 2065 | |||
| 1,830,138 | 1789 | |||
| 1,138,011 | 1095 |
Fig. 12-dimension dot-matrix graphs of dual nucleotides determined from the complete genomes of various eubacteria of Table 1. As shown in Fig. 1 of Sorimachi and Okayasu (2004), blue represents Gram-positive bacteria; red represents Gram-negative bacteria; green represents mycoplasmas, which lack a cell wall. (For interpretation of the references to the color in this figure legend, the reader is referred to the web version of this article.)
Fig. 2Grid coordinate system is divided into two region: Region Ι (x∈[−0.05, 0.05], y∈[−0.05, 0.05]) and Region ΙI (outside Region I).