| Literature DB >> 21324187 |
Abstract
BACKGROUND: Elhaik, Graur and Josic recently commented on the genome order index (S) and the Z-curve (Elhaik et al. Biol Direct 2010, 5: 10). S is a quantity defined as S = a2 + c2 + g2 + t2, where a, c, g and t denote corresponding base frequencies. The Z-curve is a three dimensional curve that represents a DNA sequence in the manner that each can be uniquely reconstructed given the other. Elhaik et al. made 4 major claims. 1) In the previous mapping system with the regular tetrahedron, calculation of the radius of the inscribed sphere is "a mathematical error". 2) S follows an exponential distribution and is narrowly distributed with a range of (0.25 - 0.33). 3) Based on the Chargaff's second parity rule (PR2), "S is equivalent to H [Shannon entropy]" and they are derivable from each other. 4) Z-curve "suffers from over dimensionality", because based on the analysis of 235 bacterial genomes, x and y components contributed only less than 1% of the variance and therefore "would be of little use".Entities:
Mesh:
Year: 2011 PMID: 21324187 PMCID: PMC3046898 DOI: 10.1186/1745-6150-6-10
Source DB: PubMed Journal: Biol Direct ISSN: 1745-6150 Impact factor: 4.540
Figure 1The distribution of the mapping points for the 235 bacterial genomes used in [5]. A) The distribution of the mapping points in the reduced coordinate system (x, y, z) according to equation (3). The square (the side length = 2) corresponds to the projection of the regular tetrahedron onto the x-z coordinate plane. The bold and dotted circles correspond to projections of inscribe spheres with the correct radius length and wrongly calculated value 1/4, respectively. Based on the correct radius, in contrast to their conclusion that 45% of genomes have mapping points outside the inscribed sphere, none has the mapping point outside the inscribed sphere and none has an S value larger than 1/3. B) The distribution of the mapping points in the original coordinate system (X, Y, Z) according to equation (1). The square (the side length = ) corresponds to the projection of the regular tetrahedron onto the X-Z coordinate plane. Note that all mapping points are within the inscribed sphere, too, whose radius length is 1/4. The mistake of Elhaik et al. is the confusion of the original and reduced coordinate systems, and consequently, the neglect of the coordinate transform parameter . Refer to text for details.