| Literature DB >> 17935620 |
Richard A Morton1, Brian R Morton.
Abstract
BACKGROUND: Many bacterial chromosomes display nucleotide asymmetry, or skew, between the leading and lagging strands of replication. Mutational differences between these strands result in an overall pattern of skew that is centered about the origin of replication. Such a pattern could also arise from selection coupled with a bias for genes coded on the leading strand. The relative contributions of selection and mutation in producing compositional skew are largely unknown.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17935620 PMCID: PMC2099444 DOI: 10.1186/1471-2164-8-369
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Chromosomes analyzed from eubacterial phyla1
| Phylum | Class1 | Number |
| Actinobacteria | 23 | |
| Bacteroidetes | 5 | |
| Chlamydiae | 11 | |
| Cyanobacteria | 17 | |
| Deinococcus-Thermus | 4 | |
| Firmicutes | 79 | |
| Mollicutes | 16 | |
| Lactobacillales | 27 | |
| Clostridia | 7 | |
| Bacillales | 29 | |
| Proteobacteria | 197 | |
| Alphaproteobacteria | 55 | |
| Betaproteobacteria | 38 | |
| Deltaproteobacteria | 11 | |
| Epsilonproteobacteria | 8 | |
| Gammaproteobacteria | 85 | |
| Spirochaetes | 6 | |
| Others | 10 |
1The two phyla with the largest sample sizes are shown sub-divided by Class representation.
Figure 1Chromosome division. The distribution of Cd (chromosome division, Equation 10) based on [S1, S2]ML pairs for the 326 chromosomes indicated in the text. Cd is plotted along the X axis and represents the deviation from equal chromosome division.
Chromosomes with unequal replication arms1
| Taxon2 | Accession | Phylum; Class3 |
| NC_008009 | Acidobacteria | |
| NC_000918 | Aquificae | |
| Aster yellows witches'-broom phytoplasma AYWB | NC_007716 | Firmicutes; Mollicutes |
| NC_002928 | Proteobacteria; Betaproteobacteria | |
| NC_002929 | Proteobacteria; Betaproteobacteria | |
| NC_008060 | Proteobacteria; Betaproteobacteria | |
| NC_007651 | Proteobacteria; Betaproteobacteria | |
| Candidatus | NC_007292 | Proteobacteria; Gammaproteobacteria |
| Candidatus | NC_007205 | Proteobacteria; Alphaproteobacteria |
| NC_007907 | Firmicutes; Clostridia | |
| NC_007880 | Proteobacteria; Gammaproteobacteria | |
| NC_002940 | Proteobacteria; Gammaproteobacteria | |
| NC_004917 | Proteobacteria; Epsilonproteobacteria | |
| NC_006512 | Proteobacteria; Gammaproteobacteria | |
| NC_007295 | Firmicutes; Mollicutes | |
| NC_006908 | Firmicutes; Mollicutes | |
| NC_004432 | Firmicutes; Mollicutes | |
| NC_004757 | Proteobacteria; Betaproteobacteria | |
| NC_005071 | Cyanobacteria | |
| NC_002516 | Proteobacteria; Gammaproteobacteria | |
| NC_002947 | Proteobacteria; Gammaproteobacteria | |
| NC_007606 | Proteobacteria; Gammaproteobacteria | |
| NC_006569 | Proteobacteria; Alphaproteobacteria | |
| NC_003037 | Proteobacteria; Alphaproteobacteria | |
| NC_007712 | Proteobacteria; Gammaproteobacteria | |
| NC_007604 | Cyanobacteria | |
| NC_006461 | Deinococcus-Thermus | |
| NC_002978 | Proteobacteria; Alphaproteobacteria | |
| NC_002488 | Proteobacteria; Gammaproteobacteria | |
| NC_008150 | Proteobacteria; Gammaproteobacteria | |
| NC_005810 | Proteobacteria; Gammaproteobacteria |
1One replication arm less than 40% of the total chromosome length (see text).
2Primary chromosome unless otherwise indicated.
3Class is given only for Firmicutes and Proteobacteria as in Table 1.
Comparison of the origin of replication and the putative ML origin for each of the five linear chromosomes
| Bacterial Species | Accession | Origin Signal1 | Origin Location1 | [S1ML, S2ML]2 | Po3 | Distance4 |
| Streptomyces | NC_003155 | oriC | 0.586 | 0.569, 0.994 | 0.569 | 0.017 |
| Streptomyces | NC_003888 | oriC | 0.493 | 0.463, 0.987 | 0.463 | 0.030 |
| NC_001318 | 0.503 | 0.000, 0.503 | 0.503 | 0.000 | ||
| NC_006156 | 0.509 | 0.000, 0.510 | 0.510 | 0.001 | ||
| NC_003305 | 0.494 | 0.000, 0.493 | 0.493 | 0.001 |
1Location of the annotated origin as a fraction of chromosome length starting at the NCBI site 1.
2Location of [S1, S2]ML pair as a fraction of chromosome length.
3POML as inferred by which of S1ML or S2ML is closest to the annotated origin.
4Distance between the annotated origin and POML.
Comparison of the location of partition genes and the putative ML origin for each of the 28 secondary circular chromosomes
| Bacterial Species | Accession | POML2 | Distance3 | |
| NC_006933 | 0.998 | 0.999 | 0.001 | |
| NC_003318 | 0.079 | 0.079 | 0.000 | |
| NC_007624 | 0.998 | 0.999 | 0.001 | |
| NC_004311 | 0.998 | 0.999 | 0.001 | |
| NC_007511 | 0.002 | 0.000 | 0.002 | |
| NC_007509 | 0.996 | 0.001 | 0.005 | |
| NC_008061 | 0.879 | 0.878 | 0.001 | |
| NC_008062 | 0.315 | 0.320 | 0.005 | |
| NC_006349 | 0.997 | 0.001 | 0.004 | |
| NC_007435 | 0.576 | 0.578 | 0.002 | |
| NC_006351 | 0.998 | 0.001 | 0.003 | |
| NC_007650 | 0.998 | 0.000 | 0.002 | |
| NC_007952 | 0.000 | 0.006 | 0.006 | |
| NC_007953 | 0.981 | 0.979 | 0.002 | |
| NC_006371 | 0.999 | 0.000 | 0.001 | |
| NC_007348 | 0.775 | 0.778 | 0.003 | |
| NC_007974 | 0.943 | 0.946 | 0.003 | |
| NC_003296 | 0.001 | 0.002 | 0.001 | |
| NC_007494 | 0.996 | 0.513 | 0.483 | |
| NC_008043 | 0.517 | 0.519 | 0.002 | |
| NC_006569 | 0.678 | 0.668 | 0.010 | |
| NC_003037 | 0.999 | 0.991 | 0.008 | |
| NC_003078 | 0.034 | 0.060 | 0.026 | |
| NC_002506 | 0.999 | 0.005 | 0.006 | |
| NC_006841 | 0.999 | 0.002 | 0.003 | |
| NC_004605 | 0.999 | 0.000 | 0.001 | |
| NC_004460 | 0.689 | 0.692 | 0.003 | |
| NC_005140 | 0.999 | 0.002 | 0.003 |
1parA or repA location (see text), as a fraction of chromosome length starting at the NCBI site 1.
2POML as inferred from rRNA sites or σGC > 0 (see text).
3Distance between parA/repA location and POML.
Number of chromosomes in which the putative origin is located near one or more origin gene pairs
| Number of origin gene pairs near the putative origin2 | ||||||||
| Gene Pairs1 | 0 | 1 | 2 | 3 | 4 | 5 | > 0 | Total3 |
| 0 | 57 | - | - | - | - | - | - | 57 |
| 1 | 24 | 7 | - | - | - | - | 7 | 31 |
| 2 | 22 | 11 | 21 | - | - | - | 32 | 54 |
| 3 | 17 | 3 | 23 | 24 | - | - | 50 | 67 |
| 4 | 23 | 11 | 12 | 3 | 31 | - | 57 | 80 |
| 5 | 1 | 2 | 0 | 1 | 0 | 5 | 8 | 9 |
1Number of "origin gene" pairs found in any given chromosome, as defined in the text.
2Number of chromosomes for which the specified number of origin gene pairs is found within 1% of the chromosome length from POML. When multiple origin gene pairs are found near POML it indicates that the gene pairs are close together on the chromosome.
3Total number of chromosomes with that number of annotated gene pairs regardless of whether or not any pair is close to POML.
Figure 2Strand asymmetry across bacterial chromosomes. A scatter plot of RT and RG for D4 sites on the leading strands of the 352 bacterial chromosomes. The values represent the average effect for the two leading strands in each of the replication arms, with the bars indicating the 95% uncertainty. The straight line represents RT = RG.
Chromosomes with extreme skew
| Species | Group | RG | RT | Skew1 |
| Chromosomes with strongest overall skew | ||||
| Candidatus | Gammaproteobacteria | 0.433 | 0.107 | 0.446 |
| Firmicutes | 0.426 | -0.087 | 0.435 | |
| Alphaproteobacteria. | 0.346 | 0.226 | 0.413 | |
| Alphaproteobacteria | 0.360 | 0.191 | 0.408 | |
| Alphaproteobacteria | 0.360 | 0.191 | 0.407 | |
| Alphaproteobacteria | 0.353 | 0.190 | 0.401 | |
| Alphaproteobacteria | 0.349 | 0.188 | 0.397 | |
| Firmicutes | 0.362 | -0.121 | 0.382 | |
| Spirochaetes | 0.229 | 0.300 | 0.378 | |
| Firmicutes | 0.350 | -0.086 | 0.360 | |
| Spirochaetes | 0.207 | 0.287 | 0.354 | |
| Gammaproteobacteria | 0.254 | 0.228 | 0.341 | |
| Alphaproteobacteria | 0.300 | 0.080 | 0.311 | |
| Firmicutes | 0.273 | -0.137 | 0.305 | |
| Alphaproteobacteria | 0.284 | 0.080 | 0.295 | |
| Gammaproteobacteria | 0.197 | 0.218 | 0.294 | |
| Firmicutes | 0.286 | -0.024 | 0.287 | |
| Gammaproteobacteria | 0.254 | 0.124 | 0.282 | |
| Firmicutes | 0.260 | -0.042 | 0.263 | |
| Firmicutes | 0.260 | 0.000 | 0.260 | |
| Chromosomes with weakest overall skew | ||||
| Actinobacteria | 0.023 | 0.019 | 0.030 | |
| Cyanobacteria | 0.028 | 0.005 | 0.028 | |
| Alphaproteobacteria | 0.022 | 0.014 | 0.026 | |
| Actinobacteria | 0.002 | 0.025 | 0.025 | |
| Deltaproteobacteria | 0.022 | -0.009 | 0.024 | |
| Firmicutes | 0.017 | 0.016 | 0.023 | |
| Firmicutes | -0.012 | 0.016 | 0.020 | |
| Gammaproteobacteria | 0.016 | 0.011 | 0.020 | |
| Firmicutes | 0.017 | 0.009 | 0.019 | |
| Firmicutes | -0.014 | 0.009 | 0.016 | |
| Aquificae | -0.006 | -0.014 | 0.015 | |
| Firmicutes | -0.012 | -0.002 | 0.013 | |
| Cyanobacteria | -0.005 | -0.010 | 0.012 | |
| Cyanobacteria | 0.010 | -0.002 | 0.010 | |
| Cyanobacteria | 0.009 | -0.001 | 0.009 | |
| Gammaproteobacteria | -0.003 | -0.005 | 0.006 | |
| Cyanobacteria | -0.005 | 0.003 | 0.006 | |
| Firmicutes | 0.003 | 0.003 | 0.004 | |
| Cyanobacteria | 0.001 | 0.004 | 0.004 | |
| Cyanobacteria | 0.003 | -0.002 | 0.004 | |
1Overall skew given by SQRT(RG2 + RT2)
Figure 3Strand asymmetry in Firmicutes. The same plot as in Figure 2 with the Firmicute chromosomes indicated as open points. Open squares indicate those of the class Mollicutes (which includes the Mycoplasma genus) while the open circles indicate all other Firmicutes.
Likelihood comparisons for the different methods when applied to the E. coli K12 chromosome CDS and intergenic sites
| Method | Log (L) | -2 × Diff1 | Probability (DF1, DF2)2 |
| Mobs | - 2696055.16 | NA | NA |
| M0 | - 2696052.54 | reference | NA |
| M1 | - 2696075.49 | 45.9 | 5.5 × 10-5 (35, 20) |
| M2 | - 2696239.16 | 373.2 | < < 10-6 (35, 14) |
| M3 | - 2740719.34 | 89,334 | < < 10-6 (35, 5) |
1Relative to M0.
2Probability of the chi-square test with (DF1 - DF2) degrees of freedom.
Model parameters for the E. coli K12 chromosome when the MO method is implemented
| Site Class1 | CA+T2 | πAT | σAT | πGC | σGC |
| IG | |||||
| C1+ | |||||
| C1- | |||||
| C2+ | |||||
| C2- | |||||
| C3+ | |||||
| C3- | |||||
| D4+ | |||||
| D4- |
1Sites are classified as intergenic (IG) or coding (C) with the latter further classified by codon position, as indicated by the subscript, and chromosome strand, as indicated by the superscript. D4 are four fold degenerate C3 sites.
2The ML value is given with the 95% interval given in brackets.
Figure 4Likelihood surface for the . This representation of the likelihood surface of the E. coli K12 chromosome (NC_000913) is based on the MOBS model (see Methods). The two axes represent relative chromosome length so that every point represents the likelihood analysis on the pair of chromosome locations [S1, S2]. The grey scale inset shows the conversion of log likelihood to grey value, with the maximum log likelihood (LLmax) as black and the bars indicating -10 decrements. Since not every pair of chromosome locations was sampled the points are dispersed. The likelihood analysis is symmetrical around the diagonal so the two maxima are identical and represent just one pair of chromosome locations, but interchanging leading and lagging strands. The negative lines represent the location of the annotated ori and ter. The positive lines represent the maximum likelihood (ML) values and Monte Carlo estimated 95% ranges.