| Literature DB >> 22403581 |
Aurian Garcia-Gonzalez1, Ruben J Rivera-Rivera, Steven E Massey.
Abstract
DNA repair is expected to be a modulator of underlying mutation rates, however the major factors affecting the distribution of DNA repair pathways have not been determined. The Proteomic Constraint theory proposes that mutation rates are inversely proportional to the amount of heredity information contained in a genome, which is effectively the proteome. Thus, organisms with larger proteomes are expected to possess more efficient DNA repair. We show that an important factor influencing the presence or absence of four DNA repair genes mutM, mutY, mutL, and mutS is indeed the size of the bacterial proteome. This is true both of intracellular and other bacteria. In addition, the relationship of DNA repair to genome GC content was examined. In principle, if a DNA repair pathway is biased in the types of mutations it corrects, this may alter the genome GC content. The presence of the mismatch repair genes mutL and mutS was not correlated with genome GC content, consistent with their involvement in an unbiased DNA repair pathway. In contrast, the presence of the base excision repair genes mutM and mutY, whose products both correct GC → AT mutations, was positively correlated with genome GC content, consistent with their biased repair mechanism. Phylogenetic analysis however indicates that the relationship between the presence of mutM and mutY genes and genome GC content is not a simple one.Entities:
Keywords: AT bias; DNA repair; bacterial genome; proteome size
Year: 2012 PMID: 22403581 PMCID: PMC3288817 DOI: 10.3389/fgene.2012.00003
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1The BER and MMR pathways (A) The role of the products of the . (a) Oxidation of guanine to 8-oxoguanine (circled G) occurs either before or after incorporation of dGTP into the DNA strand; (b) mutM removes 8-oxoguanine that is present in double-stranded DNA; (c) mutY removes a mismatched adenine that is incorporated opposite 8-oxoguanine. (B) The role of the products of the . The mutS gene product recognizes a mismatched basepair in the DNA strand after replication. The mutL gene product is recruited to the complex, while the mutH gene product is used to identify the original DNA strand, which is methylated, from the newly replicated DNA strand.
Mean GC contents and proteome sizes of bacterial genomes that possess or lack the DNA repair genes .
| Gene | Number of genes absent from 699 total genomes or 604 genomes from extracellular bacteria | Mean GC content if gene present (%) | Mean GC content if gene absent (%) | Mean proteome size if gene present (codons) | Mean proteome size if gene absent (codons) | ||
|---|---|---|---|---|---|---|---|
| 135 | 50.5 | 38.2 | 3.1E−21 | 1154894 | 717216 | 9.2E−18 | |
| 94 (extracellular) | 52.4 | 39.9 | 4.6E−18 | 1239048 | 885246 | 4.0E−11 | |
| 189 | 51.9 | 38.1 | 6.3E−35 | 1206817 | 706366 | 1.2E−27 | |
| 129 (extracellular) | 52.9 | 41.5 | 1.7E−19 | 1262652 | 894257 | 1.4E−13 | |
| 123 | 48.1 | 48.1 | ns | 1131088 | 877910 | 1.9E−7 | |
| 93 (extracellular) | 49.8 | 53.9 | 3.9E−3 | 1226575 | 1068544 | 5.1E−4 | |
| 83 | 47.7 | 51.2 | 0.03 | 1080884 | 992286 | 0.02 | |
| 50 (extracellular) | 49.1 | 64.5 | 8.4E−17 | 1158575 | 1454162 | 8.0E−4 | |
| 59 | 52.4 | 40.0 | 9.1E−10 | 1208453 | 760976 | 1.3E−9 | |
| 41 (extracellular) | 54.2 | 43.3 | 3.4E−7 | 1298041 | 960711.9 | 1.4E−6 | |
| 79 | 53.2 | 41.1 | 2.0E−11 | 1258561 | 751846 | 1.7E−11 | |
| 56 (extracellular) | 54.5 | 45.2 | 9.8E−7 | 1327864 | 956273.1 | 8.5E−7 | |
| 47 | 50.0 | 47.8 | ns | 1167556 | 842510 | 2.6E−5 | |
| 32 (extracellular) | 51.7 | 56.2 | 0.03 | 1258482 | 1115556 | 0.02 | |
| 35 | 49.8 | 48.9 | ns | 1145003 | 883780 | 0.002 | |
| 20 (extracellular) | 51.2 | 63.7 | 7.9E−6 | 1226206 | 1375153 | ns (0.56) | |
A Mann–Whitney test was conducted to test the significance of the difference in the means of GC content and proteome size, between the genomes that possess a gene and those that do not. “ns” denotes “not significant.” Table (A) shows the results of the analysis on the entire dataset, (B) shows results of the analysis on the dataset where only one species was selected from each genus.
Figure A1The relationship between the presence or absence of genes involved in DNA repair and proteome size. Plots show presence or absence of (A) mutM (B) mutY (C) mutL (D) mutS. Data was generated from 699 complete eubacterial genomes. Intracellular and extracellular bacteria are indicated.
Φ Coefficient for pairwise distributions of DNA repair genes.
| Gene pairs | ||||||
|---|---|---|---|---|---|---|
| Total dataset (699 genomes) | 0.66 | 0.25 | 0 | 0.08 | 0 | 0.10 |
| Extracellular bacteria (604 genomes) | 0.70 | 0.32 | 0.11 | −0.03 | −0.08 | −0.12 |
The Φ coefficient was calculated as described in Section .
Figure A2Heatmap clustering analysis showing presence/absence of . Rows represent genomes and columns represent genes. Black denotes absence of a gene in a particular genome, while gray indicates presence. The clustering of columns follows the complete linkage method, as implemented in the R statistical package. The right hand side of the plot displays whether a genome belongs to a bacterium that is intracellular (black square).
Gene contents of pathogenic and non-pathogenic extracellular bacteria 604 extracellular bacterial genomes were examined for their average proteome sizes, average GC contents, and gene contents.
| Extracellular pathogens (237 genomes) | Extracellular non-pathogens (367 genomes) | ||
|---|---|---|---|
| Mean proteome size (codons) | 1139797 | 1202806 | ns |
| Mean GC content | 49% | 51% | 0.045 |
| Percentage that lack | 24 | 24 | – |
| Percentage that lack | 42 | 72 | – |
| Percentage that lack | 31 | 31 | – |
| Percentage that lack | 25 | 20 | – |
| Mean proteome size (codons) | 1094197 | 1198376 | ns |
| Mean GC content | 50% | 50% | No difference in the means |
| Percentage that lack | 22 | 25 | – |
| Percentage that lack | 46 | 64 | – |
| Percentage that lack | 30 | 31 | – |
| Percentage that lack | 20 | 21 | – |
| Mean proteome size (number of codons) | 342246 | 1183493 | 2.0E–50 |
| Mean GC content | 34% | 50% | 2.6E–29 |
| Percentage that lack | 46 | 25 | – |
| Percentage that lack | 72 | 61 | – |
| Percentage that lack | 40 | 31 | – |
| Percentage that lack | 44 | 23 | – |
(A) Pathogenic (opportunistic and host associated) extracellular bacteria were compared with non-pathogenic extracellular bacteria; (B) host associated pathogenic extracellular bacteria were compared with all other extracellular bacteria; (C) Intracellular bacteria were compared with extracellular bacteria. “ns” denotes not significant.
Figure A3The relationship between the presence/absence of genes involved in DNA repair and bacterial genome GC content. Plots show presence or absence of (A) mutM (B) mutY (C) mutL (D) mutS. Data was generated from 699 complete eubacterial genomes. Intracellular and extracellular bacteria are indicated.
Figure 2Phylogenetic trees of the cyanobacteria and alphaproteobacteria, with genome GC contents and presence of DNA repair genes (A) Cyanobacteria; (B) Alphaproteobacteria. Trees were constructed as described in Section “Materials and Methods.” Numerals refer to posterior probability.
| Gene | |||
|---|---|---|---|
| 1 (Gene present) | 0 (Gene absent) | ||
| 1 (Gene present) | |||
| 0 (Gene absent) | |||