| Literature DB >> 24408877 |
Abstract
Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches.Entities:
Keywords: DNA curvature; Z-DNA; palindromes; sequence patterns; sequence repeats
Mesh:
Substances:
Year: 2014 PMID: 24408877 PMCID: PMC4060949 DOI: 10.1093/dnares/dst057
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
List of sequence patterns investigated in this work
| Pattern | Code | Meaning | Examplea |
|---|---|---|---|
| Simple sequence repeats | 1n8 | A single nucleotide repeated 8+ times in a row | G |
| 2n5 | A dinucleotide repeated 5+ times in a row | AC | |
| 3n4 | Analogous to the two examples above | ||
| 4n4 | |||
| 5n4 | |||
| 6n3 | |||
| 7n3 | |||
| 8n3 | |||
| 9n2 | |||
| 10n2 | |||
| 11n2 | |||
| Close direct repeats | 4n6g12 | A tetranucleotide repeated 6+ times with gaps ≤12 nt | TA |
| 6n6g24 | A 6-mer repeated 6+ times with gaps ≤24 nt | ||
| 8n4g24 | An 8-mer repeated 4+ times with gaps ≤24 nt | ||
| cd8g6 | An 8-mer repeated within 6 nt | ||
| cd10g50 | A 10-mer repeated within 50 nt | ||
| Palindromes and inverted repeats | cp8g6 | Inverted repeat of an 8-mer separated by no more than 6 nt | |
| cp10g50 | Inverted repeat of a 10-mer separated by ≤50 nt | ||
| pals9 | 9 nt inverted repeat (no separation) OR 12 nt inverted repeat allowing 1 mismatch OR 15 nt inverted repeat allowing 2 mismatches OR … (one mismatch added for every 3 nt length) | CTGGATCAGGCTAAA⋮TTCAGCCTCATCCAG | |
| pals9g12 | Like pals9 but allowing separation up to 12 bp | ||
| pals12g20 | Analogous to the example above | ||
| H-DNA-related patterns | cm8g6 | Mirror repeat of an 8-mer separated by ≤6 nt | |
| cm10g50 | Mirror repeat of a 10-mer separated by ≤50 nt | ||
| mirs9 | 9 nt mirror repeat (no separation) OR 12 nt mirror repeat allowing 1 mismatch OR 15 nt mirror repeat allowing 2 mismatches OR … (one mismatch added for every 3 nt length) | CTGGATCAGGCTAAA⋮AACTCGGACTGGGTC | |
| mirs9g12 | Like mirs9 but allowing separation up to 12 bp | ||
| mirs12g20 | Analogous to mirs9g12 | ||
| R15 | Run of ≥15 purines or pyrimidines | AAGGGAGGGAGGAGA | |
| R30 | Run of ≥30 purines or pyrimidines | ||
| R30e3 | Run of ≥30 purines or pyrimidines allowing ≤3 errors | ||
| R45e6 | Analogous to the example above | ||
| R60e9 | |||
| G-DNA-related patterns | GG8g4 | 8 or more GG dimers separated by ≤4 nt from each other | |
| GGG4g6 | Analogous to the example above | ||
| GGGG4g6 | |||
| Z-DNA-related patterns | GC6 | Alternating G–C, ≥6 nt length | GCGCGC |
| GC8 | Alternating G–C, ≥8 nt length | ||
| RY12 | Alternating R-Y, ≥12 nt length | TGTACGTGTGCA | |
| RY12e1 | Like RY12 but allowing 1 error | TGTACGAGTGCA | |
| RY18e2 | Alternating R-Y, ≥18 nt length, ≤2 errors | ||
| RY24e3 | Alternating R-Y, ≥24 nt length, ≤3 errors | ||
| DNA bending | bend45w60 | Predicted bend of ≥45° within a ≤60 bp segment | |
| bend60w100 | Predicted bend of ≥60° within a ≤100 bp segment | ||
| bend90w120 | Predicted bend of ≥90° within a ≤120 bp segment |
aSegments matching the sequence pattern are underscored, mismatches are shaded, and symmetrical segments are separated by a vertical dashed line.
Representation of sequence patterns in different phyla
| Pattern name | Pattern code | AlPr | BePr | GaPr | DePr | EpPr | Firm | Acti | Cyan | Bact | Chlb | Chlf | Dein | Fuso | Chla | Spir | Acid | Verr | Defe | Plan | Aqui | Ther | Eury | Cren |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 58 | 39 | 76 | 23 | 11 | 61 | 63 | 12 | 34 | 5 | 8 | 6 | 5 | 6 | 4 | 4 | 4 | 4 | 4 | 8 | 5 | 44 | 16 | ||
| Simple sequence repeats | 1n8 | −4.00 | −3.00 | −4.00 | −4.00 | −4.00 | −4.00 | −4.00 | −4.00 | −4.00 | −1.67 | −4.00 | −4.00 | −4.00 | −3.79 | −3.75 | −3.00 | −3.00 | −4.00 | −3.50 | −4.00 | −4.00 | −4.00 | −4.00 |
| 2n5 | −2.21 | −2.00 | −3.00 | −3.00 | −4.00 | −4.00 | −3.00 | −3.63 | −4.00 | −3.00 | −3.00 | −2.60 | −4.00 | −4.00 | −3.50 | −2.00 | −2.50 | −4.00 | −2.00 | −4.00 | −4.00 | −3.00 | −3.00 | |
| 3n4 | −0.78 | −2.00 | −2.09 | −1.00 | −3.00 | −3.00 | −2.00 | −1.50 | −2.67 | −2.00 | −1.50 | −1.50 | −4.00 | −3.00 | −3.17 | −0.25 | −1.00 | −3.50 | −1.00 | −3.50 | −4.00 | −3.00 | −3.00 | |
| 4n4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | −1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 5n4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.07 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 6n3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.07 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 7n3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.75 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.50 | 1.00 | 2.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 8n3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.50 | 0.00 | 0.00 | 0.50 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 9n2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | −0.36 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 10n2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.17 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | |
| 11n2 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.59 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Close direct repeats | 4n6g12 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.13 | 0.00 | 0.00 | 0.00 | 1.25 | 0.00 | 0.00 | 0.00 | 0.00 |
| 6n6g24 | 0.42 | 0.25 | 0.88 | 1.00 | 0.00 | 0.30 | 2.00 | 1.50 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.98 | 1.50 | 0.00 | 1.50 | 1.25 | 0.00 | 0.00 | 0.10 | 0.00 | |
| 8n4g24 | 0.50 | 1.00 | 1.14 | 2.00 | 0.00 | 1.00 | 3.00 | 3.00 | 1.50 | 2.00 | 1.00 | 0.00 | 0.00 | 0.00 | 1.25 | 2.25 | 0.50 | 1.50 | 2.75 | 0.00 | 0.00 | 1.00 | 0.00 | |
| cd8g6 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 | 0.00 | 2.00 | 1.92 | 0.00 | 0.00 | 0.75 | 1.20 | 0.00 | 0.00 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.60 | 1.00 | |
| cd10g50 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.60 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | −0.19 | 0.15 | 0.00 | 0.00 | −1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Palindromes and inverted repeats | cp8g6 | 3.00 | 3.00 | 4.00 | 3.50 | 3.00 | 4.00 | 3.00 | 1.81 | 3.00 | 2.00 | 4.00 | 2.63 | 3.00 | 4.00 | 2.86 | 3.50 | 4.00 | 3.00 | 1.50 | 2.00 | 4.00 | 2.00 | 2.00 |
| cp10g50 | 3.00 | 4.00 | 4.00 | 4.00 | 3.83 | 4.00 | 4.00 | 3.54 | 4.00 | 4.00 | 4.00 | 3.75 | 4.00 | 4.00 | 3.69 | 3.50 | 2.00 | 3.00 | 3.50 | 1.00 | 4.00 | 1.00 | 1.00 | |
| pals9 | 1.09 | 2.00 | 3.35 | 3.00 | 2.00 | 4.00 | 4.00 | 2.00 | 4.00 | 3.00 | 3.50 | 3.00 | 3.00 | 3.96 | 2.17 | 1.50 | 2.00 | 1.50 | 1.00 | 0.00 | 3.50 | 0.10 | 0.00 | |
| pals9g12 | 3.66 | 4.00 | 4.00 | 4.00 | 3.72 | 4.00 | 4.00 | 3.71 | 4.00 | 4.00 | 4.00 | 3.75 | 4.00 | 4.00 | 3.77 | 4.00 | 4.00 | 3.00 | 4.00 | 1.00 | 4.00 | 2.00 | 1.00 | |
| pals12g20 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 4.00 | 3.77 | 4.00 | 4.00 | 3.00 | 4.00 | 0.00 | 4.00 | 1.00 | 0.22 | |
| H-DNA-related patterns | cm8g6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.34 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| cm10g50 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.34 | 0.25 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| mirs9 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| mirs9g12 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.25 | 0.17 | 0.00 | 0.00 | 0.34 | 0.00 | 0.00 | 0.00 | 0.25 | 0.50 | 0.00 | 0.00 | 0.25 | 0.00 | 0.00 | 0.00 | 0.00 | |
| mirs12g20 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.09 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.13 | 0.00 | 0.50 | 0.00 | 0.25 | 0.00 | 0.00 | 0.00 | 0.00 | |
| R15 | −0.20 | 0.00 | 0.00 | 0.00 | −1.00 | −0.20 | 0.00 | −1.07 | 0.00 | 0.00 | 0.25 | −1.00 | −4.00 | 0.00 | −0.90 | −0.50 | 1.00 | −3.00 | 0.00 | −2.00 | −3.00 | −1.00 | −1.00 | |
| R30 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.38 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| R30e3 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | −2.00 | −0.25 | −0.13 | 0.00 | 0.50 | −1.00 | 0.00 | −0.50 | −1.50 | 0.00 | 0.00 | |
| R45e6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | −2.00 | 0.00 | 0.15 | 0.00 | 0.50 | 0.00 | 0.50 | −0.50 | 0.00 | 0.00 | 0.00 | |
| R60e9 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.07 | 0.00 | 0.00 | 0.00 | 0.50 | 0.00 | 0.00 | 0.00 | 0.00 | |
| G-DNA-related patterns | GG8g4 | 1.00 | 1.33 | 0.00 | 0.00 | 0.00 | 0.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 2.70 | 0.00 | 0.00 | 0.54 | 1.00 | 0.00 | 0.00 | 1.50 | 0.00 | 0.00 | 0.00 | 0.00 |
| GGG4g6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | −0.63 | 0.00 | 0.00 | 0.00 | 0.00 | −0.31 | 1.00 | 0.00 | 0.00 | −0.25 | 0.00 | 0.00 | 0.00 | 0.00 | −0.50 | 0.00 | −0.42 | 0.00 | |
| GGGG4g6 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.17 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| Z-DNA-related patterns | GC6 | −1.00 | −1.00 | −2.00 | −2.00 | −1.00 | −1.00 | −2.00 | −2.00 | −1.00 | −4.00 | −2.00 | −1.20 | 0.00 | 0.00 | −0.77 | −1.50 | −1.00 | −0.50 | −2.75 | −0.25 | 0.00 | −2.00 | −1.00 |
| GC8 | −0.50 | −1.00 | −1.00 | −1.00 | 0.00 | 0.00 | −3.00 | −0.25 | 0.00 | −2.00 | −1.34 | −1.25 | 0.00 | 0.00 | 0.00 | −0.50 | 0.00 | 0.00 | −3.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| RY12 | −2.50 | −2.00 | −3.00 | −1.00 | −0.89 | −0.33 | −2.00 | −2.00 | −0.35 | −2.00 | −3.25 | −0.25 | 0.00 | 0.42 | −0.09 | −0.75 | 0.00 | −1.00 | −3.00 | 0.00 | 0.00 | −0.72 | 0.00 | |
| RY12e1 | −2.00 | −1.40 | −3.00 | −1.00 | −1.00 | −0.25 | −2.00 | −3.00 | 0.00 | −3.00 | −2.25 | 0.00 | 0.00 | 0.84 | 0.05 | −0.75 | 0.00 | −1.00 | −3.00 | 0.50 | 0.00 | −0.64 | −0.98 | |
| RY18e2 | −2.65 | −2.00 | −2.53 | −1.00 | 0.00 | 0.00 | −2.00 | −2.00 | 0.00 | −2.00 | −3.00 | −1.00 | 0.00 | 0.75 | 0.00 | −0.75 | 0.00 | 0.00 | −3.50 | 0.00 | 0.00 | 0.00 | 0.00 | |
| RY24e3 | −1.03 | −1.50 | −1.00 | 0.00 | 0.00 | 0.00 | −1.00 | −1.00 | 0.00 | −1.00 | −1.25 | 0.00 | 0.00 | 0.00 | −0.17 | −1.00 | 0.00 | 0.00 | −1.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| DNA bending | bend45w60 | 0.28 | 0.00 | 0.07 | 1.00 | 3.00 | 2.00 | 0.00 | 1.50 | 2.00 | 1.00 | 0.17 | 0.45 | 2.00 | 1.50 | 0.50 | 0.00 | 0.00 | 2.50 | 0.00 | 2.00 | 2.00 | 2.00 | 0.00 |
| bend60w100 | 0.00 | 0.00 | 0.00 | 0.50 | 2.97 | 2.00 | 0.00 | 1.00 | 2.00 | 0.00 | 0.00 | 0.00 | 2.00 | 0.75 | 0.25 | 0.00 | 0.00 | 3.00 | 0.00 | 2.25 | 3.00 | 2.00 | 0.00 | |
| bend90w120 | 0.00 | 0.00 | 0.00 | 0.00 | 3.00 | 1.00 | 0.00 | 1.00 | 1.00 | 0.00 | 0.00 | 0.00 | 3.00 | 0.00 | 0.20 | 0.00 | 0.00 | 3.50 | 0.00 | 2.00 | 3.00 | 1.00 | 0.00 |
Numbers in the table refer to the medians of pattern representation among all genera of the corresponding phylum. The pattern representations were categorized into nine categories from −4 (extremely under-represented), through 0 (normally represented), to +4 (extremely over-represented). See Materials and methods for details. A colour version of this table is presented in Supplementary Materials as Table S8. Codes in the second column refer to specific sequence patterns (see Table 1). Columns represent different phyla abbreviated as follows: AlPr, α-proteobacteria; BePr, β-proteobacteria; GaPr, γ-proteobacteria; DePr, δ-proteobacteria; EpPr, ɛ-proteobacteria; Firm, Firmicutes; Acti, Actinobacteria; Cyan, Cyanobacteria; Bact, Bacteroidetes; Chlb, Chlorobi; Chlf, Chloroflexi; Dein, Deinococcus-Thermus; Fuso, Fusobacteria; Chla, Chlamydiae; Spir, Spirochaetes; Acid, Acidobacteria; Verr, Verrucomicrobia; Defe, Deferribacteres; Plan, Planctomycetes; Aqui, Aquificales; Ther, Thermotogae; Eury, Euryarchaeota; Cren, Crenarchaeota. Numbers in the second row indicate the number of genera available for each phylum. Only phyla represented by three or more genera are shown.
Representation of sequence patterns in different OGT and oxygen requirement classes
| Pattern name | Pattern code | Psychrophile | Mesophile | Thermophile | Hyperthermophile | Anaerobe | Aerobe | Facultative | Microaerophile |
|---|---|---|---|---|---|---|---|---|---|
| 18 | 382 | 72 | 26 | 159 | 202 | 95 | 9 | ||
| Simple sequence repeats | 1n8 | −3.83 | −3.54 | −3.89 | −4.00 | −3.73 | −3.61 | −3.65 | −3.67 |
| 2n5 | −3.15 | −2.89 | −3.40 | −3.45 | −3.30 | −2.74 | −3.02 | −3.11 | |
| 3n4 | −1.93 | −1.92 | −2.83 | −3.09 | −2.56 | −1.75 | −2.15 | −2.33 | |
| 4n4 | −0.06 | 0.01 | −0.06 | −0.08 | −0.04 | 0.01 | 0.00 | −0.10 | |
| 5n4 | 0.00 | 0.07 | 0.00 | 0.00 | 0.02 | 0.10 | 0.01 | 0.11 | |
| 6n3 | 0.20 | 0.19 | 0.06 | −0.03 | 0.05 | 0.25 | 0.08 | 0.31 | |
| 7n3 | 1.27 | 0.48 | 0.45 | 0.00 | 0.42 | 0.59 | 0.35 | 0.89 | |
| 8n3 | 0.56 | 0.30 | 0.12 | 0.01 | 0.25 | 0.27 | 0.20 | 0.33 | |
| 9n2 | −0.08 | 0.09 | 0.22 | 0.04 | 0.09 | 0.19 | −0.11 | −0.11 | |
| 10n2 | 0.09 | 0.25 | 0.25 | 0.28 | 0.27 | 0.24 | 0.16 | 0.00 | |
| 11n2 | 0.21 | 0.32 | 0.30 | 0.40 | 0.42 | 0.26 | 0.24 | 0.03 | |
| Close direct repeats | 4n6g12 | 0.45 | 0.58 | 0.48 | 0.82 | 0.55 | 0.73 | 0.31 | 0.50 |
| 6n6g24 | 1.35 | 1.16 | 0.71 | 0.34 | 1.00 | 1.21 | 0.81 | 1.39 | |
| 8n4g24 | 2.02 | 1.61 | 0.80 | 0.28 | 1.27 | 1.69 | 1.18 | 1.13 | |
| cd8g6 | 0.39 | 0.73 | 0.80 | 1.20 | 0.87 | 0.82 | 0.41 | 0.70 | |
| cd10g50 | 0.46 | 0.41 | 0.05 | −0.07 | 0.25 | 0.60 | 0.01 | 0.07 | |
| Palindromes and inverted repeats | cp8g6 | 3.57 | 2.78 | 2.81 | 1.99 | 2.98 | 2.39 | 3.21 | 3.22 |
| cp10g50 | 3.56 | 3.26 | 2.98 | 1.64 | 3.15 | 3.03 | 3.45 | 3.20 | |
| pals9 | 3.39 | 2.58 | 2.13 | 0.71 | 2.47 | 2.27 | 2.89 | 2.47 | |
| pals9g12 | 3.83 | 3.46 | 3.05 | 1.79 | 3.33 | 3.22 | 3.55 | 3.64 | |
| pals12g20 | 3.72 | 3.58 | 2.97 | 1.17 | 3.16 | 3.47 | 3.62 | 3.00 | |
| H-DNA-related patterns | cm8g6 | 0.02 | 0.20 | 0.21 | 0.33 | 0.24 | 0.24 | 0.15 | 0.04 |
| cm10g50 | 0.02 | 0.16 | 0.10 | 0.15 | 0.16 | 0.18 | 0.10 | 0.00 | |
| mirs9 | 0.00 | 0.13 | 0.19 | 0.17 | 0.14 | 0.18 | 0.08 | −0.02 | |
| mirs9g12 | 0.15 | 0.30 | 0.35 | 0.44 | 0.33 | 0.38 | 0.20 | 0.00 | |
| mirs12g20 | 0.15 | 0.27 | 0.28 | 0.20 | 0.26 | 0.33 | 0.16 | 0.22 | |
| R15 | −0.47 | −0.52 | −0.96 | −1.77 | −0.91 | −0.55 | −0.34 | −1.78 | |
| R30 | 0.03 | 0.07 | 0.08 | 0.11 | 0.10 | 0.04 | 0.07 | 0.00 | |
| R30e3 | 0.08 | −0.01 | −0.18 | −0.66 | −0.23 | 0.00 | 0.14 | −0.66 | |
| R45e6 | 0.17 | 0.08 | −0.03 | −0.32 | −0.06 | 0.10 | 0.15 | −0.02 | |
| R60e9 | 0.15 | 0.13 | 0.11 | −0.03 | 0.14 | 0.10 | 0.12 | 0.00 | |
| G-DNA-related patterns | GG8g4 | 0.22 | 0.72 | 0.24 | 0.14 | 0.20 | 1.03 | 0.48 | 0.44 |
| GGG4g6 | −0.08 | −0.26 | −0.78 | −0.65 | −0.67 | −0.18 | −0.26 | −0.22 | |
| GGGG4g6 | 0.00 | 0.10 | −0.06 | −0.12 | 0.01 | 0.11 | 0.02 | 0.11 | |
| Z-DNA-related patterns | GC6 | −1.37 | −1.50 | −1.35 | −0.95 | −1.41 | −1.38 | −1.71 | −1.42 |
| GC8 | −0.75 | −1.10 | −0.84 | −0.18 | −0.47 | −1.42 | −1.13 | −0.37 | |
| RY12 | −1.79 | −1.58 | −0.88 | −0.16 | −0.78 | −1.73 | −2.03 | −0.99 | |
| RY12e1 | −1.81 | −1.40 | −0.71 | −0.31 | −0.85 | −1.43 | −1.93 | −0.44 | |
| RY18e2 | −1.66 | −1.45 | −0.72 | −0.04 | −0.58 | −1.60 | −2.07 | −0.37 | |
| RY24e3 | −0.60 | −0.78 | −0.25 | 0.05 | −0.30 | −0.79 | −1.12 | −0.44 | |
| DNA Bending | bend45w60 | 0.60 | 0.93 | 1.51 | 1.11 | 1.57 | 0.59 | 0.85 | 1.70 |
| bend60w100 | 0.32 | 0.79 | 1.40 | 1.17 | 1.42 | 0.50 | 0.74 | 1.48 | |
| bend90w120 | 0.28 | 0.58 | 1.23 | 0.94 | 1.11 | 0.34 | 0.66 | 1.44 |
Numbers in the table refer to the average significance category for all genera within each class of organisms. Numbers below the class description indicate numbers of available genera of each class. Anaerobe includes both obligate anaerobes and anaerobes; Aerobe includes both obligate aerobes and aerobes. See Supplementary Table S5 for coloured version of this table.
Figure 1.Comparison of representations of selected patterns in different OGT classes. Bars show the percentage of species in each OGT class which have the given pattern under-represented, normally represented, or over-represented. The pattern is considered over-represented if the P-value is <10−4 and observed to expected ratio >1.10 (representation level 2 or higher) for majority of the complete genomes available for that genera, it is deemed under-represented if the P-value is <10−4 and observed to expected ratio <0.91, and normally represented otherwise. See Materials and methods and Supplementary Table S8. The four patterns for which the data are shown are representative of close repeat structures (8n4g24, top left), palindromes and close inverted repeats (pals9g12, top right), potential Z-DNA-promoting patterns (RY12, bottom left), and DNA bending pattrens (bend60w100, bottom right). See Table 1 for description of the pattern codes.