| Literature DB >> 35723320 |
Koji Ishiya1, Nobutaka Nakashima1.
Abstract
Homopolymeric tracts (HPTs) can lead to phase variation and DNA replication slippage, driving adaptation to environmental changes and evolution of genes and genomes. However, there is limited information on HPTs in Escherichia; therefore, we conducted a comprehensive cross-strain search for HPTs in Escherichia genomes. We determined the HPT genomic distribution and identified a pattern of high-frequency HPT localization in pathogenic Escherichia lineages. Notably, HPTs localized near transcriptional regulatory genes. Additionally, excessive repeats accumulated in toxin-coding genes. Moreover, the genomic localization of some HPTs might be derived from exogenous DNA, such as that of bacteriophages. Altogether, our findings may prove useful for understanding the role of HPTs in Escherichia genomes.Entities:
Keywords: Escherichia; comparative genomics; homopolymeric tracts; pathogenic lineages; single-nucleotide repeats; transcriptional regulation
Year: 2022 PMID: 35723320 PMCID: PMC8928963 DOI: 10.3390/cimb44020034
Source DB: PubMed Journal: Curr Issues Mol Biol ISSN: 1467-3037 Impact factor: 2.976
Repeat length of the observed homopolymeric tracts.
| Count a | Mean b | SD c | Min d | Max e | |
|---|---|---|---|---|---|
| A | 6,606,099 | 6.27 | 0.55 | 6.00 | 56 |
| C | 569,850 | 6.22 | 0.57 | 6.00 | 70 |
| G | 567,499 | 6.22 | 0.58 | 6.00 | 68 |
| T | 6,592,065 | 6.27 | 0.56 | 6.00 | 108 |
| All | 14,335,513 | 6.27 | 0.55 | 6.00 | 108 |
a Summarized by the type of nucleotide (A/T/G/C), regardless of the length of the homopolymeric tract repeats observed. b The average homopolymeric tract repeat length. c The standard deviation in the length of the repeat. d The minimum length of repeat. e The maximum length of repeat.
Figure 1Number of homopolymeric tract (HPT) repeats in various genomic features. The bar plot shows the percentage of the genomic features to which the detected HPTs belong. “ALL” denotes all HPTs with > 6 nucleotide repeats; remaining row labels correspond to the number of nucleotide repeats in HPT. The colors correspond to different genomic features as indicated.
Figure 2Variation in repeat length in intragenic homopolymeric tracts (HPTs) across Escherichia strains. The cluster map shows genes with significantly-variable intragenic HPTs in the genus Escherichia clustered by their repeat lengths. The horizontal axis shows the gene (protein) name, and the vertical axis shows the corresponding NCBI Taxonomy ID. The cluster map of all observed intragenic HPTs was shown in Figure S4. The color of the heatmap corresponds to the length of the repeats, with yellow and dark blue colors indicating the shortest and longest repeat lengths, respectively.
Figure 3Maximum likelihood phylogenetic tree of the genus Escherichia. The phylogenetic tree was constructed based on the core genes in all 140 strains of the genus Escherichia; the NCBI Taxonomy ID was replaced by the name of each strain. The distribution of the number of HPT repeats in the toxin B gene is shown in a heatmap. Blue and yellow colors denote the smallest and largest number of repeats, respectively. The gray color indicates the loss of toxin B. The bar plot in light green shows the frequency of occurrence of transcriptional regulatory genes in intergenic HPTs (%). The axis of the bar plot is 0 to 0.6% with 0.1% intervals.