| Literature DB >> 32584752 |
Andrew J Page1, Emma V Ainsworth1, Gemma C Langridge1.
Abstract
Rearrangements of large genome fragments occur in bacteria between repeat sequences and can impact on growth and gene expression. Homologous recombination resulting in inversion between indirect repeats and excision/translocation between direct repeats enables these structural changes. One form of rearrangement occurs around ribosomal operons, found in multiple copies across many bacteria, but identification of these rearrangements by sequencing requires reads of several thousand bases to span the ribosomal operons. With long-read sequencing aiding the routine generation of complete bacterial assemblies, we have developed socru, a typing method for the order and orientation of genome fragments between ribosomal operons. It allows for a single identifier to convey the order and orientation of genome-level structure and we have successfully applied this typing to 433 of the most common bacterial species. In a focused analysis, we observed the presence of multiple structural genotypes in nine bacterial pathogens, underscoring the importance of routinely assessing this form of variation alongside traditional single-nucleotide polymorphism (SNP) typing.Entities:
Keywords: bacteria; genome structure; rearrangement; sequencing
Mesh:
Substances:
Year: 2020 PMID: 32584752 PMCID: PMC7478630 DOI: 10.1099/mgen.0.000396
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Fig. 1.Structural genotype assignment. Coloured segments denote genome fragments, located between ribosomal operons marked as chevrons. Origin of replication (location of dnaA) is denoted with a dashed line and terminus (dif site) is denoted with a solid line. (a) Baseline references for , and , indicating genome fragments running in clockwise numerical order from 1. Chevron directions indicate the orientation of ribosomal operons. The fragments harbouring the origin and terminus of replication are bordered by indirect repeats and all other fragments are bordered by direct repeats. (b) The pattern 1′43′2′ is a mirror of pattern 1234′ (flipped across the vertical dashed line). However, since dnaA is present on fragment 4, this fragment will always be aligned with the baseline in the forward orientation. (c) There are three valid orders in a four-fragment genome (accounting for mirroring). (d) Impact of independent inversions of fragments on orientation. GS1.0, no inversions; GS1.1, inversion of terminal fragment; GS1.7, inversion of origin fragment [represented as per mirror rule in (b)]; GS1.6, inversion of both terminal and origin fragments (as per mirror rule). (e) The assigned structural genotype is invalid – the orientation of ribosomal operons flanking fragment 2 violate the rule that operons must be oriented from the origin to the terminus of replication. This would be flagged by socru as a ‘red’ assignment denoting structure invalidity, which is indicative of potential misassembly.
Structural variation in bacterial pathogens
|
Pathogen (baseline) |
Baseline no. of fragments (total possible combinations) |
No. of complete RefSeq genomes |
No. of observed arrangements |
Main GS type |
% with main GS type |
No. of likely misassemblies in database |
|---|---|---|---|---|---|---|
|
|
6 (240) |
116 |
5 |
GS1.0 |
69 % |
12 |
|
|
5 (48) |
408 |
10 |
GS2.0* |
59 % |
29 |
|
|
8 (10 080) |
350 |
7 |
GS1.0 |
98 % |
21 |
|
|
6 (240) |
148 |
6 |
GS1.0 |
72 % |
6 |
|
|
4 (12) |
185 |
6 |
GS1.1 |
88% |
13 |
|
|
7 or 8 (up to 10 080) |
88 |
5 |
GS1.0 |
63 % |
16 |
|
|
7(1440) |
838 |
13 |
GS1.0 |
90 % |
62 |
|
|
6(240) |
176 |
5 |
GS1.0 |
91 % |
56 |
|
|
7(1440) |
726 |
29 |
GS1.0 |
79 % |
25 |
|
Non- |
|
|
|
GS1.0 |
94 % |
18 |
|
|
|
|
|
GS2.67 |
66 % |
7 |
Baseline genome accessions: DO GCF_000174395.2, RF122 GCF_000009005.1, NTUH-K2044 GCF_000009885.1, ACICU GCF_000018445.1, PAO1 GCF_000006765.1, K12 MG1655 GCF_000005845.2, 4b_F2365 GCF_000008285.1, LT2 GCF_000006945.2. Enterobacter spp. comprised the following species and baselines: seven fragments, Enterobacter sp. 638 GCF_000016325.1; eight fragments, E. asburiae L1 GCF_000632395.1, E. cloacae ATCC13047 GCF_000025565.1, E. hormaechei ECNIH3 GCF_000750225.1 and E. roggenkampii 35 734 GCF_000807415.2.
*S. aureus GS2.0 harbours six fragments, whereas GS1.0 (36 %) harbours five. S. enterica subdivided to show structural genotypes found in S. enterica subspecies enterica serovar Typhi (S. Typhi) versus the remainder of S. enterica.