| Literature DB >> 25798594 |
Wolfgang Schüler1, Ignas Bunikis2, Jacqueline Weber-Lehman3, Pär Comstedt1, Sabrina Kutschan-Bunikis2, Gerold Stanek4, Jutta Huber3, Andreas Meinke1, Sven Bergström2, Urban Lundberg1.
Abstract
The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp) and 13 plasmids (8 linear and 5 circular) together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25798594 PMCID: PMC4370689 DOI: 10.1371/journal.pone.0120548
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Comparison of the sequence types of B. afzelii strains according to multi locus sequence typing (MLST), and ospA and ospC typing.
| Organism | MLST |
|
|
|---|---|---|---|
| Baf_K78 | ST335 | 3 (2) | A5 |
| Baf_ACA-1 | - | 3 (2) | A1 |
| Baf_PKo | ST71 | 1 (2) | A2 |
| Baf_HLJ01 | ST106 | - | - |
| Bbu_B31 | ST1 | 9 (1) | B4 |
MLST typing, according to the system described by Margos et al.[54] comprising 592 defined profiles, assigned K78 to sequence type ST335 which is identical to the Italian strains 0600839I and 05001891I in the Borrelia MLST database [55]. No match for ACA-1 and Tom3107 was found in the MLST data base with their respective sequence profiles. Column ospA lists the ospA sequence type from the MLSA database [56] and in parentheses the OspA serotypes. The nearest hit for Tom3107 is ospA sequence type-3 with 1 bp mismatch: ospC classification follows the scheme of Seinost et al. [57] and Lagal et al. [58]. Tom3107 do not fall into any invasive group. For strain HLJ01 only the chromosome sequence is available.
Comparison of the replicons found in Borrelia afzelii K78 to the published sequences of B. afzelii strains ACA-1, PKo and B. burgdorferi strain B31.
|
|
|
|
| ||
|---|---|---|---|---|---|
|
| circular | 5 | 5 | 8 | 9 |
|
| linear | 8 | 9 | 9 | 12 |
a Accession numbers (GenBank, RefSeq) are listed in S3 Table.
b Another cp9 plasmid has been described for B31 which is named cp9–2 (renaming the listed to cp9–1) [65]
c The attribution to code “Q” which is the naming for cp32–10 has been made via the presence of the respective plasmid partitioning protein type of the paralogous family 32 (PFam32). The linear plasmid lp56 in B31 is longer and contains parts analog to the cp32–10 type plasmids therefore this plasmid has been proposed to be attributed to code “Q” [46]. Linear plasmids lp32–10, as seen in PKo and ACA-1, carry a PFam32 gene similar to cp32–10 and therefore also get the code “Q” in spite of carrying different gene content.
d There is data from an earlier PKo genome project available, with a chromosome length of 905.4 kbp, GenBank CP000395) with an apparent insert of two genes (BAPKO_0393, BAPKO_0395) and a full definition of the 3’-terminal arcB gene (truncated in the listed chromosome).
e Two more plasmids, cp32–2, which has identical PFam32 and PFam49 genes as cp32–7, and cp32–5 have been described in [66] but have not yet been sequenced in full length.
Comparison of the K78 chromosome to representative chromosomes within Borrelia.
| Organism | Length bp | GC% | Identity % | Indel content % |
|---|---|---|---|---|
|
| 905,949 | 28.3 | 100 (ref) | 0 (ref) |
|
| >903,516 | 28.3 | 99.4 | 0.3 |
|
| 903,609 | 28.3 | 99.5 | 0.3 |
|
| 905,471 | 28.3 | 99.4 | 0.1 |
|
| 905,861 | 28.3 | 99.4 | 0.1 |
|
| 910,724 | 28.6 | 91.1 | 1.7 |
|
| 904,246 | 28.3 | 92.7 | 0.8 |
|
| >902,096 | 28.3 | 92.4 | 1.1 |
|
| 905,534 | 28.4 | 92.9 | 0.7 |
|
| 902,789 | 28.4 | 92.6 | 1.0 |
aSequence identities and indel contents calculated with stretcher (EMBOSS package [44])
bSum of two unconnected non-overlapping contigs (436,767 + 466,749 bp)
cUnfinished assembly (5 contigs, of which the shortest with 1774 bp length has been left out of the comparative analysis)
*Approximate values due to incompleteness of the chromosome assemblies.
Fig 1Chromosomal region of 5S-23S rRNA and 16S rRNA for the B. afzelii strains K78, ACA-1, PKo, Tom3107, HLJ01 and for B. burgdorferi B31.
The rRNAs (marked red), are presented with transcription from right to left as located on the chromosome, and are composed of two copies of 16S rRNA, separated by tRNA-Ala. A tRNA-Ile (transcribed left to right) precedes the tandem repeats of the 23S-5S cassette. In many cases one of the 16S copies has undergone degeneration. In the case of ACA-1 the two contigs constituting the chromosome are separated at the position where the second 23S rRNA copy is expected (vertical red line), meaning the presence or absence of the second copy of 23S rRNA could not be determined due to the lower sequencing quality in this region. There is a high sequence homology among the four B. afzelii strains (except for the second copy of 23S rRNA of ACA-1) in contrast to the sequences in B. burgdorferi B31 rRNAs. The similarity score plots of the Mauve alignments use the backbone color scheme [43] which shows overall similarity in a mauve color or clustering blocks among cluster members in the same color.
Functional classification of the B. afzelii K78 annotated genome, describing a total of 1,309 proteins.
| Chromosome (n = 813) | Plasmids (n = 496) | Functional category (COG) |
|---|---|---|
| 27 | 4 | Amino acid transport and metabolism |
| 49 | 4 | Carbohydrate transport and metabolism |
| 14 | 14 | Cell division and chromosome partitioning |
| 54 | 1 | Cell envelope biogenesis, outer membrane |
| 52 | 0 | Cell motility and secretion |
| 12 | 1 | Coenzyme metabolism |
| 9 | 4 | Defense mechanisms |
| 51 | 7 | DNA replication, recombination, and repair |
| 22 | 1 | Energy production and conversion |
| 66 | 8 | General function prediction only |
| 22 | 1 | Inorganic ion transport and metabolism |
| 32 | 0 | Intracellular trafficking and secretion |
| 15 | 0 | Lipid metabolism |
| 20 | 7 | Nucleotide transport and metabolism |
| 32 | 0 | Posttranslational modification, protein turnover, chaperones |
| 1 | 1 | Secondary metabolites biosynthesis, transport, and catabolism |
| 30 | 0 | Signal transduction mechanisms |
| 23 | 2 | Transcription |
| 118 | 0 | Translation, ribosomal structure and biogenesis |
| 42 | 12 | Function unknown |
| 173 | 429 | Unclassified in COG |
The best-hits per category from rpsblast against COG with a cutoff of E-value 0.01 are counted. Proteins with the best-hit falling into more than one category are counted as hit in each category which results in the addition of 51 hits, resulting in a total of 758 hits to defined COGs.
Number of predicted membrane proteins in four B. afzelii strains and B. burgdorferi B31.
| Genomes | Lipoproteins | Signal peptides | Transmembrane helices | |||
|---|---|---|---|---|---|---|
| Chromosome | Plasmids | Chromosome | Plasmids | Chromosome | Plasmids | |
| Baf_K78 | 31 | 74 | 98 | 30 | 191 | 48 |
| Baf_ACA-1 | 28 | 72 | 91 | 40 | 190 | 48 |
| Baf_PKo | 31 | 85 | 89 | 38 | 192 | 53 |
| Baf_HLJ01 | 27 | - | 89 | - | 197 | - |
| Bbu_B31 | 36 | 74 | 87 | 42 | 179 | 58 |
aLipoprotein predictions (SpLip). Given counts are “probable” and “possible” hits combined.
bSignal peptide prediction (SignalP) were not counted when SpLip predicted a lipidation signal for the protein.
cThe predictions of a single transmembrane helix (TMHMM) was not counted as such when located within the N-terminal 60 amino acids and SignalP predicted a signal protein or SpLip a lipidation site.
Fig 2Alignment of B. afzelii K78 ospC sequence against the sequences of B. afzelii strains from public databases.
A non-redundant set of partial ospC sequences according to BAFK78_B0019 bp 97–583, comprising 59 B. afzelii strains and the sequence of B. burgdorferi B31 as external root reference were included in the analyses. A: Maximum likelihood tree representation, re-rooted with B. burgdorferi B31 as outgroup. Clusters containing strains attributed to human infectivity are boxed, of which the previously identified groups were labelled A1–A8. The strains compared in this study are highlighted in blue. B: A recombination network representation is shown for the sequences in an unrooted distance phylogram. The pairwise homoplasy index test for the B. afzelii sequences (p = 7.8x10-15) indicates significance for the presence of recombination events. The strains compared in this study are highlighted by a yellow background.
Fig 3Telomere types of the linear replicons.
Alignment of the telomeres of the linear replicons in B. afzelii K78 is shown. The sequences are oriented such that their hairpin bend would be positioned to their left side. The typing corresponds to the classification of the telomere types 1–3 according to the spacing between Box 1 (yellow) and Box 3 (blue) or the absence of Box 1 [68, 99, 101]. For five of the sequences the utmost left residue could not be determined and is represented by a “-”as placeholder. Box 1 and Box 3 correspond to previously annotated regions of conservation which are assumed to be directly (Box 3) or indirectly (Box 1) involved in interaction with the telomere resolvase ResT [100, 102]. No telomere data could be obtained for K78 lp28 and the telomeres of lp54L and lp38R are identical. In Box 1 two different sequences, TAT(A/T)AT, are present as in B31. Unlike in B31, where the TATTAT sequence is exclusively found in type 2 telomeres, this sequence is also found in type 1 telomeres of K78. lp28–1L of K78 is an exception while it is compatible with both the definition of type 1 and type 2 telomeres as also seen on lp28–3R of B31 and lp28–2R of PKo. Within the 16 telomeres, 6 have substitutions in Box 3 (5 with one change, and 1 with two changes, marked green).
Fig 4Species-specific variation of the intergenic variable region in the circular plasmid cp26.
The variable sequence segment in the cp26 plasmids of K78 is compared with 25 Borrelia strains. A maximum likelihood tree rooted with a B. bissettii sequence as outgroup, shows the relationship for the intergenic part, which in B. afzelii K78 is situated between BAFK78_A0014 and BAFK78_A0015 (bacterial extracellular solute-binding protein), and which shows a species-specific length. Insertions and deletions within this region have been analyzed with the program Mauve, and the compositional analysis for 16 of the 26 sequences (underlined in the tree view) is shown with related segments marked by color and/or boxes together with a similarity score diagram for each sequence. Blue bars denote segments which, in some strains, have been annotated as short hypothetical proteins (the number of assigned proteins in this region is indicated in parentheses in the tree view).