| Literature DB >> 28028472 |
YaDong Wang1, Christopher Chandler2.
Abstract
The bacterial genus Rickettsiellabelongs to the order Legionellales in the Gammaproteobacteria, and consists of several described species and pathotypes, most of which are considered to be intracellular pathogens infecting arthropods. Two members of this genus, R. grylliand R. isopodorum, are known to infect terrestrial isopod crustaceans. In this study, we assembled a draft genomic sequence for R. isopodorum, and performed a comparative genomic analysis with R. grylli. We found evidence for several candidate genomic island regions in R. isopodorum, none of which appear in the previously available R. grylli genome sequence.Furthermore, one of these genomic island candidates in R. isopodorum contained a gene that encodes a cytotoxin partially homologous to those found in Photorhabdus luminescensand Xenorhabdus nematophilus (Enterobacteriaceae), suggesting that horizontal gene transfer may have played a role in the evolution of pathogenicity in Rickettsiella. These results lay the groundwork for future studies on the mechanisms underlying pathogenesis in R. isopodorum, and this system may provide a good model for studying the evolution of host-microbe interactions in nature.Entities:
Keywords: Genomic islands; Rickettsiella; Trachelipus rathkei; mcf2
Year: 2016 PMID: 28028472 PMCID: PMC5181103 DOI: 10.7717/peerj.2806
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Rickettsiella contigs in initial assembly.
Initial assembly of all Trachelipus rathkei sequencing data seems to contain contigs coming from two distinct Rickettsiella lineages. (A) One set of contigs displays high similarity to R. grylli NZ_AAQJ00000000 (red dots), while the contigs in the other set are more divergent (blue dots), but contigs in both sets seem to span the entire length of the R. grylli genome. (B) The two sets of contigs also form two distinct clusters based on sequencing depth in the two T. rathkei samples; the high similarity contigs are present at moderate depth in the male sample and very low depth in the female sample, while the low similarity contigs are present at very high depth in the female sample and low depth in the male sample; only a small number of contigs seem to be mis-classified.
Assembly statistics for Rickettsiella grylli and Rickettsiella isopodorum genomes assembled from Trachelipus rathkei sequencing data.
Pseudogenes includes incomplete or partial gene sequences, which may include functional genes that are falsely predicted to be pseudogenes because they are fragmented or only partially assembled; the number in parentheses indicates the number of predicted pseudogenes with frameshift mutations or internal stop codons, excluding partially assembled genes.
| Number of contigs | 851 | 198 |
| Total length (bp) | 1,369,903 | 1,509,158 |
| N50 (bp) | 5,907 | 384,641 |
| Longest contig (bp) | 33,090 | 583,532 |
| GC content (%) | 38.32 | 37.06 |
| Total genes | 1,454 | 1,330 |
| Pseudogenes | 337 (9) | 39 (3) |
| rRNAs | 1 complete 5S, 2 partial 16S, 2 partial 23S | 1 complete 16S, 1 complete 23S |
| tRNAs | 34 | 40 |
| ncRNAs | 4 | 4 |
Figure 2Phylogenies.
(A) Phylogeny based on ftsY, gidA, rpsA, and sucB sequences from the two Rickettsiella genomes assembled from Trachelipus rathkei, R. grylli NZ_AAQJ00000000, D. massiliensis, and other Rickettsiella sequences from other phylogenetic studies. Phylogenies were generated in MEGA7 using maximum likelihood with the Tamura-Nei model, and node support was estimated using bootstrapping with 100 replicates. (B) Phylogeny based on gidA sequences from the same samples as (A), with the addition of several other isopod samples from upstate New York in which gidA sequences were obtained by PCR and Sanger sequencing.
Figure 3Synteny and genomic islands.
Dot plots showing synteny between Rickettsiella isopodorum and (A) R. grylli and (B) Diplorickettsia massiliensis. Light gray lines indicate borders between contigs in each assembly. Vertical pink bars indicate candidate genomic island regions, i.e., sequences in R. isopodorum that have no matches in blastn or tblastx searches against each reference genome.
Candidate genomic island regions in R. isopodorum.
% Cov.: percentage of query sequence that aligns to the best BLAST hit; % Id.: percentage of aligned amino acids that are identical to best BLAST hit; % Pos.: percentage of positive-scoring amino acids in aligned region of best BLAST hit.
| Predicted location | Size (bp) | GC (%) | Predicted genes & functional notes | Size (aa) | % Cov. | % Id. | % Pos. |
|---|---|---|---|---|---|---|---|
| contig_191: 18,201–21,800 | 3,600 | 27.3 | A1D18_00540: no apparent homology | 209 | n/a | n/a | n/a |
| A1D18_00545: no apparent homology | 105 | n/a | n/a | n/a | |||
| A1D18_00550: no apparent homology | 61 | n/a | n/a | n/a | |||
| A1D18_00555: no apparent homology | 103 | n/a | n/a | n/a | |||
| A1D18_00560: portion matches hypothetical protein in | 274 | 50 | 35 | 53 | |||
| contig_193: 31,601–36,400 | 4,800 | 28.0 | A1D18_00930: no apparent homology | 329 | n/a | n/a | n/a |
| A1D18_00935: no apparent homology | 428 | n/a | n/a | n/a | |||
| A1D18_00940: no apparent homology | 448 | n/a | n/a | n/a | |||
| A1D18_00945: no apparent homology | 149 | n/a | n/a | n/a | |||
| contig_196: 42,801–48,200 | 5,400 | 31.0 | A1D18_01965: matches hypothetical protein from nucleopolyhedrovirus virus from Lepidoptera (e = 7e−27); also matches hypothetical protein and surface-related protein entries from | 1,213 | 40 | 25 | 41 |
| contig_196: 85,801–89,600 | 3,800 | 37.9 | A1D18_02135: has matches in | 1,862 | 98 | 40 | 58 |
| contig_197: 50,601–54,600 | 4,000 | 33.5 | A1D18_02420: matches permease from Yersinia (e = 0) | 452 | 99 | 75 | 85 |
| A1D18_02425: matches a hypothetical protein in | 412 | 85 | 26 | 44 | |||
| A1D18_02430: no apparent homology | 255 | n/a | n/a | n/a | |||
| contig_197: 72,601–83,600 | 11,000 | 35.5 | A1D18_02500: part of gene is outside of predicted island; has partial match to a hypothetical protein in | 545 | 62 | 38 | 51 |
| A1D18_02505: Contains a predicted RING domain, a type of zinc finger domain implicated in many functions, and a Ubox domain, implicated in ubiquitination; matches are in eukaryotes, not prokaryotes | 313 | 18 | 34 | 50 | |||
| A1D18_02510: Matches | 2,928 | 33 | 27 | 47 | |||
| contig_197: 278,601–285,600 | 7,000 | 33.6 | A1D18_03425; has three domains common to thyamine pyrophosphate enzymes; top hit is an uncultured bacterium, but secondary matches in | 628 | 95 | 69 | 80 |
| A1D18_03430; contains predicted phosphate binding domain, aldolase domain; matches uncultured bacterium, | 335 | 99 | 73 | 88 | |||
| A1D18_03435; matches predicted acetaldehyde dehydrogenase enzymes from same taxa as A1D18_03430 | 294 | 96 | 67 | 80 | |||
| A1D18_03440; matches predicted dolichol phosphate mannose synthase enzymes from | 311 | 99 | 61 | 78 | |||
| A1D18_03445: matches UDP-glucuronate decarboxylase enzymes from | 355 | 97 | 60 | 76 | |||
| A1D18_03450: matches polysaccharide biosynthetase (synthesizes cell surface polysaccharides) from | 147 | 76 | 46 | 67 | |||
| A1D18_03455; matches pyridoxal phosphate (PLP)-dependent aspartate aminotransferase superfamily proteins from | 400 | 100 | 71 | 84 | |||
| contig_198: 245,601–250,400 | 4,800 | 37.3 | None predicted by annotation software; however, tblastn searches of this sequence show two portions matching glycosyltransferase enzymes from | n/a | n/a | n/a | n/a |
Figure 4Mcf phylogeny.
Phylogenetic tree showing inferred relationships among Mcf-like genes, obtained via maximum likelihood using the JTT matrix-based model. Numbers indicate bootstrap support using 100 replicates. No outgroup was specified in this analysis; instead, the tree was rooted at the longest branch.
Candidate horizontally transferred genes identified by HGTector.
| Closest match | Functional notes | Size (aa) | % Cov. | % Id. | % Pos. | |
|---|---|---|---|---|---|---|
| A1D18_00810 | Outer membrane autotransporter; contains an autotransporter and a pertactin-like passenger domain; proteins in this family are usually virulence factors | 932 | 64 | 43 | 60 | |
| A1D18_02160 | NAD/FAD binding protein | 236 | 92 | 53 | 70 | |
| A1D18_03435 | acetaldehyde dehydrogenase; also identified in genomic island analysis | 294 | 96 | 67 | 80 | |
| A1D18_03465 | glucose-1-phosphate cytidylyltransferase | 268 | 95 | 71 | 85 | |
| A1D18_03485 | rhamnosyltransferase | 313 | 95 | 30 | 49 | |
| A1D18_03490 | rhamnosyltransferase | 289 | 93 | 30 | 52 | |
| A1D18_03505 | glycosyltransferase | 268 | 98 | 36 | 58 | |
| A1D18_05025 | glycosyltransferase | 409 | 96 | 38 | 63 | |
| A1D18_05065 | N-acetyltransferase | 146 | 93 | 36 | 57 | |
| A1D18_06515 | aquaporin | 230 | 99 | 69 | 79 | |
| A1D18_06535 | matches hypothetical proteins; contains Permuted papain-like amidase enzyme | 256 | 93 | 53 | 76 |