Literature DB >> 29515524

Environmental Metagenomic Assemblies Reveal Seven New Highly Divergent Chlamydial Lineages and Hallmarks of a Conserved Intracellular Lifestyle.

Trestan Pillonel1, Claire Bertelli1, Gilbert Greub1.   

Abstract

The Chlamydiae phylum exclusively encompasses bacteria sharing a similar obligate intracellular life cycle. Existing 16S rDNA data support a high diversity within the phylum, however genomic data remain scarce owing to the difficulty in isolating strains using culture systems with eukaryotic cells. Yet, Chlamydiae genome data extracted from large scale metagenomic studies might help fill this gap. This work compares 33 cultured and 27 environmental, uncultured chlamydial genomes, in order to clarify the phylogenetic relatedness of the new chlamydial clades and to investigate the genetic diversity of the Chlamydiae phylum. The analysis of published chlamydial genomes from metagenomics bins and single cell sequencing allowed the identification of seven new deeply branching chlamydial clades sharing genetic hallmarks of parasitic Chlamydiae. Comparative genomics suggests important biological differences between those clades, including loss of many proteins involved in cell division in the genus Similichlamydia, and loss of respiratory chain and tricarboxylic acid cycle in several species. Comparative analyses of chlamydial genomes with two proteobacterial orders, the Rhizobiales and the Rickettsiales showed that genomes of different Rhizobiales families are much more similar than genomes of different Rickettsiales families. On the other hand, the chlamydial 16S rRNAs exhibit a higher sequence conservation than their Rickettsiales counterparts, while chlamydial proteins exhibit increased sequence divergence. Studying the diversity and genome plasticity of the entire Chlamydiae phylum is of major interest to better understand the emergence and evolution of this ubiquitous and ancient clade of obligate intracellular bacteria.

Entities:  

Keywords:  Chlamydiae; bacterial taxonomy; chlamydial metabolism; comparative genomics; endosymbiont evolution; obligate intracellular

Year:  2018        PMID: 29515524      PMCID: PMC5826181          DOI: 10.3389/fmicb.2018.00079

Source DB:  PubMed          Journal:  Front Microbiol        ISSN: 1664-302X            Impact factor:   5.640


Introduction

Bacteria of the phylum Chlamydiae are all obligate intracellular bacteria that multiply within eukaryotic host cells (Horn, 2015). Various chlamydial species were identified in mammals, birds, reptiles, fishes, arthropods, and unicellular eukaryotes (Taylor-Brown et al., 2015). Several Chlamydiae, such as Chlamydia trachomatis, Chlamydia abortus, and Chlamydia pneumoniae, are well-known important human and animal pathogens (Elwell et al., 2016). Other Chlamydia-related organisms such as Waddlia chondrophila, associated with abortion in cattle, are increasingly recognized as emerging pathogens posing a zoonotic or vector-borne risk (Taylor-Brown and Polkinghorne, 2017). Multiple species of the Simkaniaceae, “Candidatus Piscichlamydiaceae” and “Ca. Clavichlamydiaceae” are associated with epitheliocystis, a disease affecting the gill of fish. Rhabdochlamydiaceae spp. are also highly prevalent in ticks, the most common arthropod vector of human and animal disease (Pilloux et al., 2015). Analysis of environmental 16S rRNA sequences indicates that those organisms are highly diverse and are present in a wide variety of freshwater, marine and terrestrial environments (Lagkouvardos et al., 2014). However, there are currently only six validated families (Chlamydiaceae, Parachlamydiaceae, Simkaniaceae, Waddliaceae, Criblamydiaceae, and Rhabdochlamydiaceae) and three Candidatus families (Clavichlamydiaceae, Parilichlamydiaceae, and Piscichlamydiaceae) (Stride et al., 2013; Horn, 2015). The last comparative analysis of the phylum Chlamydiae (Collingro et al., 2011) included genomic data from four families; a single species of Waddliaceae and Simkaniaceae, and multiple species of Chlamydiaceae and Parachlamydiaceae. This work reported a relatively large set of 560 genes conserved in all four chlamydial clades, and a large number of accessory genes restricted to families exhibiting much larger genomes than vertebrate parasites of the Chlamydia genus. Some virulence factors such as the Type III secretion system were shown to be conserved in all four families investigated. On the other hand, the F-like conjugative DNA transfer system identified by Greub et al. (2004) in Protochlamydia amobophila was restricted to Simkaniaceae and Parachlamydiaceae (Bertelli et al., 2016). Since then, multiple genomes belonging to the family Parachlamydiaceae were sequenced from strains in pure culture, including one representative of a new genus named Rubidus (Domman et al., 2014; Ishida et al., 2014; Yamaguchi et al., 2015; Yamane et al., 2015; Bertelli et al., 2016; Bou Khalil et al., 2016, 2017; Fukumoto et al., 2016). In addition, genome sequences representing two distinct genera of the family Criblamydiaceae were recently released (Bertelli et al., 2014, 2015). Previously, technically challenging and time-consuming in vitro cultures were required to obtain sufficient sample material for genomic analysis of Chlamydiae (Jacquier et al., 2013). This difficulty was recently overcome by culture-independent genomic characterization of uncultivable Chlamydiae. In 2013, the genome of C. trachomatis could be recovered directly from clinical samples using a metagenomic approach (Andersson et al., 2013; Seth-Smith et al., 2013). More recently, the complete genome of “Candidatus Chlamydia sanzinia”—an uncultivated snake pathogen—was sequenced without culture (Taylor-Brown et al., 2016). The fish pathogen “Candidatus Similichlamydia epinepheli,” a representative of the Candidatus family Parilichlamydiaceae, was partially recovered using a similar approach (Taylor-Brown et al., 2017). Moreover, several “Candidatus Rhabdochlamydia spp.” genomes could also be directly sequenced from ticks samples (Pillonel et al., unpublished data). Single-cell genomics is another promising approach for the study of uncultivable organisms (Gawad et al., 2016). This technique allowed the recovery of partial genomes of three new chlamydial lineages from marine environments (Collingro et al., 2017). Thus, from 2011 to 2017, a large number of additional chlamydial genomes have been made available but their diversity and their relative position in the phylogenetic tree of the phylum Chlamydiae remained so far unknown. Updated comparative genomic analyses will allow to clarify the phylogenetic relatedness of the new chlamydial clades and to improve our understanding of the evolution of shared and distinct genetic features in this diverse group of obligate intracellular bacteria. Therefore, we performed here a comprehensive comparative analysis of the phylum Chlamydiae including all genomic data currently publicly available (as of June 2017). We have identified 7 putative novel family-level lineages retrieved from metagenomics bins. Extremely diverse genomic characteristics were observed, with genome size ranging from 1 to 3.4 Mbp and GC content ranging from 26.23 to 55.76%. Despite those variations, essential mechanisms involved in host-symbiont interactions such as the type III secretion apparatus and the ADP-ATP translocase (a transporter involved in energy parasitism) are conserved in all newly discovered lineages. On the other hand, fundamental biological differences such as distinct division mechanisms of novel and phylogenetically distantly-related Chlamydiae could be highlighted.

Methods

Data retrieval

Genome assemblies were downloaded from the RefSeq database, and from Genbank for the assemblies absent from RefSeq (Table 1; O'Leary et al., 2016). All genome assemblies from the phylum Chlamydiae (NCBI taxonomy ID 204428) were included in this study with the exception of the Chlamydia genus. Since several Chlamydia species are well-studied pathogens with many sequenced genomes, a single representative genome assembly was considered for each of the 13 Chlamydia species out of a total of 326 genome assemblies (June 2017).
Table 1

List of genomes included in the analysis, with GC content (%), size (bp) and completeness as evaluated using checkM.

AccessionDescriptionSize (bp)Number of CDSGC (%)Number of contigsCompleteness
NC_010655Akkermansia muciniphila ATCC BAA-83526641022,24655.761100
NC_004552Chlamydia abortus S26/3114437793539.87198.28
NZ_CP006571Chlamydia avium 10DC88104117084236.92194.83
NC_003361Chlamydia caviae GPIC117339096839.22198.28
NC_007899Chlamydia felis Fe/C-56 Fe/C-56116623996639.38198.28
NZ_CP015840Chlamydia gallinacea 08-1274/3105958390537.94196.55
NZ_APJW00000000Chlamydia ibidis 10-1398/6114606693938.32496.55
NC_002620Chlamydia muridarum Nigg107295090040.34198.28
NC_015408Chlamydia pecorum E58110619793441.081100
NC_000922Chlamydia pneumoniae CWL02912302301,02940.58198.28
NC_015470Chlamydia psittaci 6BC117166098339.06198.28
NZ_CP014639Chlamydia sanzinia 2742-308111323393338.54198.28
MKSG01000000Chlamydia sp. 32-2425299572,07532.429896.55
NZ_AYKJ01000000Chlamydia suis MD56107350788642.031398.28
NC_000117Chlamydia trachomatis D/UW-3/CX104251988741.31198.28
LNES01000000Chlamydiae bacterium Ga007414017242031,63947.82696.55
MGLO01000000Chlamydiae bacterium GWA2_50_15117736899349.343293.1
MGLP01000000Chlamydiae bacterium GWC2_50_10117228396648.945286.21
MGLQ01000000Chlamydiae bacterium GWF2_49_8101973376049.239377.9
MGLR01000000Chlamydiae bacterium RIFCSPHIGHO2_01_FULL_44_3915696491,46644.723095.69
MGLS01000000Chlamydiae bacterium RIFCSPHIGHO2_02_FULL_45_913426351,15644.6610480.72
MGLT01000000Chlamydiae bacterium RIFCSPHIGHO2_02_FULL_49_2913870151,18749.086993.1
MGLU01000000Chlamydiae bacterium RIFCSPHIGHO2_12_FULL_27_897436081727.439872.1
MGLV01000000Chlamydiae bacterium RIFCSPHIGHO2_12_FULL_44_5915685481,47044.723095.69
MGLW01000000Chlamydiae bacterium RIFCSPHIGHO2_12_FULL_49_1112584841,06548.457181.27
MGLX01000000Chlamydiae bacterium RIFCSPHIGHO2_12_FULL_49_3213973021,19048.917289.66
MGLY01000000Chlamydiae bacterium RIFCSPHIGHO2_12_FULL_49_913203871,16648.599471.24
MGLZ01000000Chlamydiae bacterium RIFCSPLOWO2_01_FULL_28_770852657228.077558.21
MGMA01000000Chlamydiae bacterium RIFCSPLOWO2_01_FULL_44_5215375901,43844.742995.69
MGMB01000000Chlamydiae bacterium RIFCSPLOWO2_02_FULL_45_2215758691,47544.73095.69
MGMC01000000Chlamydiae bacterium RIFCSPLOWO2_02_FULL_49_1214133291,17548.995893.1
MGMD01000000Chlamydiae bacterium RIFCSPLOWO2_12_FULL_45_2015421821,44344.752895.69
MGME01000000Chlamydiae bacterium RIFCSPLOWO2_12_FULL_49_1214187421,22449.164691.38
LJUH01000000Chlamydiae bacterium SM23_39112660498626.236796.55
MKSK01000000Chlamydiales bacterium 38-2628341102,32738.12898.28
NZ_FLYF00000000Chlamydiales bacterium SCGC AB-751-O2398692471535.458946.24
NZ_FLYO00000000Chlamydiales bacterium SCGC AG-110-M1592981571141.85440.52
NZ_FLYP00000000Chlamydiales bacterium SCGC AG-110-P3129966195446.839647.49
NZ_CCEJ000000000Criblamydia sequanensis CRIB-1829696042,42238.242397.41
CWGJ01000001Estrella lausannensis CRIB-3028201952,20248.232997.41
NZ_JSDQ00000000Neochlamydia sp. EPS425306771,88238.0911297.41
NZ_BASK00000000Neochlamydia sp. S1331870742,23238.031,34297.41
NZ_JRXI00000000Neochlamydia sp. TUME125463231,87938.0225497.41
NZ_BAWW00000000Parachlamydia acanthamoebae Bn929993612,40938.947299.66
NZ_ACZE00000000Parachlamydia acanthamoebae Halls coccus29712612,47738.979599.66
NZ_JSAM00000000Parachlamydia acanthamoebae OEW130088852,30939.0416297.07
NC_015702Parachlamydia acanthamoebae UV-730723832,53239.03199.66
NZ_BBPT00000000Parachlamydiaceae bacterium HS-T323078852,02538.7134100
NZ_JSAN00000000Protochlamydia amoebophila EI223976751,77534.82178100
NC_005861Protochlamydia amoebophila UWE2524144651,84134.72198.28
NZ_CCJF00000000Protochlamydia naegleriophila Diamant28640732,35942.81498.28
NZ_LN879502Protochlamydia naegleriophila KNic28850902,37242.71100
NZ_FCNU00000000Protochlamydia phocaeensis34239822,76642.0533100
NZ_BASL00000000Protochlamydia sp. R18 S1327226992,00634.78795100
NZ_BCPZ00000000Protochlamydia sp. W-924845731,81534.48402100
PRJEB24578Rhabdochlamydia helvetica T335818305431,69236.1638100
CCSC01000000Rubidus massiliensis27014492,29932.45398.28
PRJNA343727Similichlamydia epinepheli GCCT1498154091339.5216970
NC_015713Simkania negevensis Z24963372,22941.781100
NC_014225Waddlia chondrophila WSU 86-104421163121,83243.78198.28
List of genomes included in the analysis, with GC content (%), size (bp) and completeness as evaluated using checkM.

Evaluation of genome completeness and quality

Many genomes included in this work are metagenomics bins that may be of unequal quality. Indeed, metagenomics bins can be an incomplete representation of the organism's genomes, can be a mixture of multiple genomes or include chimeric DNA sequences. To screen for contaminants, the completeness and redundancy of each genome was thus evaluated based on the identification of 104 nearly universal (and generally single copy) bacterial markers with checkM (Parks et al., 2015). In addition, translated predicted coding sequences (CDS) of each genome were compared to RefSeq database using PLAST, a fast sequence similarity search tool, with the following parameters: -M BLOSUM62 -s 45 -seeds-use-ratio 60 -G 11 -E 1 -F F -max-hit-per-query 100 -max-hsp-per-hit 1 (Van Nguyen and Lavenier, 2009). PLAST was favored over BLAST for its lower memory usage. The taxonomy of the first PLAST hit (excluding hits against RefSeq sequences classified as belonging to the same species) was investigated with help of the NCBI taxonomy database (Federhen, 2012).

Comparative genome analysis and reconstruction of the species phylogeny

Orthologs were identified using OrthoFinder version 0.4 (Emms and Kelly, 2015). A reference phylogeny was built based on single copy orthologs conserved in at least 55 of the 60 studied genomes (Table 1). Core single copy orthologs were aligned using mafft version 7.058b (Katoh and Standley, 2013). The concatenated alignment was used for the reconstruction of the species tree using FastTree 2.1.9 with double precision (Price et al., 2010). Circular plots were drawn using Circos version 0.69 (Krzywinski et al., 2009), whereas other plots were made with R (R Core Team, 2016).

Genome annotation and identification of secretion systems, flagellar subunits, and selected metabolic traits

All genomes were annotated using GhostKOALA (Kanehisa et al., 2016) and Interproscan version 5.23–62.0 (Mitchell et al., 2015). Orthologs of the type III and type IV secretion systems, known effectors, division proteins, membrane proteins, the Euo master regulator, respiratory chain complexes and enzymes involved in menaquinone biosynthesis and glycogen metabolism were identified based on OrthoFinder grouping into groups of orthologous proteins (see Table S1 for the detailed reference locus list). The ATP:ADP antiporters homologs were identified by InterProScan annotation (interpro accession: IPR004667). The flagellar apparatus subunits, proteins involved in peptidoglycan, purine, pyrimidine and ubiquinone biosynthesis as well as the glycolysis pathway, the citrate cycle (TCA), and the pentose phosphate pathway (PPP) were identified based on KEGG ortholog (KO) annotation. Reported counts for each pathway/module are the non-redundant number of identified KO. Predicted coding sequences were compared to COG database (version 2014) (Galperin et al., 2015) using BLASTP version 2.3.0+(Camacho et al., 2009) with an e-value cutoff of 1e−5, a minimal query coverage of 50% and a minimal identity of 20%.

Identification of phylogenetic markers and evaluation of genetic relatedness

Proteins conserved between pairs of genomes were identified by pairwise protein sequences comparisons with BLASTP version 2.3.0+ (Camacho et al., 2009). The identity of all reciprocal best blast hit (RBBH) with a minimal e-value of 10−5 and 50% of both query and hit coverage were retained for the comparative analyses. 16S and 23S rRNA sequences were identified using barrnap (https://github.com/tseemann/barrnap). HMM profiles were built from 21 reference sequences for nine previously identified phylogenetically informative markers (Pillonel et al., 2015). These profiles were used to identify markers in all chlamydial genomes included in this study using hmmsearch v3.1 (Eddy, 2011). For each marker, a phylogenetic tree was reconstructed based on the best hmmsearch hit in each genome. Bitscore cut-offs were defined individually for each marker based on the density distribution of hmmsearch bit scores and gene tree topologies: DnaA (260), SucA (750), the hypothetical protein 325 (350), FabI (400), RpoN (300), FtsK (750), PepF (470), Adk (175), and HemL (360). Pairwise amino acid sequence identities were calculated based on pairwise Needleman-Wunsch global alignments made with Needle from the EMBOSS package version 6.6 (Rice et al., 2000). Gaps were not considered in identity calculations. The probability distributions of pairwise amino acid identity values were estimated using the statistical package R with kernel density estimations (R Core Team, 2016). Density plots were made using the ggplot2 package (Wickham, 2016).

Results

Twelve nearly complete genome assemblies from multiple new deeply branching Chlamydiae clades

In total, 60 genomes (including one outgroup) were compared in this study: 21 poorly characterized metagenomics bins classified as Chlamydiae were retrieved from Genbank. Most (n = 17) originate from a single study of groundwater microbial ecosystems (Anantharaman et al., 2016). Others were sequenced as part of the microbial community of a biologically active filter of a water treatment plant (n = 1) (Pinto et al., 2016), as part of an investigation of bacterial communities in estuary sediments (n = 1) (Baker et al., 2015), and from an experimental bioreactor used to treat contaminated goldmine water (n = 2) (Kantor et al., 2015). In addition, three single-cell amplified partial genomes from marine Chlamydiae (Collingro et al., 2017) and 25 previously published chlamydial genomes (Table 1) were studied. The latter contained a majority (n = 20) of draft assemblies. The verrucomicrobial genome of Akkermansia muciniphila was included as an outgroup. While most Chlamydiae share a similar life cycle, their genetic diversity is considerable. Standard procedures for the classification of Chlamydiae rely on a small set of reference genes or protein sequences, including the 16S rRNA sequence (Pillonel et al., 2015; Greub, 2017). Given that metagenomic datasets frequently lack rRNA operons, pairwise comparisons of all protein sequences were undertaken to better characterize the genetic diversity of newly sequenced genomes. The protein sequences of the 59 chlamydial genomes included in this comparative analysis were clustered into 10,162 orthologous groups (Table S2, Figure S1A). Among those groups, 4,553 are restricted to a single genome (44.8%, Figure S1B). Only three single copy orthologs were conserved in all chlamydial genomes (the two ribosomal proteins S8 and L6, and the tRNA threonylcarbamoyladenosine biosynthesis protein TsaE), but 108 orthologous groups were conserved in more than 90% of the genomes (55 out of 60, including the outgroup genome.The list of 108 orthologous groups is reported in Table S3). The concatenated alignment of those 108 protein sequences was used to build a reference phylogeny of all representative genomes of the phylum Chlamydiae (Figure 1A).
Figure 1

Diversity of the phylum Chlamydiae. (A) Phylogenetic tree of cultured and uncultured representative of the phylum Chlamydiae reconstructed based on the concatenated alignment of 108 single copy orthologs conserved in more than 90% of the genomes. Nodes with support values lower than 1 are indicated with black dots. The width of the black dots does not reflect a genetic distance (lines a shifted to accommodate the presence of the dots). *Genome sequenced using a shotgun metagenomics approach. +Single-cell amplified genomes (B) GC content, genome size and estimated genome completeness (C) quality measures: (i) number of missing core proteins (out of 108 proteins) (ii) number of duplicated core genes (iii) number of contigs in the assembly (iv) number of contigs without any CDS (v) number of contigs without any best PLAST hits against chlamydial sequences (RefSeq database version 81). (D) Described families of the Chlamydiae phylum.

Diversity of the phylum Chlamydiae. (A) Phylogenetic tree of cultured and uncultured representative of the phylum Chlamydiae reconstructed based on the concatenated alignment of 108 single copy orthologs conserved in more than 90% of the genomes. Nodes with support values lower than 1 are indicated with black dots. The width of the black dots does not reflect a genetic distance (lines a shifted to accommodate the presence of the dots). *Genome sequenced using a shotgun metagenomics approach. +Single-cell amplified genomes (B) GC content, genome size and estimated genome completeness (C) quality measures: (i) number of missing core proteins (out of 108 proteins) (ii) number of duplicated core genes (iii) number of contigs in the assembly (iv) number of contigs without any CDS (v) number of contigs without any best PLAST hits against chlamydial sequences (RefSeq database version 81). (D) Described families of the Chlamydiae phylum. The GC content of uncultured strains varies widely from 26.2 to 49.3% (Table 1, Figure 1B). Twelve of the metagenomics bins are predicted to be more than ninety percent complete, based on the identification of 104 nearly universal bacterial proteins (Table 1, Table S4, Figure 1B). Those nearly complete assemblies range from 1.42 to 2.83 Mb in size. The largest metagenomics bin (Chlamydia sp. 32–24), is phylogenetically related to Neochlamydia spp. The smallest one, Chlamydiae bacterium RIFCSPLOWO2_01_FULL_28_7, only comprises about 700 kb that were predicted to represent approximately 58% of the genome. The most incomplete genomes are the three genomes from marine Chlamydiae (Figure 1, groups 2, 6, and 7). Those genomes are amplified from a single cell. Such approach frequently yield incomplete genomes (Collingro et al., 2017). Chlamydiales bacterium SCGC AG-110-M15 missed 55 of the 108 core proteins used to build the species phylogeny (50.9% of the dataset, Figure 1C). Despite the metagenomics approach to sequence Chlamydiae from more complex samples, only few genomes exhibited signs of contamination. The strongest evidence was present in the S. epinepheli genome where 32 of the 104 nearly universal proteins were present in more than one copy, indicating that it might be an admixture of two genomes (Figure 1C). Nevertheless, the 32 duplicated S. epinepheli markers were all monophyletic in phylogenetic reconstructions including all chlamydial homologs and their closest homologs in the RefSeq database, indicating that the assembly might be an admixture of two closely related strains. In order to further evaluate if the assemblies contained only fragments of Chlamydiae genomes, all CDS were compared to the RefSeq database. A majority of CDSs of each analyzed genome exhibited a best non-identity PLAST hit (excluding hits against RefSeq sequences classified as belonging to the same species) against chlamydial sequences present in the RefSeq database (Figure 1C). Assemblies reconstructed from metagenomics and single cell data exhibit between 55.77 and 75.93% of best PLAST hits against chlamydial sequences, which is higher than for S. negevensis (45.01%). S. negevensis is the only representative of the family Simkaniaceae currently available in RefSeq. The higher proportion of PLAST hits against non-chlamydial sequences (32.48%) and without any hits (22.51%) as compared to representative of other new families absent from RefSeq might be related to the much larger size of S. negevensis genome (Figure 1B). Several of the most fragmented assemblies harbored contigs without any best hits against chlamydial sequences (Figure 1C). Nevertheless, chlamydial hits were systematically distributed along the whole assembly, with only few small contigs exhibiting no best hits against chlamydial sequences, as can be seen on Figure 2 for the assembly RIFCSPHIGHO2_01_FULL_44_39. Overall, these results indicate that all assemblies contain mostly chlamydial DNA sequences, supporting the reliability of the analysis.
Figure 2

Circular representation of the uncultured genome RIFCSPHIGHO2_01_FULL_44_39 (accession: MGLR01000000). The outer circle indicates contig boundaries. The predicted open reading frames (ORFs) of the leading and lagging strands are reported in gray. Pink ORFs are proteins without a best hit against chlamydial sequences in the RefSeq database. Violet ORFs are proteins without any significant hit in the RefSeq database. The inner blue/red circles show the conservation of the closest identified orthologous protein (red scale) in the 37 other chlamydial species and one other genome of group 3 (Figure 1). Identity values were calculated based on the alignment of orthologous groups inferred using OrthoFinder. The absence of any ortholog is indicated in blue. Most contigs exhibit a majority of best PLAST hits against chlamydial sequences.

Circular representation of the uncultured genome RIFCSPHIGHO2_01_FULL_44_39 (accession: MGLR01000000). The outer circle indicates contig boundaries. The predicted open reading frames (ORFs) of the leading and lagging strands are reported in gray. Pink ORFs are proteins without a best hit against chlamydial sequences in the RefSeq database. Violet ORFs are proteins without any significant hit in the RefSeq database. The inner blue/red circles show the conservation of the closest identified orthologous protein (red scale) in the 37 other chlamydial species and one other genome of group 3 (Figure 1). Identity values were calculated based on the alignment of orthologous groups inferred using OrthoFinder. The absence of any ortholog is indicated in blue. Most contigs exhibit a majority of best PLAST hits against chlamydial sequences.

Classification and diversity of new chlamydial genomes

We recently proposed a scheme for the classification of chlamydial genomes at various taxonomic levels (species, genus and family). This scheme was used here to update the classification of all unclassified chlamydial genomes (Figure 3). Briefly, the strain HS-T3 likely belongs to a new genus in the family Parachlamydiaceae (Figure S2). Chlamydia sp. 38–26 is a new Neochlamydia species (Figure S3) and Chlamydia sp. 32–24 is a Rubidus massiliensis strain (Figure S4). There are four Pr. amoebophila strains (Figure S5) and three strains of the same Neochlamydia species (S13, TUME1 and EPS4, Figure S6).
Figure 3

Updated classification of the order Chlamydiales based on genomic data. Proposed new Candidatus genera, families and orders are indicated in bold. Only three genera (Chlamydia, Neochlamydia and Protochlamydia) have more than one species. Vertical blue lines indicate genomes of the same species. Numbers from 1 to 12 indicate new candidate species. Candidate species were not named. Eleven new Candidatus genera are proposed. Novochlamydia (from Latin novus, new), Pelagichlamydia (from the Latin world pelagus, sea), Abzuchlamydia (from Abzu, the underground waters of the Aquifer in the Sumerian mythology), Enkichlamydia (from the Sumerian god Enki, lord of the Abzu), Limichlamydia (from the Latin world limus, silt, alluvium), Alluviochlamydia (from the Latin world alluvio, alluvium), Arenachlamydia (from the Latin world Arena, sand), Hydrochlamydia (from the Greek prefix hydro-, relating to water), Kinetochlamydia (from the Greek prefix kineto-, motion; movement), Motilichlamydia (from Latin world motus, past participle of movere, to move), Thermochlamydia (from the Greek world thermos, warm, hot).

Updated classification of the order Chlamydiales based on genomic data. Proposed new Candidatus genera, families and orders are indicated in bold. Only three genera (Chlamydia, Neochlamydia and Protochlamydia) have more than one species. Vertical blue lines indicate genomes of the same species. Numbers from 1 to 12 indicate new candidate species. Candidate species were not named. Eleven new Candidatus genera are proposed. Novochlamydia (from Latin novus, new), Pelagichlamydia (from the Latin world pelagus, sea), Abzuchlamydia (from Abzu, the underground waters of the Aquifer in the Sumerian mythology), Enkichlamydia (from the Sumerian god Enki, lord of the Abzu), Limichlamydia (from the Latin world limus, silt, alluvium), Alluviochlamydia (from the Latin world alluvio, alluvium), Arenachlamydia (from the Latin world Arena, sand), Hydrochlamydia (from the Greek prefix hydro-, relating to water), Kinetochlamydia (from the Greek prefix kineto-, motion; movement), Motilichlamydia (from Latin world motus, past participle of movere, to move), Thermochlamydia (from the Greek world thermos, warm, hot). Regarding the new deeply branching assemblies, Group 3 and Group 4 encompass each two candidate species from two different genera (Figures S7, S8). The seven assemblies of group 5 are part of the same candidate species (Figure S9). Altogether, the new genomes can be classified into 12 new candidate species (Figure 3). Half of them are only represented by <90% complete genomic data (Figure 1B, Table 1). Most genome assemblies sequenced using a metagenomics approach lack rRNA operons: 15 lack 16S rRNA, whereas 17 lack 23S rRNA (Figure S2). This happens frequently as rRNA operons are too highly conserved to be assigned reliably to specific genome assemblies. In addition, most genomes lack one or several of the nine proposed taxonomic markers (Figures S2–S9; Pillonel et al., 2015). This is a major limitation for any scheme relying on a limited set of genes. Therefore, in order to evaluate the genus and family level diversity of the Chlamydiae phylum, the identity of reciprocal best blast hits (RBBH) was used to evaluate the divergence of chlamydial lineages at various taxonomic depths. Figure 4 shows the distribution of RBBH identity values between pairs of genomes. As expected, the distribution of sequence identity values shifts continuously toward lower values with increased phylogenetic distance (Figure 4A). W. chondrophila and the different genera classified in the Parachlamydiaceae family show similar levels of divergence (Figure 4B). This is not the case with S. negevensis, exhibiting a median identity lower than 50% with P. amoebophila (Figure 4B). The median of the median pairwise identities among members of the Chlamydiales and Parachlamydiales (Figure 3) is of 44.44%, whereas Chlamydiaceae and the two deep branching clades 1 and 2 exhibit a median identity of respectively, 39.81 and 41.14% (using one representative genome per species, see detailed values in Figure S10). The high divergence of clade 1 and 2 is even more obvious on Figure 4A, with a clear shift of the distribution of the RBBH identities peaking at about 35% identity between C. trachomatis and clade 1 and 2. Altogether, the phylum exhibits at least three highly divergent clades, and may now include seven new candidate family-level lineages and 11 new genera (Figure 3, Figure S11).
Figure 4

Distribution of the identity of reciprocal best blast hit between pairs of genomes. (A) Protein sequences exhibit similar levels of divergence between different Parachlamydiaceae genera and Waddlia chondrophila (different family), with a median amino acid identity around 55%. The distribution is clearly shifted for Simkania negevensis, with a median identity inferior to 50%. (B) Distribution of protein identities between C. trachomatis and the closely related C. muridarum genome, C. pneumoniae, W. chondrophila, and two deep branching taxa (S. epinepheli and one groundwater metagenomic bin). The continuous decline in sequence identity with divergence time is clearly visible here. The most distant chlamydial lineages are extremely divergent and skewed, with a lot of lowly conserved proteins and some highly conserved ones (mode of 35% sequence identity). (C) Distribution of sequence identities of C. sequenanesis and representatives of four recognized families and the outgroup, A. muciniphila. (D) Identical analyses comparing the obligate intracellular bacteria Rickettsia prowazekii (alpha-proteobacteria) with Orientia tsutsugamushi (Rickettsieae), Anaplasma marginale (Anaplasmataceae), Ehrlichia chafeensis (Anaplasmataceae), Midichloria mitochondrii (“Candidatus Midichloriaceae”), and Agrobacterium fabrum (free-living member of the Rhizobiales order).

Distribution of the identity of reciprocal best blast hit between pairs of genomes. (A) Protein sequences exhibit similar levels of divergence between different Parachlamydiaceae genera and Waddlia chondrophila (different family), with a median amino acid identity around 55%. The distribution is clearly shifted for Simkania negevensis, with a median identity inferior to 50%. (B) Distribution of protein identities between C. trachomatis and the closely related C. muridarum genome, C. pneumoniae, W. chondrophila, and two deep branching taxa (S. epinepheli and one groundwater metagenomic bin). The continuous decline in sequence identity with divergence time is clearly visible here. The most distant chlamydial lineages are extremely divergent and skewed, with a lot of lowly conserved proteins and some highly conserved ones (mode of 35% sequence identity). (C) Distribution of sequence identities of C. sequenanesis and representatives of four recognized families and the outgroup, A. muciniphila. (D) Identical analyses comparing the obligate intracellular bacteria Rickettsia prowazekii (alpha-proteobacteria) with Orientia tsutsugamushi (Rickettsieae), Anaplasma marginale (Anaplasmataceae), Ehrlichia chafeensis (Anaplasmataceae), Midichloria mitochondrii (“Candidatus Midichloriaceae”), and Agrobacterium fabrum (free-living member of the Rhizobiales order).

Correlation between 16S rRNA sequence conservation and whole genome relatedness

Gupta and colleagues recently disputed the 16S rRNA identity cutoffs proposed to delineate families of the phylum Chlamydiae, arguing that the number of chlamydial families was inflated by the high cutoff in use (Gupta et al., 2015). Nevertheless, different bacterial clades may exhibit variable rates of sequences evolution (Kuo and Ochman, 2009). In addition, the 16S rRNA gene may not evolve at a constant rate across the entire tree of life and may not necessarily be a reliable indicator of whole genome relatedness (Konstantinidis and Tiedje, 2005; Kuo and Ochman, 2009). In order to put the Chlamydiae phylum into a broader perspective, we compared the 16S rRNA pairwise sequence identity and the conservation of protein sequences with data from another diverse group of intracellular bacteria, the order Rickettsiales. Representatives of the Rhizobiales order were also included for comparison (Table S5). Figure 5A shows the relationships of 16S rRNA and median RBBH identities for the three considered orders. For both measures, the Chlamydiales and Rickettsiales orders exhibit higher sequence divergence than the Rhizobiales order. The sequence divergence of different families belonging to the same order (see Tables S5, S6 for the detailed classification used here) reveals that the 16S rRNA gene of Rickettsiales families is more divergent than in the Chlamydiales families (Figure 5B). On the other hand, the median pairwise protein identity is also relatively low for most Chlamydiales (Figures S12A–D). Given the skewed shape of RBBH identity distributions (Figure 4), the median identity may not be the most appropriate summary metrics to estimate genome divergence. An alternative would be the maximum of the density distribution of RBBH identity rather than the median (Figures 5C,D). Using the maximum of the estimated distribution of amino acid identities, the Chlamydiae cloud shifted toward lower values, indicating that the distribution of RBBH identities in pairs of chlamydial genomes is more skewed than for Rickettsiales and Rhizobiales. This still holds true when comparing Parachlamydiales only (excluding Chlamydiaceae and Similichlamydiaceae).
Figure 5

Relationships between 16S rRNA and protein sequences conservation within three distinct bacterial orders: Rhizobiales, Chlamydiales, and Rickettsiales (indicated with colors). (A) Relationships between 16S rRNA identity and the median protein identity within the three bacterial orders and (B) for different families of the same order. (C) Relationships between 16S rRNA and the modal identity within the three bacterial orders and (D) between different families of the same order.

Relationships between 16S rRNA and protein sequences conservation within three distinct bacterial orders: Rhizobiales, Chlamydiales, and Rickettsiales (indicated with colors). (A) Relationships between 16S rRNA identity and the median protein identity within the three bacterial orders and (B) for different families of the same order. (C) Relationships between 16S rRNA and the modal identity within the three bacterial orders and (D) between different families of the same order. The difference between Rickettsiales and Chlamydiales could be due to the fact that the Chlamydiales dataset includes incomplete genomes. Missing data might lead to spurious reciprocal best blast hits, leading to an undervaluation of whole genome relatedness. Nevertheless, the trend is still visible when only complete chlamydial genomes are considered (Figures S12E,F). In conclusion, chlamydial families exhibit higher conservation of 16S rRNA but a skewed distribution of whole genome RBBH identity toward values that are as low, or even lower than genomes exhibiting lower 16S rRNA identities.

Conservation of mechanisms involved in the interaction with eukaryotic hosts and other microbial cells

Several genomes included in this analysis are only a partial representation of the true chlamydial genome, which precludes any in-depth genome content comparisons. Nevertheless, the identification of even partial molecular machineries and metabolic pathways can provide first insights into the ancestrality and conservation of genomic features that may be essential to their unique lifestyle. The recent sequencing of S. epinepheli, the most diverging Chlamydiae described thus far, revealed the presence of a type III secretion system (T3SS) and the conservation of several key virulence factors (Taylor-Brown et al., 2017). T3SS components were identified in all new chlamydial clades (Figure 6, column T3SS), including the most incomplete ones. Several described T3SS effectors such as Mip and Nue are also widely distributed within the whole phylum (Figure S13). The entire clade that includes the Simkaniaceae, the Rhabdochlamydiaceae and several new lineages lack any homolog of Type II secretion system effector protease CPAF (Figure S13). In C. trachomatis, CPAF mutants display impaired generation of infectious elementary bodies (Snavely et al., 2014) and cpaf is therefore an essential gene. Our observations suggest that CPAF is dispensable in some chlamydial lineages, and its role in other Chlamydia-related organisms where it is conserved remains to be confirmed.
Figure 6

Comparative genomics of the phylum Chlamydiae. Identified homologs of proteins involved in nucleotide uptake (ntt), part of molecular machines (i.e., T3SS, Flagellum) and part of selected metabolic pathway. For KEGG pathways/modules, number indicates non redundant count of KEGG orthologs part of the corresponding pathway/module. Reference locus tag, KEGG and Pfam accessions are indicated in Table S1.

Comparative genomics of the phylum Chlamydiae. Identified homologs of proteins involved in nucleotide uptake (ntt), part of molecular machines (i.e., T3SS, Flagellum) and part of selected metabolic pathway. For KEGG pathways/modules, number indicates non redundant count of KEGG orthologs part of the corresponding pathway/module. Reference locus tag, KEGG and Pfam accessions are indicated in Table S1. As obligate intracellular bacteria, Chlamydiae acquire essential nutrients from the cytosol of their host, including nucleotides and sugars such as ATP and glucose-6-phosphate. All but one partial genome encode at least two ADP/ATP antiporter homologs. These transporters are involved in nucleotide, ATP or nicotinamide adenine dinucleotide uptake from the host (Tjaden et al., 1999; Greub and Raoult, 2003; Haferkamp et al., 2004; Fisher et al., 2013). Furthermore, most genomes also encode a homolog of the glucose-6-phosphate transporter UhpC. The genomes of Simkania negevensis and several Parachlamydiaceae strains encode an apparently complete type IV secretion system (T4SS) located either on the chromosome or on a plasmid (Bertelli et al., 2016) that may be involved in conjugative DNA transfer (Greub et al., 2004). Several strains of Parachlamydia acanthamoebae encode remnants of the T4SS in their genome (Greub et al., 2009; Bertelli et al., 2016). No traces of T4SS genes could be found in any other chlamydial lineages (Figure 6). The T4SS of S. negevensis, Protochlamydia naegleriophila KNic and R. massiliensis are located on a plasmid, which may have facilitated its loss or acquisition in the various chlamydial lineages. Assignment of plasmid sequences to specific organisms within the context of a metagenomic dataset is extremely difficult. Chlamydial plasmids may thus be missing in some assemblies, including for the genome of Chlamydia sp. 32–34. It was recently suggested that some marine Chlamydiae may possess a flagellum regulated by a chemotaxis system (Collingro et al., 2017). The identification of several flagellar and chemotaxis genes in highly divergent chlamydial genomes retrieved by single cell-genomics was interpreted as an evidence that motility was an ancient feature of the Chlamydiae phylum (Collingro et al., 2017). Although most new uncultivated chlamydial clades were also sequenced from aquatic samples, none exhibited homologs of the flagellar genes identified in the genome assemblies SCGC AG-110-P3, AB-751-O23 and AG-110-M15 (Figure 6, group 2, 6, and 7), questioning whether this flagellar system is as old as Collingro et al. hypothesized. Two proteins of the type III secretion system that are homologous to flagellar proteins, fliQ/sctS (K02420) and fliN/sctQ (K02417), were sometimes annotated as part of the flagellum (Figure 6) but did not branch with SCGC Chlamydiae homologs in phylogenetic reconstructions (see for example Figure S14). A chemotaxis system is also encoded in several chlamydial genomes lacking flagellar genes including several Parachlamydiaceae (P. phocaeensis, P. naegleriophila, and P. acanthamoebae) and the two Criblamydiaceae genomes currently available. This system was thus proposed to be involved in signal transduction (Collingro et al., 2011). In the present analysis, chemotaxis genes were identified in the genome of Chlamydia sp. 38–26, which is closely related to Neochlamydia sp. strains that do not harbor a chemotaxis system. Overall, our data suggests that all Chlamydiae exhibit a similar obligate intracellular lifestyle, and that most are non-motile.

Conservation of division mechanisms across the phylum Chlamydiae

All described Chlamydiae exhibit a similar biphasic developmental cycle, including the most deeply branching clades (Abdelrahman and Belland, 2005; Steigen et al., 2013; Seth-Smith et al., 2017). The transcription factor Euo targets more than 100 promoters in W. chondrophila, and may be a key player in cell cycle regulation (Domman and Horn, 2015; de Barsy et al., 2016). It is conserved throughout the whole phylum, including S. epinepheli and “Ca. Novochlamydiaceae” (Figure 6), with exception of three of the most incomplete genomes probably due to the partial data available. Interestingly, the genome assembly of S. epinepheli lacks all proteins involved in peptidoglycan biosynthesis (Figure 6) and all but three proteins involved in chlamydial division (Figure S13). Remnants of those proteins could not be found in the raw assembly using tBLASTn. Although the genome of this strain is partially covered, peptidoglycan synthesis genes are located in various regions along the chromosome, which reinforces the likely absence of these genes from S. epinepheli. Indeed, this bacterium divides through a budding process (Seth-Smith et al., 2017), which suggests that the family Parilichlamydiaceae may present a significantly different division mechanism than other chlamydial lineages.

High metabolic diversity of the Chlamydiae phylum

Menaquinones are electron shuttles that allow the transfer of electrons between membrane-bound protein complexes in the electron transport chain (Meganathan, 2001). The Chlamydiaceae synthesize menaquinone through the alternative futalosine pathway (Hiratsuka et al., 2008; Barta et al., 2014). Other clades, including the deep branching “Ca. Novochlamydiaceae” (Figure 6), encode the traditional pathway, supporting the hypothesis that the futalosine pathway was horizontally acquired by the Chlamydiaceae lineage (Barta et al., 2014; Subtil et al., 2014). No menaquinone biosynthesis enzymes could be identified in the S. epinepheli genome (Figure 6). A homolog of MenB was identified (Figure 6) but it exhibits more similarity with enoyl-CoA hydratase and did not cluster with other chlamydial sequences in phylogenetic reconstructions, thus likely having another enzymatic role than MenB (Figure S15). Most genomes encode the necessary components of a minimal respiratory chain except the seven genomes of “Ca. Arenachlamydiaceae” (Figure 6). Those 1.4 Mb genomes are predicted to be nearly complete but do not encode the menaquinone biosynthesis pathway, nor any component of the respiratory chain, suggesting that they use alternative respiration systems. Overall, enzymes of the respiratory chain exhibit variable patterns of parallel losses across the entire phylum. The NADH-quinone oxidoreductase (complex I) is for instance restricted to the Parachlamydiaceae-Waddliaceae-Criblamydiaceae clade, and was lost at least three times independently in subclades (Figure 6). Few proteins homologous to complex I were identified in the low GC genomes recovered from estuary sediments and groundwater (“Ca. Limichlamydiaceae”). They encode a putative NADP-reducing hydrogenase exhibiting 44.5–56% amino acid identity to the hnd operon of Desulfovibrio fructosivorans (Malki et al., 1995; Figure S16). The F-type ATPase is also restricted to Parachlamydiaceae-Waddliaceae-Criblamydiaceae clade, as opposed to the V-type ATPase that is conserved in all sequenced chlamydial genomes. The citrate cycle (TCA) is incomplete in all Chlamydiaceae due to the absence of three enzymes; citrate synthase (GltA), aconitase (Acn) and isocitrate dehydrogenase (Icd) (Omsland et al., 2014). Several Chlamydiae such as the three Neochlamydia sp. and the seven “Ca. Arenachlamydiaceae” genomes (repeatedly missing the same set of enzymes) exhibit an even less complete TCA (Figure 6). Indeed, no homologs of TCA enzymes could be identified in the S. epinepheli genome assembly. On the other hand, several homologs of the pentose phosphate pathway (PPP), glycolysis and glycogen metabolism were identified in all assemblies that are more than 90% complete (Figure 6). The various Parachlamydiaceae species exhibit significant differences in their ability to synthetize amino acids, with reduced biosynthetic abilities in Protochlamydia amoebophila strains and Neochlamydia strains and extended metabolic abilities for Parachlamydia strains (Figure S13). The 3 closely related Neochlamydia genomes exhibit genomes larger than 2.5 Mbp, but overall highly reduced metabolic capacities (Figures S17, S18). On the other hand, they encode large numbers of mobile genetic elements and repeat proteins (Figure S17). S. epinepheli has the most reduced metabolic capacities, but its genome is only about 70% complete. While the metabolism of amino acids, nucleotides, cofactors and vitamins was not investigated in detail, we generally observed that all genomes had limited predicted biosynthetic capabilities for all those compounds and that marked differences exist within the phylum (Figure S13), as already observed in previous comparative analyses (Bertelli et al., 2010, 2015; Collingro et al., 2011; Omsland et al., 2014).

Discussion

Shotgun metagenomics allows the reconstruction of genomes from complex microbial communities. It can yield draft and nearly complete genomes without the need for cultivation (Alneberg et al., 2014), allowing to investigate the biology of uncultivable or difficult-to-culture organisms, including members of the phylum Chlamydiae. The comparative analysis of 21 metagenomic bins from public databases with previously described genomes allowed the identification of seven new candidate family level lineages, 11 new candidate genera and 12 new candidate species (Figure 3). Six of those new species exhibit nearly complete genomes (more than 90 % complete). A reference phylogeny was reconstructed based on 108 single copy orthologs conserved in more than 90% of the genomes. Given the incomplete nature of the dataset and the high sequence composition bias of several genomes, the reconstructed phylogeny might not precisely reflect the true evolutionary relationships between the most deeply branching lineages. Nevertheless, this phylogeny is congruent with recently published phylogenies based on 16S rRNA and whole genome datasets (Figure 1; Gupta et al., 2015; Pillonel et al., 2015). While no information regarding host or lifestyle is available for the newly identified clades, genomic analyses support the hypothesis that they all share the same obligate intracellular lifestyle. They all encode a type III secretion system, homologs of known effector proteins and transporters involved in nutrient uptake. In addition, members of the Parilichlamydiaceae, the most deeply branching lineage in the current phylogeny, are fish pathogens that share the typical biphasic life cycle of other Chlamydiae (Steigen et al., 2013; Stride et al., 2013; Seth-Smith et al., 2017). Despite these similarities, some central mechanisms such as cell division might significantly differ in distantly-related clades such as S. epinepheli. No homologs of most proteins involved in peptidoglycan biosynthesis and cell division could be identified in the S. epinepheli assembly. This partial genome also shows particularly limited capacities for the synthesis of amino acids, effectors, vitamins and nucleotides. Pairwise protein sequences comparisons revealed that S. epinepheli is highly divergent as compared to previously sequenced members of the phylum. This high sequence divergence might impair the correct identification of homologs based on amino acid sequence comparisons. The discovery and sequencing of novel intermediate species should facilitate the identification of groups of orthologs among such distantly-related organisms. It should clarify the origin of these highly different biosynthetic abilities and further differences in core chlamydial mechanisms. Several genomes predicted to be nearly complete showed little evidences of autonomous energy production capabilities. Those Chlamydiae might completely depend on their host for energy generation. Protein complexes of the respiratory chain are expressed at the end of the replication cycle (König et al., 2017). They might be essential at the extracellular stage. Chlamydial lineages which have lost the complete respiratory chain might not present the classical biphasic life-cycle of most Chlamydiae. Elementary bodies (EB) and cell lysis were never described for any of the three Acanthamoeba endosymbionts of the Neochlamydia genus (Ishida et al., 2014). The transition from a biphasic life-style to vertical transmission is expected to lead to major changes at the genomic level. Those three strains exhibit genomes more than twice larger than Chlamydiaceae spp., but highly reduced functional repertoires as compared to other Parachlamydiaceae (Figures S17, S18). They also harbor a high number of mobile genetic elements and repetitive elements (Figure S17; Domman et al., 2014). Similar genome features were observed in recently established insect endosymbionts such as Serratia symbiotica and Sodalis glossinidius (Toh et al., 2006; Lamelas et al., 2011). Gene losses strongly affect carbohydrate biosynthetic pathways and the production of energy, as illustrated by the loss of multiple components of the respiratory chain (Figure 6). The reduction of the electron transport chain is associated with the near complete loss of the tricarboxylic acid (TCA) cycle in at least two distinct lineages (“Ca. Arenachlamydiaceae” and the three Neochlamydia spp.). The loss of glycogen metabolism is generally associated with parasitic bacteria (Henrissat et al., 2002). This is not the case for Chlamydiae (Omsland et al., 2014). Indeed, homologs of the enzymes involved in the biosynthesis of glycogen were identified in all new assemblies. One hypothesis to explain the conservation of glycogen biosynthesis in these obligate intracellular organisms is that these enzymes are effectors manipulating the host metabolism (Ball et al., 2013; Ball and Greub, 2015). Two-fold variations in genome GC content can be observed across the phylum. The genomes of two related species of “Ca. Limichlamydiaceae” exhibit a GC content lower than 30%, similarly to some primary and secondary insects symbionts such as Buchnera and Spiroplasma (Moran et al., 2008; Lo et al., 2016). The increasing genomic coverage of the phylum, including the small low GC genome clades and five different genera of the Parachlamydiaceae, indicates that gene losses occur in parallel in different chlamydial lineages, and that highly reduced genomes are not restricted to the Chlamydiaceae family. Nevertheless, additional complete genome sequences will be necessary to gain further knowledge on the variable patterns of genome evolution across the phylum. Family-level lineages of the phylum Chlamydiae exhibit a higher conservation of the 16S rRNA gene as compared to members of the Rickettsiales order. Yet, the comparison of complete and nearly complete genomes suggest that chlamydial protein sequences show similar level or increased level of divergence for the most distantly related families. The distribution of RBBH identities seems to be more skewed toward lower values in the phylum Chlamydiae. Endosymbiosis is characterized by an acceleration of the rate of evolution (Itoh et al., 2002; Woolfit and Bromham, 2003; Kuo and Ochman, 2009). The Buchnera synonymous rate of evolution was for instance evaluated to be about twice that of low-codon-bias genes of Escherichia coli and Salmonella typhimurium (Clark et al., 1999). This is particularly due to an accelerated accumulation of mutations by genetic drift due to the small effective population size of those bacteria (Moran, 1996), but also to the loss of DNA repair mechanisms (Moran and Bennett, 2014). Members of the Chlamydia genus lack homologs of several repair proteins such as MutT, MutM, and MutH (Stephens et al., 1998). Several attempts were made to evaluate the mutation rate and the impact of genetic drift on members of the Chlamydiaceae family. They report a rather high dN/dS ratio for C. trachomatis and C. pneumoniae, which is indicative of increased level of genetic drift (Rocha et al., 2006; Kuo et al., 2009; Joseph et al., 2012). Estimates of mutation rates differ by several folds between studies, and were recently estimated to be similar to free-living bacteria for C. trachomatis (Joseph et al., 2012; Hadfield et al., 2017). As several genomes are now available for several Chlamydia-related species, it would be interesting to estimate and compare the dN/dS ratio of other clades. The 16S rRNA, a gene under high purifying selection, is more conserved than protein sequences in the phylum Chlamydiae. This difference is larger in the phylum Chlamydiae than in other intracellular bacteria such as the Rickettsiales and might indicate that the accelerated rate of evolution of Chlamydiae is at least partly due to an increase in mutation rate rather than genetic drift alone. Drift might also be counter balanced by the strong selection pressure on rRNA (Woolfit and Bromham, 2003). Those parameters might nevertheless vary within the phylum itself. Members of the Parachlamydiales, which infect free-living amoebae, exhibit larger genomes and are expected to be less sensitive to genetic drift owing to their larger effective population size and reduced transmission bottlenecks as compared to vertebrate parasites of the Chlamydia and Similichlamydia genera. The skewed identity distribution is indeed more accentuated when members of the Chlamydiaceae and S. epinepheli are included in the comparison, two clades exhibiting genomes of highly reduced size. Considering the relative high divergence of Chlamydiae genomes as compared to the 16S rRNA sequence, dedicated cutoffs are needed to properly interpret 16S rRNA based surveys of chlamydial diversity. Such cutoffs specific for members of the Chlamydiae phylum have been previously proposed (Pillonel et al., 2015). Gupta and colleagues recently proposed to split the phylum Chlamydiae into the order Chlamydiales and Parachlamydiales (Figure 3; Gupta et al., 2015). They also recognized the existence of the deep branching “Ca. Parilichlamydiaceae” clade, but included it in the Parachlamydiales order. The current work confirms the high level of divergence of the “Ca. Parilichlamydiaceae,” in addition to a potential new deeply branching lineage (“Ca. Novochlamydiaceae”). In addition, they appear to exhibit fundamental differences in division mechanisms that underline the need to create additional orders in the taxonomical classification for those deep branching lineages. The data presented is this study support the division of the Chlamydiae phylum into four orders: the Parachlamydiales, Chlamydiales, “Candidatus Parilichlamydiales” and “Candidatus Novochlamydiales.” While much work is still needed to fully comprehend the biology and diversity of members of the phylum Chlamydiae, this work provides a first insight into the metabolic and genetic diversity of what could be the most ancient and diverse clade of intracellular parasites of eukaryotes.

Author contributions

TP designed the study, performed the analyses and wrote the manuscript. TP and CB contributed to the interpretation of the results and the redaction of the manuscript. GG contributed to the design of the study and the redaction of the manuscript.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  85 in total

1.  Sequence evolution in bacterial endosymbionts having extreme base compositions.

Authors:  M A Clark; N A Moran; P Baumann
Journal:  Mol Biol Evol       Date:  1999-11       Impact factor: 16.240

2.  Massive genome erosion and functional adaptations provide insights into the symbiotic lifestyle of Sodalis glossinidius in the tsetse host.

Authors:  Hidehiro Toh; Brian L Weiss; Sarah A H Perkin; Atsushi Yamashita; Kenshiro Oshima; Masahira Hattori; Serap Aksoy
Journal:  Genome Res       Date:  2005-12-19       Impact factor: 9.043

3.  FastTree 2--approximately maximum-likelihood trees for large alignments.

Authors:  Morgan N Price; Paramvir S Dehal; Adam P Arkin
Journal:  PLoS One       Date:  2010-03-10       Impact factor: 3.240

Review 4.  Twenty years of research into Chlamydia-like organisms: a revolution in our understanding of the biology and pathogenicity of members of the phylum Chlamydiae.

Authors:  Alyce Taylor-Brown; Lloyd Vaughan; Gilbert Greub; Peter Timms; Adam Polkinghorne
Journal:  Pathog Dis       Date:  2014-12-04       Impact factor: 3.166

5.  Taxogenomics of the order Chlamydiales.

Authors:  Trestan Pillonel; Claire Bertelli; Nicolas Salamin; Gilbert Greub
Journal:  Int J Syst Evol Microbiol       Date:  2015-01-29       Impact factor: 2.747

6.  International Committee on Systematics of Prokaryotes Subcommittee on the taxonomy of Chlamydiae. Minutes of the closed meeting, 31 March 2015, New Orleans, USA.

Authors:  Gilbert Greub
Journal:  Int J Syst Evol Microbiol       Date:  2017-02-24       Impact factor: 2.747

7.  Two nucleotide transport proteins in Chlamydia trachomatis, one for net nucleoside triphosphate uptake and the other for transport of energy.

Authors:  J Tjaden; H H Winkler; C Schwöppe; M Van Der Laan; T Möhlmann; H E Neuhaus
Journal:  J Bacteriol       Date:  1999-02       Impact factor: 3.490

8.  Reassessing the role of the secreted protease CPAF in Chlamydia trachomatis infection through genetic approaches.

Authors:  Emily A Snavely; Marcela Kokes; Joe Dan Dunn; Hector A Saka; Bidong D Nguyen; Robert J Bastidas; Dewey G McCafferty; Raphael H Valdivia
Journal:  Pathog Dis       Date:  2014-05-16       Impact factor: 3.166

Review 9.  Tracing the primordial Chlamydiae: extinct parasites of plants?

Authors:  Agathe Subtil; Astrid Collingro; Matthias Horn
Journal:  Trends Plant Sci       Date:  2013-11-08       Impact factor: 18.313

10.  Biphasic Metabolism and Host Interaction of a Chlamydial Symbiont.

Authors:  Lena König; Alexander Siegl; Thomas Penz; Susanne Haider; Cecilia Wentrup; Julia Polzin; Evelyne Mann; Stephan Schmitz-Esser; Daryl Domman; Matthias Horn
Journal:  mSystems       Date:  2017-05-30       Impact factor: 6.496

View more
  10 in total

1.  Sequencing the Obligate Intracellular Rhabdochlamydia helvetica within Its Tick Host Ixodes ricinus to Investigate Their Symbiotic Relationship.

Authors:  Trestan Pillonel; Claire Bertelli; Sébastien Aeby; Marie de Barsy; Nicolas Jacquier; Carole Kebbi-Beghdadi; Linda Mueller; Manon Vouga; Gilbert Greub
Journal:  Genome Biol Evol       Date:  2019-04-01       Impact factor: 3.416

2.  Eukaryotic Cell Permeabilisation to Identify New Putative Chlamydial Type III Secretion System Effectors Secreted within Host Cell Cytoplasm.

Authors:  Carole Kebbi-Beghdadi; Ludovic Pilloux; Virginie Martin; Gilbert Greub
Journal:  Microorganisms       Date:  2020-03-03

3.  Draft Genome Sequences of Chlamydiales Bacterium STE3 and Neochlamydia sp. Strain AcF84, Endosymbionts of Acanthamoeba spp.

Authors:  Stephan Köstlbacher; Stefanie Michels; Alexander Siegl; Frederik Schulz; Daryl Domman; Somchai Jongwutiwes; Chaturong Putaporntip; Matthias Horn; Astrid Collingro
Journal:  Microbiol Resour Announc       Date:  2020-05-14

4.  Transcriptional Landscape of Waddlia chondrophila Aberrant Bodies Induced by Iron Starvation.

Authors:  Silvia Ardissone; Aurélie Scherler; Trestan Pillonel; Virginie Martin; Carole Kebbi-Beghdadi; Gilbert Greub
Journal:  Microorganisms       Date:  2020-11-24

5.  Conservation of the glycogen metabolism pathway underlines a pivotal function of storage polysaccharides in Chlamydiae.

Authors:  Derifa Kadouche; Mathieu Ducatez; Matthieu Colpaert; Trestan Pillonel; Carole Kebbi-Beghdadi; Ugo Cenci; Binquan Huang; Malika Chabi; Emmanuel Maes; Bernadette Coddeville; Loïc Couderc; Hélène Touzet; Fabrice Bray; Catherine Tirtiaux; Steven Ball; Gilbert Greub; Christophe Colleoni
Journal:  Commun Biol       Date:  2021-03-05

6.  Pangenomics reveals alternative environmental lifestyles among chlamydiae.

Authors:  Stephan Köstlbacher; Astrid Collingro; Tamara Halter; Frederik Schulz; Sean P Jungbluth; Matthias Horn
Journal:  Nat Commun       Date:  2021-06-29       Impact factor: 14.919

7.  Metagenomic Analysis of Fish-Associated Ca. Parilichlamydiaceae Reveals Striking Metabolic Similarities to the Terrestrial Chlamydiaceae.

Authors:  Alyce Taylor-Brown; Trestan Pillonel; Gilbert Greub; Lloyd Vaughan; Barbara Nowak; Adam Polkinghorne
Journal:  Genome Biol Evol       Date:  2018-10-01       Impact factor: 3.416

8.  A single-cell genome perspective on studying intracellular associations in unicellular eukaryotes.

Authors:  Tomáš Tyml; Shailesh V Date; Tanja Woyke
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-10-07       Impact factor: 6.237

9.  A predation assay using amoebae to screen for virulence factors unearthed the first W. chondrophila inclusion membrane protein.

Authors:  C Kebbi-Beghdadi; L Pilloux; A Croxatto; N Tosetti; T Pillonel; G Greub
Journal:  Sci Rep       Date:  2019-12-20       Impact factor: 4.379

10.  Novel Chlamydiae and Amoebophilus endosymbionts are prevalent in wild isolates of the model social amoeba Dictyostelium discoideum.

Authors:  Tamara S Haselkorn; Daniela Jimenez; Usman Bashir; Eleni Sallinger; David C Queller; Joan E Strassmann; Susanne DiSalvo
Journal:  Environ Microbiol Rep       Date:  2021-06-22       Impact factor: 3.541

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.