| Literature DB >> 35906926 |
Jean-Baptiste Leducq1,2, David Sneddon2, Malia Santos2, Domitille Condrain-Morel3, Geneviève Bourret3, N Cecilia Martinez-Gomez4, Jessica A Lee5, James A Foster2, Sergey Stolyar2, B Jesse Shapiro6,7, Steven W Kembel3, Jack M Sullivan2, Christopher J Marx2.
Abstract
Methylobacterium is a group of methylotrophic microbes associated with soil, fresh water, and particularly the phyllosphere, the aerial part of plants that has been well studied in terms of physiology but whose evolutionary history and taxonomy are unclear. Recent work has suggested that Methylobacterium is much more diverse than thought previously, questioning its status as an ecologically and phylogenetically coherent taxonomic genus. However, taxonomic and evolutionary studies of Methylobacterium have mostly been restricted to model species, often isolated from habitats other than the phyllosphere and have yet to utilize comprehensive phylogenomic methods to examine gene trees, gene content, or synteny. By analyzing 189 Methylobacterium genomes from a wide range of habitats, including the phyllosphere, we inferred a robust phylogenetic tree while explicitly accounting for the impact of horizontal gene transfer (HGT). We showed that Methylobacterium contains four evolutionarily distinct groups of bacteria (namely A, B, C, D), characterized by different genome size, GC content, gene content, and genome architecture, revealing the dynamic nature of Methylobacterium genomes. In addition to recovering 59 described species, we identified 45 candidate species, mostly phyllosphere-associated, stressing the significance of plants as a reservoir of Methylobacterium diversity. We inferred an ancient transition from a free-living lifestyle to association with plant roots in Methylobacteriaceae ancestor, followed by phyllosphere association of three of the major groups (A, B, D), whose early branching in Methylobacterium history has been heavily obscured by HGT. Together, our work lays the foundations for a thorough redefinition of Methylobacterium taxonomy, beginning with the abandonment of Methylorubrum.Entities:
Keywords: zzm321990 Methylobacteriumzzm321990 ; zzm321990 Methylorubrumzzm321990 ; core genome; genome architecture; horizontal gene transfers; lineage tree; phyllosphere; species concept in bacteria; species tree
Mesh:
Substances:
Year: 2022 PMID: 35906926 PMCID: PMC9364378 DOI: 10.1093/gbe/evac123
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 4.065
Fig. 1.Methylobacteriaceae lineage trees inferred from 213 genomes. (a) Best tree from RAxML ML search on the concatenated alignments of 384 core gene nucleotide sequences (GTRCAT model, 512 replicated trees), rooted on Microvirga and Enterovirga (gray). Correspondence with clades described by previous studies is indicated. (b) ASTRAL tree inferred from 384 core gene ML trees. Each gene ML tree was inferred assuming a GTRgamma model (1,000 replicated trees; nodes with less than 10% of support collapsed) and combined in ASTRAL-III. Branch lengths are in coalescent units. Nodal support values represent local posterior probability. (c) SVD quartet tree inferred from the concatenated alignments of 384 core gene nucleotide sequences. Nodes supported by less than 75% of quartets were collapsed. (d) Main isolation sources of species from Methylobacterium group and Microvirga (see table 1). For each group, ordered according to a consensus tree (see panels B and C), the number of species is indicated in parenthesis.
Description and Isolation Source of 104 Methylobacterium Species Distributed in the Four Main Phylogenetic Groups
| Group | Description | Species | Genomes | Isolation source | Anthropogenic environments* | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Plant (phyllosphere) | Plant (rhizosphere) | Plant (other) | Water, sediments | Soil | Other | |||||
|
| Described species | 14 | 33 | 30% | 1% | 1% | 13% | 19% | 35% | 29% |
| ( | ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
| Candidate species ( | 7 | 8 | 50% | — | 7% | 29% | — | 14% | 29% | |
| All species | 21 | 41 | 37% | 1% | 3% | 18% | 13% | 28% | 29% | |
|
| Described species | 24 | 45 | 30% | 1% | 4% | 28% | 17% | 20% | 40% |
| ( | ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
| Candidate species ( | 17 | 36 | 80% | 10% | 10% | — | — | — | — | |
| All species | 41 | 81 | 51% | 4% | 7% | 16% | 10% | 12% | 23% | |
|
| Described species | 6 | 10 | 47% | — | 17% | 20% | — | 17% | 17% |
| ( | ||||||||||
|
| ||||||||||
| Candidate species ( | 17 | 32 | 74% | 6% | — | 21% | — | — | — | |
| All species | 23 | 42 | 67% | 4% | 4% | 20% | — | 4% | 4% | |
|
| Described species | 15 | 21 | 8% | 10% | 7% | 27% | 33% | 15% | 55% |
| ( | ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
|
| ||||||||||
| Candidate species ( | 4 | 4 | — | 25% | 25% | — | 25% | 25% | 25% | |
| All species | 19 | 25 | 7% | 13% | 11% | 21% | 32% | 17% | 49% | |
|
|
| 59 | 109 | 26% | 3% | 5% | 23% | 20% | 22% | 39% |
|
| 45 | 80 | 66% | 8% | 7% | 12% | 2% | 4% | 7% | |
|
| 104 | 189 | 43% | 5% | 6% | 18% | 12% | 14% | 25% | |
|
| 18 | 24 | — | 33% | 6% | 17% | 33% | 11% | 6% | |
|
| 2 | 2 | — | — | — | — | 50% | 50% | — | |
For each group, the number of species, of genomes, and the proportion of genomes isolated from each main category of environment, are given for described and candidate species (numbered from Methylobacterium sp 001 to 045), separately. Proportions were corrected by the number of genomes per species. Anthropogenic environments include several other isolations sources. —, no observation.
Methylobacteriaceae Genome Characteristics (Average and Standard Deviation Per Group)
| Group | Genomes | Species | Size (Mb) | Annotations | Unique Annotations | Estimated Copy Number | Mobile Elements | % GC |
|---|---|---|---|---|---|---|---|---|
|
| 81 | 41 | 6.21 ± 0.59 | 6907 ± 821 | 2696 ± 134 | 1.457 ± 0.067 | 60 ± 34 | 70.1 ± 0.8 |
|
| 41 | 21 | 5.58 ± 0.49 | 5766 ± 509 | 2706 ± 173 | 1.365 ± 0.048 | 57 ± 31 | 69.1 ± 0.8 |
|
| 25 | 19 | 7.15 ± 0.66 | 7670 ± 956 | 2899 ± 122 | 1.542 ± 0.066 | 71 ± 49 | 71.1 ± 0.7 |
|
| 42 | 23 | 4.99 ± 0.35 | 5224 ± 476 | 2421 ± 82 | 1.312 ± 0.042 | 42 ± 21 | 68.8 ± 1.1 |
|
| 2 | 2 | 4.91 ± 0.36 | 5128 ± 182 | 2321 ± 14 | 1.414 ± 0.019 | 14 ± 10 | 68.8 ± 0.1 |
|
| 22 | 18 | 5.92 ± 1.74 | 6834 ± 2929 | 2495 ± 251 | 1.471 ± 0.173 | 128 ± 147 | 63.9 ± 1.6 |
GC content was estimated from coding sequences. Hypothetical protein, mobile and repeat elements were excluded from unique annotation counts and estimated copy numbers.
Fig. 2Gene content comparison among the four main Methylobacterium groups. (a) Occurrence of 10,187 gene annotations (rows) in 124 Methylobacteriaceae species (average occurrence per species; column, ordered according to the ASTRAL species tree, left) and in four Methylobacterium groups and two outgroups (mean occurrence among species within groups; legend in bottom right) are shown. (b and c) Venn diagrams showing the overlap of pan genomes (B) and core genome (c) among four groups. Pan and core genome sizes were estimated assuming 15 species per group (mean and standard deviation over 100 random resampling of 15 species per group). (d) RAxML ML best tree based on annotation occurrence per genome (best ML tree, BINCAT model, 1,001 replicate trees). Main groups are shown and are monophyletic in the gene content tree, but group A: clade A2 (Alessa et al. 2021) and M. jeotgali branched out of group A.
Average and Standard Deviation in Gene Content Dissimilarity (BC Index, Hellinger Transformation on Gene Occurrence Per Genome) Per and Among Methylobacterium Group and Outgroups
| Group |
|
|
| ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
| |||
|
| 0.06 ± 0.02 | 0.19 ± 0.05 | |||||
|
| 0.08 ± 0.02 | 0.17 ± 0.05 | 0.28 ± 0.02 | ||||
|
| 0.07 ± 0.03 | 0.20 ± 0.04 | 0.29 ± 0.02 | 0.34 ± 0.02 | |||
|
| 0.04 ± 0.02 | 0.16 ± 0.03 | 0.24 ± 0.03 | 0.24 ± 0.03 | 0.32 ± 0.02 | ||
|
| — | 0.28 | 0.39 ± 0.01 | 0.41 ± 0.02 | 0.36 ± 0.02 | 0.37 ± 0.01 | |
|
| 0.03 ± 0.04 | 0.26 ± 0.04 | 0.41 ± 0.02 | 0.44 ± 0.02 | 0.38 ± 0.02 | 0.40 ± 0.02 | 0.36 ± 0.03 |
—, no observation.
Average and Standard Deviation in Core Genome Synteny (SI) Per and Among Methylobacterium Group and Outgroups
| Group |
|
|
| ||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
| |||
|
| 0.98 ± 0.03 | 0.70 ± 0.17 | |||||
|
| 0.99 ± 0.01 | 0.77 ± 0.21 | 0.46 ± 0.03 | ||||
|
| 0.91 ± 0.06 | 0.61 ± 0.12 | 0.43 ± 0.03 | 0.44 ± 0.01 | |||
|
| 0.98 ± 0.02 | 0.73 ± 0.11 | 0.53 ± 0.05 | 0.51 ± 0.02 | 0.48 ± 0.01 | ||
|
| — | 0.47 | 0.39 ± 0.03 | 0.39 ± 0.02 | 0.38 ± 0.02 | 0.42 ± 0.03 | |
|
| 1.00 ± 0.00 | 0.91 ± 0.05 | 0.49 ± 0.03 | 0.48 ± 0.01 | 0.46 ± 0.01 | 0.52 ± 0.02 | 0.52 ± 0.05 |
—, no observation.
Fig. 3Core genome architecture comparison (synteny) among Methylobacteriaceae genomes. (a) Consensus map of the Methylobacterium core genome architecture, and major rearrangements within and among Methylobacterium groups, using M. planium YIM132548 core genome as a reference. The map was drawn as a network using 384 core genes as nodes, and links among neighbor core genes as edges. Only 389 links that were observed in a majority (>50%) of species from a given Methylobacterium group are shown (Venn diagram on top right; 5,720 links discarded). Bold lines indicate links mostly conserved in group A, colored according to their dominance in other groups (legend on bottom right). Thick lines indicate links mostly absent in group A but dominant in other groups. A syntenic island conserved in most Methylobacterium genomes and containing ribosomal genes and gene rpoB is indicated (dotted frame). (b) RAxML ML best tree based on link occurrence per genome (6,109 links; best ML tree, BINCAT model, 1,001 replicate trees). Main groups are shown and are monophyletic in the synteny tree. (c) Detailed synteny plot for the comparison of core genome architecture between seven species from group A and six species from group D (best assembled genome per species). For each pairwise comparison, core gene (black points) are ordered according to their relative position in species 1 genome (x-axis) and are compared with their relative positions in species 2 genome (y-axis). Each plot is colored according to the SI value between species 1 and 2 (scale on top right).