| Literature DB >> 25124552 |
Guillaume Borrel, Nicolas Parisot, Hugh M B Harris, Eric Peyretaillade, Nadia Gaci, William Tottey, Olivier Bardot, Kasie Raymann, Simonetta Gribaldo, Pierre Peyret, Paul W O'Toole, Jean-François Brugère1.
Abstract
BACKGROUND: A seventh order of methanogens, the Methanomassiliicoccales, has been identified in diverse anaerobic environments including the gastrointestinal tracts (GIT) of humans and other animals and may contribute significantly to methane emission and global warming. Methanomassiliicoccales are phylogenetically distant from all other orders of methanogens and belong to a large evolutionary branch composed by lineages of non-methanogenic archaea such as Thermoplasmatales, the Deep Hydrothermal Vent Euryarchaeota-2 (DHVE-2, Aciduliprofundum boonei) and the Marine Group-II (MG-II). To better understand this new order and its relationship to other archaea, we manually curated and extensively compared the genome sequences of three Methanomassiliicoccales representatives derived from human GIT microbiota, "Candidatus Methanomethylophilus alvus", "Candidatus Methanomassiliicoccus intestinalis" and Methanomassiliicoccus luminyensis.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25124552 PMCID: PMC4153887 DOI: 10.1186/1471-2164-15-679
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Genome statistics
| Feature | “ | “ |
|
|---|---|---|---|
| Genome sizea | 1,666,795 | 1,931,651 | 2,637,810d |
| (2,620,233) | |||
| DNA G + C content | 55.6% | 41.3% | 60.5% |
| % DNA coding region | 89.5% | 88.4% | 87.6% |
| Intergenic regions mean size (SD)a | 102 (175) | 119 (264) | 121 (238) |
| Genes mean G + C content | 56.3% | 42.4% | 61.0% |
| Putative replicons | 1(+1)b | 1(+1)b | 1 (+1)b |
| Extrachromosomal elements | NAc | NAc | NAc |
| Total genes | 1,705 | 1,882 | 2,713 |
| RNA genes | 52 | 50 | 52 |
| rRNA genes (5S-16S-23S) | 4 (2 - 1 - 1) | 4 (2 - 1 - 1) | 4 (2 - 1 - 1) |
| tRNA genes | 48 | 46 | 48 |
| Protein coding genes | 1,653 | 1,832 | 2,661 |
| Mean size of protein coding genes (SD)a | 901 (667) | 930 (890) | 859 (676) |
| Median size of protein coding genesa | 771 | 780 | 732 |
| Gene products with function prediction | 1,335 | 1,476 | 2,002 |
| Gene products assigned to arCOGs | 1,271 | 1,438 | 2,065 |
| Gene products assigned Pfam domains | 123 | 125 | 204 |
| Gene products with signal peptides | 247 | 336 | 512 |
| Gene products with transmembrane helices | 281 | 389 | 585 |
| CRISPR repeats | 1e | 1 | 1 |
aSizes are given in bp.
bPresence of two different cdc6 genes per genome. See the text for more information.
cNot available.
dData from [8]: in bracket stands the total bp (26 contigs) available from database [GenBank: CAJE01000001 to CAJE01000026], analyzed in this study.
ePresence of CRISPR repeats split into two neighboring loci (see Additional file 1: Table S3) surrounding a DNA sequence containing one gene encoding a putative transposase.
Figure 1Genomic features of ribosomal genes in Euryarchaeota. (A) Phylogeny of Euryarchaeota highlighting the position of the Methanomassiliicoccales (according to [16]). The seven orders of methanogens are in red. (B) Genomic organization of ribosomal genes in Euryarchaeota: 5S, 16S and 23S rRNA genes are symbolized by blue, green and orange arrows, respectively. They are indicated irrespectively of the (+) or (-) DNA strand carrying them. A plain line defines an operon organization where tRNAs (when present) are not shown, nor the number of genes encoding rRNA with the exception of the Methanomassiliicoccales. The 5S rRNA gene in bracket refers to a second 5S rRNA copy isolated from the 16S-23S-5S rRNA gene operon in Methanococcus maripaludis C5.
ORBs motifs found in the Methanomassiliicoccales genomes
| ORB | Sequence | Position | Spacing | Orientation | Comment |
|---|---|---|---|---|---|
| “ |
| 78 - 99 | 39 | inverted | downstream |
| “ |
| 138 - 159 | inverted | downstream | |
| “ | T | 1977 - 1998 | 47 | upstream | |
| “ |
| 2045 - 2066 | upstream | ||
| “ | A | 15 - 36 | 256 | inverted | downstream |
| “ | T | 292 - 313 | downstream | ||
| “ |
| 795626 - 795647 | downstream | ||
| “ | TC | 1576211 -1576232 | inverted | downstream fused | |
|
|
| 73488 - 73475b | 113 | downstream | |
|
|
| 73341 - 73362b | inverted | downstream | |
| Methanomassiliicoccales consensus ORB |
| ||||
| Archaea consensus ORB | C | Pelve |
Bases in bold indicate consensual bases of the ORB sequence in the Methanomassiliicoccales. The “Ca. M. alvus” ORBs, and the ORB2 of M. luminyensis and “Ca. M. intestinalis” might be extended by a “GGGGGT” sequence otherwise not conserved in the 4 other Methanomassiliioccales ORBs and the Archaea consensus ORB.
aNot found in close association to another ORB.
bContig [GenBank: CAJE01000021.1].
DNA replication proteins compared to the corresponding components in Thermoplasmatales, MG-II and DHEV2
| " | " |
| MG-II | DHEV2 | Thermoplasmatales | |
|---|---|---|---|---|---|---|
| ATP-dependent DNA ligase | AGI85913 | AGN25909 | WP_019176428 | X | ■ | ■ |
| Orc1/Cdc6 | AGI84758 (1) | AGN25419 (1) | WP_019178385 (1) | ■■ | ■ | ■■ |
| AGI85775 (2) | AGN27158 (2) | WP_019178317 (2) | ||||
| DNA Pol D large subunit (DPL) | AGI85099 | AGN26720 | WP_019177373 | ■ | ■ | ■ |
| DNA Pol D small subunit (DPS) | AGI84772 | AGN27082 | WP_019178373 | ■ | ■ | ■ |
| FEN-1 | AGI85207 | AGN26626 | WP_019176843 | ■ | ■ | ■ |
| GINS 51 | AGI84890 | AGN27100 | X | X | ■ | ■ |
| GINS 23 | X | X | X | X | X | X |
| DNA Gyrase subunit B | [AGI86382] | [AGY50228] | [WP_019178436] | [■] | [■] | [■] |
| DNA Gyrase subunit A | [AGI86381] | [AGN27159] | [WP_019178437] | [■] | [■] | [■] |
| MCM | AGI86392 | AGN26346 | WP_019178416 | ■ | ■ | ■ |
|
| ||||||
| PCNA | AGI84935 | AGN27068 | WP_019176118 | ■ | ■ | ■ |
| DNA Pol B | AGI86264 | AGN26701 | WP_019177962 | ■ | ■ | ■ |
|
| ||||||
| Primase large subunit (PriL) | AGI84820 | AGN27177 | WP_019178297 | ■ | ■ | ■ |
| Primase small subunit (PriS) | AGI86400 | AGY50234 | WP_019178400 | ■ | ■ | ■ |
| RFC large subunit | AGI85559 | AGN26596 | WP_019176873 | ■ | ■ | ■ |
| RFC small subunit | AGI85778 | AGN26166 | WP_019177244 | ■ | ■ | ■ |
| RNaseH II | AGI86158 | AGN25790 | WP_019177553 | ■ | ■ | ■ |
| TopoVI subunit A | AGI85998 | AGN26743 | WP_019177592 | ■ | ■ | X |
| TopoVI subunit B | AGI85997 | AGN26742 | WP_019177591 | ■ | ■ | X |
| Topo IB | X | X | X | X | X | X |
| SSB | X | X | X | X | ■ | ■ |
| RPA2 | AGI84916 | AGN25568 | WP_019178149 | ■ | ■ | ■ |
|
|
| |||||
| rpa2A (rp associated protein) | AGI84915 | AGN25567 | WP_019178150 | ■ | ■ | ■ |
| NAD-dependent DNA ligase | [AGI85455] | X | X | [■] | X | X |
Proteins in brackets indicate horizontal transfers from bacteria; Proteins in italics indicate fast evolving additional copies likely representing decaying paralogs, genes horizontally transferred among archaea, or homologs arising from integration of foreign elements. Absent proteins (or unavailable due to genome incompleteness) are indicated by an X. (1) and (2) in front of the Orc1/Cdc6 protein accession numbers indicate the Orc1/Cdc6.1 and Orc1/Cdc6.2, respectively.
Figure 2Genomic regions surrounding the (A) and (B) genes in the three genomes of Methanomassiliicoccales. Each homologous gene (i.e. showing more than 30% amino acid identity and an e-value < 10-5 when analyzed by blast against each other) from the 2 regions of the 3 genomes is colored differently and connected with shading. The black arrows represent genes involved in the replication process. The grey arrows represent other genes of various function with no homologue detected on the corresponding region of the 2 other genomes. “Hypoth.” refers to genes encoding hypothetical proteins.
Figure 3Shared and unique CDS among the three genomes. Venn diagram indicating the core genome at its center, deduced from a BLAST analysis of the CDS from the 3 genomes of the Methanomassiliicoccales. Unique and shared CDS among genome pairs are also given.
Proteome of the three Methanomassiliicoccales representatives compared to their phylogenetic neighbors, human gut methanogens and NCBI nr proteins
| Core genome of Methanomassiliicoccales: 658 protein sequences | Specific a | Shared b |
|---|---|---|
| Phylogenetic neighbors | 173 | 485 |
|
| ||
| Human gut methanogens | 227 | 431 |
|
| ||
| Phylogenetic neighbors and human gut methanogens | 102 | 556 shared with at least one |
|
| ||
|
| ||
|
| ||
| 360 shared with the two groups | ||
| NCBI non-redundant protein sequences database | 20 (21)c | 637 |
aNumber of deduced proteins of the core genome of Methanomassiliicoccales that are not found in the corresponding organisms.
bNumber of deduced proteins of the core genome of Methanomassiliicoccales that are also found in the corresponding organisms.
cThe value of 21 encompasses CDS that are specific of the proteome of the Methanomassiliicoccales together with either the ones of the phylogenetic neighbors or of the human gut methanogens, without any other blast hits with the NCBI nr protein sequences database.
Figure 4Proposed pathways for methanogenesis and energy conservation in the Methanomassiliicoccales representatives. The protein names are in bold. The predicted pathways and enzymes present in the three Methanomassiliicoccales species are in blue, those absent from "Ca. M. intestinalis" are in green and those absent from "Ca. M. alvus" are in red. MtaA and MtbA are marked with an asterisk to signify that the homologs present in the Methanomassiliicoccales are not yet assigned to one or the other enzyme category. "X" refers to the uncharacterized lipid soluble electron transporter. The question mark points out that the enzymes involved in the reoxidation of the lipid soluble electron transporter remain to be uncovered. See Table 6 and Additional file 1: Table S10 for a description of the set of genes involved.
Genes involved in energy conservation in " . M. alvus", " . M. intestinalis" and and accession numbers of the proteins they encode
| " | " |
| Transmembrane helices | |
|---|---|---|---|---|
| ATP synthase | ||||
|
| AGI84762.1 | AGN25422.1 | WP_019178382.1 | no |
|
| AGI84763.1 | AGN25423.1 | WP_019178381.1 | yes |
|
| AGI84764.1 | AGN25424.1 | WP_019178380.1 | yes |
|
| AGI84765.1 | AGN25425.1 | WP_019178379.1 | no |
|
| AGI84766.1 | AGN25426.1 | WP_019178378.1 | no |
|
| AGI84767.1 | AGN25427.1 | WP_019178377.1 | no |
|
| AGI84768.1 | AGN25428.1 | WP_019178376.1 | no |
|
| AGI84769.1 | AGN25429.1 | WP_019178375.1 | no |
|
| AGI84770.1 | AGN25430.1 | WP_019178374.1 | no |
| Membrane-bound proton-translocating pyrophosphatase | ||||
|
| / | AGN26077.1 | WP_019176822.1 | yes |
| Heterodisulfide reductase | ||||
|
| AGI85054.1 | AGN25863.1 | WP_019177460.1 | no |
|
| AGI86093.1 | AGN25718.1 | WP_019177711.1 | no |
|
| AGI85474.1 | AGN25916.1 | WP_019176125.1 | no |
|
| AGI86094.1 | AGN25719.1 | WP_019177712.1 | no |
|
| / | / | WP_019176126.1 | no |
|
| AGI86375.1 | AGN25510.1 | WP_019178460.1 | no |
|
| AGI86212.1 | AGN25649.1 | WP_019177852.1 | no |
|
| / | / | WP_019177557.1 | no |
|
| / | / | / | / |
| Methyl-viologen-reducing hydrogenase | ||||
|
| AGI85055.1 | AGN25864.1 | WP_019177459.1 | no |
|
| / | AGN25453.1 | WP_019176201.1 | no |
|
| / | / | WP_019176130.1 | no |
|
| AGI85056.1 | AGN25865.1 | WP_019177458.1 | no |
|
| AGI85057.1 | AGN25866.1 | WP_019177457.1 | no |
| F420H2 dehydrogenase-like/11-subunit respiratory complex 1 | ||||
|
| AHA34030.1 | AGN25601.1 | WP_019176183.1 | yes |
|
| AGI84952.1 | AGN25602.1 | WP_019176182.1 | no |
|
| AGI84953.1 | AGN25603.1 | WP_019176181.1 | no |
|
| AGI84954.1 | AGN25604.1 | WP_019176180.1 | no |
|
| / | / | / | |
|
| AGI84955.1 | AGN25605.1 | WP_019176179.1 | yes |
|
| AGI84956.1 | AGN25606.1 | WP_019176178.1 | no |
|
| AGI84957.1 | AGN25607.1 | WP_019176177.1 | yes |
|
| AGI84958.1 | AGN25608.1 | WP_019176176.1 | yes |
|
| AGI84959.1 | AGN25609.1 | WP_019176175.1 | yes |
|
| AGI84960.1 | AGN25610.1 | WP_019176174.1 | yes |
|
| AGI84961.1 | AGN25611.1 | WP_019176173.1 | yes |
|
| AGI84962.1 | AGN25612.1 | WP_019176172.1 | yes |
|
| / | / | / | |
| Energy-converting hydrogenase | ||||
|
| / | AGN25511.1 | WP_019178471.1 | yes |
|
| / | AGN26997.1 | WP_019176386.1 | yes |
|
| / | AGN25512.1 | WP_019178472.1 | yes |
|
| / | AGN26998.1 | WP_019176385.1 | yes |
|
| / | AGN25513.1 | WP_019178473.1 | no |
|
| / | AGN26999.1 | WP_019176384.1 | no |
|
| / | AGN25514.1 | WP_019178474.1 | no |
|
| / | AGN27000.1 | WP_019176383.1 | no |
|
| / | AGN25515.1 | WP_019178475.1 | no |
|
| / | AGN27001.1 | WP_019176382.1 | no |
|
| / | AGN25516.1 | WP_019178476.1 | no |
|
| / | AGN27002.1 | WP_019176381.1 | no |
| Liposoluble electron transporter synthesis | ||||
|
| AGI84964.1 | AGN25614.1 | WP_019176170.1 | / |
|
| AGI85875.1 | AGN26416.1 | WP_019178349.1 | / |
| AGN26109.1 | ||||
|
| AGI85874.1 | AGN26417.1 | WP_019178072.1 | / |
| AGN25541.1 | WP_019178198.1 | |||
| WP_019176998.1 | ||||
aencoding a geranylgeranyl pyrophosphate synthase (GGPPS).
bencoding a1,4-dihydroxy-2-naphthoate octaprenyltransferase (DHNOPT).
cencoding a 2-heptaprenyl-1,4-naphthoquinone methyltransferase (HPNQMT).
Core proteins of methanogenesis
| Annotation | " | " |
| Distribution | arCOG |
|---|---|---|---|---|---|
| Nitrogenase molybdenum-iron like protein (NifD-like/NflD) |
|
|
| 1 | arCOG04888 |
| UDP-N-acetylmuramyl pentapeptide synthase like protein (MurF-like) |
|
|
| 1 | arCOG02822 |
| Methyl-coenzyme M reductase operon associated like protein (McrC-like) |
|
|
| 1 | arCOG03226 |
| Conserved hypothetical protein |
|
|
| 1 | arCOG04904 |
| CoA-substrate-specific enzyme activase |
|
|
| 1 | arCOG02679 |
| Conserved hypothetical protein |
|
|
| 1 | arCOG04903 |
| Conserved hypothetical protein |
|
|
| 1 | arCOG04901 |
| Peptidyl-prolyl cis-trans isomerase related protein |
|
|
| 1 | arCOG04900 |
| Methyl coenzyme M reductase operon associated protein (McrC) |
|
|
| 1 | arCOG03225 |
| Methyl-coenzyme M reductase, component A2 (AtwA) |
|
|
| 1*† | arCOG00185 |
| Methyl coenzyme M reductase, beta subunit (McrB/MrtB) |
|
|
| 1* | arCOG04860 |
| Methyl coenzyme M reductase, protein D (McrD/MrtD) |
|
|
| 1* | arCOG04859 |
| Methyl coenzyme M reductase, gamma subunit (McrG/MrtG) |
|
|
| 1* | arCOG04858 |
| Methyl coenzyme M reductase, alpha subunit (McrA/MrtA) |
|
|
| 1* | arCOG04857 |
| SH3 fold protein |
|
|
| 1 | arCOG04846 |
| Conserved hypothetical protein |
|
|
| 2*† | arCOG02882 |
| AIR synthase-like protein | AGI85549 | AGN26462 | WP_019176932.1 | 2 | arCOG00640 |
| Predicted DNA-binding protein containing a Zn-ribbon domain | AGI84948 | AGN25597 | WP_019176187.1 | 2* | arCOG01116 |
| Methyltransferase related protein (MtxX) | AGI85117 | AGN26654 | WP_019177314.1 | 3 | arCOG00854 |
| Conserved hypothetical protein | AGI84870 | AGN25885 | WP_019178690.1 | 3* | arCOG04893 |
| Fe-S oxidoreductase, related to NifB/MoaA family | - | - | - | 4* | arCOG00950 |
| Conserved hypothetical protein | - | - | - | 4 | arCOG04866 |
| N5-methyltetrahydromethanopterin: coenzyme M methyltransferase, subunit A (MtrA) | - | - | - | 4* | arCOG03221 |
| N5-methyltetrahydromethanopterin: coenzyme M methyltransferase, subunit B (MtrB) | - | - | - | 4 | arCOG04867 |
| N5-methyltetrahydromethanopterin: coenzyme M methyltransferase, subunit C (MtrC) | - | - | - | 4 | arCOG04868 |
| N5-methyltetrahydromethanopterin: coenzyme M methyltransferase, subunit D (MtrD) | - | - | - | 4 | arCOG04869 |
| N5-methyltetrahydromethanopterin: coenzyme M methyltransferase, subunit E (MtrE) | - | - | - | 4 | arCOG04870 |
| Soluble P-type ATPase | - | - | - | 5 | arCOG01579 |
| Uncharacterized conserved protein | - | - | - | 5* | arCOG04844 |
| Conserved hypothetical protein (putative kinase) | - | - | - | 6 | arCOG04885 |
Protein accession numbers with the same font (bold, italics or bold-italics) are encoded by genes situated close to each other in their respective genomes.
*Paralogues.
†Related to a bacterial cluster with same conserved domain.
1, Methanogenesis marker, present in and unique to all sequenced methanogens and not in other archaea.
2, Present in all sequenced methanogens and less than 5% of other sequenced archaea.
3, Present in more than 90% of sequenced methanogens including Methanomassiliicoccales and less than 5% of other sequenced archaea.
4, Absent from the Methanomassiliicoccales but present and unique to all other methanogens.
5, Absent from the Methanomassiliicoccales but present in more than 90% other methanogens and not in other archaea.
6, Absent from the Methanomassiliicoccales but present in more than 90% of sequenced methanogens and less than 5% of other sequenced archaea.
Putative Pyl-containing proteins in " . M. alvus"
| Accession number | Annotation | Size a | Comments |
|---|---|---|---|
|
| hypothetical protein | 253 | DPM synthase like/GT2 superfamily |
|
| hypothetical protein | 270 | digeranylgeranylglyceryl phosphate synthase |
|
| phosphotransacetylase-like protein | 242 | putative methyltransferase MtxX |
|
| filamentation induced by cAMP protein Fic | 425 | |
| AGI85186.1 | hypothetical protein | 149 | Rv0623-like transcription factor |
| AGI85280.1 | hypothetical protein | 917 | glycosyltransferase family 29 |
| AGI85290.1 | hypothetical protein | 148 | |
| AGI85300.1 | hypothetical protein | 444 | ATPase domain |
| AGI85437.1 | hypothetical protein | 536 | prophage Lp3 protein 8 (helicase) of |
| AGI85443.1 | hypothetical protein | 717 | |
|
| hypothetical protein | 262 | putative methyltransferase |
| AGI85596.1 | hypothetical protein | 162 | putative acetyltransferase |
| AGI85630.1 | hypothetical protein | 322 | CRISPR- associated endonuclease cas1 |
|
|
|
| |
|
|
|
| |
|
|
|
| |
|
|
|
| |
| AGI86303.1 | hypothetical protein | 389 | Sel-1 domain containing protein |
| AGI86346.1 | transporter family protein | 289 | bacterial/archaeal transporter family protein |
|
| uncharacterized protein | 187 | conserved in archaea (DUF531) |
Proteins in bold indicate homologs in the two other members of the Methanomassiliicoccales, devoided of Pyl. Proteins in italics indicate homologs in the two other members of the Methanomassiliicoccales also containing Pyl.
aNumber of amino acids.
Figure 5Comparison of the number of putative Pyl-containing proteins (other than MtmB/MtbB/MttB methyltransferases) and of the percentage of codons used as translational stop signals deduced from the three Methanomassiliicoccales genomes.