| Literature DB >> 27150504 |
Filipa L Sousa1, Shijulal Nelson-Sathi2, William F Martin2.
Abstract
Life arose in a world without oxygen and the first organisms were anaerobes. Here we investigate the gene repertoire of the prokaryote common ancestor, estimating which genes it contained and to which lineages of modern prokaryotes it was most similar in terms of gene content. Using a phylogenetic approach we found that among trees for all 8779 protein families shared between 134 archaea and 1847 bacterial genomes, only 1045 have sequences from at least two bacterial and two archaeal groups and retain the ancestral archaeal-bacterial split. Among those, the genes shared by anaerobes were identified as candidate genes for the prokaryote common ancestor, which lived in anaerobic environments. We find that these anaerobic prokaryote common ancestor genes are today most frequently distributed among methanogens and clostridia, strict anaerobes that live from low free energy changes near the thermodynamic limit of life. The anaerobic families encompass genes for bifunctional acetyl-CoA-synthase/CO-dehydrogenase, heterodisulfide reductase subunits C and A, ferredoxins, and several subunits of the Mrp-antiporter/hydrogenase family, in addition to numerous S-adenosyl methionine (SAM) dependent methyltransferases. The data indicate a major role for methyl groups in the metabolism of the prokaryote common ancestor. The data furthermore indicate that the prokaryote ancestor possessed a rotor stator ATP synthase, but lacked cytochromes and quinones as well as identifiable redox-dependent ion pumping complexes. The prokaryote ancestor did possess, however, an Mrp-type H(+)/Na(+) antiporter complex, capable of transducing geochemical pH gradients into biologically more stable Na(+)-gradients. The findings implicate a hydrothermal, autotrophic, and methyl-dependent origin of life. This article is part of a Special Issue entitled 'EBEC 2016: 19th European Bioenergetics Conference, Riva del Garda, Italy, July 2-6, 2016', edited by Prof. Paolo Bernardi.Entities:
Keywords: Acetogens; Anaerobes; Autotrophy; Early evolution; Geochemistry; Methanogens
Mesh:
Substances:
Year: 2016 PMID: 27150504 PMCID: PMC4906156 DOI: 10.1016/j.bbabio.2016.04.284
Source DB: PubMed Journal: Biochim Biophys Acta ISSN: 0006-3002
Fig. 1Wordle representation of the most frequent functional descriptions within a) 27 interdomain nearly universal monophyletic protein families and b) 109 nearly universal protein families. The size of the words relates to the number of times the word appears within the annotations. The larger the word, the higher its frequency.
Functional category of the nearly universal protein families and the 1045 families that retain the archaeal bacterial division.
| Nearly universal mono. | Nearly universal | 1045 mono. families | Anaerobic | Aerobic | Mixed | |
|---|---|---|---|---|---|---|
| Cellular processes and signaling | ||||||
| Cell cycle control, division, chromos part. | 0 | 1 | 9 | 0 | 0 | 9 |
| Cell motility | 0 | 0 | 9 | 0 | 0 | 9 |
| Cell wall/membrane/envelope biogenesis | 0 | 5 | 50 | 0 | 1 | 49 |
| Defense mechanisms | 0 | 1 | 49 | 4 | 1 | 44 |
| Extracellular structures | 0 | 0 | 1 | 0 | 0 | 1 |
| Intra traff., secretion, and vesicular transport | 2 | 2 | 14 | 0 | 2 | 12 |
| Mobilome: prophages, transposons | 0 | 0 | 21 | 1 | 0 | 20 |
| Nuclear structure | 0 | 0 | 0 | 0 | 0 | 0 |
| Post-transl. mod., protein turnover, chaperones | 1 | 3 | 33 | 1 | 4 | 28 |
| Signal transduction mechanisms | 0 | 0 | 25 | 0 | 2 | 23 |
| Information, storage and processing | ||||||
| Chromatin structure and dynamics | 0 | 0 | 0 | 0 | 0 | 0 |
| Replication, recombination and repair | 0 | 3 | 30 | 1 | 2 | 27 |
| RNA processing and modification | 0 | 0 | 0 | 0 | 0 | 0 |
| Transcription | 0 | 0 | 46 | 2 | 5 | 39 |
| Translation, ribosomal struct. and biogenesis | 23 | 34 | 91 | 6 | 0 | 85 |
| Metabolism | ||||||
| Amino acid transport and metabolism | 0 | 22 | 68 | 3 | 5 | 60 |
| Carbohydrate transport and metabolism | 0 | 5 | 75 | 4 | 3 | 68 |
| Coenzyme transport and metabolism | 1 | 7 | 53 | 3 | 1 | 49 |
| Energy production and conversion | 0 | 5 | 91 | 10 | 11 | 70 |
| Inorganic ion transport and metabolism | 0 | 4 | 56 | 2 | 6 | 48 |
| Lipid transport and metabolism | 0 | 1 | 41 | 0 | 7 | 34 |
| Nucleotide transport and metabolism | 0 | 13 | 15 | 1 | 2 | 12 |
| Synth., transp. cat. metabolites | 0 | 1 | 23 | 0 | 7 | 16 |
| Poorly characterized | ||||||
| Function unknown | 0 | 0 | 68 | 5 | 4 | 59 |
| General function prediction only | 0 | 2 | 117 | 7 | 12 | 98 |
| Not found | 0 | 0 | 60 | 12 | 4 | 44 |
| Total | 27 | 109 | 1045 | 62 | 79 | 904 |
Fig. 2Inter-domain gene sharing and aerobic profile network of families that retain the archaeal bacteria division. Number of genes shared between archaeal and bacterial organisms A) within the monophyletic 4397 protein families and B) within the remaining 1045 monophyletic protein families after removal of obvious interdomain LGTs. Each cell in the matrix indicates the number of genes (E-value ≤ 10− 10 and ≥ 25% global identity) shared between the protein families of 134 archaeal and 1847 bacterial genomes whose tree retain the archaea–bacteria division (scale bar at right). C) Inter-domain gene sharing network in terms of the aerobic classification of the organism pairs. Panel A is adapted from [77].
Taxonomic distribution of HCOs within 1981 genomes.
| Groups | Aerobic | Anaerobic | N. genomes |
|---|---|---|---|
| Archaea | |||
| Others | |||
| Korarchaeota | 0 | 1 | 1 |
| Thaumarchaeota | 2 | 0 | 2 |
| Nanoarchaeota | 0 | 1 | 1 |
| 2 | 2 | 4 | |
| Crenarchaeota | |||
| Thermoproteales | 6 | 7 | 13 |
| Desulfurococcales | 1 | 13 | 14 |
| Sulfolobales | 16 | 0 | 16 |
| 23 | 20 | 43 | |
| Euryarchaeota | |||
| Thermococcales | 0 | 14 | 14 |
| Thermoplasmatales | 1 | 3 | 4 |
| Archaeoglobales | 0 | 4 | 4 |
| Methanobacteriales | 0 | 8 | 8 |
| Methanococcales | 0 | 15 | 15 |
| Methanomicrobiales | 0 | 6 | 6 |
| Methanocellales | 0 | 3 | 3 |
| Methanosarcinales | 0 | 10 | 10 |
| Halobacteriales | 23 | 0 | 23 |
| 24 | 63 | 87 | |
| Bacteria | |||
| Clostridia | 4 | 105 | 109 |
| Bacilli | 137 | 151 | 288 |
| Negativicutes | 0 | 6 | 6 |
| Tenericutes | 0 | 47 | 47 |
| Planctomycetes | 6 | 0 | 6 |
| Chlamydiae | 4 | 34 | 38 |
| Spirochaetes | 7 | 39 | 46 |
| Bacteroidetes | 55 | 22 | 77 |
| Actinobacteria | 166 | 41 | 207 |
| Chlorobi | 5 | 6 | 11 |
| Fusobacteria | 0 | 5 | 5 |
| Thermotogae | 0 | 15 | 15 |
| Aquificae | 8 | 2 | 10 |
| Chloroflexi | 10 | 6 | 16 |
| Deinococcus–Thermus | 17 | 0 | 17 |
| Cyanobacteria | 44 | 0 | 44 |
| Acidobacteria | 8 | 0 | 8 |
| Deltaproteobacteria | 31 | 17 | 48 |
| Epsilonproteobacteria | 72 | 2 | 74 |
| Alphaproteobacteria | 204 | 4 | 208 |
| Betaproteobacteria | 120 | 3 | 123 |
| Gammaproteobacteria | 374 | 41 | 415 |
| Other bacteria | 11 | 18 | 29 |
| 1283 | 564 | 1847 | |
| Total | 1332 | 649 | 1981 |
Fervidicoccus fontis Kam984 (Fervidicoccales order) is grouped within Desulfurococcales order.
Fig. 3Wordle representation of the most frequent taxonomic and functional descriptions within the 79 aerobic and the 62 anaerobic protein families. The size of the words relates to the number of times the word appears within the annotations. The larger the word, the higher its frequency.
Fig. 4Distribution of oxygen dependent reaction within prokaryotic genomes grouped by the 11 Kegg metabolic pathway maps.
Gene annotations of the 62 anaerobic core families.
| Cluster | Annotation | Comments |
|---|---|---|
| 6079 | radical_SAM_protein | SAM methyltransferase |
| 9248 | radical_SAM_protein | Pyruvate formate lyase activating enzyme EC:1.97.1.4; SAM methyltransferase |
| 21,920 | Radical_SAM_domain_protein | rlmN; 23S rRNA methyltransferase [EC:2.1.1.192]; SAM |
| 10,466 | type_11_methyltransferase | SAM methyltransferase |
| 21,520 | type_11_methyltransferase | SAM methyltransferase |
| 14,283 | Methylase_involved_in… | SAM methyltransferase |
| 1662 | 50S_ribosomal_protein_L29 | RP-L29, rpmC; large subunit ribosomal protein L29 |
| 1296 | acetyl-CoA_decarbonylase/synthase… | Acetyl-CoA decarbonylase/synthase complex subunit gamma [EC:2.1.1.245] |
| 1321 | acetyl-CoA_decarbonylase/synthase… | Acetyl-CoA decarbonylase/synthase complex subunit delta [EC:2.1.1.245] |
| 7851 | ATP_synthase_subunit_c | ATPVK, ntpK, atpK; V/A-type H +-transporting ATPase subunit K |
| 4404 | ferredoxin | fer; ferredoxin |
| 18,705 | flavodoxin | Flavodoxin |
| 13,779 | heterodisulfide_reductase_subunit_A… | Heterodisulfide-reductase CoB CoM |
| 14,009 | heterodisulfide_reductase_subunit_C… | Heterodisulfide reductase; CoM CoS |
| 4789 | membrane_bound_hydrogenase_subunit… | Multicomponent Na +:H + antiporter |
| 11,625 | Membrane-bound_hydrogenase_MBH… | Membrane-bound-hydrogenase antiporter subunit |
| 11,548 | L-glutamine_synthetase | glnA, GLUL; glutamine synthetase [EC:6.3.1.2]; glutamine-synthetase |
| 10,709 | nitrogenase_iron-iron_accessory_protein… | Iron–iron nitrogenase accessory protein AnfO |
| 5364 | acetyltransferase | CoA acyl-CoA acetyltransferase |
| 10,044 | phenylacetate–CoA_ligase | paaK; phenylacetate-CoA ligase [EC:6.2.1.30]; CoA phenylacetate-CoA-ligase |
| 23,640 | NADPH-dependent_FMN_reductase | FMN FAD NADH NADH-dependent-FMN-reductase |
| 10,494 | NADPH-dependent_FMN_reductase | NADH-dependent-FMN-reductase |
| 4774 | sugar_kinase | Sugar-kinase |
| 11,273 | FeoA_family_protein | feoA ferrous iron transport protein A |
| 11,762 | aldo/keto_reductase | K07079; NADH aldo/keto-reductase |
| 12,136 | putative_ABC_transporter | Putative ABC transport system permease protein |
| 6304 | citrate_transporter | Citrate_transporter |
| 16,010 | cobalamin_biosynthesis_protein | cbiN; cobalt/nickel transport protein; cobalamin |
| 14,243 | cytochrome_c_biogenesis… | Cytochrome c-biogenesis transmembrane protein |
| 1071 | ApbE_family_lipoprotein | K09740; hypothetical protein; ApbE_family_lipoprotein |
| 6375 | beta-lactamase_domain-containing_protein | Beta-lactamase metal dependent |
| 4311 | deblocking_aminopeptidase | E3.2.1.4; endoglucanase metal metallopeptidase |
| 22,518 | YcfA-like_protein | hicA; mRNA interferase HicA [EC:3.1.-.-]; YcfA-like-protein |
| 15,732 | nucleotidyltransferase | K07076; nucleotidyltransferase |
| 4155 | PP-loop_domain-containing… | ttcA; tRNA 2-thiocytidine biosynthesis protein TtcA; PP-loop |
| 9988 | regulatory_protein_MarR | Regulatory-protein-MarR transcriptional-regulator |
| 23,018 | type_II_site-specific_deoxyribonuclease | Type-II-restriction-enzyme type-II-site-specific-deoxyribonuclease |
| 16,528 | transposase,_IS4 | Transposase |
| 10,489 | ATP-dependent_RNA_helicase | CRISPR-associated-endonuclease/helicase Cas3; ATP-dependent |
| 15,539 | CRISPR-associated_protein… | CRISPR-associated protein Cmr3 |
| 14,799 | CRISPR-associated_protein… | CRISPR-associated protein Csm1 |
| 18,976 | helicase_domain_protein | Helicase |
| 20,220 | helicase-like_protein | Helicase |
| 2237 | xylose_isomerase_domain… | Metal-dependent; xylose-isomerase-domain-containing-protein |
| 20,694 | xylose_isomerase_domain… | Metal xylose-isomerase-domain-containing-protein |
| 7805 | putative_YgiT-type_zinc_finger | Zinc zinc-finger domain |
16 other protein families annotated as hypothetical proteins.