Literature DB >> 32636606

Descent of Bacteria and Eukarya From an Archaeal Root of Life.

Xi Long1, Hong Xue1, J Tze-Fei Wong1.   

Abstract

The 3 biological domains delineated based on small subunit ribosomal RNAs (SSU rRNAs) are confronted by uncertainties regarding the relationship between Archaea and Bacteria, and the origin of Eukarya. The similarities between the paralogous valyl-tRNA and isoleucyl-tRNA synthetases in 5398 species estimated by BLASTP, which decreased from Archaea to Bacteria and further to Eukarya, were consistent with vertical gene transmission from an archaeal root of life close to Methanopyrus kandleri through a Primitive Archaea Cluster to an Ancestral Bacteria Cluster, and to Eukarya. The predominant similarities of the ribosomal proteins (rProts) of eukaryotes toward archaeal rProts relative to bacterial rProts established that an archaeal parent rather than a bacterial parent underwent genome merger with bacteria to generate eukaryotes with mitochondria. Eukaryogenesis benefited from the predominantly archaeal accelerated gene adoption (AGA) phenotype pertaining to horizontally transferred genes from other prokaryotes and expedited genome evolution via both gene-content mutations and nucleotidyl mutations. Archaeons endowed with substantial AGA activity were accordingly favored as candidate archaeal parents. Based on the top similarity bitscores displayed by their proteomes toward the eukaryotic proteomes of Giardia and Trichomonas, and high AGA activity, the Aciduliprofundum archaea were identified as leading candidates of the archaeal parent. The Asgard archaeons and a number of bacterial species were among the foremost potential contributors of eukaryotic-like proteins to Eukarya.
© The Author(s) 2020.

Entities:  

Keywords:  Accelerated gene adoption; archaeal parent; eukaryogenesis; isoleucyl-tRNA synthetase; valyl-tRNA synthetase

Year:  2020        PMID: 32636606      PMCID: PMC7313328          DOI: 10.1177/1176934320908267

Source DB:  PubMed          Journal:  Evol Bioinform Online        ISSN: 1176-9343            Impact factor:   1.625


Introduction

Molecular evolution analysis of small subunit ribosomal RNAs (SSU rRNAs) yielded a universal but unrooted tree of life (ToL) that comprises the 3 biological domains of Archaea, Bacteria, and Eukarya.[1] A ToL of transfer RNAs (tRNAs) based on the genetic distances between the 20 classes of tRNA acceptors for different amino acids located the Last Universal Common Ancestor (LUCA) near the hyperthermophilic archaeal methanogen Methanopyrus kandleri (Mka).[2] The rooting is supported by a wide range of evidence,[3-14] and the finding of the Methanopyrus lineage as the oldest lineage among living organisms.[15] However, the phylogenies of the 3 biological domains are beset by 2 fundamental problems regarding the evolutionary relationship between Archaea and Bacteria, and the nature of the Archaea-Bacteria collaboration that gave rise to Eukarya. As long as these 2 problems remain unresolved, the root of life and the origin of Eukarya would both be open to diverse formulations.[16-20] Accordingly, the objective of this study was to examine the pathways of descent of Bacteria and Eukarya from an archaeal LUCA and the identity of the plausible archaeal parent of Eukarya.

Materials and Methods

Source of data and materials

Protein and SSU rRNA sequences were retrieved from NCBI GenBank release 231 (ftp://ftp.ncbi.nlm.nih.gov/genomes/).[21,22] For species without available SSU rRNA information in NCBI, quality checked SSU rRNA sequences were downloaded from the SILVA database release 132 (https://www.arb-silva.de/).[23] For species with multiple SSU rRNA sequences, the one yielding the highest total bitscore (using BLASTN[24] with “-word_size” flag set to 4) with SSU rRNAs of other species from the same domain was employed for analysis. The accession numbers of SSU rRNAs analyzed were available in File S1 in Supplementary Materials. Eukaryotic mitochondrial DNA-encoded protein sequences were retrieved from the RefSeq mitochondrial reference genomes in the NCBI Protein database (https://www.ncbi.nlm.nih.gov/protein).

Estimation of nuclear or mitochondrial proteome similarity bitscores

When comparing proteome similarities, the proteomes of all subject species were used to construct a local BLAST database using makeblastdb,[24] and every query proteome is searched against the local database using BLASTP with a BLOSUM62 matrix and thresholds setting to evalue <1 × 10−5, percent identity >25%, and query coverage >50%. Only the query and subject sequences that were the best match of each other, viz when query sequence n from species 1 exhibited the highest bitscore toward subject sequence m among all proteins of species 2 and vice versa, were included in the estimation of inter-proteome similarity, which was given by the sum of BLASTP bitscores of all such best-matched proteins between the 2 proteomes.

Estimation of rProt similarity bitscores

To identify rProt sequences in Gla, Trv, Sce, and Hsa (see species name abbreviations in Table 1), eukaryotic proteomes were cleared of mitochondrial or mitochondrial DNA-encoded proteins, and then searched against the Pfam database[25] using RPSBLAST[24] at a threshold set by the “-evalue” flag at 0.01. For each of the 88 rProt families analyzed (Table S1), only the protein sequence from each species that yielded the highest bitscore toward the rProt family was analyzed further. On this basis, 79, 81, 84, and 86 out of the 88 rProt families were found in the Gla, Trv, Sce, and Hsa proteomes, respectively. These eukaryotic rProts were blasted against all the prokaryotic proteomes using BLASTP. Prokaryotic proteins passing the threshold of evalue <0.05 were searched against the Pfam database using RPSBLAST, and false-positive sequences that failed to map to the targeted rProt family were removed. The similarities between the rProt sequences identified from eukaryotes and prokaryotes were estimated based on the maximum BLASTP bitscores.
Table 1.

Partial list of species analyzed.

Abbr.Species name
Archaea
Abo Aciduliprofundum boonei
Acf Aciduliprofundum sp. MAR08-339
Aen C.Aenigmarchaeota archaeon
Afu Archaeoglobus fulgidus
Aia Acidilobus sp. 7A
Alt C.Altiarchaeales archaeon
Ape Aeropyrum pernix
Bat C.Bathyarchaeota archaeon
Csu C.Caldiarchaeum subterraneum
Csy Cenarchaeum symbiosum
Dia C.Diapherotrites archaeon
Fac Ferroplasma acidiphilum
Ffo Fervidicoccus fontis
Hal Halobacterium salinarum
Hei C.Heimdallarchaeota archaeon
Hgi Haloferax gibbonsii
Hla Halobiforma lacisalsi
Kcr C.Korarchaeum cryptofilum
Lok C.Lokiarchaeota archaeon
Mac Methanosarcina acetivorans
Man C.Mancarchaeum acidiphilum
Mar C.Marsarchaeota G2 archaeon
Mbo Methanoregula boonei
Mco Methanocella conradii
Mes C.Methanosuratus sp.
Mfe Methanothermus fervidus
Mic C.Micrarchaeota archaeon
Min C.Methanomassiliicoccus intestinalis
Mja Methanocaldococcus jannaschii
Mka Methanopyrus kandleri
Mlt C.Methanoliparum thermophilum
Mnt Methanonatronarchaeum thermophilum
Mph Methanophagales archaeon
Mte C.Methanoplasma termitum
Nca C.Nitrosocaldus cavascurensis
Nga C.Nitrososphaera gargensis
Nko C.Nitrosopumilus koreensis
Nst C.Nanobsidianus stetteri
Odi C.Odinarchaeota archaeon
Pae Pyrobaculum aerophilum
Psy C.Prometheoarchaeum syntrophicum
Pfu Pyrococcus furiosus
Sso Saccharolobus solfataricus
Tac Thermoplasma acidophilum
Tho C.Thorarchaeota archaeon
Tvo Thermoplasma volcanium
Woa C.Woesearchaeota archaeon
Bacteria
Aae Aquifex aeolicus
Atu Agrobacterium tumefaciens
Bap Buchnera aphidicola
Bja Bradyrhizobium japonicum
Blo Bifidobacterium longum
Bsu Bacillus subtilis
Cex Caldisericum exile
Cje Campylobacter jejuni
Cpo Cloacibacillus porcorum
Ctr Chlamydia trachomatis
Cvi Caulobacter vibrioides
Cvo Chelativorans sp. BNC1
Det Desulfurobacterium thermolithotrophum
Dra Deinococcus radiodurans
Dth Dictyoglomus thermophilum
Eco Escherichia coli
Hth Hungateiclostridium thermocellum
Kol Kosmotoga olearia
Mau Mahella australiensis
Mhy Megamonas hypermegale
Mpn Mycoplasma pneumoniae
Mtu Mycobacterium tuberculosis
Pel Pelobacter sp. SFB93
Pmo Petrotoga mobilis
Rpr Rickettsia prowazekii
Rru Rhodospirillum rubrum
Rso Ralstonia solanacearum
Spn Streptococcus pneumoniae
Ssp Sporanaerobacter sp. NJN-17
Syn Synechocystis sp. PCC 6803
Tht Thermobaculum terrenum
Tis Tistrella mobilis
Tma Thermotoga maritima
Tpa Treponema pallidum
Tte Thermoanaerobacter tengcongensis
Xca Xanthomonas campestris
Eukarya
Aca Acanthamoeba castellanii
Bbo Babesia bovis
Bho Blastocystis hominis
Bpr Bathycoccus prasinos
Cel Caenorhabditis elegans
Cme Cyanidioschyzon merolae
Dme Drosophila melanogaster
Dre Danio rerio
Esi Ectocarpus siliculosus
Gla Giardia lamblia
Hsa Homo sapiens
Lma Leishmania major
Pfa Plasmodium falciparum
Pma Perkinsus marinus
Sce Saccharomyces cerevisiae
Spa Saprolegnia parasitica
Spo Schizosaccharomyces pombe
Sra Strongyloides ratti
Tps Thalassiosira pseudonana
Ttr Thecamonas trahens
Trv Trichomonas vaginalis

Note: C. in front of species name stands for Candidatus. Detailed species information is given in Table S2.

Partial list of species analyzed. Note: C. in front of species name stands for Candidatus. Detailed species information is given in Table S2.

Estimation of non-rProt similarity bitscores

To identify Gla-like protein families in various prokaryotes, every sequence in the Gla proteome was blasted against the 82 prokaryotic proteomes in Table 1 (except for Psy from preprint form), and the best matches passing the threshold of evalue <0.05 were mapped to the Pfam database using the NCBI Batch CD-search Tool.[26] To remove false-positive pairs, only cases where both query and subject sequences belonged to the same targeted protein family were analyzed, and the Gla sequences that were relatively rare in prokaryotes, displaying similarity bitscores toward ⩽10 out of the 82 prokaryotic proteomes tested, were classified as Gla-like proteins.

Results and Discussion

Similarity between VARS-IARS paralogues

The relative antiquity of proteins could be approximated, except for proteins that have undergone extraordinarily extensive evolution, based on the increasing divergence of paralogous proteins in time.[27] Accordingly, BLASTP was performed between the intraspecies valyl-tRNA synthetase (VARS) and isoleucyl-tRNA synthetase (IARS) in the genomes of 5398 species in NCBI Genbank. When the bitscores obtained were arranged in descending order (Table S2), or in part on a distribution curve (Figure 1), Mka yielded a top bitscore of 473. BLASTP, which provided indication of similarity but not necessarily phylogenetic relationship,[28] was a fitting tool for evaluating the intracellular divergence of VARS-IARS which carried no phylogenetic implication: 2 neighboring species on the distribution curve could belong to 2 different biological domains. As the 119 highest scoring species were all archaeons, the top-scoring bacterium Mau gave only a bitscore of 378 and the top-scoring eukaryote Esi gave only a bitscore of 240, the smallest VARS-IARS divergences were clearly confined to Archaea, in keeping with the descent of Bacteria from Archaea, and descent of Eukarya from either Archaea or an Archaea-Bacteria collaboration. The foremost antiquity of Mka indicated by its bitscore was in accordance with the Mka-proximal LUCA identified by the genetic distances between alloacceptor tRNAs,[2] and the unchanging environment throughout the ages at the hydrothermal vents inhabited by Mka. It was also consistent with the datings of the sn1,2 chemistries of archaeal lipids, and the core of archaeal formylmethanofuran dehydrogenase, prior to the rise of LUCA.[29]
Figure 1.

Ranking of similarity bitscores of intraspecies VARS-IARS for various species in descending order (from left to right). The bitscores for 1185 archaeal, 3621 bacterial, and 592 eukaryotic species from NCBI are given in Table S2. IARS indicates isoleucyl-tRNA synthetase; NCBI, National Center for Biotechnology Information; VARS, valyl-tRNA synthetase.

Ranking of similarity bitscores of intraspecies VARS-IARS for various species in descending order (from left to right). The bitscores for 1185 archaeal, 3621 bacterial, and 592 eukaryotic species from NCBI are given in Table S2. IARS indicates isoleucyl-tRNA synthetase; NCBI, National Center for Biotechnology Information; VARS, valyl-tRNA synthetase. The positions of some of the species analyzed in Figure 1 were indicated on the SSU rRNA tree, with their intraspecies VARS-IARS bitscores expressed in circles colored according to the thermal scale (Figure 2A).
Figure 2.

Distribution of similarity bitscores relating to VARS and IARS on SSU rRNA tree. (A) Bitscores for VARS-IARS pairs. (B) Bitscores for VARS (squares), or IARS (triangles), between Gla and other organisms. For building the consensus maximum parsimony tree of SSU rRNAs for 29 archaeal, 31 bacterial, and 19 eukaryotic species using PHYLIP version 3.698,[30] the sequences were aligned in Clustal Omega.[31] One thousand sets of bootstrap-resampled sequence alignments were generated using SEQBOOT and inputted into DNAPARS to construct maximum parsimony trees. The consensus tree was produced based on the 1000 sets of maximum parsimony trees using CONSENSE. The nodes indicate more than 85% bootstrap support (black), more than 50% (gray), or less than or equal to 50% (white). IARS indicates isoleucyl-tRNA synthetase; SSU rRNA, small subunit ribosomal RNA; VARS, valyl-tRNA synthetase.

Distribution of similarity bitscores relating to VARS and IARS on SSU rRNA tree. (A) Bitscores for VARS-IARS pairs. (B) Bitscores for VARS (squares), or IARS (triangles), between Gla and other organisms. For building the consensus maximum parsimony tree of SSU rRNAs for 29 archaeal, 31 bacterial, and 19 eukaryotic species using PHYLIP version 3.698,[30] the sequences were aligned in Clustal Omega.[31] One thousand sets of bootstrap-resampled sequence alignments were generated using SEQBOOT and inputted into DNAPARS to construct maximum parsimony trees. The consensus tree was produced based on the 1000 sets of maximum parsimony trees using CONSENSE. The nodes indicate more than 85% bootstrap support (black), more than 50% (gray), or less than or equal to 50% (white). IARS indicates isoleucyl-tRNA synthetase; SSU rRNA, small subunit ribosomal RNA; VARS, valyl-tRNA synthetase. There was a concentration of euryarchaeons with high VARS-IARS similarity in a “Primitive Archaea Cluster” centered between Pfu and Mac. In the Bacteria domain, there was likewise a concentration of species with high VARS-IARS similarity in an “Ancestral Bacteria Cluster” centered between Det and Hth. The deepest branching species in the Bacteria domain were 2 members of the Aquificae phylum, viz the anaerobic Det with high VARS-IARS similarity, and the microaerobic Aae with low similarity. As mutations could cause loss of similarity more easily than gain, this suggests that Aae has evolved far from the ancestral Aquificae species possibly as part of the wave of radical changes undergone by some former anaerobes in response to the appearance of atmospheric oxygen,[32,33] thereby sustaining extensive evolutionary erosion of its VARS-IARS similarity. The enhanced resistance of paralogue similarity to perturbation by horizontal gene transfer (HGT), due to the difficulty of transfer of a pair of genes compared to the transfer of a single gene, was illustrated by the preservation of low VARS-IARS bitscores in the proteobacterial region of the tree against large shifts caused by HGT events. Given the relative paucity of HGT effects on VARS-IARS similarity, the parallel prominences of high VARS-IARS similarity-bitscore species in the Primitive Archaea Cluster and the Ancestral Bacteria Cluster were explicable by vertical genetic transmission of the VARS and IARS genes from an Mka-proximal root of life to the archaeal cluster, and in turn to the bacterial cluster. As the top-ranked bacterial bitscore of Mau at 378 was between those of archaeons Mac at 382 and Pfu at 369, the results indicated that the Ancestral Bacteria Cluster branched off from the Primitive Archaea Cluster near the Mka-proximal root of life. The medium VARS-IARS bitscores of Esi, Tps, Bpr, and Cme among the Eukarya (Figure 2A) also pointed to the conservation of intraspecies VARS-IARS similarity in this domain. The much higher VARS (colored squares) and IARS (colored triangles) bitscores between Gla and various bacterial species compared to archaeal species, except for the high similarity exhibited by Gla IARS toward that of Abo, suggests that Eukarya received VARS from Bacteria and IARS from Abo or a bacterium (Figure 2B).

Sequence alignments

The aligned segments of VARS and IARS (Figure 3) from Mka, Mau, and Esi, viz the archaeon, bacterium, and eukaryote displaying the highest VARS-IARS similarity within their respective domains, included 42 of 207 columns where all 6 sequences carried the same amino acid, in support of sequence conservation of this pair of paralogous genes among all 3 living domains. Together with the higher rankings of VARS-IARS similarity attained by archaeons relative to both bacteria and eukaryotes (Figure 1), the sequence conservation observed represented strong evidence for the vertical transmission of the VARS and IARS genes from Archaea to both Bacteria and Eukarya.
Figure 3.

Segments of the aligned VARS and IARS sequences of Mka, Mau, and Esi. Sequences were aligned using Clustal Omega, and the numbers indicate the positions of amino acid residues on the complete sequence alignment (Figure S1). Similar amino acids in the same column are colored in orange, and ⩾50% conserved ones in blue. Asterisks mark the 6 positions where a V or L residue is found in all 6 sequences. IARS indicates isoleucyl-tRNA synthetase; VARS, valyl-tRNA synthetase.

Segments of the aligned VARS and IARS sequences of Mka, Mau, and Esi. Sequences were aligned using Clustal Omega, and the numbers indicate the positions of amino acid residues on the complete sequence alignment (Figure S1). Similar amino acids in the same column are colored in orange, and ⩾50% conserved ones in blue. Asterisks mark the 6 positions where a V or L residue is found in all 6 sequences. IARS indicates isoleucyl-tRNA synthetase; VARS, valyl-tRNA synthetase.

Process of eukaryogenesis

Extensive evidence supports that an endosymbiotic event between an archaeal parent and an alphaproteobacterium played a key role in the development of Eukarya.[34,35] Proposals regarding the identity of the archaeal parent have focused on a range of archaeons including Thermoplasmata where the lack of a rigid cell wall could facilitate engulfment of the alphaproteobacterium[36,37]; and the Asgard archaeons[38,39] that were enriched in eukaryotic signature proteins (ESPs).[40] There is a phylogenomic impasse regarding these, as well as other, choices.[41,42] Upon BLASTP comparisons of the 79 Gla, 81 Trv, 84 Sce, and 86 Hsa rProt families with prokaryotic rProts, 69/69 Gla, 71/72 Trv, 71/72 Sce, and 71/71 Hsa ones with prokaryotic resemblance showed higher similarity toward archaeons than bacteria; thus, only 1 of 72 of Trv (rProt L29) or Sce (rProt S4) ones showed higher similarity toward bacteria than archaeons (Figures 4A and S2), clearly indicating that eukaryogenesis was hosted by an archaeal parent instead of a bacterial parent.[36,37] Those rProts in Table S1 without any prokaryotic resemblance might be derived from a prokaryote not analyzed in this study, invented by the eukaryogenic lineage, or diminished in their resemblances by evolutionary changes to beyond recognition by BLASTP.
Figure 4.

Protein sequence similarities between Gla and prokaryotic species. (A) Maximum BLASTP bitscores between Gla rProts and prokaryotic rProts. (B) Bitscores of PEP-utilizing enzyme mobile domain (PF00391) between Gla and prokaryotes. (C) Bitscores between some of the Gla-like proteins from Table S3 and potentially homologous proteins in various prokaryotes. (D) Numbers of the 162 Gla-like proteins found in various prokaryotes. The color coding and order of different prokaryotic species on the x-axis in (B), (C), and (D) are the same as those in (A). PEP indicates phosphoenolpyruvate.

Protein sequence similarities between Gla and prokaryotic species. (A) Maximum BLASTP bitscores between Gla rProts and prokaryotic rProts. (B) Bitscores of PEP-utilizing enzyme mobile domain (PF00391) between Gla and prokaryotes. (C) Bitscores between some of the Gla-like proteins from Table S3 and potentially homologous proteins in various prokaryotes. (D) Numbers of the 162 Gla-like proteins found in various prokaryotes. The color coding and order of different prokaryotic species on the x-axis in (B), (C), and (D) are the same as those in (A). PEP indicates phosphoenolpyruvate. Among the 6502 proteins in the Gla proteome, 3203 of them showed finite similarity bitscores toward the sequences of one or more of the 82 prokaryotes tested, and the phosphoenolpyruvate (PEP)-utilizing enzyme mobile domain of Gla yielded the highest combined BLASTP bitscore of any Gla protein toward prokaryotic protein families, with Acf, Abo, and Mac (2nd, 1st, and 14th red columns from the right in Figure 4B) showing the top 3 archaeal bitscores. The bitscores were high for Tho and Hei but low for Odi and nil for Lok (3rd, 4th, 2nd, and 1st purple columns from the right) among the Asgard archaea, and high for Tvo and Tac but low for Mte, Min, and Fac among the Thermoplasmata (5th, 6th, 3rd, 4th, and 7th red columns from the right). Figure 4C shows the distribution of potential archaeal and bacterial homologues of some of the 162 Gla-like proteins that were either ESPs or relatively rare proteins found in less than 10 of the 82 prokaryotes analyzed (Table S3). The Asgard archaeons (purple columns) and a number of bacterial species (green columns) were prominently endowed with the ESPs or rare proteins required for eukaryogenesis (Figure 4D and Table S4). However, the highest scoring Tho, Odi, Xca, and Lok in this regard harbored only 26, 19, 17, and 16 of the 162 Gla-like proteins, respectively, which underlined the difficulty for any archaeon or bacterium to accumulate a sufficient number of eukaryote-type proteins to launch the Eukarya domain by itself. On the other hand, it was impressive that one or more potential prokaryotic sources could be located for each of the 162 Gla-like proteins targeted despite the modest spectrum of prokaryotes analyzed in Figure 4C, demonstrating that the obstacle to eukaryogenesis posed by an ESP deficit could be overcome readily if some efficient mechanism was available for collecting the requisite protein genes from a broad spectrum of prokaryotes. With respect to the problem of inadequacy of ESPs occurring in any single archaeon,[43,44] it was suggested that HGTs might provide a solution,[39] but the actual adoption of HGT-transferred genes by recipients might be a limiting factor,[45] as illustrated by the fact that few members of the alphaproteobacterial and Asgard groups had spread a large fraction of their Gla-like proteins to all other members of the same group through HGTs (Figure 4C).

Nature of archaeal parent

Eukaryogenesis could follow a mitochondria-early scenario or a mitochondria-late scenario,[46] and there is no consensus on these 2 scenarios.[47-49] Previously, the proteome of the eukaryote Sce was found to contain a rich variety of bacterial proteins, and also some archaeal ones, and it was suggested that the influx of bacterial genes into Sce was not explicable by a merger between archaeal parent and another bacterium besides an alphaproteobacterium, or by uptake of bacterial genes through ingestion of bacteria as food.[35] When the eukaryotic Gla and Trv proteomes were employed as probes for BLASTP query against various prokaryotic proteomes, they gave rise to so many hits with a range of archaea and bacteria (Table S5) that the influx of bacterial and archaeal genes into the eukaryogenic lineage would need to be mediated by some specially efficient form of HGT. Comparable yet nonidentical spectra of inter-proteome similarities were exhibited by Gla and Trv toward the prokaryotes, with archaeal bitscores surpassing bacterial ones in the case of Gla but vice versa in the case of Trv (Figure 5A). It was suggested that actin-associated proteins and regulators were introduced into archaea from diverse bacteria[50]; and the influx of a large number of bacterial genes into a methanogen was found to precede its evolution into the haloarchaeans.[51] Accordingly, an influx of prokaryotic genes into the eukaryogenic lineage, likely beginning prior to the emergence of the archaeal parent and continuing through to the Last Eukaryotic Common Ancestor (LECA) and the early eukaryotes, could play a crucial role in eukaryogenesis.
Figure 5.

Inter-proteome similarity bitscores. (A) Total similarity bitscores of Gla and Trv proteomes toward individual prokaryotic proteomes. Relationships of average bitscore per best-match hit (y-axis) with the number of best-match hits (x-axis): (B) between prokaryotic and Gla proteomes and (C) between prokaryotic and Trv proteomes.

Inter-proteome similarity bitscores. (A) Total similarity bitscores of Gla and Trv proteomes toward individual prokaryotic proteomes. Relationships of average bitscore per best-match hit (y-axis) with the number of best-match hits (x-axis): (B) between prokaryotic and Gla proteomes and (C) between prokaryotic and Trv proteomes. Based on the premise that the free-living archaeal parent might still retain recognizable similarity toward eukaryotes, 46 archaeal proteomes were compared regarding their relationships with the proteomes of Gla and Trv. Figure 5B and C showed that the proteome of the Aciduliprofundum archaeon Abo displayed the highest average similarity bitscores among archaeons toward the proteomes of both Gla and Trv, which identified Abo and its companion species Acf as candidate archaeal parents. The Asgard archaeons Hei, Odi, Tho, Lok, and the cultivatable Psy[52,53] constituted an unusually inventive group with both some high average similarity bitscores and a rich store of ESPs, even though their average similarity bitscores were lower than those of Abo. Among all the prokaryotic species, Psy also yielded the highest number of similarity hits toward both Gla and Trv, indicating that the archaeal parent contained more genes derived from Psy than any other archaeon. For the bacterial species, once any bacterial protein entered into the eukaryogenic lineage, its eukaryotic version and free-living bacterial version became segregated irreversibly and evolved independently; the divergence between the 2 versions would increase with time as in the case of paralogues such as VARS and IARS. Accordingly, the higher inter-proteome bitscores of Tpa toward Gla and Trv compared to Mpn could be at least in part the result of later entry of Tpa genes than Mpn genes into the eukaryotes. These findings thus suggest that the entries of various bacterial proteins at different times into the eukaryogenic lineage would furnish useful landmarks for deciphering the chronicle of eukaryogenesis. The determinants of the bitscores of archaeons outside of the archaeal parent were more complex, for they would depend not only on the time of entry of their proteins into the eukaryotes but also on the extent of their kinship with the archaeal parent. When the bacterial-gene contents of different archaeons were compared regarding their abilities to acquire bacterial genes, Hla, Hgi, and Mac with their large proteomes (3704 to 4469 protein-coding genes) displayed high similarity bitscores toward a wide range of bacteria (Figure 6, left panel). However, when the bitscore of each archaeon was normalized with respect to the number of protein-coding genes in its genome, the normalized bitscores of the smaller Abo, Acf, Mte, Tvo, and Tac (each with <1600 protein-coding genes), Mfe (1283 protein-coding genes), and Mlt (1291 protein-coding genes) became more prominent (Figure 6, right panel). The medium-sized Pfu (2065 protein-coding genes) gave much the same result with or without normalization. Notably, the high similarity bitscores exhibited by these archaeal proteomes toward multiple bacterial proteomes suggest that they had efficiently adopted exogenous genes received by them from HGT into their own genomes. In contrast, the bacterial proteomes of Bja, Tht, Pel, Dth, Tte, and the DNA transformation-active Bsu exhibited only modest bitscores toward smaller number of archaeons. This enhanced ability of some archaeons to adopt exogenous genes may be referred to as an accelerated gene adoption (AGA) phenotype. The prominence of AGA in some archaeons was consistent with the finding that 44% of Mja gene products were derived from bacteria.[54] A possible determinant of the AGA phenotype could be the “Darwinian Threshold,” viz organisms below a given threshold level of organizational connectedness adopt genes received from HGTs more readily than organisms above the threshold.[55] Other determinants might include a full-fledged or partial scavenger lifestyle,[56] tetraethers in their membranes,[56,57] or the presence of rudimentary phagocytosis.[58,59] Previously, it was suggested that eukaryotes could ingest bacteria as proto-organelles, and upon lysis transfer their genes to the eukaryotic nuclear genome through a recycling rachet.[60,61] The plausible deployment of the dissimilar AGA and recycling rachet mechanisms for gene transfer in eukaryogenesis underlines the significance of prokaryotic genes in eukaryogenesis. Importantly, the bacterial species Rpr, Bap, Ctr, Mpn, and Tpa furnished few genes to the AGA-active archaeons (Figure 6), and their proteins were also depleted in the proteomes of both Gla and Trv (Figure 5A), clearly indicating that AGA played a major role in governing the entry of bacterial genes into Eukarya.
Figure 6.

Similarity bitscores between archaeal proteomes (y-axis) and bacterial proteomes (x-axis) without (left) or with (right) normalization based on the number of protein-coding genes in each archaeon. Data for the heat maps are given in Table S6.

Similarity bitscores between archaeal proteomes (y-axis) and bacterial proteomes (x-axis) without (left) or with (right) normalization based on the number of protein-coding genes in each archaeon. Data for the heat maps are given in Table S6. On account of the large variety and numbers of prokaryotic genes to be included in eukaryotic genomes (Figure 5A), it would be essential for the archaeal parent to be highly active in AGA, so that it could assemble beneficial genes from wide ranging prokaryotic sources and incorporate them into its own genome in the course of eukaryogenesis. Besides AGA activity, Abo the first cultivatable archaeon from the “Deep-sea hydrothermal vent euryarchaeotic 2” (DHVE2) group, and its facultatively anaerobic companion species Acf,[57,62,63] possess an exceptionally flexible cell surface which can form small blebbing vesicles that bud off and anneal with other cells. While all prokaryotic cells evolve on the basis of nucleotidyl mutations through the replacement, addition, and subtraction of nucleotides, AGA would enable the archaeal parent to evolve on the basis of gene-content mutations as well through the replacement, addition, and subtraction of genes, or gene clusters, expediting eukaryogenesis by orders of magnitude. The AGA-active Tac for example succeeded in acquiring gene clusters from other organisms for rProts, NADH dehydrogenase, precorrin biosynthesis, flagellar proteins, and a protein degradation pathway amounting to 32% of its total open reading frames via its AGA which was considerably less active than that of Abo and Acf (Figure 6, right panel).[56] The blebbing vesicles of Abo and Acf could further mediate gene exchanges between individual cells engaged in eukaryogenesis to advance the process. Overall, therefore, based on their highest archaeal BLASTP bitscores toward the PEP-utilizing enzyme mobile domain of Gla (Figure 4B), highest average archaeal bitscores toward the Gla and Trv proteomes (Figure 5B and C), front-rank AGA activity, blebbing membrane vesicles, and almost complete Embden-Meyerhof-Parnas pathway[62] that could evolve readily into a glycolytic pathway to link up with mitochondrial respiration, Abo and Acf were endowed with a range of advantageous attributes as candidates for the archaeal-parent role.[64] Acf and Abo are highly similar, although the facultatively anaerobic nature of Acf could enable it to explore more ecological niches than anaerobic Abo to collect and adopt useful genes from HGT donors. Similarity bitscores displayed by the proteomes of 225 different archaeons, alphaproteobacterial genera, and other bacteria toward the total mitochondrial DNA-encoded proteins of different eukaryotes indicated that the prokaryotic proteomes displaying top similarity toward each of 19 mitochondrial proteomes were all alphaproteobacterial ones (Table S7). The distributions of the bitscores of the prokaryotic proteomes toward the mitochondrial DNA-encoded proteins of R americana, M paleacea, and P falciparum, viz mitochondria with the highest total score, mitochondria with the second highest total score, and the mitochondria with a small number of mitochondrial DNA-encoded proteins, respectively, are illustrated in Figure 7; the 3 top-scoring alphaproteobacteria in each instance are indicated with their bitscores in parentheses. These findings demonstrated the dominance of alphaproteobacterial precursors in mitochondrial evolution among extant eukaryotes.
Figure 7.

Similarity bitscores between mitochondrial DNA-encoded proteins and prokaryotic proteins. Total bitscores displayed by 46 archaeons, 150 alphaproteobacterial genera, and 29 other kinds of bacteria toward 3 species of mitochondrial DNA-encoded proteins are shown in the 3 panels. In each case, the 3 top-scoring prokaryotes are indicated with their individual total bitscores inside parentheses.

Similarity bitscores between mitochondrial DNA-encoded proteins and prokaryotic proteins. Total bitscores displayed by 46 archaeons, 150 alphaproteobacterial genera, and 29 other kinds of bacteria toward 3 species of mitochondrial DNA-encoded proteins are shown in the 3 panels. In each case, the 3 top-scoring prokaryotes are indicated with their individual total bitscores inside parentheses.

Conclusions

In this study, Methanopyrus kandleri was found to be the top-ranked organism with respect to the similarity between intraspecies VARS-IARS among 5398 species from the 3 biological domains and therefore closest to LUCA. Moreover, the parallel clusters of archaeal and bacterial species with high VARS-IARS similarity delineated a pathway of descent of these genes from the Primitive Archaea Cluster to the Ancestral Bacteria Cluster, branching early from the Archaea domain. The asterisked columns in Figure 3, where all 6 aligned protein sequences uniformly showed a Val or Leu residue despite the ease with which Val, Leu, and Ile can be interchanged in evolution, conveyed a surprising level of protein sequence conservation across 2 different proteins, 3 biological domains, and a time span of more than 2 billion years in support of the descent of Bacteria and Eukarya from an archaeal root of life. With respect to eukaryogenesis, the preeminent eukaryotic-archaeal similarities pertaining to rProts compared to eukaryotic-bacterial similarities showed that the prokaryotic parent which hosted the process of eukaryogenesis was an archaeal parent rather than a bacterial parent. Evidence suggests that the archaeal parent was an archaeon enriched with eukaryote-homologous proteins and expert in the acquisition of exogenous genes through AGA, as exemplified by the Aciduliprofundum archaeons. Click here for additional data file. Supplemental material, FigureS1-S2_xyz322784b82bfeb for Descent of Bacteria and Eukarya From an Archaeal Root of Life by Xi Long, Hong Xue and J Tze-Fei Wong in Evolutionary Bioinformatics Click here for additional data file. Supplemental material, TableS1-S7_xyz3227876cd61d9 for Descent of Bacteria and Eukarya From an Archaeal Root of Life by Xi Long, Hong Xue and J Tze-Fei Wong in Evolutionary Bioinformatics
  2 in total

Review 1.  "Superwobbling" and tRNA-34 Wobble and tRNA-37 Anticodon Loop Modifications in Evolution and Devolution of the Genetic Code.

Authors:  Lei Lei; Zachary Frome Burton
Journal:  Life (Basel)       Date:  2022-02-08

Review 2.  The origin and evolution of viruses inferred from fold family structure.

Authors:  Fizza Mughal; Arshan Nasir; Gustavo Caetano-Anollés
Journal:  Arch Virol       Date:  2020-08-03       Impact factor: 2.574

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.