| Literature DB >> 34335678 |
Alejandro Álvarez-Lugo1,2, Arturo Becerra2.
Abstract
Gene duplication is a crucial process involved in the appearance of new genes and functions. It is thought to have played a major role in the growth of enzyme families and the expansion of metabolism at the biosphere's dawn and in recent times. Here, we analyzed paralogous enzyme content within each of the seven enzymatic classes for a representative sample of prokaryotes by a comparative approach. We found a high ratio of paralogs for three enzymatic classes: oxidoreductases, isomerases, and translocases, and within each of them, most of the paralogs belong to only a few subclasses. Our results suggest an intricate scenario for the evolution of prokaryotic enzymes, involving different fates for duplicated enzymes fixed in the genome, where around 20-40% of prokaryotic enzymes have paralogs. Intracellular organisms have a lesser ratio of duplicated enzymes, whereas free-living enzymes show the highest ratios. We also found that phylogenetically close phyla and some unrelated but with the same lifestyle share similar genomic and biochemical traits, which ultimately support the idea that gene duplication is associated with environmental adaptation.Entities:
Keywords: enzymatic classes; enzyme evolution; function class; gene duplication; paralogous enzymes
Year: 2021 PMID: 34335678 PMCID: PMC8318041 DOI: 10.3389/fgene.2021.641817
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Relation between the enzyme and protein content, and the genome size. For each pair of variables, a power-law equation is the one that best explains the distribution of the data. The equations and R-squared values are as follows: (A) y = 2.51x0.72; R2 = 0.79, (B) y = 0.04x0.66; R2 = 0.76; (C) y = 0.004x0.9; R2 = 0.95.
FIGURE 2Ratio of paralogous enzymes as a function of the number of enzymes (A) and the number of proteins (B), for each individual genome. The power-law equation for each adjustment and the R-squared value are as follows: (A) y = 0.003x0.67; R2 = 0.68; (B) y = 0.003x0.58; R2 = 0.75. Note that the R-squared value is higher when the number of proteins is considered (R2 = 0.75) instead of the number of enzymes (R2 = 0.68).
FIGURE 3Ratio of paralogous enzymes in prokaryotic organisms. In panel (A), the ratio of paralogous enzymes is plotted for the whole sample and is separated by enzymatic class. For panels (B–E), we also plot the ratio of paralogous enzymes for each enzymatic class but separated by the four different lifestyles in which we sorted our initial sample: (B) free-living organisms; (C) extremophiles; (D) pathogenic (non-intracellular) organisms; (E) intracellular organisms (both endosymbiont and intracellular pathogens).
FIGURE 4Boxplot showing the comparison of the average paralogous-enzymes ratio across the different lifestyles. Black dots represent the outliers that are found within each lifestyle. Interestingly, the lifestyle within which we found the highest number of outliers was in the intracellular organisms.
Number of subcategories and entries for each enzymatic class.
| Enzymatic class | EC code | No. of subclasses | No. of sub-subclasses | No. of enzymes |
| Oxidoreductases | EC 1 | 26 | 148 | 1798 |
| Transferases | EC 2 | 10 | 38 | 1900 |
| Hydrolases | EC 3 | 13 | 66 | 1360 |
| Lyases | EC 4 | 8 | 17 | 677 |
| Isomerases | EC 5 | 7 | 19 | 310 |
| Ligases | EC 6 | 6 | 12 | 203 |
| Translocases | EC 7 | 6 | 10 | 90 |
FIGURE 5Number of paralogous enzymes found within prokaryotic oxidoreductases (A), isomerases (B), and translocases (C) subclasses. Each cell in the heatmaps represents the mean value of the phylum for that specific subclass.
FIGURE 6Principal component analysis, considering the mean values for the following 11 variables: genome size, number of proteins, number of enzymes, and number of paralogous proteins, plus the average ratio of paralogous enzymes for each of the seven enzymatic classes. Each circle represents a single phylum, according to the KEGG Organisms Database, and those grouped into the same superphylum are depicted with the same color. The only exceptions that aren’t included into a supergroup but considered as individual phyla are the Aquificae, Thermotogae, and Spirochaetes. Single phyla are indicated by the following numbers: (1) Gammaproteobacteria-Enterobacteria, (2) Gammaproteobacteria-Others, (3) Betaproteobacteria, (4) Epsilonproteobacteria, (5) Deltaproteobacteria, (6) Alphaproteobacteria, (7) Other proteobacteria, (8) Firmicutes-Bacilli, (9) Firmicutes-Clostridia, (10) Firmicutes-Others, (11) Tenericutes, (12) Actinobacteria, (13) Cyanobacteria, (14) Chloroflexi, (15) Deinococcus-Thermus, (16) Unclassified Terrabacteria Group, (17) Verrucomicrobia, (18) Spirochaetes, (19) Synergistetes, (20) Acidobacteria, (21) Fibrobacteres, (22) Fusobacteria, (23) Gemmatimonadetes, (24) Planctomyces, (25) Chlamydia, (26) Elusimicrobia, (27) Bacteroidetes, (28) Chlorobi, (29) Aquificae, (30) Thermotogae, (31) Deferribacteres, (32) Dictyoglomi, (33) Nitrospirae, (34) Euryarchaeota, (35) Crenarchaeota, (36) Thaumarchaeota, (37) Korarchaeota, (38) Nanoarchaeota, (39) Bathyarchaeota.