| Literature DB >> 24489775 |
Wen-Shyong Tzou1, Ying Chu1, Tzung-Yi Lin2, Chin-Hwa Hu2, Tun-Wen Pai3, Hsin-Fu Liu4, Han-Jia Lin2, Ildeofonso Cases5, Ana Rojas5, Mayka Sanchez6, Zong-Ye You2, Ming-Wei Hsu2.
Abstract
Adaptation of enzymes in a metabolic pathway can occur not only through changes in amino acid sequences but also through variations in transcriptional activation, mRNA splicing and mRNA translation. The heme biosynthesis pathway, a linear pathway comprised of eight consecutive enzymes in animals, provides researchers with ample information for multiple types of evolutionary analyses performed with respect to the position of each enzyme in the pathway. Through bioinformatics analysis, we found that the protein-coding sequences of all enzymes in this pathway are under strong purifying selection, from cnidarians to mammals. However, loose evolutionary constraints are observed for enzymes in which self-catalysis occurs. Through comparative genomics, we found that in animals, the first intron of the enzyme-encoding genes has been co-opted for transcriptional activation of the genes in this pathway. Organisms sense the cellular content of iron, and through iron-responsive elements in the 5' untranslated regions of mRNAs and the intron-exon boundary regions of pathway genes, translational inhibition and exon choice in enzymes may be enabled, respectively. Pathway product (heme)-mediated negative feedback control can affect the transport of pathway enzymes into the mitochondria as well as the ubiquitin-mediated stability of enzymes. Remarkably, the positions of these controls on pathway activity are not ubiquitous but are biased towards the enzymes in the upstream portion of the pathway. We revealed that multiple-level controls on the activity of the heme biosynthesis pathway depend on the linear depth of the enzymes in the pathway, indicating a new strategy for discovering the molecular constraints that shape the evolution of a metabolic pathway.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24489775 PMCID: PMC3904948 DOI: 10.1371/journal.pone.0086718
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Heme biosynthesis pathway in animals.
The substrate and product are indicated for each enzyme and the subcellular localization of each enzyme is also shown (cytosol or mitochondria). Each enzyme is coded from one to eight according to the linear order of the pathway. Also shown are the processes by which hydroxymethylbilane, the substrate of UROS, can be non-enzymatically cyclized to form uroporphyrinogen I, leading to uroporphyrin I or coproporphyrin I, and the process by which uroporphyrinogen III, the product of UROS, can be auto-oxidized to form uroporphyrin III. Protoporphyrinogen, the substrate of PPO, can be auto-oxidized to form protoporphyrin.
Figure 2Selection pressure on heme biosynthesis genes in animals.
ω values (d) were estimated with the M0 model for the eight heme biosynthesis genes in animals (A). The distribution of ω values (B), the nonsynonymous substitution rate, d (C), and the synonymous substitution rate, d (D). The order of genes follows the linear order of their pathway positions (Figure 1).
Comparison of ω values among eight genes to test the significance of the ω variations among genes.
| Group 1 | Group 2 | Group 3 | |||||||
| FECH | PBGD | CPO | ALAS | UROD | PBGS | PPO | UROS | ||
| FECH | ** | * | ** | ** | ** | ||||
| Group 1 | PBGD | * | ** | ** | ** | ||||
| CPO | * | * | ** | ** | |||||
| ALAS | ** | ** | |||||||
| Group 2 | UROD | ** | * | ** | ** | ||||
| PBGS | ** | ** | * | * | ** | ** | |||
| Group 3 | PPO | ** | ** | ** | ** | ** | ** | ||
| UROS | ** | ** | ** | ** | ** | ** | |||
For each comparison, a pair of genes was chosen. ω value of the gene in the row is constrained to the average value of omega values from the genes in the rows and columns. The difference in the likelihood between the null model M0 and the constrained model was obtained. The significance level is labeled * if p<0.05 and ** if p<0.01.
Selection pressure of genes of heme-biosynthesis pathway by employing the branch model.
| Model | Model | Estimates of parameters | 2ΔL |
|
| ALAS+ALAS1+ALAS2 | M0: one ratio | ω = 0.05356 | ||
| Teleost ALAS2 branch | two ratios | ω0 = 0.05323, ω = 8.04979 | 10.087074 | 1.49E-03 |
| PBGS | M0: one ratio | ω = 0.06398 | ||
| Arthropod branch | two ratios | ω0 = 0.06398, ω = 999.000 | 4.478516 | 3.43E-02 |
| UROD | M0: one ratio | ω = 0.06194 | ||
| Teleost branch | two ratios | ω0 = 0.06135, ω = 13.69790 | 6.01671 | 1.42E-02 |
Twice the difference between the log likelihood of M0 (one ratio) and two ratio model.
p(χ2) of the likelihood ratio test.
Positive selection in the different lineages of genes of heme biosynthesis pathway.
| Foreground branches | 2ΔL |
| Estimates of the parameters in the modified model A | Positively selected sites |
| Teleost ALAS2 | 40.20213 | 2.29E-10 | p0 = 0.86464, p1 = 0.07816, p2a = 0.05246, | 204E, 243K, 352P, 353K, |
| p2b = 0.00474, ω0 = 0.04595, ω2 = 999.0 | 431G | |||
| Arthropod PBGS | 23.97615 | 9.75E-07 | p0 = 0.86020, p1 = 0.03447, p2a = 0.10127, | 11Y, 106H, 162C, 195S, |
| p2b = 0.00406, ω0 = 0.06383, ω2 = 999.0 | 267K, 274A, 314I | |||
| Teleost UROD | 25.60371 | 4.19E-07 | p0 = 0.87046, p1 = 0.07295, p2a = 0.05222, | 136Q, 174M, 290K, 297K, |
| p2b = 0.00438, ω0 = 0.06761, ω2 = 999.0 | 300T, 316E, 349H |
Twice the difference between the log likelihood of M0 (one ratio) and two ratio model.
p-value p(χ2) of the likelihood ratio test.
For the branch-site model A, the following four classes are demarcated to each amino acid: class 0 with 0< ω0<1 in all branches; class 1 with ω1 = 1 in all branches; class 2a with foreground ω2>1 but background 0< ω0<1; and class 2b with foreground ω2>1 but background ω1 = 1. p0 is the proportion of codons with 0< ω <1 in class 0; p1 is the proportion of codons with ω1 = 1 in class 1; p2a is the proportion of codons with foreground ω2>1 but with background ω0 in class 2a; p2b is the proportion of codons with foreground ω2>1 but with background ω1 = 1 in class 2b.
Bayes empirical Bayes (BEB) is used to calculate the posterior probabilities to identify the sites (amino acid residues) under positive selection (higher than 95%). The sites are indexed by the amino acids of the site in the human sequence (ALAS1, PBGS, UROD).
Figure 3Three-dimensional view of the evolutionarily conserved DNase-hypersensitive sites in intron sequences.
For each gene in the biosynthesis pathway (ALAS1 and ALAS2 are treated separately because they are different genes located on different chromosomes), the length of the intersection of the DNA sequence that is evolutionarily conserved across vertebrates and DNase-hypersensitive sites is indicated on the z-axis. The intron ID is provided on the x-axis. Genes from ALAS1 to FECH are shown on the y-axis and are coded from one to eight according to the linear order of the pathway. (Figure 1).
IRE in 5′UTR.
| Gene | Mammal | Bird,Reptile,Amphibian | Teleost | Chordate | Echinoderm | Arthropod | Cnidaria |
| ALAS | NF | NF | NF | 1(2) | 1(1) | 1(2) | 1(1) |
| ALAS1 | (5) | (5) | 3(4) | NF | NF | NF | NF |
| ALAS2 | 5(5) | 1(1) | 3(3) | NF | NF | NF | NF |
| PBGS | (6) | (3) | (4) | (1) | NF | (0) | (1) |
| PBGD | (5) | (3) | (3) | (1) | 1(1) | 2(2) | (1) |
| UROS | (5) | (4) | (3) | (0) | (0) | (1) | (1) |
| UROD | (5) | (1) | (3) | (1) | (1) | (1) | (2) |
| CPO | (4) | (3) | (4) | (1) | (1) | (1) | (0) |
| PPO | (5) | (1) | (4) | (0) | (0) | (1) | (2) |
| FECH | (5) | (0) | (3) | (0) | (0) | (1) | (1) |
Number of species containing potential IRE in 5′UTR region of genes involved in heme-biosynthesis pathway.
Number inside the parenthesis is the total number of 5′UTR sequences under investigation.
NF: Not found.
IRE in intron.
| Gene | Mammal | Bird,Reptile,Amphibian | Teleost | Chordate | Echinoderm | Arthropod | Cnidaria |
| ALAS | NF | NF | NF | 2(2) | 1(1) | (4) | (2) |
| ALAS1 | 1(7) | 1(5) | (6) | NF | NF | NF | NF |
| ALAS2 | (5) | 1(2) | (5) | NF | NF | NF | NF |
| PBGS | 4(6) | 2(4) | 1(5) | (1) | NF | (1) | 1(3) |
| PBGD | 3(7) | (5) | (6) | (1) | (1) | (4) | 1(2) |
| UROS | 5(7) | 3(5) | (4) | (1) | (1) | (2) | (2) |
| UROD | (7) | (5) | (5) | (2) | (1) | (4) | (3) |
| CPO | (7) | (5) | (5) | (1) | (1) | (1) | (1) |
| PPO | (7) | 1(2) | (5) | (2) | (1) | (1) | 1(3) |
| FECH | 5(7) | (4) | (4) | (1) | (1) | (5) | (1) |
Number of species containing potential IRE in intron region of genes involved in heme biosynthesis pathway.
Number inside the parenthesis is the total number of intron sequences under investigation.
NF: Not found.
Figure 4Potential iron-responsive elements (IREs) in the introns and intron-exon boundaries of UROS genes.
IREs depicted as stem-loop structures are shown in the corresponding intron regions. UROS exon and intron IDs from four species are indicated. The conserved splicing acceptor site AG and the unpaired nucleotide of the IRE structure are also shown.
Figure 5Sequence alignment of the IREs at the intron-exon boundaries of UROS from four species.
“>” and “<” represent the base pairing of the RNA secondary structure. The potential IRE consensus loop sequence, CAGUGN, and the unpaired nucleotide G are also shown with respect to the location of the IRE hairpin. The intron-exon boundary is indicated as |.
HRM_t in protein sequence.
| Gene | Mammal | Bird,Reptile,Amphibian | Teleost | Chordate | Echinoderm | Arthropod | Cnidaria |
| ALAS | NF | NF | NF | 3(3) | 2(2) | 1(6) | 1(2) |
| ALAS1 | 7(7) | 6(6) | 6(6) | NF | NF | NF | NF |
| ALAS2 | 4(5) | 3(4) | (5) | NF | NF | NF | NF |
| PBGS | (6) | (5) | (5) | (1) | NF | (6) | (3) |
| PBGD | (7) | (6) | (5) | (1) | (1) | 3(6) | (2) |
| UROS | (7) | (6) | (4) | (2) | (1) | (6) | (3) |
| UROD | (7) | (6) | (5) | (3) | (1) | (7) | (3) |
| CPO | (7) | (5) | (5) | (2) | (1) | (6) | (1) |
| PPO | (7) | (5) | (5) | (2) | (1) | (6) | (3) |
| FECH | (7) | (5) | (4) | (3) | (1) | (7) | (2) |
Number of protein species containing potential HRM_t of genes involved in heme-biosyntheis pathway.
Number inside the parenthesis is the total number of protein sequences under investigation.
NF: Not found.
HRM_r in protein sequence.
| Gene | Mammal | Bird,Reptile,Amphibian | Teleost | Chordate | Echinoderm | Arthropod | Cnidaria |
| ALAS | NF | NF | NF | (3) | (2) | (6) | (2) |
| ALAS1 | (7) | 2(6) | (6) | NF | NF | NF | NF |
| ALAS2 | 5(5) | (4) | 1(5) | NF | NF | NF | NF |
| PBGS | 6(6) | 5(5) | 5(5) | (1) | NF | 2(6) | 1(3) |
| PBGD | (7) | (6) | (5) | (1) | (1) | 3(6) | (2) |
| UROS | (7) | (6) | (4) | (2) | (1) | (6) | (3) |
| UROD | (7) | (6) | (5) | (3) | (1) | (7) | (3) |
| CPO | (7) | (5) | (5) | (2) | (1) | (6) | (1) |
| PPO | (7) | (5) | (5) | (2) | (1) | (6) | (3) |
| FECH | (7) | (5) | (4) | (3) | (1) | (7) | (2) |
Number of protein species containing potential HRM_r of genes in heme biosynthesis pathway.
Number inside the parenthesis is the total number of protein sequences under investigation.
NF: Not found.
Figure 6Heme-regulatory motifs (HRMs) in PBGS and PBGD.
Multiple sequence alignments of PBGS (A) and PBGD (B) are shown, with HRM_t and HRM_r colored orange and green, respectively. Amino acid numbers for HRM_t and HRM_r are also shown according to the first protein sequence in the alignment.
Multiple regulatory potentials in heme biosynthesis pathway.
| Gene | Pathway Position | First Intron-mediatedTranscription Control | Splicing Control | Translational Control | Protein Localization | ProteinStability | Selection Pressure |
| ALAS | 1 | ** | ** | ||||
| ALAS1 | 1 | * | *** | * | |||
| ALAS2 | 1 | ** | *** | ** | ** | ||
| PBGS | 2 | ** | * | *** | |||
| PBGD | 3 | *** | * | * | * | ||
| UROS | 4 | * | * | *** | |||
| UROD | 5 | * | |||||
| CPO | 6 | ||||||
| PPO | 7 | *** | |||||
| FECH | 8 | ** |
Evolutionarily conserved DNase-hypersensitive sites in intron sequences.
IRE in intron-exon boundary that could potentially affect splicing.
IRE in 5′UTR,while binding IRP, could potentially inhibit protein translation.
HRM_t that could potentially block the import of enzyme to mitochondria while binding heme.
HRM_r that could potentially affect protein stability.
Genes with two highest ω value.
In a, *: <50 bps, **:>50 bps, <100 bps, ***:>100 bps. In bcde, *:< = 0.33, **:>0.33, < = 0.67, ***:>0.67 for the proportion of species collected.