| Literature DB >> 25158795 |
Pavithra M Rallapalli1, Christine A Orengo1, Romain A Studer2, Stephen J Perkins2.
Abstract
Blood coagulation occurs through a cascade of enzymes and cofactors that produces a fibrin clot, while otherwise maintaining hemostasis. The 11 human coagulation factors (FG, FII-FXIII) have been identified across all vertebrates, suggesting that they emerged with the first vertebrates around 500 Ma. Human FVIII, FIX, and FXI are associated with thousands of disease-causing mutations. Here, we evaluated the strength of selective pressures on the 14 genes coding for the 11 factors during vertebrate evolution, and compared these with human mutations in FVIII, FIX, and FXI. Positive selection was identified for fibrinogen (FG), FIII, FVIII, FIX, and FX in the mammalian Primates and Laurasiatheria and the Sauropsida (reptiles and birds). This showed that the coagulation system in vertebrates was under strong selective pressures, perhaps to adapt against blood-invading pathogens. The comparison of these results with disease-causing mutations reported in FVIII, FIX, and FXI showed that the number of disease-causing mutations, and the probability of positive selection were inversely related to each other. It was concluded that when a site was under positive selection, it was less likely to be associated with disease-causing mutations. In contrast, sites under negative selection were more likely to be associated with disease-causing mutations and be destabilizing. A residue-by-residue comparison of the FVIII, FIX, and FXI sequence alignments confirmed this. This improved understanding of evolutionary changes in FVIII, FIX, and FXI provided greater insight into disease-causing mutations, and better assessments of the codon sites that may be mutated in applications of gene therapy.Entities:
Keywords: coagulation; evolution; hemostasis; positive selection
Mesh:
Substances:
Year: 2014 PMID: 25158795 PMCID: PMC4209140 DOI: 10.1093/molbev/msu248
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
The 14 Coagulation Genes and Their Protein Products.
| Gene | ENSEMBL ID | Protein Product | Other Name | Function | Genetic Disorder | Amino Acid Length | No. of Domains | Domain Organization |
|---|---|---|---|---|---|---|---|---|
| ENSG00000171560 | Coagulation Factor I | Fibrinogen alpha chain | Forms fibrin clot; cofactor in platelet aggregation | Congenital afibrogenmia, familial renal amyloidosis | 866 | 1 | Fibrinogen C-terminal domain | |
| ENSG00000171564 | Fibrinogen beta chain | 491 | 1 | Fibrinogen C-terminal domain | ||||
| ENSG00000171557 | Fibrinogen gamma chain | 453 | 1 | FG C-terminal domain | ||||
| ENSG00000180210 | Coagulation factor II | Prothrombin | Activates FG to fibrin | Thrombophilia | 622 | 4 | Gla–Kringle 1–Kringle 2–SP | |
| ENSG00000117525 | Coagulation factor III | Tissue factor | Activates FVIIa | — | 295 | 3 | Transmembrane–transmembrane helix–transmembrane | |
| ENSG00000198734 | Coagulation factor V | Proaccelerin | Combines with FX to activate prothrombin | Activated protein C resistance | 2224 | 6 | F5/8 type A1–F5/8 type A2–B–F5/8 type A3–F5/8 type C1–F5/8 type C2 | |
| ENSG00000057593 | Coagulation factor FVII | Proconvertin | Activates FIX, FX | Congenital proconvertin/factor VII deficiency | 466 | 4 | Gla–EGF1–EGF2–SP | |
| ENSG00000185010 | Coagulation Factor FVIII | Antihemophilic factor A (AHF-A) | Combines with FIX and FIV to activate FX | Hemophilia A | 2351 | 6 | F5/8 type A1–F5/8 type A2–B–F5/8 type A3–F5/8 type C1–F5/8 type C2 | |
| ENSG00000101981 | Coagulation factor FIX | Christmas factor | Combines with FVIII and FIV to activate FX | Hemophilia B | 461 | 4 | Gla–EGF1–EGF2–SP | |
| ENSG00000126218 | Coagulation factor FX | Stuart-prower factor | Converts prothrombin to thrombin | Congenital factor X deficiency | 488 | 4 | Gla–EGF1–EGF2–SP | |
| ENSG00000088926 | Coagulation factor FXI | Plasma thromboplastin antecedent | Combines with FIV to activate FIX | Factor XI deficiency | 625 | 4 | Apple 1–apple 2–apple 3–SP | |
| ENSG00000131187 | Coagulation factor FXII | Hageman factor | Activates FXI; activates plasmin | Hereditary angioedema type III | 615 | 6 | Fibronectin type-II–EGF1–fibronectin type-I - EGF2–kringle–SP | |
| ENSG00000124491 | Coagulation factor FXIII | Fibrin-stabilizing factor A chain | Stabilizes fibrin | Congenital factor XIIIa deficiency | 732 | — | — | |
| ENSG00000143278 | Fibrin-stabilizing factor B chain | Stabilizes FXIIIA; regulates thrombin | Congenital factor XIIIb deficiency | 661 | 10 | Short complement regulator domains 1–10 |
FSchema of the blood coagulation pathway leading to fibrin. The relationships between the 11 coagulation factors that are coded by 14 genes are shown in the modern revised coagulation pathway (blue, enzymes; red, cofactors). Tissue factor (TF; also known as FIII) initiates coagulation when it binds and activates FVII. The activated TF-FVIIa complex then activates FX which then generates activated thrombin (FIIa). FVIII, FIX, and FXI enable the amplification of the coagulation pathway to maximize FIIa production, and possess the largest number of pathogenic human mutations. Activated thrombin cleaves FG (FI; green) to form fibrin polymers (blood clots) that are cross-linked by FXIIIa.
FPhylogenetic tree of 47 vertebrate genomes. The tree shows the 47 genomes studied here alongside their clade and taxonomic groups. The 47 vertebrate genomes showed good sequence quality and coverage in the Ensembl database and were grouped into five major clades (Primates, Glires, Laurasiatheria, Sauropsida, and Fishes) and three minor clades. The Mammals comprised of the Primates, Glires, and Laurasiatheria clades together with Atlantogenata (African origin mammals and Xenartha) and four Australian species, totaling 30 out of 47 genomes.
LRT Statistics for the Site Model Comparisons of Model 8 with Model 8a for the 14 Coagulation Factors among the Five Clades.
| Gene | Clade | No. of Sequences | No. of Codons | (M8a-M8) | ||
|---|---|---|---|---|---|---|
| LRT | q value | |||||
| Vertebrates | 44 | 236 | 1.39 | 0.24 | 0.35 | |
| Mammals | 29 | 288 | 7.75 | 0.01** | 0.04 | |
| Primates | 7 | 865 | 3.35 | 0.07 | 0.16 | |
| Glires | 7 | 670 | 2.89 | 0.09 | 0.19 | |
| Laurasiatheria | 9 | 267 | 7.28 | 0.01** | 0.04 | |
| Sauropsida | 6 | 464 | 0.33 | 0.57 | 0.62 | |
| Fishes | 7 | 592 | −0.16 | 1.00 | 0.72 | |
| Vertebrates | 40 | 429 | 1.45 | 0.23 | 0.35 | |
| Mammals | 26 | 456 | −2.78 | 1.00 | 0.72 | |
| Primates | 6 | 488 | 7.44 | 0.01** | 0.04 | |
| Glires | 6 | 471 | 0.69 | 0.41 | 0.51 | |
| Laurasiatheria | 8 | 479 | 1.44 | 0.23 | 0.35 | |
| Sauropsida | 5 | 473 | −0.02 | 1.00 | 0.72 | |
| Fishes | 7 | 476 | −0.99 | 1.00 | 0.72 | |
| Vertebrates | 44 | 386 | −1.74 | 1.00 | 0.72 | |
| Mammals | 29 | 427 | 3.19 | 0.07 | 0.17 | |
| Primates | 7 | 433 | 3.98 | 0.05 | 0.13 | |
| Glires | 7 | 433 | 3.44 | 0.06 | 0.16 | |
| Laurasiatheria | 8 | 432 | 0.09 | 0.76 | 0.72 | |
| Sauropsida | 5 | 427 | 0.03 | 0.86 | 0.72 | |
| Fishes | 8 | 395 | −5.42 | 1.00 | 0.72 | |
| Vertebrates | 38 | 475 | 16.91 | 0.00** | 0.00** | |
| Mammals | 23 | 571 | 4.64 | 0.03 | 0.11 | |
| Primates | 7 | 621 | 4.25 | 0.04 | 0.12 | |
| Glires | 5 | 611 | −0.07 | 1.00 | 0.72 | |
| Laurasiatheria | 7 | 620 | 1.58 | 0.21 | 0.34 | |
| Sauropsida | 5 | 593 | 3.68 | 0.05 | 0.14 | |
| Fishes | 8 | 562 | −0.05 | 1.00 | 0.72 | |
| Vertebrates | 47 | 166 | 2.09 | 0.15 | 0.27 | |
| Mammals | 26 | 240 | 13.37 | 0.00** | 0.00** | |
| Primates | 7 | 295 | 0.01 | 0.90 | 0.72 | |
| Glires | 6 | 287 | 3.07 | 0.08 | 0.18 | |
| Laurasiatheria | 7 | 292 | 0.97 | 0.32 | 0.44 | |
| 5 | 254 | 8.65 | ||||
| Fishes | 14 | 192 | 0.82 | 0.36 | 0.47 | |
| 37 | 1,182 | 14.95 | ||||
| 22 | 1,741 | 31.92 | ||||
| 6 | 2,116 | 5.36 | ||||
| Glires | 6 | 1,847 | 0.48 | 0.49 | 0.56 | |
| Laurasiatheria | 6 | 2,043 | 4.83 | 0.10 | ||
| Sauropsida | 5 | 1,550 | 1.45 | 0.23 | 0.35 | |
| Fishes | 8 | 1,534 | 2.12 | 0.15 | 0.27 | |
| 34 | 330 | 5.93 | ||||
| Mammals | 29 | 344 | 3.06 | 0.08 | 0.18 | |
| Primates | 7 | 443 | 0.16 | 0.69 | 0.72 | |
| Glires | 7 | 417 | −0.3 | 1.00 | 0.72 | |
| Laurasiatheria | 7 | 443 | 0.57 | 0.45 | 0.54 | |
| Sauropsida | 3 | 425 | 0.55 | 0.46 | 0.54 | |
| Fishes | ||||||
| Vertebrates | 38 | 1,028 | 4.85 | 0.10 | ||
| 24 | 1,985 | 11.19 | ||||
| 7 | 2,351 | 6.83 | ||||
| Glires | 5 | 2,257 | 3.95 | 0.13 | ||
| Laurasiatheria | 7 | 2,344 | 0.25 | 0.62 | 0.67 | |
| Sauropsida | 5 | 1,480 | 0.03 | 0.86 | 0.72 | |
| Fishes | 7 | 1,413 | 0.9 | 0.34 | 0.45 | |
| Vertebrates | 44 | 329 | 0.63 | 0.43 | 0.52 | |
| 24 | 443 | 18.57 | ||||
| Primates | 7 | 461 | 1.28 | 0.26 | 0.36 | |
| Glires | 5 | 468 | 1.85 | 0.17 | 0.30 | |
| Laurasiatheria | 7 | 459 | 4.34 | 0.12 | ||
| Sauropsida | 5 | 435 | −0.03 | 1.00 | 0.72 | |
| Fishes | 13 | 368 | −0.36 | 1.00 | 0.72 | |
| 41 | 353 | 14.55 | ||||
| Mammals | 26 | 424 | −0.23 | 0.63 | 0.67 | |
| Primates | 7 | 485 | 0.01 | 0.91 | 0.72 | |
| Glires | 6 | 464 | 2.06 | 0.15 | 0.27 | |
| Laurasiatheria | 7 | 463 | 2.3 | 0.13 | 0.25 | |
| Sauropsida | 5 | 385 | 0.03 | 0.87 | 0.72 | |
| Fishes | 8 | 391 | 0.71 | 0.40 | 0.51 | |
| Vertebrates | 23 | 614 | 0.34 | 0.56 | 0.62 | |
| Mammals | 23 | 614 | 0.34 | 0.56 | 0.62 | |
| Primates | 7 | 625 | 0.05 | 0.82 | 0.72 | |
| Glires | 5 | 619 | 0.07 | 0.79 | 0.72 | |
| Laurasiatheria | 6 | 625 | 0.02 | 0.89 | 0.72 | |
| Sauropsida | ||||||
| Fishes | ||||||
| Vertebrates | 27 | 512 | −3.98 | 1.00 | 0.72 | |
| Mammals | 25 | 533 | −0.18 | 1.00 | 0.72 | |
| Primates | 7 | 615 | −0.02 | 1.00 | 0.72 | |
| Glires | 6 | 557 | −0.07 | 1.00 | 0.72 | |
| Laurasiatheria | 8 | 513 | 4.6 | 0.11 | ||
| Sauropsida | ||||||
| Fishes | ||||||
| 47 | 631 | 9.07 | ||||
| Mammals | 26 | 684 | 2.58 | 0.11 | 0.22 | |
| Primates | 7 | 732 | 1.62 | 0.20 | 0.34 | |
| Glires | 6 | 730 | −1.47 | 1.00 | 0.72 | |
| Laurasiatheria | 9 | 730 | −0.27 | 1.00 | 0.72 | |
| Sauropsida | 5 | 732 | 0.01 | 0.92 | 0.72 | |
| Fishes | 14 | 643 | 2.59 | 0.11 | 0.22 | |
| Vertebrates | 31 | 609 | 1.3 | 0.25 | 0.36 | |
| Mammals | 25 | 640 | 1.57 | 0.21 | 0.34 | |
| Primates | 7 | 660 | 1.22 | 0.27 | 0.37 | |
| Glires | 5 | 634 | −0.21 | 1.00 | 0.72 | |
| 7 | 656 | 7.08 | ||||
| Sauropsida | 5 | 573 | −0.93 | 1.00 | 0.72 | |
| Fishes | ||||||
NOTE.—Positive selection is denoted in bold in column 2.
*P < 0.05; **P < 0.01
FPositively selected genes of the coagulation cascade in vertebrates, mammals, and three major clades. The layout of this figure is a simplified representation of figure 1. Genes showing significant positive selection in the LRT calculations are highlighted in red. (a) Positive selection was observed in five genes when all 47 genome sequences (vertebrates) were analyzed as a single group. (b) In the Mammals group (fig. 2), where positive selection is seen for five genes, only the (c) Primates and (d) Laurasiatheria clades are shown with positive selection for two or three genes only, because no positive selection was seen in the Glires clade whereas insufficient sequences were available in the Atlantogenata clade for conclusions to be drawn. (e) Positive selection was seen in the Sauropsida, clade but no positive selection was observed in Fishes (fig. 2).
Positively Selected Sites in the M8a-M8 Comparison.
| Gene | Clade | No. of Codons Analyzed | No. of Codons under positive selection | Percentage Estimated by CodeML (%) | Sites under Positive Selection (BEB>50%) |
|---|---|---|---|---|---|
| Mammals | 288 | 13 | 7.17 | 7,18,32,35,39,51,101,108,178 | |
| Laurasiatheria | 267 | 64 | 13.29 | 4, | |
| Primates | 488 | 17 | 8.22 | ||
| Vertebrates | 475 | 5 | 0.99 | 52,57,151,162,381 | |
| Mammals | 240 | 11 | 6.63 | 55,72,76,91,115,177,179, | |
| Sauropsida | 254 | 13 | 10.60 | 3,40,41,44,74,79,80,81,110,123,170,208, | |
| Vertebrates | 1,182 | 10 | 0.87 | 103,275, | |
| Mammals | 1,741 | 54 | 2.72 | 2,3,7,24,36,60, | |
| Primates | 2,116 | 54 | 13.21 | 7,40,52,129,180,211,336,341,408,409,434, 592,660,703,707,720,754,784,877,879,888,892,907,941,959,978,1033,1039,1059,1131,1168,1205,1209,1218,1220,1222,1230,1231,1232,1236,1244,1254,1262,1265,1279,1285,1498,1504,1519,1526,1667,1682,1849,1918 | |
| Vertebrates | 330 | 12 | 5.22 | 5, | |
| Mammals | 1,985 | 27 | 3.98 | 7,245,281,339,342,521,667,718,765,784, 785,792,807,817,889,912,1130,1193,1197, 1220,1229,1288,1308,1333,1686,1967,1984 | |
| Primates | 2,351 | 33 | 0.63 | 422,755,791,814,850,860,886,927,964,966, 979,988,996,1019,1071,1180,1282,1312, 1361,1414,1421,1436,1438,1459,1527,1582, 1613,1622,1650,1689,1730,2340,2349 | |
| Mammals | 443 | 15 | 4.40 | 4,5, | |
| Vertebrates | 353 | 3 | 0.01 | 1,44,70 | |
| Vertebrates | 631 | 6 | 1.26 | 398, | |
| Laurasiatheria | 656 | 18 | 1.52 | 5,56,73,78,107,341,349,350,374,443, 459,461,497,522,525,539,540,624 |
*P > 95%; **P > 99%.
FProbability of evolutionary selection across the Primates clade for the three proteins FVIII, FIX, and FXI. The probabilities of negative, neutral, and positive selection were plotted against the amino acid position in the three proteins. The probability values were obtained from CodeML analyses of the seven primate sequences (gray, negative selection; yellow, neutral selection; green, positive selection).
FRelationship between disease-causing mutations and evolutionary pressures. The correlation between evolutionary selection and the number of times a disease-causing mutation occurs at each amino acid position is shown. In the columns, (a) FVIII, (b) FIX, and (c) FXI correspond to the three coagulation factors for which pathogenic mutational information is available in sufficient quantity from our three mutational databases. The three rows correspond to the posterior probability (based on the BEB model of CodeML) of positive selection, neutral evolution, and negative selection for primates.
FThe effect of amino acid replacements on the overall protein stability change ΔΔG. Data are shown for (a and b) FVIII A and B chains, (c) FIX (d) FXI. Three calculations were performed, each being based on the sequences for which crystal structures are known (PDB codes: 2R7E [FVIII], 2WPH [FIX] and 2F83 [FXI]). The FVIII A and B chains correspond to the N-terminal residues 19–760 and the C-terminal residues 1582–2351 that are observed in its crystal structure. The calculation represented by the dashed line indicates the distribution calculated for all 19 possible amino acid replacements. That represented by the dotted line indicates the distribution calculated for all the evolutionary occurring replacements that were observed in our data set of 47 genomes. That represented by the continuous line indicates the distribution calculated for the disease-causing mutations from the FVIII, FIX, and FXI mutation databases.
FThe effect of amino acid replacements at the sequence level on the protein stability change ΔΔG. The residue stability changes are depicted on a seven point scale from highly destabilizing (red) to highly stabilizing (blue) for each of (a) FVIII chain A, (b) FVIII chain B, (c) FIX, and (d) FXI. Their PDB codes are indicated in figure 6. In each panel, the upper half shows the ΔΔG values for all the possible 19 amino acid replacements at each residue position that is part of the crystal structure, and the lower half shows the ΔΔG values for the disease-causing mutations taken from the mutation databases.