| Literature DB >> 32046633 |
David E Northover1, Stephen D Shank1, David A Liberles2,3.
Abstract
BACKGROUND: Understanding the origins of genome content has long been a goal of molecular evolution and comparative genomics. By examining genome evolution through the guise of lineage-specific evolution, it is possible to make inferences about the evolutionary events that have given rise to species-specific diversification. Here we characterize the evolutionary trends found in chordate species using The Adaptive Evolution Database (TAED). TAED is a database of phylogenetically indexed gene families designed to detect episodes of directional or diversifying selection across chordates. Gene families within the database have been assessed for lineage-specific estimates of dN/dS and have been reconciled to the chordate species to identify retained duplicates. Gene families have also been mapped to the functional pathways and amino acid changes which occurred on high dN/dS lineages have been mapped to protein structures.Entities:
Keywords: Comparative genomics; Gene duplication; Molecular evolution; Pathway evolution; Protein structure
Mesh:
Year: 2020 PMID: 32046633 PMCID: PMC7011509 DOI: 10.1186/s12862-020-1585-y
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
TAED gene family lineages with the largest dN/dS values where dS > 0.001
| TAED Gene Families with high dN/dS | ||||
|---|---|---|---|---|
| Family | dN/dS Value | Mapped Location on Chordate Species Tree (Start to End) | Family description | KEGG Pathways |
| 151,766 | 88.7785 | Boreoeutheria to Laurasiatheria | splicing factor arginine/serine-rich 4/5/6 | Herpes simplex infection; Spliceosome |
| 30 | 76.4909 | Cercopithecidae to Cercopithecidae | transmembrane protein 91 isoform X1 | |
| 60,787 | 63.0029 | Euarchontoglires to Simiiformes | fucose-1-phosphate guanylyltransferase | Metabolic pathways; Fructose and mannose metabolism; Amino sugar and nucleotide sugar metabolism |
| 23,133 | 61.9262 | Aves to Aves | galanin receptor 1 | Neuroactive ligand-receptor interaction |
| 296,346 | 55.184 | Eutheria to Delphinidae | LOW QUALITY PROTEIN: probable N-acetyltransferase 16 | |
| 21,900 | 45.0176 | Boreoeutheria to Boreoeutheria | X-linked interleukin-1 receptor accessory protein-like 2 | |
| 52,181 | 44.575 | Boreoeutheria to Laurasiatheria | unnamed protein product | |
| 8186 | 44.3077 | Sauria to | protein Simiate | |
| 9378 | 40.3796 | Neognathae to Neognathae | palmitoyltransferase ZDHHC17 isoform X4 | |
| 22,600 | 39.9557 | Boreoeutheria to Laurasiatheria | potassium voltage-gated channel subfamily G member 3 isoform X1 | |
| 4144 | 38.9415 | Hystricognathi to Hystricognathi | pygopus homolog 1 | |
| 14,875 | 38.1267 | Laurasiatheria to Laurasiatheria | kinesin-like protein KIF16B | |
| 12,213 | 38.0258 | Camelidae to Camelus | LOW QUALITY PROTEIN: dnaJ homolog subfamily B member 3 | |
| 12,593 | 37.1258 | Boreoeutheria to Boreoeutheria | tRNA (guanine(10)-N2)-methyltransferase homolog | |
| 20,708 | 36.782 | Amniota to Amniota | heparan sulfate 2-O-sulfotransferase HS2ST1 | Glycosaminoglycan biosynthesis - heparan sulfate / heparin |
| 32,532 | 36.0215 | Boreoeutheria to | interleukin 8 | AGE-RAGE signaling pathway in diabetic complications; NOD-like receptor signaling pathway;Influenza A; Phospholipase D signaling pathway; Chemokine signaling pathway;Hepatitis B; Toll-like receptor signaling pathway; Legionellosis;RIG-I-like receptor signaling pathway; Rheumatoid arthritis; Malaria; NF-kappa B signaling pathway; Shigellosis;Hepatitis C;N on-alcoholic fatty liver disease (NAFLD); Pathways in cancer; Epithelial cell signaling in |
| 4273 | 35.4369 | Laurasiatheria to Laurasiatheria | myelin protein zero | Cell adhesion molecules (CAMs) |
| 21,944 | 35.0114 | Boreoeutheria to Boreoeutheria | adrenergic receptor alpha-2C | cGMP-PKG signaling pathway; Neuroactive ligand-receptor interaction |
| 14,303 | 34.6434 | Neognathae to Neognathae | ATP-dependent DNA helicase PIF1 partial | |
| 14,588 | 34.0198 | Neognathae to Neognathae | organic solute transporter subunit alpha-like partial | |
| 12,299 | 34.0037 | Neognathae to Neognathae | phosphatidylinositol glycan class H | Glycosylphosphatidylinositol (GPI)-anchor biosynthesis; Metabolic pathways |
| 8762 | 33.6089 | Myotis to Myotis | doublesex- and mab-3-related transcription factor 2 isoform X1 | |
| 55,196 | 31.2663 | Murinae to | Isx protein | |
| 14,119 | 31.0551 | Passeriformes to Passeriformes | cardiolipin synthase CMP-forming | Glycerophospholipid metabolism; Metabolic pathways |
| 39 | 30.43 | Boreoeutheria to Boreoeutheria | large subunit ribosomal protein L4e | Ribosome |
| 23,596 | 30.0168 | Neognathae to Neognathae | diacylglycerol cholinephosphotransferase | Phosphonate and phosphinate metabolism; Glycerophospholipid metabolism; Metabolic pathways; Choline metabolism in cancer; Ether lipid metabolism |
| 10,501 | 29.8433 | smoothelin isoform X5 | ||
| 157,103 | 27.6427 | Neognathae to | nucleolar protein 7 partial | |
| 3717 | 27.4352 | Boreoeutheria to Boreoeutheria | glypican 4 | Wnt signaling pathway |
| 9725 | 26.5684 | Eutheria to Boreoeutheria | cilia- and flagella-associated protein 221-like partial | |
Pathways present in lineages under positive selection
| Over-Represented KEGG Pathways TAED | |||||
|---|---|---|---|---|---|
| KEGG Pathway | Mapped Lineages Under Positive Selection | Lineages Under Positive Selection Mapped | Uncorrected | FDR | Bonferroni |
| Metabolic pathways | 7.63% | 5.73% | < 0.0001 | < 0.0001 | < 0.0001 |
| Olfactory transduction | 12.25% | 2.67% | < 0.0001 | < 0.0001 | < 0.0001 |
| Biosynthesis of secondary metabolites | 7.85% | 1.90% | < 0.0001 | < 0.0001 | < 0.0001 |
| Biosynthesis of antibiotics | 7.96% | 1.11% | < 0.0001 | < 0.0001 | < 0.0001 |
| Neuroactive ligand-receptor interaction | 6.45% | 1.05% | < 0.0001 | < 0.0001 | < 0.0001 |
| Microbial metabolism in diverse environments | 7.86% | 0.85% | < 0.0001 | < 0.0001 | < 0.0001 |
| Protein processing in endoplasmic reticulum | 6.97% | 0.72% | < 0.0001 | < 0.0001 | < 0.0001 |
| Purine metabolism | 6.63% | 0.69% | < 0.0001 | < 0.0001 | < 0.0001 |
| Herpes simplex infection | 6.18% | 0.68% | 0.0047 | 0.0128 | |
| Carbon metabolism | 8.63% | 0.62% | < 0.0001 | < 0.0001 | < 0.0001 |
| RNA transport | 6.45% | 0.59% | < 0.0001 | < 0.0001 | 0.0056 |
| Influenza A | 6.74% | 0.59% | < 0.0001 | < 0.0001 | < 0.0001 |
| Fluid shear stress and atherosclerosis | 6.21% | 0.54% | 0.0061 | 0.016 | |
| Lysosome | 6.72% | 0.52% | < 0.0001 | < 0.0001 | < 0.0001 |
| Glycerophospholipid metabolism | 7.61% | 0.49% | < 0.0001 | < 0.0001 | < 0.0001 |
| Non-alcoholic fatty liver disease (NAFLD) | 6.19% | 0.47% | 0.0128 | 0.0325 | |
| Pancreatic secretion | 8.86% | 0.43% | < 0.0001 | < 0.0001 | < 0.0001 |
| Peroxisome | 7.37% | 0.42% | < 0.0001 | < 0.0001 | < 0.0001 |
| Toxoplasmosis | 6.50% | 0.41% | 0.0001 | 0.0003 | 0.0341 |
| Phosphatidylinositol signaling system | 6.70% | 0.41% | < 0.0001 | < 0.0001 | 0.0004 |
| Glycerolipid metabolism | 10.21% | 0.39% | < 0.0001 | < 0.0001 | < 0.0001 |
| Drug metabolism - cytochrome P450 | 12.03% | 0.39% | < 0.0001 | < 0.0001 | < 0.0001 |
| Th17 cell differentiation | 6.38% | 0.38% | 0.0013 | 0.0039 | |
| Valine_ leucine and isoleucine degradation | 11.00% | 0.38% | < 0.0001 | < 0.0001 | < 0.0001 |
| Chemical carcinogenesis | 9.32% | 0.38% | < 0.001 | < 0.0001 | < 0.0001 |
The top 25 over-represented KEGG pathways with the highest % of lineages under positive selection mapping to a pathway. All pathways were significant at the 0.05 level after correction with the false discovery rate (FDR). Bold numbers indicate not significant at the 0.05 level. Lineages with dN/dS > 1 considered under positive selection
Pathways absent in lineages under positive selection
| Under-Represented KEGG pathways TAED | |||||
|---|---|---|---|---|---|
| KEGG Pathway | Mapped Lineages Under Positive Selection | Lineages Under Positive Selection Mapped | Uncorrected | FDR | Bonferroni |
| Zeatin biosynthesis | 0.53% | < 0.01% | 0.0001 | 0.0006 | |
| D-Arginine and D-ornithine metabolism | 1.12% | < 0.01% | 0.0016 | 0.0056 | |
| Penicillin and cephalosporin biosynthesis | 1.12% | < 0.01% | 0.0016 | 0.0056 | |
| Indole alkaloid biosynthesis | 0.89% | < 0.01% | < 0.0001 | < 0.0001 | 0.0011 |
| Bacterial secretion system | 1.40% | < 0.01% | 0.0013 | 0.0047 | |
| Toluene degradation | 2.53% | 0.01% | 0.0138 | 0.0422 | |
| Fluorobenzoate degradation | 2.53% | 0.01% | 0.0138 | 0.0422 | |
| Chlorocyclohexane and chlorobenzene degradation | 2.53% | 0.01% | 0.0138 | 0.0422 | |
| Styrene degradation | 1.37% | 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Tropane_ piperidine and pyridine alkaloid biosynthesis | 3.25% | 0.02% | 0.0014 | 0.0052 | |
| Cyanoamino acid metabolism | 4.07% | 0.06% | 0.0023 | 0.0080 | |
| Maturity onset diabetes of the young | 4.11% | 0.08% | 0.0004 | 0.0014 | |
| Phototransduction - fly | 1.64% | 0.09% | < 0.0001 | < 0.0001 | < 0.0001 |
| MAPK signaling pathway - plant | 4.42% | 0.11% | 0.0014 | 0.0052 | |
| Glycosaminoglycan biosynthesis - keratan sulfate | 3.51% | 0.11% | < 0.0001 | < 0.0001 | < 0.0001 |
| Glycosaminoglycan biosynthesis - heparan sulfate / heparin | 4.06% | 0.11% | < 0.0001 | 0.0001 | 0.0061 |
| Thyroid cancer | 2.78% | 0.12% | < 0.0001 | < 0.0001 | < 0.0001 |
| Mannose type O-glycan biosynthesis | 4.56% | 0.15% | 0.0010 | 0.0037 | |
| RNA polymerase | 2.96% | 0.15% | < 0.0001 | < 0.0001 | < 0.0001 |
| Glycosaminoglycan biosynthesis - chondroitin sulfate / dermatan sulfate | 4.95% | 0.16% | 0.0171 | 0.0506 | |
| Phototransduction | 3.79% | 0.17% | < 0.0001 | < 0.0001 | < 0.0001 |
| Nicotine addiction | 4.07% | 0.21% | < 0.0001 | < 0.0001 | < 0.0001 |
| Collecting duct acid secretion | 5.06% | 0.22% | 0.0158 | 0.0472 | |
| Pathogenic | 3.61% | 0.23% | < 0.0001 | < 0.0001 | < 0.0001 |
| Hedgehog signaling pathway - fly | 3.88% | 0.23% | < 0.0001 | < 0.0001 | < 0.0001 |
The KEGG pathways with the lowest % of lineages under positive selection mapping to a pathway. All pathways were significant at the 0.05 level after correction with the false discovery rate (FDR). Bold numbers indicate not significant at the 0.05 level. Lineages with dN/dS > 1 considered under positive selection
Domains present in lineages under positive selection
| Over-Represented CATH Domain Topologies in TAED | |||||
|---|---|---|---|---|---|
| CATH domain topology | Mapped Lineages Under Positive Selection | Lineages Under Positive Selection Mapped | Uncorrected | FDR | Bonferroni |
| Rossmann fold | 7.35% | 25.55% | < 0.0001 | < 0.0001 | < 0.0001 |
| Jelly Rolls | 7.30% | 4.43% | < 0.0001 | < 0.0001 | 0.0013 |
| Phosphorylase Kinase domain 1 | 9.51% | 3.87% | < 0.0001 | < 0.0001 | < 0.0001 |
| TIM Barrel | 7.52% | 3.59% | < 0.0001 | < 0.0001 | < 0.0001 |
| Thrombin subunit H | 11.13% | 2.90% | < 0.0001 | < 0.0001 | < 0.0001 |
| Ubiquitin-like (UB roll) | 7.06% | 2.50% | 0.0143 | 0.0355 | |
| Glutaredoxin | 9.53% | 2.39% | < 0.0001 | < 0.0001 | < 0.0001 |
| Collagenase (Catalytic Domain) | 9.70% | 2.22% | < 0.0001 | < 0.0001 | < 0.0001 |
| DNA polymerase domain 1 | 8.26% | 2.12% | < 0.0001 | < 0.0001 | < 0.0001 |
| OB fold (Dihydrolipoamide Acetyltransferase E2P) | 7.32% | 1.79% | 0.0013 | 0.0039 | |
| Methane Monooxygenase Hydroxylase Chain G domain 1 | 8.24% | 1.55% | < 0.0001 | < 0.0001 | < 0.0001 |
| Cytochrome p450 | 9.11% | 1.51% | < 0.0001 | < 0.0001 | < 0.0001 |
| Helicase Ruva Protein domain 3 | 9.86% | 1.21% | < 0.0001 | < 0.0001 | < 0.0001 |
| Laminin | 8.56% | 1.03% | < 0.0001 | < 0.0001 | < 0.0001 |
| Glutathione S-transferase Yfyf (Class Pi) Chain A domain 2 | 10.79% | 1.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Kinesin | 8.47% | 1.00% | < 0.0001 | < 0.0001 | < 0.0001 |
| Glycosyltransferase | 12.75% | 0.92% | < 0.0001 | < 0.0001 | < 0.0001 |
| FAD/NAD(P)-binding domain | 10.68% | 0.87% | < 0.0001 | < 0.0001 | < 0.0001 |
| 2-enoyl-CoA Hydratase Chain A domain 1 | 14.07% | 0.80% | < 0.0001 | < 0.0001 | < 0.0001 |
| Erythroid Transcription Factor GATA-1 Chain A | 9.34% | 0.72% | < 0.0001 | < 0.0001 | < 0.0001 |
| Cyclin A domain 1 | 10.32% | 0.71% | < 0.0001 | < 0.0001 | < 0.0001 |
| Alkaline Phosphatase subunit A | 9.96% | 0.69% | < 0.0001 | < 0.0001 | < 0.0001 |
| Butyryl-CoA Dehydrogenase subunit A domain 3 | 9.13% | 0.68% | < 0.0001 | < 0.0001 | < 0.0001 |
| Carbonic Anhydrase II | 14.52% | 0.66% | < 0.0001 | < 0.0001 | < 0.0001 |
Enrichment analysis of CATH domain topologies in TAED showing CATH domains topologies present in highest % of lineages under positive selection. All pathways were significant at the 0.05 level after correction with the false discovery rate (FDR). Bold numbers indicate not significant at the 0.05 level. Lineages with dN/dS > 1 considered under positive selection
Domains absent in lineages under positive selection
| Under-Represented CATH Domain Topologies in TAED | |||||
|---|---|---|---|---|---|
| CATH domain topology | Mapped Lineages Under Positive Selection | Lineages Under Positive Selection Mapped | Uncorrected | FDR | Bonferroni |
| Smad3 Chain A | 0.08% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| A middle domain of Talin 1 | 0.24% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Endonuclease - Pi-scei_ Chain A domain 1 | 0.27% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Undecaprenyl pyrophosphate synthetase | 0.33% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Neurophysin II Chain A | 0.38% | < 0.01% | < 0.0001 | < 0.0001 | 0.0002 |
| Major Prion Protein | 0.43% | < 0.01% | < 0.0001 | < 0.0001 | 0.0010 |
| Archaeosine Trna-guanine Transglycosylase Chain: A domain 4 | 0.45% | < 0.01% | < 0.0001 | < 0.0001 | 0.0023 |
| Translation Eukaryotic Peptide Chain Release Factor Subunit 1 Chain A | 0.52% | < 0.01% | < 0.0001 | 0.0002 | 0.0140 |
| PWI domain | 0.53% | < 0.01% | < 0.0001 | 0.0003 | 0.0206 |
| copper amine oxidase-like fold | 0.57% | < 0.01% | 0.0001 | 0.0005 | 0.0472 |
| ERH-like fold | 0.58% | < 0.01% | 0.0001 | 0.0007 | 0.0608 |
| Smad Anchor For Receptor Activation Chain B | 0.61% | < 0.01% | 0.0002 | 0.0010 | |
| protein kinase ck2 holoenzyme chain C domain 1 | 0.61% | < 0.01% | 0.0002 | 0.0010 | |
| Elongin C Chain C domain 1 | 0.65% | < 0.01% | 0.0003 | 0.0017 | |
| Conserved hypothetical protein from pyrococcus furiosus pfu- 392,566-001 ParB domain | 0.90% | < 0.01% | 0.0042 | 0.0195 | |
| titin filament fold | 0.92% | < 0.01% | 0.0048 | 0.0214 | |
| Transcription Regulator spoIIAA | 0.98% | < 0.01% | 0.0073 | 0.0311 | |
| cAMP-dependent Protein Kinase Chain A | 0.36% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Inorganic Pyrophosphatase | 0.45% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| subunit c (vma5p) of the yeast v-atpase domain 2 | 0.51% | < 0.01% | < 0.0001 | < 0.0001 | < 0.0001 |
| Deoxyuridine 5′-Triphosphate Nucleotidohydrolase_ Chain A | 0.88% | < 0.01% | < 0.0001 | 0.0002 | 0.0135 |
| 50s Ribosomal Protein L19e Chain O domain 1 | 0.96% | < 0.01% | 0.0001 | 0.0005 | 0.0451 |
| Glutathione Synthetase Chain A domain 3 | 1.15% | < 0.01% | 0.0006 | 0.0031 | |
| Deoxyhypusine Synthase | 1.18% | < 0.01% | 0.0007 | 0.0037 | |
| DNA Excision Repair Uvrb Chain A | 1.29% | < 0.01% | 0.0017 | 0.0084 | |
Enrichment analysis of CATH domain topologies in TAED showing CATH domains topologies present in lowest % of lineages under positive selection. All pathways were significant at the 0.05 level after correction with the false discovery rate (FDR). Bold numbers indicate not significant at the 0.05 level. Lineages with dN/dS > 1 considered under positive selection
Fig. 1Duplication analysis regression plot using family node ages as a proxy for time – The x-axis is measured in MYA based on the root node for each TAED gene family. The best Pearson’s r coefficient was found when neither axes were log transformed. The upper left half (shaded orange) of the scatterplot was used to determine TAED gene families that were statistically different from the regression line using Cook’s distance
TAED gene families with many duplications based on family node age from summed branch lengths
| Highly duplicable gene families - Family Age | |||
|---|---|---|---|
| Cook’s Distance | Family Description | Family Age | Number of Duplications |
| 0.0599 | serine/threonine-protein phosphatase 2A 56 kDa regulatory subunit epsilon isoform | 684 | 729 |
| 0.0429 | receptor-type tyrosine-protein phosphatase F isoform X1 | 684 | 632 |
| 0.0364 | guanine nucleotide-binding protein G(i) subunit alpha-1, partial | 684 | 590 |
| 0.0358 | peptidyl-prolyl cis-trans isomerase A-like | 684 | 586 |
| 0.0338 | casein kinase I isoform gamma-2 | 684 | 572 |
| 0.0328 | transcription factor AP-2-alpha isoform X1 | 684 | 565 |
| 0.0310 | protein argonaute-3 | 684 | 552 |
| 0.0307 | serine/threonine-protein phosphatase 2A 55 kDa regulatory subunit B alpha isoform | 684 | 550 |
| 0.0303 | casein kinase I isoform epsilon | 684 | 547 |
| 0.0300 | cytoplasmic polyadenylation element-binding protein 4 isoform X1 | 684 | 545 |
| 0.0283 | late histone H2B.L4-like | 684 | 532 |
| 0.0266 | L-lactate dehydrogenase B chain-like | 615 | 566 |
| 0.0264 | mitogen-activated protein kinase 11 | 684 | 517 |
| 0.0262 | septin-14 | 684 | 516 |
| 0.0257 | serine/threonine-protein phosphatase 2B catalytic subunit beta isoform isoform X1 | 684 | 512 |
| 0.0247 | alpha-enolase | 684 | 504 |
| 0.0247 | mitogen-activated protein kinase 10 | 684 | 504 |
| 0.0240 | pre-B-cell leukemia transcription factor 1 | 684 | 498 |
| 0.0229 | heat shock protein HSP 90-beta-like | 684 | 489 |
| 0.0229 | potassium voltage-gated channel beta subunit | 684 | 489 |
| 0.0227 | polyadenylate-binding protein 1 | 615 | 529 |
| 0.0209 | protein yippee-like 2 | 684 | 471 |
| 0.0207 | sodium/potassium-transporting ATPase subunit alpha-4 isoform X1 | 684 | 469 |
| 0.0202 | myosin regulatory light polypeptide 9 | 684 | 465 |
| 0.0200 | aldolase B | 684 | 463 |
TAED KEGG pathways based on duplication analysis using family node age from summed branch lengths
| KEGG Pathway mappings from high duplicable TAED gene families – Family Node Age | |
|---|---|
| KEGG Pathway | Number of mapping instances from highly duplicable families |
| Metabolic pathways | 558 |
| Olfactory transduction | 198 |
| Pathways in cancer | 171 |
| PI3K-Akt signaling pathway | 130 |
| Endocytosis | 130 |
| HTLV-I infection | 122 |
| MAPK signaling pathway | 121 |
| Proteoglycans in cancer | 106 |
| Rap1 signaling pathway | 98 |
| Neuroactive ligand-receptor interaction | 96 |
| Ras signaling pathway | 93 |
| Regulation of actin cytoskeleton | 90 |
| Epstein-Barr virus infection | 87 |
| Purine metabolism | 86 |
| RNA transport | 85 |
| Transcriptional misregulation in cancer | 83 |
| Protein processing in endoplasmic reticulum | 81 |
| Axon guidance | 80 |
| mTOR signaling pathway | 79 |
| Focal adhesion | 79 |
| Viral carcinogenesis | 78 |
| Herpes simplex infection | 77 |
| cAMP signaling pathway | 77 |
| Ribosome | 75 |
| Cytokine-cytokine receptor interaction | 72 |
Sitewise substitution rates in TAED lineages sorted by selective pressure and structural features
| Positively Selected Lineages (dN/dS > 1) | Negatively Selected Lineages (dN/dS < 0.5) | |||||
|---|---|---|---|---|---|---|
| Substituted Sites | All Sites | Substituted Sites | All Sites | |||
| Helix | 30.2826% | 34.0597% | 35.2377% | 36.6327% | ||
| Exposed | 17.1142% | 16.4580% | 20.2511% | 17.3397% | ||
| Buried | 13.1684% | 17.6017% | 14.9866% | 19.2929% | ||
| α-Helix | 26.1346% | 30.1108% | 31.1229% | 32.7617% | ||
| Exposed | 14.3764% | 0.1659 | 14.1740% | 17.5250% | 15.1730% | |
| Buried | 11.7582% | 15.9368% | 13.5979% | 17.5887% | ||
| 310 Helix | 3.4956% | 0.0098 | 3.3039% | 3.5115% | 3.2213% | |
| Exposed | 2.3597% | 2.0045% | 2.4163% | 1.8907% | ||
| Buried | 1.1359% | 1.2994% | 1.0952% | 1.3306% | ||
| π-Helix | 0.6524% | 0.8047 | 0.6449% | 0.6033% | 0.6497% | |
| Exposed | 0.3780% | 0.2794% | 0.3098% | 0.2761% | ||
| Buried | 0.2743% | 0.3655% | 0.2935% | 0.3736% | ||
| β-Sheet | 23.2104% | 21.7820% | 18.2981% | 19.8385% | ||
| Exposed | 8.9360% | 7.1998% | 7.3255% | 6.2661% | ||
| Buried | 14.2744% | 0.0361 | 14.5822% | 10.9726% | 13.5724% | |
| β-Bridge | 1.1095% | 0.7913 | 1.0984% | 0.9888% | 1.0382% | |
| Exposed | 0.5644% | 0.4641% | 0.4876% | 0.4194% | ||
| Buried | 0.5451% | 0.0081 | 0.6343% | 0.5012% | 0.6188% | |
| Turn | 12.0729% | 11.0540% | 12.3561% | 11.0859% | ||
| Exposed | 9.7554% | 8.3283% | 9.9588% | 8.1517% | ||
| Buried | 2.3175% | 2.7257% | 2.3973% | 2.9342% | ||
| Bend | 10.4763% | 9.7416% | 10.2628% | 9.6989% | ||
| Exposed | 7.9179% | 6.8552% | 7.8004% | 6.6547% | ||
| Buried | 2.5584% | 2.8864% | 2.4624% | 3.0443% | ||
| Coil | 22.8482% | 22.2643% | 22.8565% | 21.7058% | ||
| Exposed | 15.2151% | 13.2976% | 15.3522% | 12.7858% | ||
| Buried | 7.6331% | 8.9667% | 7.5044% | 8.9201% | ||
| Buried (All Sites) | 40.4969% | 47.3970% | 38.8245% | 48.3826% | ||
| Exposed (All Sites) | 59.5031% | 52.6030% | 61.1755% | 51.6174% | ||
The distribution of substituted sites by secondary structure and solvent accessibility binned by the nautre of selection are shown. Bolded items are significant (p < 0.00167 after multiple comparisons correction) based on parametric bootstrapping, n = 20000
Lineages with dN/dS > 1 in Ornithine decarboxylase family
| Lineages with dN/dS > 1 in TAED family for Ornithine decarboxylase | |||
|---|---|---|---|
| dN/dS Value | Branch Start | Branch End | Mapped Branches |
| 2.0096 | Eutheria | Afrotheria | Afrotheria |
| 1.9244 | |||
| 1.9712 | Cetacea | ||
| 1.7717 | Cetacea | ||
| 1.1272 | Cetacea | Cetacea | Cetacea |
| 1.5451 | Macaca | ||
Fig. 2Gene tree for cetacean lineages of ornithine decarboxylase – Presented here is the gene tree taken from the TAED Tree Viewer for the TAED gene family 557. Lineages not associated with Cetaceans are collapsed. Internal nodes labeled with a while box are duplication events found within the tree. Nodes with solid grey dots represent speciation events. Nodes labeled in black indicate a leaf node. Lineages labeled in red have a dN/dS > 1 and the numbers along each branch are the associated dN/dS value for the given branch. Image was generated from the TAED Tree Viewer
Fig. 3Pyridoxal phosphate binding site for ornithine decarboxylase along the lineage of Cetacea – A protein homology model of the ancestral protein leading to Cetacea was created. Template for the model was from human ornithine decarboxylase (PDB:2OO0; chain A). Ancestral changes occurring on the lineage for Cetacea have been mapped to the model, sites colored in red indicate nonsynonymous changes in the ancestral protein, sites colored in dark grey are synonymous site changes. The site indicated in green is the pyridoxal phosphate binding site 238. The site adjacent to the binding site is the substitution N238D found on the ancestral lineage. Image was generated from Swiss-PdbViewer
Fig. 4Active site remodeling for ornithine decarboxylase along the lineage of Cetacea – A protein homology model of the ancestral protein leading to Cetacea was created. Template for the model was from human ornithine decarboxylase (PDB:2OO0; chain A). Ancestral changes occurring on the lineage for Cetacea have been mapped to the model, sites colored in red indicate nonsynonymous changes in the ancestral protein, sites colored in dark grey are synonymous site changes. The site indicated in gold is the active site cysteine-357. Remodeling of the active site can be seen in the changes P368Q, R375C, I376M, and R379H which are positioned around the loop containing the active site