| Literature DB >> 32053193 |
Yue Hao1, Hyuk Jin Lee2, Michael Baraboo3, Katherine Burch4, Taylor Maurer5, Jason A Somarelli6,7, Gavin C Conant1,8,9,10.
Abstract
It has long been challenging to uncover the molecular mechanisms behind striking morphological innovations such as mammalian pregnancy. We studied the power of a robust comparative orthology pipeline based on gene synteny to address such problems. We inferred orthology relations between human genes and genes from each of 43 other vertebrate genomes, resulting in ∼18,000 orthologous pairs for each genome comparison. By identifying genes that first appear coincident with origin of the placental mammals, we hypothesized that we would define a subset of the genome enriched for genes that played a role in placental evolution. We thus pinpointed orthologs that appeared before and after the divergence of eutherian mammals from marsupials. Reinforcing previous work, we found instead that much of the genetic toolkit of mammalian pregnancy evolved through the repurposing of preexisting genes to new roles. These genes acquired regulatory controls for their novel roles from a group of regulatory genes, many of which did in fact originate at the appearance of the eutherians. Thus, orthologs appearing at the origin of the eutherians are enriched in functions such as transcriptional regulation by Krüppel-associated box-zinc-finger proteins, innate immune responses, keratinization, and the melanoma-associated antigen protein class. Because the cellular mechanisms of invasive placentae are similar to those of metastatic cancers, we then used our orthology inferences to explore the association between placenta invasion and cancer metastasis. Again echoing previous work, we find that genes that are phylogenetically older are more likely to be implicated in cancer development.Entities:
Keywords: comparative genomics; mammalian pregnancy; orthology inference; placental mammals
Mesh:
Year: 2020 PMID: 32053193 PMCID: PMC7144826 DOI: 10.1093/gbe/evaa026
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
F—The appearance of mammalian orthologous genes in a phylogenetic context. (a) Shown is a vertebrate orthology tree with the number of synonymous substitutions per synonymous site (dS) as branch lengths. Black slices in the pie charts show the proportions of genes in the nonhuman genomes that have an orthologous human gene. We inferred the first appearance of each ortholog to give monophyletic groups that possess that gene (Materials and Methods): The inferred ancestral ortholog percentages are thus shown on the internal nodes. dS values were estimated using codeml with an unrooted guide tree and alignments of 948 orthologs (see Materials and Methods). The topology used was adapted from Meredith et al. (2011). The letters A–N label the internal branches leading to humans. The box region is expanded to show the primate lineage. The species images were downloaded from PhyloPic.org. (b) Bar chart of the number of orthologs on the internal branches leading to human (A–N in a). The number of such new orthologs with identified homologs (but not orthologs) in birds is shown in red (see Materials and Methods).
F—The number of the orthologs on internal branches leading to human and branch-specific selection estimates from the alignments of 948 orthologs. (a) Scatterplot of number of the orthologs versus the estimated branch length in million years (see Materials and Methods). The dashed line represents the average rate of new ortholog occurrence across the 14 branches. (b) Comparison of two estimates of the relative rate of new gene appearance on different phylogenetic branches in the mammalian tree. The blue dashed line shows the average rate of occurrence on each branch given by dividing the number of orthologs by the branch length. The yellow line is the rates estimated using new gene counts from Zhang et al. (2010) divided by the same branch lengths. (c) Branch-specific dN/dS over time: the x axis gives time (in millions of year ago) estimated from cumulative dS and the y axis gives the estimated average dN/dS for the corresponding branch (see Materials and Methods).
Summary of Top Ten Functional Annotation Clusters for Each Group of Orthologs
| Top 10 Annotation Clusters | Functional Description | DAVID Enrichment Score | |
|---|---|---|---|
| Orthologs on the mammal branch | 1 | SPRY domain, B-box ZNF, RING-type ZNF | 5.45 |
| 2 | Innate immune response | 3.6 | |
| 3 | Cell adhesion molecule binding, cell recognition | 3.01 | |
| 4 | Adaptive immune response, cytokine, interferon | 2.95 | |
| 5 | Interleukin, interleukin receptor binding | 2.67 | |
| 6 | EF-hand domain | 2.52 | |
| 7 | Chemical carcinogenesis, drug metabolism—cytochrome P450 | 2.41 | |
| 8 | Steroid metabolic process, sulfotransferase activity | 2.07 | |
| 9 | Phototransduction guanylate cyclase activity | 2.02 | |
| 10 | Biomineralization, biomineral tissue development | 1.93 | |
| Orthologs on the therian branch | 1 | Mammalian taste receptor activity | 10.15 |
| 2 | Glycoprotein, disulfide bond | 9.41 | |
| 3 | Olfactory and sensory transduction, GPCR activity | 8.19 | |
| 4 | Glycoprotein, transmembrane helix | 7.1 | |
| 5 | Keratinization, peptide crosslinking | 4.01 | |
| 6 | KRAB, C2H2-ZNF, transcription regulation | 3.4 | |
| 7 | Peptidase S1 activity | 3.25 | |
| 8 | C-type lectin, carbohydrate binding | 2.56 | |
| 9 | DNA binding HTH domain, endonuclease | 2.11 | |
| 10 | Herpes, measles, influenza A related pathway | 2.09 | |
| Orthologs on the eutherian branch | 1 | KRAB, C2H2-ZNF, transcription regulation | 28.51 |
| 2 | MAGE protein, tumor antigen | 20.78 | |
| 3 | Innate immune response, defense response to bacterium, β-defensin | 17.46 | |
| 4 | Olfactory receptor, sensory transduction, GPCR activity | 17.05 | |
| 5 | Keratin, intermediate filament protein | 11.36 | |
| 6 | C-type lectin, carbohydrate binding | 7.06 | |
| 7 | Immune response, cytokine, cytokine receptor | 4.6 | |
| 8 | Immunoglobulin domain | 4.2 | |
| 9 | MHC I/II like antigen recognition protein, natural killer cell activity | 4.16 | |
| 10 | Keratinization, peptide crosslinking | 3.66 | |
| Orthologs on the branches postplacental | 1 | Olfactory transduction, GPCR activity | 32.51 |
| 2 | Protein deubiquitination, peptidase C19 | 13.21 | |
| 3 | Histone, epigenetic regulation of gene expression, transcriptional misregulation in cancer, viral carcinogenesis | 10.93 | |
| 4 | KRAB, C2H2-ZNF, transcription regulation | 8.79 | |
| 5 | Innate immune response, antibacterial humoral response | 7.87 | |
| 6 | Defense response to bacterium, β-defensin | 7.14 | |
| 7 | Cadherin, cell–cell adhesion | 3.92 | |
| 8 | Fungicide, defense response to fungus | 3.04 | |
| 9 | GRIP, protein targeting Golgi | 3.03 | |
| 10 | Serotonin pathway, neurotransmitter receptor activity | 2.19 |
note.—SPRY, domain in SPla and the RYanodine receptor; RING, really interesting new gene; EF-hand, helix-loop-helix domain; GPCR, G-protein coupled receptor; KRAB, Krüppel-associated box; C2H2-ZNF, C2H2 zinc finger; HTH, helix-turn-helix; MAGE, melanoma-associated antigen; GRIP, glutamate receptor-interacting protein.
F—Cancer hallmark genes for orthologs appearing on the mammal, therian, eutherian, and postplacental branches of the phylogeny in figure 1. Four groups of genes are listed based on their evolutionary time of appearance. Branch A is the mammal branch, B is the therian branch, C is the eutherian branch, and D–N are the postplacental branches. The association of these genes with cancer hallmarks were shown. Green circles stand for “promotes,” dark blue circles represent “suppresses,” and the aqua circles stand for both. The cancer hallmark annotations for cancer census genes were obtained from COSMIC release v88 (Hanahan and Weinberg 2011; Thompson et al. 2017). Ten cancer hallmarks are 1, proliferative signaling; 2, suppression of growth; 3, escaping immunic response to cancer; 4, cell replicative immortality; 5, tumor promoting inflammation; 6, invasion and metastasis; 7, angiogenesis; 8, genome instability and mutations; 9, escaping programed cell death; 10, change of cellular energetics.
Number of Orthologs That Are Expressed in Placenta (from DAVID) and Are Annotated as Census Cancer Genes in COSMIC
| Branch | Number of Genes | Number of Genes Expressed in Placenta | Number of Cancer Census Genes |
|---|---|---|---|
| Root branch | 13,472 | 7,215 | 501 |
| Mammal branch | 1,403 | 616 | 24 |
| Therian branch | 1,996 | 785 | 34 |
| Eutherian branch | 1,320 | 388 | 9 |
| Postplacental branches | 1,174 | 277 | 4 |
The gene count on each branch was compared with the sum of gene counts on later branches, for example, number of orthologs on eutherian branch was compared with the number of orthologs on all branches after the eutherian branch. Fisher’s exact tests were performed to examine whether the true odds ratios are >1.
P < 0.05.
P < 0.0001.