Literature DB >> 32072921

How new genes are born.

Urminder Singh1, Eve Syrkin Wurtele1.   

Abstract

Analysis of yeast, fly and human genomes suggests that sequence divergence is not the main source of orphan genes.
© 2020, Singh and Syrkin Wurtele.

Entities:  

Keywords:  D. melanogaster; S. cerevisiae; computational biology; conserved synteny; de novo gene emergence; evolution; evolutionary biology; genetic novelty; human; orphan genes; sequence divergence; systems biology

Year:  2020        PMID: 32072921      PMCID: PMC7030788          DOI: 10.7554/eLife.55136

Source DB:  PubMed          Journal:  Elife        ISSN: 2050-084X            Impact factor:   8.140


Related research article Vakirlis N, Carvunis A-R, McLysaght A. 2020. Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes. eLife 9:e53500. doi: 10.7554/eLife.53500 For half a century, most scientists believed that new protein-coding genes arise as a result of mutations in existing protein-coding genes. It was considered impossible for anything as complex as a functional new protein to arise from scratch. However, every species has certain genes, known as 'orphan genes', which code for proteins that are not homologous to proteins found in any other species. What do these orphan genes do, and how are they formed? To date the roles of hundreds of orphan genes have been characterized. Although this is just a tiny fraction of the total, it is known that most of them code for proteins that bind to conserved proteins such as transcription factors or receptors. Some of these proteins are toxins, some are involved in reproduction, some integrate into existing metabolic and regulatory networks, and some confer resistance to stress (Carvunis et al., 2012; Li et al., 2009; Xiao et al., 2009; Arendsee et al., 2014; Belcaid et al., 2019). However, none of them are enzymes (Arendsee et al., 2014). Orphan genes arise quickly, so they may provide a disruptive mechanism that allows a given species to survive changes to its environment. Thus, the study of how orphan genes arise (and fall) is central to understanding the forces that drive evolution (Figure 1).
Figure 1.

Life cycle of orphan genes.

Every species has orphan genes that have no homologs in other species. This schematic shows the genome of the fruit fly (bottom) and the genome of an ancestor of the fruit fly (top). Each panel also shows (from left to right): genes that are highly conserved and can be traced back to prokaryotic organisms (yellow background); genes that are found in just a few related species (taxonomically restricted genes), orphan genes and potential orphan genes that are not currently expressed and are thus free from selection pressure (proto-orphan genes); and regions of the genome that do not code for proteins (blue background) (Van Oss and Carvunis, 2019; Palmieri et al., 2014). An orphan gene can form through the rapid divergence of the coding sequence (CDS) of an existing gene (1), or arise de novo from regions of the genome that do not code for proteins (including the non-coding parts of genes that evolve to code for proteins; 2). Some orphan genes will be important for survival, and will thus be selected for and gradually optimized (3). This means that the genes in a single organism will have a gradient of ages (Tautz and Domazet-Lošo, 2011). Many proto-orphan genes will undergo pseudogenation (that is, they will not be retained; 4). Coding sequences (shown as thick colored bars) with detectable homology are shown in similar colors. Vakirlis et al. estimate that a minority of orphan genes have arisen by divergence of the coding sequence of existing genes.

Life cycle of orphan genes.

Every species has orphan genes that have no homologs in other species. This schematic shows the genome of the fruit fly (bottom) and the genome of an ancestor of the fruit fly (top). Each panel also shows (from left to right): genes that are highly conserved and can be traced back to prokaryotic organisms (yellow background); genes that are found in just a few related species (taxonomically restricted genes), orphan genes and potential orphan genes that are not currently expressed and are thus free from selection pressure (proto-orphan genes); and regions of the genome that do not code for proteins (blue background) (Van Oss and Carvunis, 2019; Palmieri et al., 2014). An orphan gene can form through the rapid divergence of the coding sequence (CDS) of an existing gene (1), or arise de novo from regions of the genome that do not code for proteins (including the non-coding parts of genes that evolve to code for proteins; 2). Some orphan genes will be important for survival, and will thus be selected for and gradually optimized (3). This means that the genes in a single organism will have a gradient of ages (Tautz and Domazet-Lošo, 2011). Many proto-orphan genes will undergo pseudogenation (that is, they will not be retained; 4). Coding sequences (shown as thick colored bars) with detectable homology are shown in similar colors. Vakirlis et al. estimate that a minority of orphan genes have arisen by divergence of the coding sequence of existing genes. One possible mechanism is the 'de novo' appearance of a gene from an intergenic region or a completely new reading frame within an existing gene (Tautz and Domazet-Lošo, 2011). An alternative mechanism is that the coding sequence of the orphan gene arises by rapid divergence from the coding sequence of a preexisting gene: this would mean that an entire set of regulatory and structural elements would be available to the gene as it evolves. Now, in eLife, Nikolaos Vakirlis and Aoife McLysaght (both from Trinity College Dublin) and Anne-Ruxandra Carvunis (University of Pittsburgh) report how they have studied yeast, fly and human genes to compare the contributions of these two mechanisms to the emergence of orphan genes (Vakirlis et al., 2020). Previous studies have used simulations to estimate the number of orphan genes that appear by divergence; until now, no one had relied on actual genomics data to study this phenomenon. Vakirlis et al. use a new approach to analyze orphan genes that have originated through divergence. They examine regions of the genome that correspond to each other (so-called syntenic regions) in related species to determine whether a gene exists in both regions and, if so, whether the proteins are non-homologous. If the genes have no homology, they may have originated by rapid divergence from the coding sequence of a preexisting gene. Using this method, Vakirlis et al. infer that at most 45% of S. cerevisiae (yeast) orphan genes, 25% of D. melanogaster (fruit fly) orphan genes, and 18% of human orphan genes arose by rapid divergence, but this is an upper estimate. For example, it is possible that a new coding sequence might have arisen de novo within an existing gene, rather than the existing coding sequence having been modified beyond recognition. But how can a protein sequence continue to be selected for as it rapidly diverges? Vakirlis et al. suggest that divergence might occur by a process of partial pseudogenation: the existing gene becomes non-functional, and then, with no selection pressure to retain the old protein, it diverges to form an orphan gene. Many orphan genes may not have been identified yet, because they do not have homologs in other species, and have few recognizable sequence features. This means that up to 80% of orphan genes can be missed when a new genome is annotated (Seetharam et al., 2019). The approach detailed by Vakirlis, Carvunis and McLysaght evaluates specifically those annotated orphan genes for which a similar gene exists in a related species (which is ~50% of them; Arendsee et al., 2019). As high-quality genomes from more species become available, and as more orphan genes are annotated, the approach will provide yet deeper insights into the origin of these genes. One of the many open questions in this field deals with genes of ‘mixed age’. Some such genes have incorporated ‘chunks’ of orphans into their coding sequences. A gene that has done this is (somewhat arbitrarily) considered to be the age of its most ancient segment, but we know little about the mechanism of this process or its significance. Another question involves the unique strategies and rates of evolution of each gene (Revell et al., 2018). How might the abundance and mechanisms of orphan gene origin vary among species? And how do different environments affect the emergence of orphan genes?
  10 in total

1.  Identification of the novel protein QQS as a component of the starch metabolic network in Arabidopsis leaves.

Authors:  Ling Li; Carol M Foster; Qinglei Gan; Dan Nettleton; Martha G James; Alan M Myers; Eve Syrkin Wurtele
Journal:  Plant J       Date:  2008-01-18       Impact factor: 6.417

Review 2.  The evolutionary origin of orphan genes.

Authors:  Diethard Tautz; Tomislav Domazet-Lošo
Journal:  Nat Rev Genet       Date:  2011-08-31       Impact factor: 53.242

3.  phylostratr: a framework for phylostratigraphy.

Authors:  Zebulun Arendsee; Jing Li; Urminder Singh; Arun Seetharam; Karin Dorman; Eve Syrkin Wurtele
Journal:  Bioinformatics       Date:  2019-10-01       Impact factor: 6.937

Review 4.  Coming of age: orphan genes in plants.

Authors:  Zebulun W Arendsee; Ling Li; Eve Syrkin Wurtele
Journal:  Trends Plant Sci       Date:  2014-11       Impact factor: 18.313

5.  Proto-genes and de novo gene birth.

Authors:  Anne-Ruxandra Carvunis; Thomas Rolland; Ilan Wapinski; Michael A Calderwood; Muhammed A Yildirim; Nicolas Simonis; Benoit Charloteaux; César A Hidalgo; Justin Barbette; Balaji Santhanam; Gloria A Brar; Jonathan S Weissman; Aviv Regev; Nicolas Thierry-Mieg; Michael E Cusick; Marc Vidal
Journal:  Nature       Date:  2012-07-19       Impact factor: 49.962

6.  Symbiotic organs shaped by distinct modes of genome evolution in cephalopods.

Authors:  Mahdi Belcaid; Giorgio Casaburi; Sarah J McAnulty; Hannah Schmidbaur; Andrea M Suria; Silvia Moriano-Gutierrez; M Sabrina Pankey; Todd H Oakley; Natacha Kremer; Eric J Koch; Andrew J Collins; Hoan Nguyen; Sai Lek; Irina Goncharenko-Foster; Patrick Minx; Erica Sodergren; George Weinstock; Daniel S Rokhsar; Margaret McFall-Ngai; Oleg Simakov; Jamie S Foster; Spencer V Nyholm
Journal:  Proc Natl Acad Sci U S A       Date:  2019-01-11       Impact factor: 12.779

7.  De novo gene birth.

Authors:  Stephen Branden Van Oss; Anne-Ruxandra Carvunis
Journal:  PLoS Genet       Date:  2019-05-23       Impact factor: 5.917

8.  A rice gene of de novo origin negatively regulates pathogen-induced defense response.

Authors:  Wenfei Xiao; Hongbo Liu; Yu Li; Xianghua Li; Caiguo Xu; Manyuan Long; Shiping Wang
Journal:  PLoS One       Date:  2009-02-25       Impact factor: 3.240

9.  The life cycle of Drosophila orphan genes.

Authors:  Nicola Palmieri; Carolin Kosiol; Christian Schlötterer
Journal:  Elife       Date:  2014-02-19       Impact factor: 8.140

10.  Synteny-based analyses indicate that sequence divergence is not the main source of orphan genes.

Authors:  Nikolaos Vakirlis; Anne-Ruxandra Carvunis; Aoife McLysaght
Journal:  Elife       Date:  2020-02-18       Impact factor: 8.140

  10 in total
  7 in total

1.  Taxonomically Restricted Genes Are Associated With Responses to Biotic and Abiotic Stresses in Sugarcane (Saccharum spp.).

Authors:  Cláudio Benício Cardoso-Silva; Alexandre Hild Aono; Melina Cristina Mancini; Danilo Augusto Sforça; Carla Cristina da Silva; Luciana Rossini Pinto; Keith L Adams; Anete Pereira de Souza
Journal:  Front Plant Sci       Date:  2022-06-30       Impact factor: 6.627

2.  Foster thy young: enhanced prediction of orphan genes in assembled genomes.

Authors:  Jing Li; Urminder Singh; Priyanka Bhandary; Jacqueline Campbell; Zebulun Arendsee; Arun S Seetharam; Eve Syrkin Wurtele
Journal:  Nucleic Acids Res       Date:  2022-04-22       Impact factor: 19.160

3.  pyrpipe: a Python package for RNA-Seq workflows.

Authors:  Urminder Singh; Jing Li; Arun Seetharam; Eve Syrkin Wurtele
Journal:  NAR Genom Bioinform       Date:  2021-06-01

4.  Bioinformatic analysis and functional predictions of selected regeneration-associated transcripts expressed by zebrafish microglia.

Authors:  Ousseini Issaka Salia; Diana M Mitchell
Journal:  BMC Genomics       Date:  2020-12-07       Impact factor: 3.969

Review 5.  Research Advances and Prospects of Orphan Genes in Plants.

Authors:  Mingliang Jiang; Xiaonan Li; Xiangshu Dong; Ye Zu; Zongxiang Zhan; Zhongyun Piao; Hong Lang
Journal:  Front Plant Sci       Date:  2022-07-08       Impact factor: 6.627

Review 6.  New Genes Born-In or Invading Vertebrate Genomes.

Authors:  Carlos Herrera-Úbeda; Jordi Garcia-Fernàndez
Journal:  Front Cell Dev Biol       Date:  2021-07-06

7.  Small Open Reading Frames: How Important Are They for Molecular Evolution?

Authors:  Diego Guerra-Almeida; Rodrigo Nunes-da-Fonseca
Journal:  Front Genet       Date:  2020-10-20       Impact factor: 4.599

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.