Literature DB >> 22102832

De novo origins of human genes.

Daniele Guerzoni1, Aoife McLysaght.   

Abstract

Entities:  

Mesh:

Year:  2011        PMID: 22102832      PMCID: PMC3213182          DOI: 10.1371/journal.pgen.1002381

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


× No keyword cloud information.
Where do new genes come from? For a long time the answer to that question has simply been “from other genes”. The most prolific source of new loci in eukaryotic genomes is gene duplication in all its guises: exon shuffling, tandem duplication, retrocopying, segmental duplication, and genome duplication. However, in recent years there has been a growing appreciation of the oft-dismissed possibility of evolution of new genes from scratch (i.e., de novo) as a rare but consistent feature of eukaryotic genomes [1], [2]. Pioneering work identified several de novo genes in Drosophila [3]–[5], and since then, additional Drosophila cases have been identified [6], as well as cases in yeast [7], [8], Plasmodium [9], rice [10], mouse [11], primates [12], and human [13], [14]. It would appear that whenever anyone makes the effort to search, candidate novel genes are found. In this issue of PLoS Genetics, Wu et al. [15] report 60 putative de novo human-specific genes. This is a lot higher than a previous, admittedly conservative, estimate of 18 such genes [13], [16]. The genes identified share broad characteristics with other reported de novo genes [13]: they are short, and all but one consist of a single exon. In other words, the genes are simple, and their evolution de novo seems plausible. The potential evolution of complex features such as intron splicing and protein domains within de novo genes remains somewhat puzzling. However, features such as proto-splice sites may pre-date novel genes [9], [17], and the appearance of protein domains by convergent evolution may be more likely than previously thought [2]. The operational definition of a de novo gene used by Wu et al. [15] means that there may be an ORF (and thus potentially a protein-coding gene) in the chimpanzee genome that is up to 80% of the length of the human gene (for about a third of the genes the chimpanzee ORF is at least 50% of the length of the human gene). This is a more lenient criterion than employed by other studies, and this may partly explain the comparatively high number of de novo genes identified. Some of these cases may be human-specific extensions of pre-existing genes, rather than entirely de novo genes—an interesting, but distinct, phenomenon.

Limitations in Defining and Identifying De Novo Genes

A major consideration in these studies is the reliable definition and identification of de novo genes. If a sequence similarity search fails to return a plausible homolog, then it may be that you are dealing with a novel gene. However, it is necessary to exclude the alternative hypothesis of recent loss in sister lineages as well as the possibility that this is a rapidly evolving gene with highly divergent, but extant, homologs. Wu et al. [15] have employed a strategy similar to that of Knowles and McLysaght [13] to search within the human genome for candidate novel loci. The search protocol requires positive evidence of the absence of the gene from other primate lineages in order to show that it is not a gene that has diverged beyond recognition from its homologs (orthologous DNA is identifiable), nor is it a gene that has been recently lost in sister lineages (the ancestral sequence is inferred to carry a disablement) (Figure 1).
Figure 1

Evidence in the detection of novel genes.

A hypothetical example where a novel human ORF is created by a human-specific deletion. The 1 bp deletion shifts a downstream stop codon out of frame. Because the deletion is not shared by other primates, the ancestral sequence is inferred to carry the in-frame stop. The authenticity of the novel human gene can be confirmed with transcription and translation evidence.

Evidence in the detection of novel genes.

A hypothetical example where a novel human ORF is created by a human-specific deletion. The 1 bp deletion shifts a downstream stop codon out of frame. Because the deletion is not shared by other primates, the ancestral sequence is inferred to carry the in-frame stop. The authenticity of the novel human gene can be confirmed with transcription and translation evidence. A serious limitation in this approach is that it relies on existing gene lists that have been annotated using criteria that usually include the presence of a homolog in other genomes. Novel genes fail to meet this criterion by definition, thus they are usually not reliably annotated. Wu et al.'s study [15] highlights the volatility of the annotation of putatively novel genes—over half of the candidate de novo genes they identified are not included in the more recent Ensembl gene lists they used (version 56), and by version 60 only six of these genes were still listed. It would be preferable to have a method of identifying novel genes that used more direct evidence of gene expression. Sequenced peptides and ESTs can be used to confirm that a putative gene is operational, but these data are not currently suitable for identifying protein-coding genes from first principles: the peptide databases usually only list peptides belonging to already-annotated genes [18]; and the high rate of promiscuous transcription of the genome, particularly in testis, where several of Wu et al.'s genes [15] were expressed at their highest, means that transcription alone is not sufficient to recognize a gene [1], [19]. However, care must also be taken to ensure that the ancestral sequence can reliably be inferred to be non-coding. Wu et al. [15] restricted their search to chimpanzee and orangutan genomes, but in at least one case (ENSG00000221972) gorilla and gibbon share the “human-specific” mutation, making this case equivocal. Ideally, the putative non-coding sequences should be investigated for evidence of transcription and translation to support the inference of absence of coding capacity.

Future Challenges

Though Wu et al. [15] have contributed to our growing knowledge of de novo gene evolution, we still lack a definitive list of de novo–originated genes in the human genome—mainly due to issues concerning genome annotation and the stringent criteria required to reliably identify cases. A comprehensive list of de novo genes in human as well as in other primates would open up the opportunity to examine the survivorship of these genes and investigate their specific contribution to phenotype. The observation by Wu et al. [15] that some of the candidate de novo genes are expressed at their highest in brain tissues and testis is interesting, but by no means proves they are functional. A major challenge remains to demonstrate functionality of the de novo genes. This is particularly difficult for human-specific genes, where there is perhaps the greatest interest, but there are also the greatest limitations in terms of possible experiments.

What Does This Tell Us about Human–Chimpanzee Divergence?

Though it remains to be seen if any of the genes is functional, a clear picture is developing of de novo evolution as a process that can create genetic novelty, upon which there is at least the opportunity for natural selection to act. It has been argued that the capacity for innovation generated by novel genes is particularly important for the evolution of lineage-specific traits [20]. It is now common knowledge that human and chimpanzee DNA differ by only 1% (more accurately, they differ in 1% of alignable regions of genome, with a further 3% divergence due to lineage-specific indels [21]). This fact lies in stark contrast to the large phenotypic differences between the two species [22]. The study by Wu et al. [15], along with the previous reports of de novo genes in human, shows that even within highly similar regions of DNA, we can pinpoint small changes at the nucleotide level—base substitutions and indels—that have the potential to generate large phenotypic effects.
  22 in total

Review 1.  Origins, evolution, and phenotypic impact of new genes.

Authors:  Henrik Kaessmann
Journal:  Genome Res       Date:  2010-07-22       Impact factor: 9.043

2.  The pros and cons of peptide-centric proteomics.

Authors:  Mark W Duncan; Ruedi Aebersold; Richard M Caprioli
Journal:  Nat Biotechnol       Date:  2010-07       Impact factor: 54.908

3.  Recently evolved genes identified from Drosophila yakuba and D. erecta accessory gland expressed sequence tags.

Authors:  David J Begun; Heather A Lindfors; Melissa E Thompson; Alisha K Holloway
Journal:  Genetics       Date:  2005-12-15       Impact factor: 4.562

4.  De novo origin of new genes with introns in Plasmodium vivax.

Authors:  Zefeng Yang; Jinling Huang
Journal:  FEBS Lett       Date:  2011-01-18       Impact factor: 4.124

Review 5.  The evolutionary origin of orphan genes.

Authors:  Diethard Tautz; Tomislav Domazet-Lošo
Journal:  Nat Rev Genet       Date:  2011-08-31       Impact factor: 53.242

6.  Novel genes derived from noncoding DNA in Drosophila melanogaster are frequently X-linked and exhibit testis-biased expression.

Authors:  Mia T Levine; Corbin D Jones; Andrew D Kern; Heather A Lindfors; David J Begun
Journal:  Proc Natl Acad Sci U S A       Date:  2006-06-15       Impact factor: 11.205

7.  Initial sequence of the chimpanzee genome and comparison with the human genome.

Authors: 
Journal:  Nature       Date:  2005-09-01       Impact factor: 49.962

8.  A computational and experimental approach to validating annotations and gene predictions in the Drosophila melanogaster genome.

Authors:  Mark Yandell; Adina M Bailey; Sima Misra; ShengQiang Shu; Colin Wiel; Martha Evans-Holm; Susan E Celniker; Gerald M Rubin
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-24       Impact factor: 11.205

Review 9.  Comparing the human and chimpanzee genomes: searching for needles in a haystack.

Authors:  Ajit Varki; Tasha K Altheide
Journal:  Genome Res       Date:  2005-12       Impact factor: 9.043

10.  De novo origin of human protein-coding genes.

Authors:  Dong-Dong Wu; David M Irwin; Ya-Ping Zhang
Journal:  PLoS Genet       Date:  2011-11-10       Impact factor: 5.917

View more
  12 in total

1.  Are proposed early genetic codes capable of encoding viable proteins?

Authors:  Annamária Franciska Angyán; Csaba Ortutay; Zoltán Gáspári
Journal:  J Mol Evol       Date:  2014-05-15       Impact factor: 2.395

2.  The Goddard and Saturn Genes Are Essential for Drosophila Male Fertility and May Have Arisen De Novo.

Authors:  Anna M Gubala; Jonathan F Schmitz; Michael J Kearns; Tery T Vinh; Erich Bornberg-Bauer; Mariana F Wolfner; Geoffrey D Findlay
Journal:  Mol Biol Evol       Date:  2017-05-01       Impact factor: 16.240

Review 3.  New genes contribute to genetic and phenotypic novelties in human evolution.

Authors:  Yong E Zhang; Manyuan Long
Journal:  Curr Opin Genet Dev       Date:  2014-09-16       Impact factor: 5.578

4.  Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs.

Authors:  Chen Xie; Yong E Zhang; Jia-Yu Chen; Chu-Jun Liu; Wei-Zhen Zhou; Ying Li; Mao Zhang; Rongli Zhang; Liping Wei; Chuan-Yun Li
Journal:  PLoS Genet       Date:  2012-09-13       Impact factor: 5.917

5.  Origin of a novel protein-coding gene family with similar signal sequence in Schistosoma japonicum.

Authors:  Evaristus Chibunna Mbanefo; Yu Chuanxin; Mihoko Kikuchi; Mohammed Nasir Shuaibu; Daniel Boamah; Masashi Kirinoki; Naoko Hayashi; Yuichi Chigusa; Yoshio Osada; Shinjiro Hamano; Kenji Hirayama
Journal:  BMC Genomics       Date:  2012-06-20       Impact factor: 3.969

6.  Evolution of viral proteins originated de novo by overprinting.

Authors:  Niv Sabath; Andreas Wagner; David Karlin
Journal:  Mol Biol Evol       Date:  2012-07-19       Impact factor: 16.240

7.  Molecular phylogeny of OVOL genes illustrates a conserved C2H2 zinc finger domain coupled by hypervariable unstructured regions.

Authors:  Abhishek Kumar; Anita Bhandari; Rahul Sinha; Puspendu Sardar; Miss Sushma; Pankaj Goyal; Chandan Goswami; Alessandro Grapputo
Journal:  PLoS One       Date:  2012-06-21       Impact factor: 3.240

8.  Mechanisms and dynamics of orphan gene emergence in insect genomes.

Authors:  Lothar Wissler; Jürgen Gadau; Daniel F Simola; Martin Helmkampf; Erich Bornberg-Bauer
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

Review 9.  New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation.

Authors:  Aoife McLysaght; Daniele Guerzoni
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-09-26       Impact factor: 6.237

10.  A survey of innovation through duplication in the reduced genomes of twelve parasites.

Authors:  Jeremy D DeBarry; Jessica C Kissinger
Journal:  PLoS One       Date:  2014-06-11       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.