Literature DB >> 30563541

We simply cannot go on being so vague about 'function'.

W Ford Doolittle1.   

Abstract

Function is an onerous concept, as the recent study by Steven Salzberg and colleagues demonstrates. We should be careful and always specific in using the 'F-word'.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 30563541      PMCID: PMC6299606          DOI: 10.1186/s13059-018-1600-4

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


Introduction

A recent study in Genome Biology by the Salzberg laboratory reported on the assembly of a new human gene catalog, based on an exhaustive transcriptomic survey of 31 tissues from hundreds of human subjects [1]. After the removal of transcripts that overlapped with those found in RefSeq or GENCODE databases and additional filtering, they found what appeared to be 224 new protein-coding genes and 116,156 new non-coding transcripts that they deemed to be functional. More surprising is their claim to have also detected over 30 million additional non-functional transcripts, revealing an overwhelming amount of “transcriptional noise” in human cells.

Functional and non-functional noncoding RNAs

The findings from Salzberg and colleagues could be seen as a decisive blow in the dispute over noncoding RNAs (ncRNAs)—are they functional or are they not? Of course, discretely functional structural RNAs such as ribosomal and transfer RNAs were known even before coding was understood, and there is now an abundance of well-studied small regulatory RNA species. Moreover, at least a few long ncRNAs (lncRNAs) play important developmental and cellular roles, and are as thoroughly documented as many proteins. But most of many genomes is transcribed, albeit infrequently. It is about this process and its products that there is disagreement, and two schools of thought. The first school, which may be called ‘functionalist’, imagines that these RNAs comprise a vast interconnected network of subtle regulatory and evolutionary capabilities (evolvability), realized and potential. John Mattick and collaborators [2], for instance, consider that we are in the midst of a “conceptual upheaval”, grounded in “the unfolding discovery of previously hidden layers of regulatory RNAs (including many derived from retrotransposon sequences and pseudogenes) and the emerging realization that the genome might not be constructed as a discrete set of protein-coding genes with associated regulatory sequences, but as an interleaved continuum of both coding and cis- and trans-acting regulatory information.” The second school, which could be called ‘skeptics’, regards ncRNAs (especially lncRNAs) as mostly transcriptional noise. In a recent review, Palazzo and Lee [3] discuss how to determine whether any given ncRNA has a function and advocate that “in the absence of any such data, the appropriate null hypothesis is that the RNA in question is junk.” The two schools came into conflict in 2012, after investigators associated with the ENCODE project claimed that 80.4% of our genome is functional, and thus we might at last “write the eulogy for junk DNA” [4]. After all, that claim was largely based on evidence that most of our DNA is transcribed, in one tissue or another. Indeed, tissue-specific transcription is considered proof of function in many studies. However, there are several reasons why tissue-specific transcription could happen without providing evidence for tissue-specific ‘function’. Indeed, Graur et al. [5] criticized the ENCODE consortium for often falling into the logical error of “affirming the consequent” (i.e., taking a true statement and invalidly concluding its converse), in particular assuming that because functional genes are transcribed, transcribed regions must be functional genes. ENCODE investigators responded to critics by admitting that assessments of ‘function’ were not easy to make, and that in the case of low-abundance transcripts it was possible that simple presence is not enough for such ascription. They admitted the need to use multiple biochemical criteria in order to elucidate “genome function in human biology and disease”. Still, the functionalist viewpoint seems at odds with the conclusions of Pertea et al. [1] which, compared to those of Lloyd et al. [6] using machine learning models, are based on very straightforward methods. For instance, unlike Mattick, Salzberg and colleagues dismiss pseudogene transcripts by fiat and declare all protein-noncoding RNAs to be non-functional if they (1) were assembled in fewer than ten samples (of almost 10,000) unless at high levels in these, (2) contained only a single exon, or (3) overlapped known genes (on either strand). By these, and a few additional tests that functionalists might consider arbitrary and biased, they declared that over 30 million transcripts at over 650,000 genomic loci were likely nonfunctional—that is to say, transcriptional noise. This last concept is of course well-grounded. Struhl [7] calculated from first principles that more than 90% of the Pol II initiation events in yeast are noise in the sense of not having a ‘biological function’, by which he presumably meant not honed by natural selection in order to contribute to organismal fitness. Accuracy in any informational transfer process such as transcription comes at a cost and perfect accuracy is unattainable. In any case, the number of ‘mistakes’ surely increases with the number of opportunities to make them, particularly with genome size.

Addressing the functions of ncRNAs

Statistics and detection methods do matter but the problems are deeper than that and not merely technical, as we might see from the following list of questions we could reasonably ask about any ncRNA and its ‘function’. If a short region localized at the 3′ end of a long ncRNA is under selection to interact with a specific site on another molecule, does that make the whole molecule ‘functional’? What if experiments showed that most of the upstream part could be harmlessly deleted? Would, by similar logic, the presence of one functional gene on a chromosome render the whole chromosome ‘functional’? There are good arguments for something like trypanosomal pan-editing having arisen by ‘constructive neutral evolution’, never being under positive selection and always mildly deleterious to organism fitness, but now ineradicable [8]. Does that make the guide RNAs involved ‘functional’? Some would consider a traits’ function to be that effect for which it originally increased in frequency in an ancestral population, an explanation that would not apply here. Most of mammalian genomes are made up of transposable elements and their decay products. For some elements, transcription is vital for transposition, serving a selfish ‘function’. Is that also a function for us mammals, the element’s ‘hosts’? Are functions ascribable to different levels of selection all to be lumped together? By such logic, we might also declare that genes of viruses making us sick are part of our functional gene repertoire. Sometimes it might be that a stretch of DNA ‘functions’ in spacing and chromosomal structuring: is any RNA accidentally transcribed from it also ‘functional’? Presence of the RNA is evidence of an essential function for the DNA and its sequence might even be conserved (because that of the DNA is), but what might this say about the RNA? Kaikkonen and Adelman [9] very recently presented “evidence that the act of transcription and the presence of nascent RNA at a locus is often central to function, rather than specific ncRNA sequences or structures.” If the very act of making RNA contributes to fitness but the RNA made is, again, irrelevant, is the RNA ‘functional’? Presumably RNAs that are lethal—for instance, by serving as a microRNA against an essential gene—have been weeded out by natural selection, so that all ncRNAs are at least not lethally dysfunctional. Is that minimum requirement enough? Mattick and other functionalists would argue that ncRNAs and especially lncRNAs represent evolutionary potential, being co-optable into a host of new regulatory roles. And unquestionably the evolutionary trajectory of complex cells is influenced (constrained?) by their content of ncRNAs. Is ‘looking ahead’ like this a function? Humans are all different, phenotypically, and surely some of that is due to differences in the expression levels of different genes during development, some of which is influenced by ncRNAs. But if we do not in consequence have more or fewer children, is this a ‘function’? Any analysis of functionality in human genomes is incomplete if it fails to address the fact that many vertebrate genomes are very much larger, and (as far as we know) also extensively transcribed. How are the facts of comparative genomics to be accommodated?

Concluding remarks

Deeper than all these questions, but underlying the last in particular, is one about what we mean when we use the ‘F-word’ generally. When we talk of a trait’s ‘function’ do we mean ‘what it does’ or ‘why it is there’? Philosophers have written a lot on this, and the evolutionary biologist John Maynard Smith [10] expressed the difference very well when he wrote “…If we say that the function of the heart is to pump blood round the body, we do not mean merely that the heart does, as a matter of fact, pump blood. We mean that the heart evolved because it pumped blood; that is, those animals whose hearts were better pumps survived and left more descendants…” (emphasis mine). Philosophers have also pointed out that ecologists, developmental biologists, physiologists, and (I claim here) molecular biologists and genomicists tend to be satisfied with ‘what it does’ or causal role explanations, whereas evolutionary biologists such as Maynard Smith also require ‘why it’s there’ or selected effect rationales. There may be no absolute right or wrong here, and a good argument could be made for eliminating ‘function’ altogether and replacing it with one of those two concepts, whichever seems appropriate. But it is clearly wrong to use conclusions based on one to ‘refute’ hypotheses based on the other. This is what the publicity around ENCODE did, to the detriment of the credibility of genomic science. So we must be careful to say what we mean if we use the ‘F-word’. We cannot simply complain that such philosophical quibbling muddies the waters. They have never been clear!
  9 in total

1.  Cell biology. Irremediable complexity?

Authors:  Michael W Gray; Julius Lukes; John M Archibald; Patrick J Keeling; W Ford Doolittle
Journal:  Science       Date:  2010-11-12       Impact factor: 47.728

2.  Transcriptional noise and the fidelity of initiation by RNA polymerase II.

Authors:  Kevin Struhl
Journal:  Nat Struct Mol Biol       Date:  2007-02       Impact factor: 15.369

Review 3.  A global view of genomic information--moving beyond the gene and the master regulator.

Authors:  John S Mattick; Ryan J Taft; Geoffrey J Faulkner
Journal:  Trends Genet       Date:  2009-11-26       Impact factor: 11.639

Review 4.  Emerging Roles of Non-Coding RNA Transcription.

Authors:  Minna U Kaikkonen; Karen Adelman
Journal:  Trends Biochem Sci       Date:  2018-06-28       Impact factor: 13.807

5.  Genomics. ENCODE project writes eulogy for junk DNA.

Authors:  Elizabeth Pennisi
Journal:  Science       Date:  2012-09-07       Impact factor: 47.728

6.  A Model-Based Approach for Identifying Functional Intergenic Transcribed Regions and Noncoding RNAs.

Authors:  John P Lloyd; Zing Tsung-Yeh Tsai; Rosalie P Sowers; Nicholas L Panchy; Shin-Han Shiu
Journal:  Mol Biol Evol       Date:  2018-06-01       Impact factor: 16.240

7.  On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE.

Authors:  Dan Graur; Yichen Zheng; Nicholas Price; Ricardo B R Azevedo; Rebecca A Zufall; Eran Elhaik
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

8.  Non-coding RNA: what is functional and what is junk?

Authors:  Alexander F Palazzo; Eliza S Lee
Journal:  Front Genet       Date:  2015-01-26       Impact factor: 4.599

9.  CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise.

Authors:  Mihaela Pertea; Alaina Shumate; Geo Pertea; Ales Varabyou; Florian P Breitwieser; Yu-Chi Chang; Anil K Madugundu; Akhilesh Pandey; Steven L Salzberg
Journal:  Genome Biol       Date:  2018-11-28       Impact factor: 13.583

  9 in total
  9 in total

1.  Incomplete annotation has a disproportionate impact on our understanding of Mendelian and complex neurogenetic disorders.

Authors:  David Zhang; Sebastian Guelfi; Sonia Garcia-Ruiz; Beatrice Costa; Regina H Reynolds; Karishma D'Sa; Wenfei Liu; Thomas Courtin; Amy Peterson; Andrew E Jaffe; John Hardy; Juan A Botía; Leonardo Collado-Torres; Mina Ryten
Journal:  Sci Adv       Date:  2020-06-10       Impact factor: 14.136

2.  Annotation and functional characterization of long noncoding RNAs deregulated in pancreatic adenocarcinoma.

Authors:  Vinicius Ferreira da Paixão; Omar Julio Sosa; Diogo Vieira da Silva Pellegrina; Bianca Dazzani; Thalita Bueno Corrêa; Ester Risério Bertoldi; Luís Bruno da Cruz E Alves-de-Moraes; Diogo de Oliveira Pessoa; Victoria de Paiva Oliveira; Ricardo Alberto Chiong Zevallos; Lilian Cristina Russo; Fabio Luis Forti; João Eduardo Ferreira; Helano Carioca Freitas; José Jukemura; Marcel Cerqueira César Machado; Maria Dirlei Begnami; João Carlos Setubal; Daniela Sanchez Bassères; Eduardo Moraes Reis
Journal:  Cell Oncol (Dordr)       Date:  2022-05-14       Impact factor: 7.051

3.  What We Talk About When We Talk About "Junk DNA".

Authors:  Nelson J R Fagundes; Rafael Bisso-Machado; Pedro I C C Figueiredo; Maikel Varal; André L S Zani
Journal:  Genome Biol Evol       Date:  2022-05-03       Impact factor: 4.065

Review 4.  Designing libraries for pooled CRISPR functional screens of long noncoding RNAs.

Authors:  Carlos Pulido-Quetglas; Rory Johnson
Journal:  Mamm Genome       Date:  2021-09-17       Impact factor: 3.224

5.  The meanings of 'function' in biology and the problematic case of de novo gene emergence.

Authors:  Diane Marie Keeling; Patricia Garza; Charisse Michelle Nartey; Anne-Ruxandra Carvunis
Journal:  Elife       Date:  2019-11-01       Impact factor: 8.140

Review 6.  Overcoming challenges and dogmas to understand the functions of pseudogenes.

Authors:  Seth W Cheetham; Geoffrey J Faulkner; Marcel E Dinger
Journal:  Nat Rev Genet       Date:  2019-12-17       Impact factor: 53.242

Review 7.  Seq'ing identity and function in a repeat-derived noncoding RNA world.

Authors:  Rachel J O'Neill
Journal:  Chromosome Res       Date:  2020-03-07       Impact factor: 5.239

8.  Incomplete annotation has a disproportionate impact on our understanding of Mendelian and complex neurogenetic disorders.

Authors:  David Zhang; Sebastian Guelfi; Sonia Garcia-Ruiz; Beatrice Costa; Regina H Reynolds; Karishma D'Sa; Wenfei Liu; Thomas Courtin; Amy Peterson; Andrew E Jaffe; John Hardy; Juan A Botía; Leonardo Collado-Torres; Mina Ryten
Journal:  Sci Adv       Date:  2020-06-10       Impact factor: 14.136

9.  Genomic determinants for initiation and length of natural antisense transcripts in Entamoeba histolytica.

Authors:  Damien Mornico; Chung-Chau Hon; Mikael Koutero; Christian Weber; Jean-Yves Coppee; Marie-Agnes Dillies; Nancy Guillen
Journal:  Sci Rep       Date:  2020-11-19       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.