BACKGROUND: RNA sequencing has opened new avenues for the study of transcriptome composition. Significant evidence has accumulated showing that the human transcriptome contains in excess of a hundred thousand different transcripts. However, it is still not clear to what extent this diversity prevails when considering the relative abundances of different transcripts from the same gene. RESULTS: Here we show that, in a given condition, most protein coding genes have one major transcript expressed at significantly higher level than others, that in human tissues the major transcripts contribute almost 85 percent to the total mRNA from protein coding loci, and that often the same major transcript is expressed in many tissues. We detect a high degree of overlap between the set of major transcripts and a recently published set of alternatively spliced transcripts that are predicted to be translated utilizing proteomic data. Thus, we hypothesize that although some minor transcripts may play a functional role, the major ones are likely to be the main contributors to the proteome. However, we still detect a non-negligible fraction of protein coding genes for which the major transcript does not code a protein. CONCLUSIONS: Overall, our findings suggest that the transcriptome from protein coding loci is dominated by one transcript per gene and that not all the transcripts that contribute to transcriptome diversity are equally likely to contribute to protein diversity. This observation can help to prioritize candidate targets in proteomics research and to predict the functional impact of the detected changes in variation studies.
BACKGROUND: RNA sequencing has opened new avenues for the study of transcriptome composition. Significant evidence has accumulated showing that the human transcriptome contains in excess of a hundred thousand different transcripts. However, it is still not clear to what extent this diversity prevails when considering the relative abundances of different transcripts from the same gene. RESULTS: Here we show that, in a given condition, most protein coding genes have one major transcript expressed at significantly higher level than others, that in human tissues the major transcripts contribute almost 85 percent to the total mRNA from protein coding loci, and that often the same major transcript is expressed in many tissues. We detect a high degree of overlap between the set of major transcripts and a recently published set of alternatively spliced transcripts that are predicted to be translated utilizing proteomic data. Thus, we hypothesize that although some minor transcripts may play a functional role, the major ones are likely to be the main contributors to the proteome. However, we still detect a non-negligible fraction of protein coding genes for which the major transcript does not code a protein. CONCLUSIONS: Overall, our findings suggest that the transcriptome from protein coding loci is dominated by one transcript per gene and that not all the transcripts that contribute to transcriptome diversity are equally likely to contribute to protein diversity. This observation can help to prioritize candidate targets in proteomics research and to predict the functional impact of the detected changes in variation studies.
Authors: Michael L Tress; Pier Luigi Martelli; Adam Frankish; Gabrielle A Reeves; Jan Jaap Wesselink; Corin Yeats; Páll Isólfur Olason; Mario Albrecht; Hedi Hegyi; Alejandro Giorgetti; Domenico Raimondo; Julien Lagarde; Roman A Laskowski; Gonzalo López; Michael I Sadowski; James D Watson; Piero Fariselli; Ivan Rossi; Alinda Nagy; Wang Kai; Zenia Størling; Massimiliano Orsini; Yassen Assenov; Hagen Blankenburg; Carola Huthmacher; Fidel Ramírez; Andreas Schlicker; France Denoeud; Phil Jones; Samuel Kerrien; Sandra Orchard; Stylianos E Antonarakis; Alexandre Reymond; Ewan Birney; Søren Brunak; Rita Casadio; Roderic Guigo; Jennifer Harrow; Henning Hermjakob; David T Jones; Thomas Lengauer; Christine A Orengo; László Patthy; Janet M Thornton; Anna Tramontano; Alfonso Valencia Journal: Proc Natl Acad Sci U S A Date: 2007-03-19 Impact factor: 11.205
Authors: Eric J Foss; Dragan Radulovic; Scott A Shaffer; Douglas M Ruderfer; Antonio Bedalov; David R Goodlett; Leonid Kruglyak Journal: Nat Genet Date: 2007-10-21 Impact factor: 38.330
Authors: Ben Sidders; Christoph Brockel; Alex Gutteridge; Lee Harland; Peter Gildsig Jansen; Robert McEwen; David Michalovich; Henrik Seidel; Bertram Weiss; Bryn Williams-Jones; Mathew Woodwark Journal: Nat Rev Drug Discov Date: 2014-02 Impact factor: 84.694
Authors: Olga Yu Sudarkina; Ivan B Filippenkov; Ilya B Brodsky; Svetlana A Limborska; Lyudmila V Dergunova Journal: Mol Cell Biochem Date: 2015-04-26 Impact factor: 3.396