Literature DB >> 20624272

Annotating conserved and novel features of primate transcriptomes using sequencing.

Philipp Khaitovich1.   

Abstract

Recent high-throughput sequencing of chimpanzee brain and liver transcriptomes published in Genome Biology reveals multiple transcripts lost in the human genome and highlights the incompleteness of primate genome annotations.

Entities:  

Mesh:

Year:  2010        PMID: 20624272      PMCID: PMC2926776          DOI: 10.1186/gb-2010-11-7-125

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


Research highlight

The completion of the human genome was followed by sequencing of the genomes of closely related primate species, such as the chimpanzee and the rhesus macaque. The motivation was simple: as the genome provided the blueprint of an organism, comparisons between the human genome and the genomes of non-human primates should reveal genomic features underlying the human phenotype. One problem with this approach, however, is that a genome is not really a blueprint of a phenotype, but rather a well-scrambled message, in which functionally relevant sequences are lost in a sea of phenotypically neutral information. A seemingly straightforward way to identify functional sequences is to determine transcribed regions. This is not a simple task, however, as the transcriptome varies greatly across cell types and changes dramatically across an organism's lifespan. Thus, in the past decade, a large effort was put into annotating the human transcriptome, mainly by sequencing transcripts converted into cDNA libraries by conventional Sanger sequencing. As a result, it became clear that given enough sequencing coverage, almost any genomic sequence can be detected on the transcriptome level [1]. This is not entirely surprising, as human genes frequently contain long introns; moreover, RNA polymerase can generate spontaneous transcripts of no functional relevance. Still, this result indicated that dividing the genome into transcribed and non-transcribed parts to determine functionality was largely futile. These cDNA sequencing projects also showed that the boundaries of most human genes, including transcription start and termination sites and the splicing patterns of internal exons, are rather fuzzy [2-6]. In addition, many of the identified transcripts and gene isoforms turned out to be rare. This does not, however, mean that they are functionally irrelevant, as such transcripts may have important roles in a limited number of cells in a tissue or at a specific stage of development. Further, many important regulators, such as transcription factors, are expressed at low levels. As a result, the current human transcriptome annotation represents a certain trade-off between confidence and comprehensiveness and contains transcripts identified with different degrees of confidence. The difficulty in compiling such an annotation is best illustrated by the differences that exist between RefSeq, Ensembl, the University of California Santa Cruz (UCSC) Genome Browser, the Vega Genome Browser and an integrated database of human genes and transcripts (H-Invitational Database): one finds an average overlap of 60 to 70% comparing any two of these annotation databases. Another way to determine functionally relevant transcripts is to require that the expression of a given transcript is conserved across species. Alternatively, if one is interested in loci important for the human phenotype, one could identify regions with human-specific transcription profiles. However, the transcriptome annotation of non-human primates is basically non-existent and what is present is entirely based on mapping the human annotation to the respective primate genomes. As human transcriptome annotation itself is far from being comprehensive and the quality of the non-human primate genomes is far worse than the quality of the human genome, such mapping-based annotation is not problem-free. But, most importantly, even though this method might allow identification of transcripts present in humans and absent in the other primates, it does not allow identification of transcripts lost from the human lineage. In this issue, Lucia Cavelier and colleagues [7] present a study that takes advantage of high-throughput sequencing technology to annotate genomic regions transcribed in the chimpanzee brain cortex and liver. High-throughput sequencing technology, introduced just a few years ago and increasing rapidly in its capacity, allows the sequencing of millions of short reads in a single run. In their study, Cavelier and colleagues [7] used the ABI/SOLiD sequencing platform to generate over 500 million reads of length 35 and 50 nucleotides from poly(A)+ RNA expressed in brain and liver tissue from two chimpanzees. Mapping the obtained sequences to the chimpanzee genome enabled them to identify transcribed regions independently of the existing annotation. Consequently, they found that only about a third of the obtained reads mapped to known chimpanzee exons. This proportion is much lower than that found in human RNA sequencing studies, reflecting the poor quality of the chimpanzee genome annotation. Importantly, they were able to identify in the order of 350 genomic regions that are highly transcribed in the chimpanzee genome but completely absent in the current human genome assembly. Using the rhesus macaque genome as an outgroup, they found that approximately half of these regions were lost from the human lineage. In addition to these transcribed regions of as-yet unknown function, Cavelier and colleagues [7] identified several novel gene isoforms not annotated in humans and a putative novel gene from the ATP-cassette-transporter family that is conserved between chimpanzee and mouse but lost from the human lineage. These findings [7] add weight to the 'less is more' hypothesis of human evolution, postulating that some of the human-specific features have evolved not through acquisition of novel genetic elements, but through functional loss of previously existing ones. This study from Cavelier and colleagues [7] clearly shows that human-specific loss of transcribed regions is not limited to annotated protein-coding genes, but is common among intergenic transcripts and non-coding RNA. This finding is in good agreement with previous studies of the human and chimpanzee brain transcriptomes carried out using tiling arrays [8], high-throughput sequencing of expressed tags [9] and high-throughput sequencing of the complete transcriptomes [10], which all indicate that a large proportion of human-specific transcription gain and loss originates in as-yet unannotated genomic regions. Thus, the current task is to reveal functional properties of these novel transcripts, if they exist. Importantly, the study [7] draws attention to the poor state of genome annotation in non-human primates. In humans, the use of high-throughput sequencing technology in transcriptome studies has already revealed much greater variability of the gene transcript isoforms than previously appreciated [2,3]. In non-human primates, such as chimpanzees and rhesus macaques, both the known genome sequence information and, particularly, the genome annotations are in a far worse state than those of humans. The study of Cavelier and colleagues [7] clearly illustrates that careful characterization of human and non-human primate transcriptomes can uncover large numbers of genetic and transcriptional changes specific to humans. Some of these changes will be responsible for the evolution of human-specific features, such as adaptation to a cooked, highly nutritious diet and unique social and cognitive abilities. Finding genetic elements underlying these uniquely human features is important not only for our understanding of human evolution, but also for prevention of their dysfunctions that may result in metabolic and cognitive disorders. The recent advances in high-throughput sequencing methodology provide us with powerful tools with which to characterize complete transcriptomes in multiple tissues and cell types across primate species, resulting in the comprehensive identification of the transcriptome features specific to humans. The work of Cavelier and colleagues [7] is the first brave step in this direction.
  10 in total

1.  Widespread mRNA polyadenylation events in introns indicate dynamic interplay between polyadenylation and splicing.

Authors:  Bin Tian; Zhenhua Pan; Ju Youn Lee
Journal:  Genome Res       Date:  2007-01-08       Impact factor: 9.043

2.  A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome.

Authors:  Marc Sultan; Marcel H Schulz; Hugues Richard; Alon Magen; Andreas Klingenhoff; Matthias Scherf; Martin Seifert; Tatjana Borodina; Aleksey Soldatov; Dmitri Parkhomchuk; Dominic Schmidt; Sean O'Keeffe; Stefan Haas; Martin Vingron; Hans Lehrach; Marie-Laure Yaspo
Journal:  Science       Date:  2008-07-03       Impact factor: 47.728

3.  Genome-wide analysis of mammalian promoter architecture and evolution.

Authors:  Piero Carninci; Albin Sandelin; Boris Lenhard; Shintaro Katayama; Kazuro Shimokawa; Jasmina Ponjavic; Colin A M Semple; Martin S Taylor; Pär G Engström; Martin C Frith; Alistair R R Forrest; Wynand B Alkema; Sin Lam Tan; Charles Plessy; Rimantas Kodzius; Timothy Ravasi; Takeya Kasukawa; Shiro Fukuda; Mutsumi Kanamori-Katayama; Yayoi Kitazume; Hideya Kawaji; Chikatoshi Kai; Mari Nakamura; Hideaki Konno; Kenji Nakano; Salim Mottagui-Tabar; Peter Arner; Alessandra Chesi; Stefano Gustincich; Francesca Persichetti; Harukazu Suzuki; Sean M Grimmond; Christine A Wells; Valerio Orlando; Claes Wahlestedt; Edison T Liu; Matthias Harbers; Jun Kawai; Vladimir B Bajic; David A Hume; Yoshihide Hayashizaki
Journal:  Nat Genet       Date:  2006-04-28       Impact factor: 38.330

4.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project.

Authors:  Ewan Birney; John A Stamatoyannopoulos; Anindya Dutta; Roderic Guigó; Thomas R Gingeras; Elliott H Margulies; Zhiping Weng; Michael Snyder; Emmanouil T Dermitzakis; Robert E Thurman; Michael S Kuehn; Christopher M Taylor; Shane Neph; Christoph M Koch; Saurabh Asthana; Ankit Malhotra; Ivan Adzhubei; Jason A Greenbaum; Robert M Andrews; Paul Flicek; Patrick J Boyle; Hua Cao; Nigel P Carter; Gayle K Clelland; Sean Davis; Nathan Day; Pawandeep Dhami; Shane C Dillon; Michael O Dorschner; Heike Fiegler; Paul G Giresi; Jeff Goldy; Michael Hawrylycz; Andrew Haydock; Richard Humbert; Keith D James; Brett E Johnson; Ericka M Johnson; Tristan T Frum; Elizabeth R Rosenzweig; Neerja Karnani; Kirsten Lee; Gregory C Lefebvre; Patrick A Navas; Fidencio Neri; Stephen C J Parker; Peter J Sabo; Richard Sandstrom; Anthony Shafer; David Vetrie; Molly Weaver; Sarah Wilcox; Man Yu; Francis S Collins; Job Dekker; Jason D Lieb; Thomas D Tullius; Gregory E Crawford; Shamil Sunyaev; William S Noble; Ian Dunham; France Denoeud; Alexandre Reymond; Philipp Kapranov; Joel Rozowsky; Deyou Zheng; Robert Castelo; Adam Frankish; Jennifer Harrow; Srinka Ghosh; Albin Sandelin; Ivo L Hofacker; Robert Baertsch; Damian Keefe; Sujit Dike; Jill Cheng; Heather A Hirsch; Edward A Sekinger; Julien Lagarde; Josep F Abril; Atif Shahab; Christoph Flamm; Claudia Fried; Jörg Hackermüller; Jana Hertel; Manja Lindemeyer; Kristin Missal; Andrea Tanzer; Stefan Washietl; Jan Korbel; Olof Emanuelsson; Jakob S Pedersen; Nancy Holroyd; Ruth Taylor; David Swarbreck; Nicholas Matthews; Mark C Dickson; Daryl J Thomas; Matthew T Weirauch; James Gilbert; Jorg Drenkow; Ian Bell; XiaoDong Zhao; K G Srinivasan; Wing-Kin Sung; Hong Sain Ooi; Kuo Ping Chiu; Sylvain Foissac; Tyler Alioto; Michael Brent; Lior Pachter; Michael L Tress; Alfonso Valencia; Siew Woh Choo; Chiou Yu Choo; Catherine Ucla; Caroline Manzano; Carine Wyss; Evelyn Cheung; Taane G Clark; James B Brown; Madhavan Ganesh; Sandeep Patel; Hari Tammana; Jacqueline Chrast; Charlotte N Henrichsen; Chikatoshi Kai; Jun Kawai; Ugrappa Nagalakshmi; Jiaqian Wu; Zheng Lian; Jin Lian; Peter Newburger; Xueqing Zhang; Peter Bickel; John S Mattick; Piero Carninci; Yoshihide Hayashizaki; Sherman Weissman; Tim Hubbard; Richard M Myers; Jane Rogers; Peter F Stadler; Todd M Lowe; Chia-Lin Wei; Yijun Ruan; Kevin Struhl; Mark Gerstein; Stylianos E Antonarakis; Yutao Fu; Eric D Green; Ulaş Karaöz; Adam Siepel; James Taylor; Laura A Liefer; Kris A Wetterstrand; Peter J Good; Elise A Feingold; Mark S Guyer; Gregory M Cooper; George Asimenos; Colin N Dewey; Minmei Hou; Sergey Nikolaev; Juan I Montoya-Burgos; Ari Löytynoja; Simon Whelan; Fabio Pardi; Tim Massingham; Haiyan Huang; Nancy R Zhang; Ian Holmes; James C Mullikin; Abel Ureta-Vidal; Benedict Paten; Michael Seringhaus; Deanna Church; Kate Rosenbloom; W James Kent; Eric A Stone; Serafim Batzoglou; Nick Goldman; Ross C Hardison; David Haussler; Webb Miller; Arend Sidow; Nathan D Trinklein; Zhengdong D Zhang; Leah Barrera; Rhona Stuart; David C King; Adam Ameur; Stefan Enroth; Mark C Bieda; Jonghwan Kim; Akshay A Bhinge; Nan Jiang; Jun Liu; Fei Yao; Vinsensius B Vega; Charlie W H Lee; Patrick Ng; Atif Shahab; Annie Yang; Zarmik Moqtaderi; Zhou Zhu; Xiaoqin Xu; Sharon Squazzo; Matthew J Oberley; David Inman; Michael A Singer; Todd A Richmond; Kyle J Munn; Alvaro Rada-Iglesias; Ola Wallerman; Jan Komorowski; Joanna C Fowler; Phillippe Couttet; Alexander W Bruce; Oliver M Dovey; Peter D Ellis; Cordelia F Langford; David A Nix; Ghia Euskirchen; Stephen Hartman; Alexander E Urban; Peter Kraus; Sara Van Calcar; Nate Heintzman; Tae Hoon Kim; Kun Wang; Chunxu Qu; Gary Hon; Rosa Luna; Christopher K Glass; M Geoff Rosenfeld; Shelley Force Aldred; Sara J Cooper; Anason Halees; Jane M Lin; Hennady P Shulha; Xiaoling Zhang; Mousheng Xu; Jaafar N S Haidar; Yong Yu; Yijun Ruan; Vishwanath R Iyer; Roland D Green; Claes Wadelius; Peggy J Farnham; Bing Ren; Rachel A Harte; Angie S Hinrichs; Heather Trumbower; Hiram Clawson; Jennifer Hillman-Jackson; Ann S Zweig; Kayla Smith; Archana Thakkapallayil; Galt Barber; Robert M Kuhn; Donna Karolchik; Lluis Armengol; Christine P Bird; Paul I W de Bakker; Andrew D Kern; Nuria Lopez-Bigas; Joel D Martin; Barbara E Stranger; Abigail Woodroffe; Eugene Davydov; Antigone Dimas; Eduardo Eyras; Ingileif B Hallgrímsdóttir; Julian Huppert; Michael C Zody; Gonçalo R Abecasis; Xavier Estivill; Gerard G Bouffard; Xiaobin Guan; Nancy F Hansen; Jacquelyn R Idol; Valerie V B Maduro; Baishali Maskeri; Jennifer C McDowell; Morgan Park; Pamela J Thomas; Alice C Young; Robert W Blakesley; Donna M Muzny; Erica Sodergren; David A Wheeler; Kim C Worley; Huaiyang Jiang; George M Weinstock; Richard A Gibbs; Tina Graves; Robert Fulton; Elaine R Mardis; Richard K Wilson; Michele Clamp; James Cuff; Sante Gnerre; David B Jaffe; Jean L Chang; Kerstin Lindblad-Toh; Eric S Lander; Maxim Koriabine; Mikhail Nefedov; Kazutoyo Osoegawa; Yuko Yoshinaga; Baoli Zhu; Pieter J de Jong
Journal:  Nature       Date:  2007-06-14       Impact factor: 49.962

5.  Both noncoding and protein-coding RNAs contribute to gene expression evolution in the primate brain.

Authors:  Courtney C Babbitt; Olivier Fedrigo; Adam D Pfefferle; Alan P Boyle; Julie E Horvath; Terrence S Furey; Gregory A Wray
Journal:  Genome Biol Evol       Date:  2010-01-18       Impact factor: 3.416

6.  Intergenic and repeat transcription in human, chimpanzee and macaque brains measured by RNA-Seq.

Authors:  Augix Guohua Xu; Liu He; Zhongshan Li; Ying Xu; Mingfeng Li; Xing Fu; Zheng Yan; Yuan Yuan; Corinna Menzel; Na Li; Mehmet Somel; Hao Hu; Wei Chen; Svante Pääbo; Philipp Khaitovich
Journal:  PLoS Comput Biol       Date:  2010-07-01       Impact factor: 4.475

7.  Identification of novel exons and transcribed regions by chimpanzee transcriptome sequencing.

Authors:  Anna Wetterbom; Adam Ameur; Lars Feuk; Ulf Gyllensten; Lucia Cavelier
Journal:  Genome Biol       Date:  2010-07-23       Impact factor: 13.583

8.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

9.  Functionality of intergenic transcription: an evolutionary comparison.

Authors:  Philipp Khaitovich; Janet Kelso; Henriette Franz; Johann Visagie; Thomas Giger; Sabrina Joerchel; Ekkehard Petzold; Richard E Green; Michael Lachmann; Svante Pääbo
Journal:  PLoS Genet       Date:  2006-08-28       Impact factor: 5.917

10.  Alternative isoform regulation in human tissue transcriptomes.

Authors:  Eric T Wang; Rickard Sandberg; Shujun Luo; Irina Khrebtukova; Lu Zhang; Christine Mayr; Stephen F Kingsmore; Gary P Schroth; Christopher B Burge
Journal:  Nature       Date:  2008-11-27       Impact factor: 49.962

  10 in total
  1 in total

1.  The baboon kidney transcriptome: analysis of transcript sequence, splice variants, and abundance.

Authors:  Kimberly D Spradling; Jeremy P Glenn; Roy Garcia; Robert E Shade; Laura A Cox
Journal:  PLoS One       Date:  2013-04-23       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.