Literature DB >> 21173855

Recent advances in RNA sequence analysis.

Steven L Salzberg1.   

Abstract

The latest high-throughput DNA sequencing technology can now be applied on a large scale to capture the complete set of mRNA transcripts in a cell, using a technique called RNA-seq. Although RNA-seq is only 2 years old, it has rapidly swept through the field of genomics, and it is now being used to analyze the transcriptomes of organisms ranging from bacteria to primates. The depth of sequencing allows researchers to quantify the level of expression of genes, to discover alternative isoforms in eukaryotic species, and even to characterize the operon structure of bacterial genomes.

Entities:  

Year:  2010        PMID: 21173855      PMCID: PMC2990453          DOI: 10.3410/B2-64

Source DB:  PubMed          Journal:  F1000 Biol Rep        ISSN: 1757-594X


Introduction and context

Sequencing the mRNA in a cell has been used as a high-throughput method for finding genes since the early days of the human genome project. Beginning in the early 1990s, the expressed sequence tag (EST) method was used to capture fragments of thousands of human genes [1] prior to the sequencing of the genome. EST sequencing relies on the fact that eukaryotic genes are polyadenylated after transcription, and the long poly-A tract can be used to capture the transcripts via reverse transcription PCR (RT-PCR). The EST method was subsequently applied to many other species, and EST databases (notably dbEST) became a vital resource for genome annotation. Recently, a ‘next-gen’ version of EST sequencing has emerged, allowing researchers to capture and sequence mRNA at dramatically lower cost, and higher volume, than was ever possible with the EST method. The new RNA-seq methods [2-5] are being applied to a rapidly growing variety of species, cell types, and scientific questions, revealing far more about the transcriptomes of these species than was known just a few years ago. The field is advancing so rapidly that a brief review cannot cover the work of the past 2 years; this review is just a sampling of a few highlights.

Major recent advances

Sultan et al. [6] analyzed approximately 8 million short reads and found that RNA-seq could detect 25% more genes as compared to microarrays. About one-third of transcripts in their experiments mapped to genomic regions not annotated as genes. Of the 94,241 splice junctions, 4096 were novel, and many of these indicated exon skipping events. This result has been amplified by subsequent studies that generated even more sequences and showed even larger numbers of novel splicing events. Trapnell et al. [7] generated approximately 430 million paired-end reads to recover 13,692 known isoforms from mouse myoblast cells, but also detected 12,712 novel isoforms, of which 7395 contained novel splice junctions while the rest represented novel combinations of known exons. This latter study also demonstrated the power of a new algorithm capable of detecting and quantifying alternative isoforms when aligning RNA-seq reads to a genome. In an RNA-seq study using liver RNA samples from humans, chimpanzees, and rhesus macaques, Blekhman et al. [8] found that alternative splicing events vary between closely related primates and also between the sexes within species. Wang et al. [9] generated approximately 600 million short reads from 15 cell types and found that 92-94% of human genes are alternatively spliced, and that many alternative splicing events are tissue-specific. RNA-seq is also being used to study genetic variation among individuals (expression quantitative trait loci, or eQTLs). Pickrell et al. [10] and Montgomery et al. [11] combined RNA-seq data and HapMap data from 69 Nigerian individuals and 63 Caucasian individuals, respectively, and both groups identified variants responsible for alternative splicing as well as variation in expression levels among individuals. In single-celled organisms, RNA-seq can reveal novel insights about polycistronic transcripts. In the first transcriptome analysis of Trypanosoma brucei, thousands of splicing and polyadenylation sites were identified and many genes were found to be differentially expressed between the parasite's two life-cycle stages [12]. In prokaryotes, RNA-seq can provide an extremely detailed transcription map, at the single-base level, as has been shown recently in an archaeal species, Sulfolobus solfataricus, and in a pathogen bacterium, Helicobacter pylori. In S. solfataricus, over 1000 transcriptional start sites were detected and 80 novel protein-coding genes were discovered [13]. In H. pylori, hundreds of transcriptional start sites within operons were found, as well as approximately 60 novel small RNA genes [14].

Future directions

The power of RNA-seq stems from its ability to generate deep coverage of the entire transcriptome of a cell with just a single run of a high-throughput sequencer, such as the Illumina HiSeq, which can produce up to 200 billion bases in a single run. The potential to characterize all genes, to capture alternative isoforms, and to measure differential expression has already been demonstrated in dozens of studies, but hundreds of species, and countless experimental conditions, are yet to be explored. Several groups have developed methods besides poly-A selection to capture all RNAs in a cell, for example, random hexamer priming [13,15], which allows them to analyze prokaryotic transcriptomes or to look at noncoding RNA in eukaryotes. It now appears that RNA-seq will replace microarray technology in the coming years, as it appears to be not only more comprehensive but also much more accurate than microarrays, particularly for transcripts with low expression levels [16]. As this new method becomes even more widely adopted, it should greatly expand our understanding of the complex interplay of genes in all phases of cell development.
  16 in total

1.  A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome.

Authors:  Marc Sultan; Marcel H Schulz; Hugues Richard; Alon Magen; Andreas Klingenhoff; Matthias Scherf; Martin Seifert; Tatjana Borodina; Aleksey Soldatov; Dmitri Parkhomchuk; Dominic Schmidt; Sean O'Keeffe; Stefan Haas; Martin Vingron; Hans Lehrach; Marie-Laure Yaspo
Journal:  Science       Date:  2008-07-03       Impact factor: 47.728

2.  The transcriptional landscape of the yeast genome defined by RNA sequencing.

Authors:  Ugrappa Nagalakshmi; Zhong Wang; Karl Waern; Chong Shou; Debasish Raha; Mark Gerstein; Michael Snyder
Journal:  Science       Date:  2008-05-01       Impact factor: 47.728

3.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

4.  Stem cell transcriptome profiling via massive-scale mRNA sequencing.

Authors:  Nicole Cloonan; Alistair R R Forrest; Gabriel Kolle; Brooke B A Gardiner; Geoffrey J Faulkner; Mellissa K Brown; Darrin F Taylor; Anita L Steptoe; Shivangi Wani; Graeme Bethel; Alan J Robertson; Andrew C Perkins; Stephen J Bruce; Clarence C Lee; Swati S Ranade; Heather E Peckham; Jonathan M Manning; Kevin J McKernan; Sean M Grimmond
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

5.  The primary transcriptome of the major human pathogen Helicobacter pylori.

Authors:  Cynthia M Sharma; Steve Hoffmann; Fabien Darfeuille; Jérémy Reignier; Sven Findeiss; Alexandra Sittka; Sandrine Chabas; Kristin Reiche; Jörg Hackermüller; Richard Reinhardt; Peter F Stadler; Jörg Vogel
Journal:  Nature       Date:  2010-02-17       Impact factor: 49.962

6.  Mapping the Burkholderia cenocepacia niche response via high-throughput sequencing.

Authors:  D R Yoder-Himes; P S G Chain; Y Zhu; O Wurtzel; E M Rubin; James M Tiedje; R Sorek
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-20       Impact factor: 11.205

7.  Sex-specific and lineage-specific alternative splicing in primates.

Authors:  Ran Blekhman; John C Marioni; Paul Zumbo; Matthew Stephens; Yoav Gilad
Journal:  Genome Res       Date:  2009-12-15       Impact factor: 9.043

8.  Genome-wide analysis of mRNA abundance in two life-cycle stages of Trypanosoma brucei and identification of splicing and polyadenylation sites.

Authors:  Tim Nicolai Siegel; Doeke R Hekstra; Xuning Wang; Scott Dewell; George A M Cross
Journal:  Nucleic Acids Res       Date:  2010-04-12       Impact factor: 16.971

9.  Highly integrated single-base resolution maps of the epigenome in Arabidopsis.

Authors:  Ryan Lister; Ronan C O'Malley; Julian Tonti-Filippini; Brian D Gregory; Charles C Berry; A Harvey Millar; Joseph R Ecker
Journal:  Cell       Date:  2008-05-02       Impact factor: 41.582

10.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.

Authors:  Cole Trapnell; Brian A Williams; Geo Pertea; Ali Mortazavi; Gordon Kwan; Marijke J van Baren; Steven L Salzberg; Barbara J Wold; Lior Pachter
Journal:  Nat Biotechnol       Date:  2010-05-02       Impact factor: 54.908

View more
  8 in total

1.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.

Authors:  Mihaela Pertea; Geo M Pertea; Corina M Antonescu; Tsung-Cheng Chang; Joshua T Mendell; Steven L Salzberg
Journal:  Nat Biotechnol       Date:  2015-02-18       Impact factor: 54.908

Review 2.  Genome-guided transcriptome assembly in the age of next-generation sequencing.

Authors:  Liliana D Florea; Steven L Salzberg
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2013 Sep-Oct       Impact factor: 3.710

3.  GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

Authors:  Jason S Cumbie; Jeffrey A Kimbrel; Yanming Di; Daniel W Schafer; Larry J Wilhelm; Samuel E Fox; Christopher M Sullivan; Aron D Curzon; James C Carrington; Todd C Mockler; Jeff H Chang
Journal:  PLoS One       Date:  2011-10-06       Impact factor: 3.240

4.  Genome assembly has a major impact on gene content: a comparison of annotation in two Bos taurus assemblies.

Authors:  Liliana Florea; Alexander Souvorov; Theodore S Kalbfleisch; Steven L Salzberg
Journal:  PLoS One       Date:  2011-06-22       Impact factor: 3.240

5.  TopHat-Fusion: an algorithm for discovery of novel fusion transcripts.

Authors:  Daehwan Kim; Steven L Salzberg
Journal:  Genome Biol       Date:  2011-08-11       Impact factor: 13.583

6.  Olfactory Receptors as Biomarkers in Human Breast Carcinoma Tissues.

Authors:  Lea Weber; Désirée Maßberg; Christian Becker; Janine Altmüller; Burkhard Ubrig; Gabriele Bonatz; Gerhard Wölk; Stathis Philippou; Andrea Tannapfel; Hanns Hatt; Günter Gisselmann
Journal:  Front Oncol       Date:  2018-02-15       Impact factor: 6.244

7.  The human transcriptome: an unfinished story.

Authors:  Mihaela Pertea
Journal:  Genes (Basel)       Date:  2012-09       Impact factor: 4.096

8.  Expression profile of ectopic olfactory receptors determined by deep sequencing.

Authors:  Caroline Flegel; Stavros Manteniotis; Sandra Osthold; Hanns Hatt; Günter Gisselmann
Journal:  PLoS One       Date:  2013-02-06       Impact factor: 3.240

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.