Kie Kyon Huang1, Jiawen Huang1, Jeanie Kar Leng Wu1, Minghui Lee1, Su Ting Tay1, Vikrant Kumar1, Kalpana Ramnarayanan1, Nisha Padmanabhan1, Chang Xu1, Angie Lay Keng Tan1, Charlene Chan2, Dennis Kappei2,3, Jonathan Göke4, Patrick Tan5,6,7,8. 1. Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore. 2. Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore. 3. Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117596, Singapore. 4. Genome Institute of Singapore, Singapore, 138672, Singapore. 5. Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore. gmstanp@duke-nus.edu.sg. 6. Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore. gmstanp@duke-nus.edu.sg. 7. Genome Institute of Singapore, Singapore, 138672, Singapore. gmstanp@duke-nus.edu.sg. 8. SingHealth/Duke-NUS Institute of Precision Medicine, National Heart Centre Singapore, Singapore, 169609, Singapore. gmstanp@duke-nus.edu.sg.
Abstract
BACKGROUND: Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. RESULTS: We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which > 66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories, are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. CONCLUSIONS: Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies.
BACKGROUND: Deregulated gene expression is a hallmark of cancer; however, most studies to date have analyzed short-read RNA sequencing data with inherent limitations. Here, we combine PacBio long-read isoform sequencing (Iso-Seq) and Illumina paired-end short-read RNA sequencing to comprehensively survey the transcriptome of gastric cancer (GC), a leading cause of global cancer mortality. RESULTS: We performed full-length transcriptome analysis across 10 GC cell lines covering four major GC molecular subtypes (chromosomal unstable, Epstein-Barr positive, genome stable and microsatellite unstable). We identify 60,239 non-redundant full-length transcripts, of which > 66% are novel compared to current transcriptome databases. Novel isoforms are more likely to be cell line and subtype specific, expressed at lower levels with larger number of exons, with longer isoform/coding sequence lengths. Most novel isoforms utilize an alternate first exon, and compared to other alternative splicing categories, are expressed at higher levels and exhibit higher variability. Collectively, we observe alternate promoter usage in 25% of detected genes, with the majority (84.2%) of known/novel promoter pairs exhibiting potential changes in their coding sequences. Mapping these alternate promoters to TCGA GC samples, we identify several cancer-associated isoforms, including novel variants of oncogenes. Tumor-specific transcript isoforms tend to alter protein coding sequences to a larger extent than other isoforms. Analysis of outcome data suggests that novel isoforms may impart additional prognostic information. CONCLUSIONS: Our results provide a rich resource of full-length transcriptome data for deeper studies of GC and other gastrointestinal malignancies.
Entities:
Keywords:
Alternative promoter; Alternative splicing; Gastric cancer; Iso-seq
Authors: Maria Nattestad; Sara Goodwin; Karen Ng; Timour Baslan; Fritz J Sedlazeck; Philipp Rescheneder; Tyler Garvin; Han Fang; James Gurtowski; Elizabeth Hutton; Elizabeth Tseng; Chen-Shan Chin; Timothy Beck; Yogi Sundaravadanam; Melissa Kramer; Eric Antoniou; John D McPherson; James Hicks; W Richard McCombie; Michael C Schatz Journal: Genome Res Date: 2018-06-28 Impact factor: 9.043
Authors: Mitsuteru Ito; Hui Shi; Kazuki Yamazawa; Elvira Isganaitis; Elizabeth J Radford; Jennifer A Corish; Stefanie Seisenberger; Timothy A Hore; Wolf Reik; Serap Erkek; Antoine H F M Peters; Mary-Elizabeth Patti; Anne C Ferguson-Smith Journal: Science Date: 2014-07-10 Impact factor: 47.728
Authors: Kristian Cibulskis; Michael S Lawrence; Scott L Carter; Andrey Sivachenko; David Jaffe; Carrie Sougnez; Stacey Gabriel; Matthew Meyerson; Eric S Lander; Gad Getz Journal: Nat Biotechnol Date: 2013-02-10 Impact factor: 54.908
Authors: Jason L Weirather; Mariateresa de Cesare; Yunhao Wang; Paolo Piazza; Vittorio Sebastiano; Xiu-Jie Wang; David Buck; Kin Fai Au Journal: F1000Res Date: 2017-02-03