| Literature DB >> 23033341 |
Jinfeng Liu1, William Lee, Zhaoshi Jiang, Zhongqiang Chen, Suchit Jhunjhunwala, Peter M Haverty, Florian Gnad, Yinghui Guan, Houston N Gilbert, Jeremy Stinson, Christiaan Klijn, Joseph Guillory, Deepali Bhatt, Steffan Vartanian, Kimberly Walter, Jocelyn Chan, Thomas Holcomb, Peter Dijkgraaf, Stephanie Johnson, Julie Koeman, John D Minna, Adi F Gazdar, Howard M Stern, Klaus P Hoeflich, Thomas D Wu, Jeff Settleman, Frederic J de Sauvage, Robert C Gentleman, Richard M Neve, David Stokoe, Zora Modrusan, Somasekar Seshagiri, David S Shames, Zemin Zhang.
Abstract
Lung cancer is a highly heterogeneous disease in terms of both underlying genetic lesions and response to therapeutic treatments. We performed deep whole-genome sequencing and transcriptome sequencing on 19 lung cancer cell lines and three lung tumor/normal pairs. Overall, our data show that cell line models exhibit similar mutation spectra to human tumor samples. Smoker and never-smoker cancer samples exhibit distinguishable patterns of mutations. A number of epigenetic regulators, including KDM6A, ASH1L, SMARCA4, and ATAD2, are frequently altered by mutations or copy number changes. A systematic survey of splice-site mutations identified 106 splice site mutations associated with cancer specific aberrant splicing, including mutations in several known cancer-related genes. RAC1b, an isoform of the RAC1 GTPase that includes one additional exon, was found to be preferentially up-regulated in lung cancer. We further show that its expression is significantly associated with sensitivity to a MAP2K (MEK) inhibitor PD-0325901. Taken together, these data present a comprehensive genomic landscape of a large number of lung cancer samples and further demonstrate that cancer-specific alternative splicing is a widespread phenomenon that has potential utility as therapeutic biomarkers. The detailed characterizations of the lung cancer cell lines also provide genomic context to the vast amount of experimental data gathered for these lines over the decades, and represent highly valuable resources for cancer biology.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23033341 PMCID: PMC3514662 DOI: 10.1101/gr.140988.112
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.The mutation spectra and mutation rate of lung cancer genomes. (A) Lung cancer genomes have a few distinct patterns of mutation composition. Genomes from smokers tend to have a large fraction of C:G > A:T transversions, a known tobacco-related DNA-damage pattern (top panel), and have larger number of filtered variations (bottom panel). In addition, the four cell lines with the most protein-altering SNVs all have mutations in at least one of the mismatch repair (MMR) genes. (B) CDS and UTRs have the lowest mutation rates among genomic features. Mutation rates (i.e., number of mutations per mega base) for different genomic features were calculated, and then normalized by the genome-wide mutation rates to obtain the relative mutation rates. Each dot in the plot represents one genome (blue, cell lines; red, tumors). The boxes in the box-and-whisker plots represent the interquartile range between the first and third quartiles; the dashed lines (whiskers) extend to the most extreme data points which is no more than 1.5 times the interquartile range from the box. (C) Mutation rates are negatively correlated with expression level. In each genome, genes were divided into three groups according to their expression level based on the RNA-seq data from the same sample: zero (RPKM = 0), low (0 < RPKM ≤ 1), and high (RPKM > 1). Mutation rate for each group was calculated, and then normalized by average mutation rate for all genes in the genome to obtain relative mutation rates. Genes with high expression levels tend to have the lowest mutation rates in either the entire intragenic region (i.e., exons and introns, left panel) or introns only (right panel). This suggests that transcription-coupled DNA-repair mechanisms may play a role. The boxes in the box-and-whisker plots represent the interquartile range between the first and third quartiles; the dashed lines (whiskers) extend to the most extreme data points which are no more than 1.5 times the interquartile range from the box.
Figure 2.Frequently altered genes in lung cancer genomes. (A) The top 20 genes with the highest mutation rates in lung cancer cell lines include three frequently mutated genes (KRAS, TP53, and STK11) in lung tumors from previous studies (Ding et al. 2008). (B) Profiles of single-nucleotide variation and copy number alterations in lung cancer cell lines and tumors for a subset of known cancer-related genes. (C) Profiles of single-nucleotide variation and copy number alterations in lung cancer cell lines and tumors for a subset of genes in epigenetic pathways.
Figure 3.Splice mutation in RB1 and the associated aberrant splicing event. (A) Splice site mutation is associated with an exon skipping event in RB1 in the tumor genome GS000000552. A novel mutation in the tumor suppressor gene RB1 alters the AG essential splice acceptor sequence just before exon 22. There are three RNA-seq reads spanning a novel exon–exon junction between exons 21 and 25, skipping the three exons in between. The resulting protein product has a 103-amino-acid in-frame deletion close to the C terminus of RB1, which is essential for the binding of RB1 to the E2F–DP transcription factor complexes. (B) Most cell cycle-related E2F target genes (Bracken et al. 2004) are up-regulated in sample GS000000552. For each gene, we obtained the expression values in matched normal and tumor samples from the patient, and calculated the log2 fold-change between tumor and normal. (C) Among the three tumor samples, E2F target genes are up-regulated the most in sample GS000000552. Each dot represents one of the known cell cycle-related E2F target genes (Bracken et al. 2004). P-values shown on the plot are derived from paired t-tests between GS000000552 and one of the other tumors. The result is consistent with the hypothesis that the truncated RB1 in this sample resulting from aberrant splicing is unable to bind to E2F and suppress the expression of its target genes. The boxes in the box-and-whisker plots represent the interquartile range between the first and third quartiles; the dashed lines (whiskers) extend to the most extreme data points which are no more than 1.5 times the interquartile range from the box.
Figure 4.Alternative splicing of RAC1 in lung cancer. (A) The RAC1b isoform, which includes exon 3b, is preferentially up-regulated in all three lung tumors in our study. (B) Up-regulation of the RAC1b isoform is confirmed by exon array data from an independent data set (GSE16534). Each dot represents a tissue sample. To account for differences in the expression of total RAC1, we calculated the difference between the exon 3b expression and total RAC1 expression, and compared the differences between normal and tumor samples. The analysis showed that RAC1b is significantly up-regulated in tumors (P = 0.00026, Student's t-test). (C) Cell lines sensitive to PD-0325901 have significantly higher expression of the RAC1b isoform (P = 0.019, Student's t-test), while no difference in RAC1 total expression was observed between resistant and sensitive cell lines. Each dot represents one cell line. The boxes in the box-and-whisker plots represent the interquartile range between the first and third quartiles; the dashed lines (whiskers) extend to the most extreme data points which are no more than 1.5 times the interquartile range from the box.