Literature DB >> 19266097

A global view of cancer-specific transcript variants by subtractive transcriptome-wide analysis.

Chunjiang He1, Fang Zhou, Zhixiang Zuo, Hanhua Cheng, Rongjia Zhou.   

Abstract

BACKGROUND: Alternative pre-mRNA splicing (AS) plays a central role in generating complex proteomes and influences development and disease. However, the regulation and etiology of AS in human tumorigenesis is not well understood. METHODOLOGY/PRINCIPAL
FINDINGS: A Basic Local Alignment Search Tool database was constructed for the expressed sequence tags (ESTs) from all available databases of human cancer and normal tissues. An insertion or deletion in the alignment of EST/EST was used to identify alternatively spliced transcripts. Alignment of the ESTs with the genomic sequence was further used to confirm AS. Alternatively spliced transcripts in each tissue were then subtractively cross-screened to obtain tissue-specific variants. We systematically identified and characterized cancer/tissue-specific and alternatively spliced variants in the human genome based on a global view. We identified 15,093 cancer-specific variants of 9,989 genes from 27 types of human cancers and 14,376 normal tissue-specific variants of 7,240 genes from 35 normal tissues, which cover the main types of human tumors and normal tissues. Approximately 70% of these transcripts are novel. These data were integrated into a database HCSAS (http://202.114.72.39/database/human.html, pass:68756253). Moreover, we observed that the cancer-specific AS of both oncogenes and tumor suppressor genes are associated with specific cancer types. Cancer shows a preference in the selection of alternative splice-sites and utilization of alternative splicing types.
CONCLUSIONS/SIGNIFICANCE: These features of human cancer, together with the discovery of huge numbers of novel splice forms for cancer-associated genes, suggest an important and global role of cancer-specific AS during human tumorigenesis. We advise the use of cancer-specific alternative splicing as a potential source of new diagnostic, prognostic, predictive, and therapeutic tools for human cancer. The global view of cancer-specific AS is not only useful for exploring the complexity of the cancer transcriptome but also widens the eyeshot of clinical research.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19266097      PMCID: PMC2648985          DOI: 10.1371/journal.pone.0004732

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

It remains unknown how both intron removal and exon rearrangement are precisely regulated to produce correct proteomes in a cell type- or developmental stage-specific manner. Alternative splicing, the process by which the exons of primary transcripts can be spliced into different arrangements to produce structurally and functionally distinct mRNA and protein variants, is the most widely used mechanism to enhance the protein diversity of higher eukaryotic organisms. It has been estimated that 35%–94% of all human genes appear to undergo alternative splicing [1]–[7], suggesting that this mechanism has a major role in generating protein diversity. As sequence data continue to be generated from projects at an ever-increasing rate, the need for mining the data and constructing a repository for transcriptome information continues to grow as well. In many pathological conditions, aberrantly spliced pre-mRNAs are generated because they escape the quality control mechanisms within cells (e.g. the nonsense mediated mRNA decay pathway) and are, therefore, translated into aberrant proteins involved in human diseases, including cancer [8]–[11]. It is estimated that approximately 60% of disease mutations in the human genome are splicing mutations [12], [13]. Currently, the analysis of cancer-specific alternative splicing is a promising step forward and potential source of new clinical diagnostic, prognostic, and therapeutic strategies. Evidence is accumulating that supports a connection between tumorigenesis and alternative splicing [14]–[18]. Using bioinformatic approaches, Xu and Lee discovered cancer-specific splice variants in 316 genes [19]. We previously identified testis-/testis cancer-specific splice variants using bioinformatic and experimental approaches [20]. Despite the growing interest in the impact of alternative splicing in various aspects of the biological processes, our understanding of alternative splicing is still scattered, and its general regulatory mechanisms, especially in tumorigenesis, are not well known [21], [22]. However, it is believed that cancer-specific splice variants could be involved in the etiopathogeny of many diseases and some might serve as diagnostic or prognostic markers. Moreover, the direct targeting of protein is probably an advantageous way of correcting cancer-associated splicing alterations. For example, the cancer-restricted splice variant protein could be used as the target for specific antibodies conjugated to tumor cell toxins for cancer treatments. The etiopathogeny concerning the cancer-specific AS and all related applications need to be explored further. In order to advance our understanding of the biological significance of alternative splicing in human cancers, it is essential to systematically identify cancer-specific splicing events at the transcriptome level. In the present study, we performed a genome-wide analysis of alternative splicing in human cancer and normal tissues using an intersection/subtractive model consisting of the following steps: 1) identifying insertions or deletions in the alignments of expressed sequence tags (ESTs) to identify alternative splicing transcripts based on a previously described method [2], 2) the alignment of EST/genome to confirm the transcripts, and 3) obtaining the tissue-specific and alternatively spliced variants by subtractively cross-screening the alternatively spliced transcripts in each tissue. Our results distinguish distinctive patterns of cancer-specific alternative splicing and identify a large number of cancer- and tissue-specific splicing isoforms, which provides a global view of human cancer-specific alternative splicing in a large-scale approach and a potential source of new clinical diagnostic, prognostic, and therapeutic strategies for human cancer.

Materials and Methods

Data sources and filtration

Human EST data for both cancerous and normal tissues were drawn from the Cancer Genome Anatomy Project (CGAP) (http://cgap.nci.nih.gov/Tissues/LibraryFinder). The CGAP collects EST libraries from all over the world and provides good tissue information. All available EST libraries for both human cancer and normal tissues were downloaded from the CGAP libraries, Mammalian Gene Collection libraries, and Open Reading Frame EST Sequencing libraries. We sought to avoid mixing multiple tissues. Among these libraries, those signed ‘pooled’ were excluded because these procedures affect tissue classification. For normal tissue, ESTs were classified in accordance with the developmental stage information, and libraries without this information were not used. All EST and library data on different tissues that were used are listed in Tables 1 and 2.
Table 1

Numbers of libraries and ESTs in cancers.

Cancer types (27 types)LibrariesESTs
adrenal_cancer310431
bone_marrow_leukemia2143943
brain_glioma4375546
brain_meningioma7860
brain_cancer21339845
breast_cancer792143423
cervical_cancer3248606
chondrosarcoma1549638
colorectal_cancer803185632
esophageal_cancer1715039
germ_cell_cancer30161682
head_and_neck_cancer29193978
kidney_cancer8774290
liver_cancer62117276
lung_cancer220126775
lymphoma612458
muscle_tissue_cancer2683411
ovarian_cancer16298087
pancreas_insulinoma133046
pancreatic_cancer2282994
primitive_neuroectodermal_cancer_of_CNS2372677
prostate_cancer169168151
prostatic_intraepithelial_neoplasia59569
retinoblastoma351568
skin_cancer36137803
stomach_cancer24494950
uterus_cancer11046624
Total 3,443 2,078,302
Table 2

Numbers of libraries and ESTs in normal tissues.

Tissue type (35 tissues)LibrariesESTs
bone-adult21965
brain-adult368130619
brain-fetus73174889
brain-infant1173726
colon-adult13325757
colon-fetus15
eye-adult33100173
eye-fetus719659
heart-adult103979
heart-fetus1362681
kidney-fetus1613947
liver-adult1129818
liver-fetus22142750
lung-adult9244457
lung-fetus1835607
mammary-gland-adult33168603
muscle-adult1070211
muscle-fetus42235
ovary-adult68126
pancreas-adult1024759
pancreas-fetus25545
peripheral-nerve-adult16571
peripheral-nerve-juvenile19482
pituitary-gland-adult68520
placenta-adult358268277
prostate-adult12753220
skin-infant310319
spleen-adult31841
spleen-fetus11332
stomach-adult6710417
stomach-fetus29
testis-adult15730549
thyroid-adult7813314
uterus-adult1035026
vascular-adult58451
Total 1,992 1,496,839
All collection data were then dealt with in three procedures: repeat sequence masking to remove simple repeats in the dataset (program, repeatmasker; repeat database, repbase; girnst server:www.girinst.org), vector and contamination masking to clean the vector sequences (program, crossmatch; vector database, UniVec_Core; National Center for Biotechnology Information ftp server: ftp://ftp.ncbi.nih.gov/), and a final cleaning of short and rubbish sequences (program, seqclean from egassembler server: http://egassembler.hgc.jp). Any Alu repeats were included in, and the filtered ESTs were available for the following analysis.

Computational procedures to identify cancer/tissue-specific alternative splicing

A basic local alignment search tool (BLAST) database was constructed for the ESTs of each tissue. Alternative splicing was analyzed based on a previous method [2]. Transcripts specific to tissue T were identified based on an intersection/subtractive model: Where is the alternatively spliced transcripts specific to tissue T, is all transcripts in tissue T, and is all transcripts in the other tissues (∩, intersection). Briefly, the three steps were as follows: Tissue T's EST dataset was BLASTed against itself. The e-value was set to 1e-30. Gaps (insertion or deletion) in the ESTs were identified after alignment. Parameters to identify alternative splicing: the gap length, 10 bp; nucleotide identity, 95%. Tissue T's ESTs were BLASTed against the ESTs of the other tissues. Parameters were the same as step 1. Subtractive ESTs were identified as tissue T-specific ESTs by insertion/deletion comparisons after BLAST. Computer programs were written using the Perl language.

EST/genomic sequence alignments, chromosome mapping, and splice site analysis

To decrease errors in EST alignments and determine the chromosomal loci of each gene, we localized ESTs to genomic sequences using BLAST-like alignment tools (http://genome.ucsc.edu). We used the default parameters and selected the best score results. The exon position on the chromosome was recorded for each transcript and used to determine splice sites and gene structure. Splice sites for both 5′ and 3′ exon/intron boundaries were aligned online via http://weblogo.berkeley.edu/logo.cgi. We allowed an error of 10 bp in the exon/intron boundary. Based on comparisons of EST/genomic alignments, two possible errors can be checked: (i) if the candidate EST in the same gene was not on the same chromosome and (ii) is the candidate EST in the same gene was not in the same locus on the chromosome. The reasons for these errors mainly included EST sequencing errors, pseudogenes, and multiple copy genes. The two cases were excluded as false positives in the final database.

Function classification of alternative splicing

Each alternatively spliced EST was BLASTed to the RefSeq mRNA database (expectations 1e-30) to identify the corresponding genes. Using PANTHER (http://www.pantherdb.org/tools/genexAnalysis.jsp), these genes were clustered by the gene ontology (GO) process. We also searched the Entrez Gene Database to correct our results.

Alternative splicing database construction

We input all prediction results into the local alternative splicing database. This database was constructed with MySql and programmed by Perl and CGI. All information such as gene ID, gene structure, EST accession, mRNA accession, gene information, and exon location on the chromosome were collected in the database.

Results

HCSAS: A database for cancer-specific alternative splicing

For analyzing cancer-specific alternative splicing, we carefully classified all available EST libraries into 35 distinct normal tissue classes and 27 types of cancer to avoid mixing multiple tissues. Our final classification consisted of 1,992 libraries with 1,496,839 ESTs for normal human samples and 3,443 libraries with 2,078,302 ESTs for cancer samples (Tables 1 and 2). Through computationally subtractive analysis, we detected 15,093 cancer-specific transcripts in 9,989 genes from the 27 types of cancer, and 14,376 normal tissue-specific transcripts in 7,240 genes from the 35 tissues (Tables 3 and 4), which cover the main types of human tumors and tissues. Cancer-specific transcript numbers per gene detected were 1 to 1.69 with an average of 1.51, whereas there were 1 to 6 normal tissue-specific transcripts with an average of 1.99 (Tables 3 and 4), indicating fewer alternative splicing events (cancer-specific) in cancer compared to normal tissues.
Table 3

Numbers of cancer-specific AS transcripts and their genes.

Cancer types (27 types)GenesTranscriptsTranscripts/Gene
adrenal_cancer61801.31
bone_marrow_leukemia2373561.50
brain_glioma4857201.48
brain_meningioma111.00
brain_cancer22301.36
breast_cancer5507571.38
cervical_cancer3164281.35
chondrosarcoma2393521.47
colorectal_cancer3975781.46
esophageal_cancer1752261.29
germ_cell_cancer130721671.66
head_and_neck_cancer1351821.35
kidney_cancer4677161.53
liver_cancer95014101.48
lung_cancer6058801.45
lymphoma16211.31
muscle_tissue_cancer69210441.51
ovarian_cancer3515121.46
pancreas_insulinoma45761.69
pancreatic_cancer4266431.51
primitive_neuroectodermal_cancer_of_CNS65910181.54
prostate_cancer921111.21
prostatic_intraepithelial_neoplasia11181.64
retinoblastoma4347051.62
skin_cancer93915571.66
stomach_cancer2493261.31
uterus_cancer1281791.40
Total 9,989 15,093 1.51
Table 4

Numbers of normal tissue-specific AS transcripts and their genes.

Tissue types (35 tissues)GenesTranscriptsTranscripts/Gene
bone-adult122
brain-adult92416131.75
brain-fetus223155452.49
brain-infant53721.36
colon-adult17221.29
colon-fetus122.00
eye-adult5829071.56
eye-fetus58761.31
heart-adult691.50
heart-fetus1001461.46
kidney-fetus1101421.29
liver-adult3359612.87
liver-fetus2925061.73
lung-adult781602.05
lung-fetus1101521.38
mammary-adult41571.39
muscle-adult2503841.54
muscle-fetus18271.50
ovary-adult19261.37
pancreas-adult28391.39
pancreas-fetus23331.43
peripheral-nerve-adult18231.28
peripheral-nerve-juvenile18231.28
pituitary-gland-adult28843.00
placenta-adult156729021.85
prostate-adult1021621.59
skin-infant991211.22
spleen-adult221.00
spleen-fetus111.00
stomach-adult351.67
stomach-fetus166.00
testis-adult47731.55
thyroid-adult16201.25
uterus-adult51631.2
vascular-adult10101.00
Total 7240 14,376 1.99
To facilitate future studies and referencing of alternatively spliced genes, for both human cancer and normal tissues, we constructed a human cancer- and normal tissue-specific alternative splicing database (HCSAS) based on our analysis, which was divided into two parts: cancer-specific (15,093 transcripts) and normal tissue-specific (14,376) alternative splicing. Of these cancer- or tissue-specific AS, approximately 70% are novel isoforms. For example, in brain cancer, because of the alternative splicing and deletion of domain of the peptidase m20 family member, the aminoacylase-1 gene (ACY1) was spliced to produce a brain cancer-specific transcript (Figure 1a), and alternative splicing occurs in the SRP19 gene to produce a breast cancer-specific transcript by an alternative deletion of exon 3 (Figure 1b). Similarly, in liver cancer, lung cancer, and prostate cancer, cancer-specific isoforms were detected in our subtractive screening (Figure 1c–e).
Figure 1

A schematic representation of cancer-specific alternative gene splicing.

(a) Brain cancer (gene ACYl), (b) breast cancer (SRP19), (c) liver cancer (CDK5), (d) lung cancer (CDKN1A), and (e) prostate cancer (SMS). Cancer-specific isoforms are showed on the bottom in each panel. The biological processes of these transcripts (GO process) are indicated on the right. Deleted domains are shown with blue arrows. Arrows with a right angle indicate the start codon, ATG.

A schematic representation of cancer-specific alternative gene splicing.

(a) Brain cancer (gene ACYl), (b) breast cancer (SRP19), (c) liver cancer (CDK5), (d) lung cancer (CDKN1A), and (e) prostate cancer (SMS). Cancer-specific isoforms are showed on the bottom in each panel. The biological processes of these transcripts (GO process) are indicated on the right. Deleted domains are shown with blue arrows. Arrows with a right angle indicate the start codon, ATG. Furthermore, we systematically identified cancer-specific transcripts in both oncogenes and tumor suppressors. Thirty-nine oncogene isoforms and 38 tumor suppressor gene isoforms with cancer-specific AS events were detected (Table 5). For example, we identified a lung cancer-specific transcript in the oncogene RAF1 with a deletion of the Raf-like Ras-binding domain, an uterus cancer-specific transcript in oncogene FOS (Figure 2a), and a retinoblastoma-specific transcript in the tumor suppressor GLTSCR2, and a skin-cancer-specific transcript in the tumor suppressor EMP3 (Figure 2b).
Table 5

Oncogenes and tumor suppressors with cancer-specific AS events.

Gene IDSymbolGene DescriptionAS
Oncogenes
25ABL1v-abl Abelson murine leukemia viral oncogene homolog 11
3726JUNBjun B proto-oncogene2
7409VAV1vav 1 oncogene1
6757SSX2synovial sarcoma, X breakpoint 23
2130EWSR1Ewing sarcoma breakpoint region 13
2241FERfer (fps/fes related) tyrosine kinase (phosphoprotein NCP94)2
369ARAFv-raf murine sarcoma 3611 viral oncogene homolog1
4613MYCNv-myc myelocytomatosis viral related oncogene2
2534FYNFYN oncogene related to SRC, FGR, YES2
727735unassignedsimilar to TBC1 domain family member 31
51513ETV7ets variant gene 7 (TEL2 oncogene)2
5894RAF1v-raf-1 murine leukemia viral oncogene homolog 12
4193MDM2Mdm2, transformed 3T3 cell double minute 2, p53 binding protein1
4609MYCv-myc myelocytomatosis viral oncogene homolog1
2353FOSv-fos FBJ murine osteosarcoma viral oncogene homolog3
7410VAV2vav 2 oncogene1
4194MDM4Mdm4, transformed 3T3 cell double minute 4, p53 binding protein1
2118ETV4ets variant gene 4 (E1A enhancer binding protein, E1AF)3
598BCL2L1BCL2-like 13
55885LMO3LIM domain only 3 (rhombotin-like 2)1
3265HRASv-Ha-ras Harvey rat sarcoma viral oncogene homolog2
4893NRASneuroblastoma RAS viral (v-ras) oncogene homolog1
Total 39
Tumor suppressors
5934RBL2retinoblastoma-like 2 (p130)1
3482IGF2Rinsulin-like growth factor 2 receptor1
5925RB1retinoblastoma 1 (including osteosarcoma)1
54984unassignedPIN2-interacting protein 11
4017LOXL2lysyl oxidase-like 23
29997GLTSCR2glioma tumor suppressor candidate region gene 23
2014EMP3epithelial membrane protein 31
672BRCA1breast cancer 1, early onset2
54879ST7Lsuppression of cancerigenicity 7 like2
51147ING4inhibitor of growth family, member 41
7982ST7suppression of cancerigenicity 71
51566ARMCX3armadillo repeat containing, X-linked 34
84695LOXL3lysyl oxidase-like 32
79961DENND2DDENN/MADD domain containing 2D2
7157TP53cancer protein p53 (Li-Fraumeni syndrome)1
7248TSC1tuberous sclerosis 11
54768HYDINhydrocephalus inducing homolog3
581BAXBCL2-associated X protein1
1026CDKN1Acyclin-dependent kinase inhibitor 1A (p21, Cip1)5
1029CDKN2Acyclin-dependent kinase inhibitor 2A1
10263CDK2AP2CDK2-associated protein 21
Total 38
Figure 2

A schematic representation of cancer-specific alternative gene splicing.

(a) Oncogene, (b) tumor suppressor gene. The alternative splicing of RAF1 generates a lung cancer-specific transcript, whereas the alternative splicing of FOS produces an uterus cancer-specific transcript. Tumor suppressor GLTSCR2 is alternatively spliced to produce two retinoblastoma-specific transcripts and EMP3 to generate a skin cancer-specific transcript. Deleted domains are shown with blue arrows. Arrows with a right angle indicate the start codon, ATG.

(a) Oncogene, (b) tumor suppressor gene. The alternative splicing of RAF1 generates a lung cancer-specific transcript, whereas the alternative splicing of FOS produces an uterus cancer-specific transcript. Tumor suppressor GLTSCR2 is alternatively spliced to produce two retinoblastoma-specific transcripts and EMP3 to generate a skin cancer-specific transcript. Deleted domains are shown with blue arrows. Arrows with a right angle indicate the start codon, ATG. The HCSAS database presents a global overview of cancer-specific alternative splicing in humans and is essential for understanding tumorigenesis at a systematic level. The main information in this database includes the specific alternative splicing in both cancer and normal tissues, gene ID, gene structure, splicing sites, chromosome localization, DNA and protein sequences linked with the NCBI website, and GO process, function, and subcellular localization. An example page set shows the details of an adrenal cancer gene, FDPS (Figure 3). The HCSAS database can be accessed at http://202.114.72.39/database/human.html.
Figure 3

A database of cancer- and normal tissue-specific alternative splicing.

An example page set from the database shows the details of an adrenal cancer gene, FDPS. The information includes the specific alternative splicing of both cancer and normal tissues, gene ID, gene structure, splicing sites, chromosome localization, DNA and protein sequences linked with the NCBI website, and GO process, function, and subcellular localization.

A database of cancer- and normal tissue-specific alternative splicing.

An example page set from the database shows the details of an adrenal cancer gene, FDPS. The information includes the specific alternative splicing of both cancer and normal tissues, gene ID, gene structure, splicing sites, chromosome localization, DNA and protein sequences linked with the NCBI website, and GO process, function, and subcellular localization.

Biased utilization of alternative splicing types in cancer

An examination of cancer-specific alternative splicing revealed a biased distribution of alternative splicing types in cancer. Both the alternative 3′ splice site and 5′ splice site were used more often in cancer; however, a lower proportion of intron retention and cassette alternative exon occurred in cancer tissues compared to normal tissues (Figure 4b). Moreover, alternative splicing types differ between different kinds of cancer (Figure 4a). For example, in liver cancer, breast cancer, and prostate cancer, intron retention decreased and cassette alternative exons increased significantly, whereas in uterus cancer and skin cancer, cassette alternative exons markedly decreased.
Figure 4

The frequencies (percentages) of the five types of cancer- and normal tissue-specific alternative splicing.

(a) 16 types of human cancer and 17 normal tissues, (b) the average values between tumors and normal tissues. The five colors indicate the five types of tissue-specific alternative splicing: cassette alternative exon, alternative 5′ splice site, alternative 3′ splice site, intron retention, and mutually exclusive alternative exons. Yellowish regions indicate over 30% of the frequencies.

The frequencies (percentages) of the five types of cancer- and normal tissue-specific alternative splicing.

(a) 16 types of human cancer and 17 normal tissues, (b) the average values between tumors and normal tissues. The five colors indicate the five types of tissue-specific alternative splicing: cassette alternative exon, alternative 5′ splice site, alternative 3′ splice site, intron retention, and mutually exclusive alternative exons. Yellowish regions indicate over 30% of the frequencies.

Preference in the selection of alternative splice sites in cancer

To explore the preference/diversification of alternative splice sites in cancer, we analyzed all splice sites in the 27 types of cancer and 35 normal tissues by comparing each EST with its genomic sequence and mapping it onto the chromosome. We detected five basic donor-acceptor splice sites: GT-AG, CT-AC, GC-AG, GG-AG, and GT-GG, of which GT-AG are the most dominant sites. The others were classified into rare splice sites. We found that cancer uses rare splice sites and GT-AG more frequently, but less CT-AC compared to normal tissues (Figure 5a, b). Moreover, the selection of splice sites differs between different kinds of cancer (Figure 5c). For example, CT-AC sites are seldom used in breast cancer, liver cancer, lung cancer, and prostate cancer; in liver cancer, 5′ sites of rare splicing are almost AA.
Figure 5

Percentages of the types of alternative splice sites.

The splice sites include GT-AG, GC-AG, GG-AG, GT-GG, and the others (a) in human cancer (b) and normal tissues. (c) Percentage distribution of the splice sites in five types of cancer and normal tissues (brain, breast, lung, liver, and prostate).

Percentages of the types of alternative splice sites.

The splice sites include GT-AG, GC-AG, GG-AG, GT-GG, and the others (a) in human cancer (b) and normal tissues. (c) Percentage distribution of the splice sites in five types of cancer and normal tissues (brain, breast, lung, liver, and prostate).

Association of cancer-specific alternative splicing of both oncogenes and tumor suppressor genes with cancer

Although both oncogenes and tumor suppressors are thought to be vital factors in tumorigenesis, we sought to identify cancer-specific variants and their possible involvement in cancer. We observed that oncogenes with cancer-specific AS are more often present in ovary cancer (6 oncogenes) and muscle cancer (5 oncogenes), whereas tumor suppressor genes with cancer-specific AS are more frequent in germ cell cancer (6), skin cancer (5), and primitive neuroectodermal cancer (5) (Figure 6). Some oncogenes and tumor suppressors with cancer-specific alternative splicing, such as EWSR1, CDKN1A, and GLTSCR2, are present in more types of cancer. Moreover, neither oncogenes nor tumor suppressors with cancer-specific AS were detected in brain cancer, prostate cancer, adrenal cancer, or lymphoma. This distribution bias for cancer-specific AS implies that the cancer-specific alternative splicing of both oncogenes and tumor suppressor genes is associated with specific cancer types.
Figure 6

Distribution of oncogenes and tumor suppressors with cancer-specific alternative splicing in cancer.

Blue squares indicate oncogenes, red squares indicate tumor suppressors, and yellow squares show both oncogenes and tumor suppressors.

Distribution of oncogenes and tumor suppressors with cancer-specific alternative splicing in cancer.

Blue squares indicate oncogenes, red squares indicate tumor suppressors, and yellow squares show both oncogenes and tumor suppressors.

Biological relevance of the cancer-specific transcripts in the diversification of protein functions

The cancer-specific transcripts were classified based on gene function by searching the RefSeq database and GO. We classified 15,093 cancer-specific transcripts from 9,989 genes into 15 function groups. Protein metabolism and modification, and nucleic acid metabolism are the most prevalent functional processes in cancer. However, the function groups of these cancer-specific transcripts differ in different cancers. For example, the least common process in breast cancer is pre-mRNA processing, whereas the function groups of cell communication and lipid, fatty acid, and steroid metabolism are seldom found in prostate cancer (Figure 7).
Figure 7

Biological processes of alternatively spliced transcripts specific to cancer.

The five cancer types are brain, breast, liver, lung, and prostate cancer. The numbers indicate the percentages for each process in the cancer. The GO process classification is based on the PANTHER (http://www.pantherdb.org/tools/genexAnalysis.jsp).

Biological processes of alternatively spliced transcripts specific to cancer.

The five cancer types are brain, breast, liver, lung, and prostate cancer. The numbers indicate the percentages for each process in the cancer. The GO process classification is based on the PANTHER (http://www.pantherdb.org/tools/genexAnalysis.jsp).

Discussion

The complexity of the transcriptome has been underestimated. In this paper, we described the transcriptome-wide identification and characterization of cancer-specific and alternatively spliced variants in human cancer based on a global view of cancer-specific alternative splicing developed by subtractive transcriptome-wide analysis. Based on an intersection/subtractive model, we have developed an analysis method for precisely screening cancer-specific alternative splicing. The EST sequences were aligned first, compared with their genomic sequences, and then mapped onto chromosomes. These procedures eliminated many EST errors, pseudogene, and multiple-copy/repeat gene problems when data were from diverse EST databases. Finally, the alternatively spliced transcripts were subject to the subtractive screening of a tissue versus all other tissues, and these analyses finally yielded cancer-specific transcripts. We identified a large number of cancer- / normal tissue-specific transcripts. Beyond all doubt, this is an abundant resource for research and the development of new diagnostic, prognostic, predictive, and therapeutic tools against human cancer. Furthermore, these resources are integrated into an available database. The HCSAS database presents a global overview of cancer-specific alternative splicing in humans and is essential for understanding tumorigenesis at a systematic level. There are two main approaches for the global analysis of alternative splicing. First, based on the availability of sequenced genomes and large databases of sequenced transcripts (ESTs and cDNAs), alternative splicing events may be searched through reciprocal transcript alignments and alignments to genomic sequences. Several analyses in this manner have been reported [6], [23]–[29]. Because of its major limitation of EST coverage bias, a microarray-based technology has been developed to search for the alternative splicing events [3], [30]–[36]. Large sets of oligonucleotide probes may be designed specifically for individual exons and/or splice junction sequences, which allow the identification of new AS events. Here we have further developed a systematic method to search for cancer- or tissue-specific AS events in transcriptomes based on the intersection/subtractive screening analyses of transcriptomes, which is especially useful for identifying cancer/tissue-specific variants. Using this method, large numbers of cancer-specific isoforms were identified for the main human cancers. Nevertheless, these transcripts need to be further confirmed for their cancer/tissue specialization. RT-PCR technology and/or microarrays may be useful screening tools for this analysis. Based on the transcriptome-wide analysis, we did observe special patterns of cancer-specific alternative splicing. 1) Less cancer-specific AS events occur in cancer compared to normal tissues. 2) Cancer possesses distribution bias for alternative splicing types. 3) Cancer uses rare splice sites and GT-AG more frequently, but less CT-AC compared to normal tissues. 4) The selection of splice sites differs between different kinds of cancer. 5) The cancer-specific alternative splicing of both oncogenes and tumor suppressor genes is associated with the specific cancer type. And finally, the functional groups of these cancer-specific transcripts differ in different cancers, indicating that individual cancers prefer combination controls of pathways in preference of using AS in tumorigenesis. These special features of human cancers indicate that 1) the cellular splicing machinery is changed during the transformation from normal to cancerous, 2) alternative splicing plays an important role during tumorigenesis, and 3) individual cancers have unique regulatory combinations at the alternative splicing level, which further support the prediction that approximately 60% of disease mutations in the human genome are splicing mutations [12], [13]. Our data includes the discovery of many novel splice forms of cancer-associated genes and alternative-splicing patterns in cancer, and it suggests a significant new direction for human cancer research. We strongly advise the use of cancer-specific alternative splicing as a potential source of new diagnostic, prognostic, predictive, and therapeutic tools against human cancer. The global view of cancer-specific AS is not only useful for exploring the complexity of the cancer transcriptome, but it also widens the eyeshot of clinical research.
  36 in total

1.  A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes.

Authors:  H X Liu; L Cartegni; M Q Zhang; A R Krainer
Journal:  Nat Genet       Date:  2001-01       Impact factor: 38.330

2.  EST comparison indicates 38% of human mRNAs contain possible alternative splice forms.

Authors:  D Brett; J Hanke; G Lehmann; S Haase; S Delbrück; S Krueger; J Reich; P Bork
Journal:  FEBS Lett       Date:  2000-05-26       Impact factor: 4.124

3.  A genomic view of alternative splicing.

Authors:  Barmak Modrek; Christopher Lee
Journal:  Nat Genet       Date:  2002-01       Impact factor: 38.330

4.  Profiling alternative splicing on fiber-optic arrays.

Authors:  Joanne M Yeakley; Jian-Bing Fan; Dennis Doucet; Lin Luo; Eliza Wickham; Zhen Ye; Mark S Chee; Xiang-Dong Fu
Journal:  Nat Biotechnol       Date:  2002-04       Impact factor: 54.908

5.  Selecting for functional alternative splices in ESTs.

Authors:  Zhengyan Kan; David States; Warren Gish
Journal:  Genome Res       Date:  2002-12       Impact factor: 9.043

6.  Frequent alternative splicing of human genes.

Authors:  A A Mironov; J W Fickett; M S Gelfand
Journal:  Genome Res       Date:  1999-12       Impact factor: 9.043

7.  Genome-wide detection of testis- and testicular cancer-specific alternative splicing.

Authors:  Chunjiang He; Zhixiang Zuo; Hengling Chen; Liao Zhang; Fang Zhou; Hanhua Cheng; Rongjia Zhou
Journal:  Carcinogenesis       Date:  2007-08-27       Impact factor: 4.944

8.  Insights into the connection between cancer and alternative splicing.

Authors:  Eddo Kim; Amir Goren; Gil Ast
Journal:  Trends Genet       Date:  2007-12-03       Impact factor: 11.639

Review 9.  Alternative splicing: multiple control mechanisms and involvement in human disease.

Authors:  Javier F Cáceres; Alberto R Kornblihtt
Journal:  Trends Genet       Date:  2002-04       Impact factor: 11.639

Review 10.  Finding signals that regulate alternative splicing in the post-genomic era.

Authors:  Andrea N Ladd; Thomas A Cooper
Journal:  Genome Biol       Date:  2002-10-23       Impact factor: 13.583

View more
  45 in total

1.  c-Myc regulates RNA splicing of the A-Raf kinase and its activation of the ERK pathway.

Authors:  Jens Rauch; Kim Moran-Jones; Valerie Albrecht; Thomas Schwarzl; Keith Hunter; Olivier Gires; Walter Kolch
Journal:  Cancer Res       Date:  2011-04-21       Impact factor: 12.701

2.  Spliceosomal gene mutations are frequent events in the diverse mutational spectrum of chronic myelomonocytic leukemia but largely absent in juvenile myelomonocytic leukemia.

Authors:  Sarah Abu Kar; Anna Jankowska; Hideki Makishima; Valeria Visconte; Andres Jerez; Yuka Sugimoto; Hideki Muramatsu; Fabiola Traina; Manuel Afable; Kathryn Guinta; Ramon V Tiu; Bartlomiej Przychodzen; Hirotoshi Sakaguchi; Seiji Kojima; Mikkael A Sekeres; Alan F List; Michael A McDevitt; Jaroslaw P Maciejewski
Journal:  Haematologica       Date:  2012-07-06       Impact factor: 9.941

3.  Personalized medicine: identifying the appropriate patient through biomarkers in oncology.

Authors: 
Journal:  P T       Date:  2011-07

Review 4.  Targeting Splicing in the Treatment of Myelodysplastic Syndromes and Other Myeloid Neoplasms.

Authors:  Charlotte K Brierley; David P Steensma
Journal:  Curr Hematol Malig Rep       Date:  2016-12       Impact factor: 3.952

Review 5.  Aberrant RNA splicing in cancer; expression changes and driver mutations of splicing factor genes.

Authors:  A Sveen; S Kilpinen; A Ruusulehto; R A Lothe; R I Skotheim
Journal:  Oncogene       Date:  2015-08-24       Impact factor: 9.867

Review 6.  Gene Regulatory Network Perturbation by Genetic and Epigenetic Variation.

Authors:  Yongsheng Li; Daniel J McGrail; Juan Xu; Gordon B Mills; Nidhi Sahni; Song Yi
Journal:  Trends Biochem Sci       Date:  2018-06-22       Impact factor: 13.807

7.  Engineering Artificial Factors to Specifically Manipulate Alternative Splicing in Human Cells.

Authors:  Huan-Huan Wei; Yuanlong Liu; Yang Wang; Qianyun Lu; Xuerong Yang; Jiefu Li; Zefeng Wang
Journal:  J Vis Exp       Date:  2017-04-26       Impact factor: 1.355

8.  SplicerAV: a tool for mining microarray expression data for changes in RNA processing.

Authors:  Timothy J Robinson; Michaela A Dinan; Mark Dewhirst; Mariano A Garcia-Blanco; James L Pearson
Journal:  BMC Bioinformatics       Date:  2010-02-25       Impact factor: 3.169

9.  Direct isolation and RNA-seq reveal environment-dependent properties of engrafted neural stem/progenitor cells.

Authors:  Hiromi Kumamaru; Yasuyuki Ohkawa; Hirokazu Saiwai; Hisakata Yamada; Kensuke Kubota; Kazu Kobayakawa; Koichi Akashi; Hideyuki Okano; Yukihide Iwamoto; Seiji Okada
Journal:  Nat Commun       Date:  2012       Impact factor: 14.919

10.  Increasing the relative expression of endogenous non-coding Steroid Receptor RNA Activator (SRA) in human breast cancer cells using modified oligonucleotides.

Authors:  Charlton Cooper; Jimin Guo; Yi Yan; Shilpa Chooniedass-Kothari; Florent Hube; Mohammad K Hamedani; Leigh C Murphy; Yvonne Myal; Etienne Leygue
Journal:  Nucleic Acids Res       Date:  2009-05-29       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.