| Literature DB >> 30329095 |
Jianbo Tian1, Zhihua Wang2, Shufang Mei1, Nan Yang1, Yang Yang1, Juntao Ke1, Ying Zhu1, Yajie Gong1, Danyi Zou1, Xiating Peng1, Xiaoyang Wang1, Hao Wan1, Rong Zhong1, Jiang Chang1, Jing Gong1,3, Leng Han4, Xiaoping Miao1.
Abstract
Alternative splicing (AS) is a widespread process that increases structural transcript variation and proteome diversity. Aberrant splicing patterns are frequently observed in cancer initiation, progress, prognosis and therapy. Increasing evidence has demonstrated that AS events could undergo modulation by genetic variants. The identification of splicing quantitative trait loci (sQTLs), genetic variants that affect AS events, might represent an important step toward fully understanding the contribution of genetic variants in disease development. However, no database has yet been developed to systematically analyze sQTLs across multiple cancer types. Using genotype data from The Cancer Genome Atlas and corresponding AS values calculated by TCGASpliceSeq, we developed a computational pipeline to identify sQTLs from 9 026 tumor samples in 33 cancer types. We totally identified 4 599 598 sQTLs across all cancer types. We further performed survival analyses and identified 17 072 sQTLs associated with patient overall survival times. Furthermore, using genome-wide association study (GWAS) catalog data, we identified 1 180 132 sQTLs overlapping with known GWAS linkage disequilibrium regions. Finally, we constructed a user-friendly database, CancerSplicingQTL (http://www.cancersplicingqtl-hust.com/) for users to conveniently browse, search and download data of interest. This database provides an informative sQTL resource for further characterizing the potential functional roles of SNPs that control transcript isoforms in human cancer.Entities:
Mesh:
Year: 2019 PMID: 30329095 PMCID: PMC6324030 DOI: 10.1093/nar/gky954
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Identification of sQTLs in the CancerSplicingQTL database. (A) The definition of Percent Spliced In values (20). PSI is the ratio of reads indicating the presence of a transcript element versus the total reads covering the event. In this example, the PSI value is 0.6, indicating that the exon 2 is included in approximately 60% of the transcripts in the sample. (B) The types of splice events analyzed in SplicingQTL. (C) Genotype data collection and processing. (D) Covariates included in sQTL mapping. (E) The values of splice events collection and processing. (F) sQTLs, survival-associated sQTLs and GWAS-related sQTLs identification.
Figure 2.sQTL statistics. (A) The cancer types included in the study. (B) The positive correlation between the number of sQTLs and the sample size. (C) The distribution of sQTLs. Each cyan dot indicates a sQTL plotted according to its distance from the corresponding AS event and statistical significance of its association with AS (–log10P-value). Red line indicates density of sQTLs according to their distance from the corresponding AS event. (D) Bar plot indicates proportions of sQTLs affecting different AS type (AA: alternative acceptor sites, AD: alternative donor sites, AP: alternate promoter, AT: alternate terminator, ES: skipped exon and RI: retained intron).
Overview of sQTLs in each cancer type included in SplicingQTL
| Cancer type | Disease full name | No. of Sample | No. of genotype | No. of splicing | sQTLs | Affected splicing | sQTL pairs | Survival_ sQTLs | GWAS_ sQTLs |
|---|---|---|---|---|---|---|---|---|---|
| ACC | Adrenocortical carcinoma | 77 | 3567953 | 26620 | 17752 | 913 | 24950 | 7 | 4930 |
| BLCA | Bladder urothelial carcinoma | 406 | 4183896 | 32125 | 168597 | 6180 | 289420 | 157 | 44333 |
| BRCA | Breast invasive carcinoma | 1090 | 2746175 | 38428 | 253767 | 11961 | 506672 | 64 | 64008 |
| CESC | Cervical squamous cell carcinoma and endocervical adenocarcinoma | 250 | 4272427 | 33443 | 118989 | 4847 | 190429 | 412 | 31143 |
| CHOL | Cholangiocarcinoma | 36 | 4012151 | 31208 | 64 | 9 | 64 | 0 | 5 |
| COAD | Colon adenocarcinoma | 285 | 4491421 | 27466 | 152518 | 6048 | 255470 | 294 | 39233 |
| DLBC | Lymphoid neoplasm diffuse large B-cell lymphoma | 48 | 4845460 | 26277 | 4445 | 206 | 5641 | 0 | 1254 |
| ESCA | Esophageal carcinoma | 180 | 4463210 | 43937 | 138960 | 5324 | 214082 | 764 | 36443 |
| GBM | Glioblastoma multiforme | 150 | 4556997 | 38904 | 126023 | 4724 | 197274 | 817 | 33604 |
| HNSC | Head and neck squamous cell carcinoma | 499 | 4247759 | 35648 | 236904 | 8109 | 418356 | 698 | 60692 |
| KICH | Kidney chromophobe | 66 | 3771773 | 39171 | 25251 | 1329 | 34571 | 388 | 6542 |
| KIRC | Kidney renal clear cell carcinoma | 527 | 4579516 | 39696 | 325766 | 10887 | 600508 | 493 | 80279 |
| KIRP | Kidney renal papillary cell carcinoma | 290 | 4894174 | 33438 | 162228 | 6001 | 264080 | 1115 | 41681 |
| LAML | Acute myeloid leukemia | 122 | 5120270 | 29804 | 35478 | 1348 | 51024 | 152 | 11042 |
| LGG | Lower grade glioma | 514 | 4632416 | 41896 | 354837 | 11254 | 675128 | 1062 | 85201 |
| LIHC | Liver hepatocellular carcinoma | 369 | 4156507 | 26210 | 119209 | 4407 | 194309 | 229 | 30134 |
| LUAD | Lung adenocarcinoma | 512 | 4383840 | 37236 | 255517 | 8777 | 455348 | 147 | 67226 |
| LUSC | Lung squamous cell carcinoma | 500 | 3742393 | 39640 | 242335 | 9123 | 437645 | 65 | 62268 |
| MESO | Mesothelioma | 87 | 4784881 | 36010 | 49305 | 1734 | 68126 | 809 | 13856 |
| OV | Ovarian serous cystadenocarcinoma | 293 | 2975439 | 41415 | 149571 | 6769 | 254127 | 133 | 39361 |
| PAAD | Pancreatic adenocarcinoma | 176 | 4985375 | 39104 | 140937 | 4946 | 224001 | 771 | 37996 |
| PCPG | Pheochromocytoma and Paraganglioma | 168 | 4707250 | 34321 | 112116 | 4400 | 180122 | 1156 | 29132 |
| PRAD | Prostate adenocarcinoma | 485 | 4823458 | 37654 | 313993 | 10268 | 581617 | 1643 | 75506 |
| READ | Rectum adenocarcinoma | 93 | 4516897 | 29274 | 52896 | 2064 | 76387 | 204 | 14965 |
| SARC | Sarcoma | 248 | 4081096 | 33922 | 124542 | 4944 | 202118 | 737 | 33246 |
| SKCM | Skin cutaneous melanoma | 101 | 4865378 | 34942 | 53912 | 2014 | 74913 | 280 | 15180 |
| STAD | Stomach adenocarcinoma | 408 | 4306085 | 41433 | 207947 | 7311 | 338590 | 280 | 53307 |
| TGCT | Testicular germ cell tumors | 144 | 4791125 | 35758 | 107451 | 3815 | 166457 | 305 | 28328 |
| THCA | Thyroid carcinoma | 493 | 4870332 | 39754 | 359916 | 11265 | 683697 | 1842 | 86793 |
| THYM | Thymoma | 107 | 4892278 | 33234 | 85317 | 3203 | 132081 | 935 | 23473 |
| UCEC | Uterine corpus endometrial carcinoma | 166 | 4941208 | 24707 | 61884 | 2773 | 92641 | 372 | 16929 |
| UCS | Uterine carcinosarcoma | 56 | 3888384 | 32022 | 6586 | 393 | 8485 | 25 | 1729 |
| UVM | Uveal melanoma | 80 | 4737551 | 32067 | 34585 | 1348 | 47524 | 716 | 10313 |
Figure 3.Overview of the CancerSplicingQTL database. (A) Browser bar in SplicingQTL. (B) The single and batch search boxes in SplicingQTL. (C) Three modules in SplicingQTL, including sQTLs, survival-associated sQTLs, and GWAS-related sQTLs. (D) An example of sQTL results on the ‘sQTL’ page. (E) An example of survival-sQTL results in ‘survival-sQTL’ page. (F) An example of a sQTL boxplot on the ‘sQTL’ page. (G) An example of a Kaplan–Meier plot on the ‘survival-sQTL’ page.