Literature DB >> 23555554

Discovery analysis of TCGA data reveals association between germline genotype and survival in ovarian cancer patients.

Rosemary Braun1, Richard Finney, Chunhua Yan, Qing-Rong Chen, Ying Hu, Michael Edmonson, Daoud Meerzaman, Kenneth Buetow.   

Abstract

BACKGROUND: Ovarian cancer remains a significant public health burden, with the highest mortality rate of all the gynecological cancers. This is attributable to the late stage at which the majority of ovarian cancers are diagnosed, coupled with the low and variable response of advanced tumors to standard chemotherapies. To date, clinically useful predictors of treatment response remain lacking. Identifying the genetic determinants of ovarian cancer survival and treatment response is crucial to the development of prognostic biomarkers and personalized therapies that may improve outcomes for the late-stage patients who comprise the majority of cases.
METHODS: To identify constitutional genetic variations contributing to ovarian cancer mortality, we systematically investigated associations between germline polymorphisms and ovarian cancer survival using data from The Cancer Genome Atlas Project (TCGA). Using stage-stratified Cox proportional hazards regression, we examined >650,000 SNP loci for association with survival. We additionally examined whether the association of significant SNPs with survival was modified by somatic alterations.
RESULTS: Germline polymorphisms at rs4934282 (AGAP11/C10orf116) and rs1857623 (DNAH14) were associated with stage-adjusted survival (p= 1.12e-07 and 1.80e-07, FDR q= 1.2e-04 and 2.4e-04, respectively). A third SNP, rs4869 (C10orf116), was additionally identified as significant in the exome sequencing data; it is in near-perfect LD with rs4934282. The associations with survival remained significant when somatic alterations.
CONCLUSIONS: Discovery analysis of TCGA data reveals germline genetic variations that may play a role in ovarian cancer survival even among late-stage cases. The significant loci are located near genes previously reported as having a possible relationship to platinum and taxol response. Because the variant alleles at the significant loci are common (frequencies for rs4934282 A/C alleles = 0.54/0.46, respectively; rs1857623 A/G alleles = 0.55/0.45, respectively) and germline variants can be assayed noninvasively, our findings provide potential targets for further exploration as prognostic biomarkers and individualized therapies.

Entities:  

Mesh:

Substances:

Year:  2013        PMID: 23555554      PMCID: PMC3605427          DOI: 10.1371/journal.pone.0055037

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Ovarian cancer accounts for about three percent of all cancers in women and is the fifth leading cause of cancer-related death among women in the United States, with an age-adjusted incidence rate of 12.8 per 100,000 women per year and death rate of 8.6 per 100,000 women per year (2003–2007) [1]. Of the gynecological cancers, ovarian cancer has the highest mortality, with an overall five-year survival rate of 43.7% for white women and 34.9% for black women [1]. The poor survival statistics are attributable to the late stage at which ovarian cancers are diagnosed due to their asymptomatic nature: while stage I tumors have a 92.4% relative survival rate, they account only for 15% of ovarian cancer diagnoses; by contrast, stage III and IV cancers have survival rates of 34% and 18%, respectively, and together account for 65.4% of diagnoses [1]. Response to standard chemotherapy (platinum plus taxane) is highly variable [2], [3], and tends to be poor for advanced cases [2]. Understanding the genetic determinants of ovarian cancer survival and response to treatment may improve these statistics, particularly for stage III and IV patients who comprise the majority of cases. In particular, identifying variations that predict response to chemotherapy allows for the possibility of administering alternate therapies that may improving outcomes. Previous studies have examined the role of genetic variation in ovarian cancer susceptibility, progression, treatment response, and survival. It has been shown that BRCA1/2 germline mutations contribute to 10–15% of cases [4], and analysis of data from The Cancer Genome Atlas Project (TCGA [5]) has also shown that that BRCA1/2 germline mutation, somatic mutations and promoter methylation effect ovarian cancer survival [5]. Additionally, candidate gene studies have shown that polymorphisms in MDM2, along with TP53 status and SULF1, are associated with ovarian cancer survival [6]–[8]. Recently, Huang and coworkers reported a genetic variation is associated with carboplatin cytotoxicity in vitro and in vivo [3], a finding which may explain differential responsiveness to the standard platinum–based ovarian cancer therapy. The same authors later showed that the identified locus regulates miRNAs that contribute to platinum sensitivity, suggesting a mechanism of action [9]. To date, however, a clinically useful genomic marker of ovarian cancer survival remains elusive. The platinum–associated SNP investigated by Huang was not found to be significantly associated with survival in a validation cohort [3]. Likewise, Bolton and co-workers successfully identified several loci associated with ovarian cancer susceptibility, but those they initially found to be associated with survival failed to reach significance in the validation set [10], although it is hoped that future studies of this cohort will result in established associations with clinical outcome [10]. While tumor gene expression signatures predictive of treatment response and relapse have been reported (e.g., [11], [12]), their clinical utility is limited by the cost, invasiveness, and variability inherent in evaluating tumor gene expression. Likewise, somatic copy number changes in certain genes have recently been reported to influence survival [13], but the utility of measuring CNV as a prognostic test is similarly limited. The Cancer Genome Atlas Project (TCGA [5]) provides a collection of genomic and clinical data in which associations between genetics and survival can be thoroughly explored. Here, we carry out a genome-wide analysis to systematically investigate associations between germline genetic variation and overall survival in TCGA patients diagnosed with ovarian cancer (serous cystadenocarcinoma) [14]. The patients had an age and stage distribution typical of ovarian cancer, as shown in Table 1. Using the clinical and Affymetrix SNP6.0 (“SNP6”) genotype data, we identified two single nucleotide polymorphism (SNP) loci at which the germline genotype is predictive of overall survival in ovarian cancer patients. The associations remain significant after adjusting for stage, and are associated with survival even amongst stage III patients. This suggests that constitutional genetic variation may play a role in treatment response and provides a potential avenue for a non-invasive prognostic biomarker test.
Table 1

Stage and age at diagnosis, organized by 5-year survival.

Censored 5 yrsSurvival 5 yrsSurvival 5 yrsAll
187 (38%)228 (47%)74 (15%)489
Stage1.3e-02
I102214
II116825
III14018153374
IV26391176
Age58.2 (49.6, 65.5)59.9 (51.0, 68.1)62.3 (54.7, 71.4)59.1 (51.4, 69.1)6.1e-02

Stage and age at diagnosis of samples, organized by 5-year survival. Median age is reported, with the first and third quartiles given in parentheses. values for the univariate association between stage and survival and age and survival (logrank test) are also given.

Stage and age at diagnosis of samples, organized by 5-year survival. Median age is reported, with the first and third quartiles given in parentheses. values for the univariate association between stage and survival and age and survival (logrank test) are also given.

Results

Here, we report the association between germline SNPs and patient survival using TCGA ovarian cancer data. The filtered data comprised a total of 662,521 SNPs assayed in 489 clinically annotated ovarian cancer samples, with stage and age distributions as given in Table 1. Each of the 662,521 SNPs meeting the filtration criteria were tested for association with survival using Cox proportional hazards regression adjusted for stage using a non-additive model. Two SNPs, rs4934282 (A/C) in the gene AGAP11 (previously associated with C10orf116) and rs1857623 (A/G) upstream of DNAH14, showed a statistically significant univariate association with overall ovarian cancer survival, as summarized in Table 2. A plot of the -values obtained is given in Figure 1. We additionally computed the per-allele hazard ratios for these SNPs using an additive model, obtaining HR = 0.599 ( = 1.28e-08) for the allele at rs4934282 and HR = 1.425 ( = 1.70e-05) for the allele at rs1857623. It should be noted that due to the small sample size, the power to detect a SNP with MAF = 0.45 (as these are) with  = 1e-06 is 32% for HR = 0.6 and 3.5% for HR = 1.4; it is therefore likely that other SNPs with similar effect sizes may have been missed by chance in this analysis.
Table 2

Stage-adjusted survival.

rs4934282 (AGAP11/C10orf116, chr10:88732476)
(482)HR
AA 146(ref)(ref)(ref)(ref)
AC 2310.6868.0e-037.5e-018.4e-03
CC 1050.3553.6e-083.7e-03 2e-06
Logrank1.1e-071.2e-04 2e-06

Significant survival associations after stratification by stage; rs4934282 and rs1857623 are from SNP6 data, rs4869 is from exome/capture data (29 SNPs tested). All tests of Schoefeld residuals had , meeting the proportional hazards assumption.

Figure 1

QQ plots.

Quantile-quantile plot for observed values for the likelihood ratio tests of the stage-adjusted Cox models versus the expected distribution of values under independent null hypotheses. Points above the line indicate values that are more significant than expected; a large systematic deviation from this line would be indicative of population substructure driving the results. The two SNPs identified as significant, rs4934282 and rs1857623, lie well above the line and outside the small systematic deviation.

QQ plots.

Quantile-quantile plot for observed values for the likelihood ratio tests of the stage-adjusted Cox models versus the expected distribution of values under independent null hypotheses. Points above the line indicate values that are more significant than expected; a large systematic deviation from this line would be indicative of population substructure driving the results. The two SNPs identified as significant, rs4934282 and rs1857623, lie well above the line and outside the small systematic deviation. Significant survival associations after stratification by stage; rs4934282 and rs1857623 are from SNP6 data, rs4869 is from exome/capture data (29 SNPs tested). All tests of Schoefeld residuals had , meeting the proportional hazards assumption. To illustrate the effect of rs4934282 (AGAP11/C10orf116) and rs1857623 (DNAH14) germline genotype on survival among patients with similar tumor stage, Kaplan-Meier plots for the 372 Stage III patients are given in Figures 2 and 3. Notably, the CC genotype at rs4934282 in AGAP11/C10orf116 confers a protective effect, nearly doubling the median survival time over the AA genotype group. Additionally, patients with homozygous CC at rs4934282 have a five-year survival rate of 45%, vs. 34% overall for Stage III patients [1].
Figure 2

Survival of stage-III ovarian cancer patients by rs4934282 genotype.

Kaplan-Meier survival plots for Stage III patients, stratified by germline genotype at rs4934282 (AGAP11): AA, black; AC, blue; CC, red. Confidence intervals are shown as a shaded region around each Kaplan-Meier curve. Censored observations are denoted with vertical ticks. The dashed horizontal and vertical lines mark 50% survival and five years (1825 days) respectively.

Figure 3

Survival of stage-III ovarian cancer patients by rs1857623 genotype.

Kaplan-Meier survival plots for Stage III patients, stratified by germline genotype at rs1857623 (DNAH14): AA, black; AG, blue; GG, red. Confidence intervals are shown as a shaded region around each Kaplan-Meier curve. Censored observations are denoted with vertical ticks. The dashed horizontal and vertical lines mark 50% survival and five years (1825 days) respectively.

Survival of stage-III ovarian cancer patients by rs4934282 genotype.

Kaplan-Meier survival plots for Stage III patients, stratified by germline genotype at rs4934282 (AGAP11): AA, black; AC, blue; CC, red. Confidence intervals are shown as a shaded region around each Kaplan-Meier curve. Censored observations are denoted with vertical ticks. The dashed horizontal and vertical lines mark 50% survival and five years (1825 days) respectively.

Survival of stage-III ovarian cancer patients by rs1857623 genotype.

Kaplan-Meier survival plots for Stage III patients, stratified by germline genotype at rs1857623 (DNAH14): AA, black; AG, blue; GG, red. Confidence intervals are shown as a shaded region around each Kaplan-Meier curve. Censored observations are denoted with vertical ticks. The dashed horizontal and vertical lines mark 50% survival and five years (1825 days) respectively. To further investigate variation in the genomic regions surrounding these SNPs, we examined exome/capture sequencing data (for 375 patients with available germline data) in 100 Kbp windows centered about the two SNPs identified as significant in the SNP6 data, specifically chr10:88672456–88772455 and chr1:223081228–223181227. For ten samples with available whole-genome data, we were able to compare the intronic rs4934282 and rs1857623 Affymetrix SNP6.0 calls to those from the whole-genome sequencing, confirming the validity SNP6 calls. Of the 29 exome/capture SNPs tested (see Table 3) in the 375 samples, only rs4869 in C10orf116 remained significant after adjusting for the multiple hypotheses (FDR  = 9.89e-03). rs4869 is located b.p. upstream of rs4934282 and is in near-perfect linkage disequilibrium with rs4934282 (A/C at rs4934282 correlating with C/T at rs4869, respectively). rs4869 encodes a synonymous mutation in C10orf116 (Ile68Ile). We also investigated whether the variant alleles at any of these 29 loci led to deleterious nonsynonymous protein alterations; only five SNPs had mis-sense allelic variations, none of which were predicted to be deleterious (Table 4).
Table 3

Cox model results for exome/capture loci.

Alleles samplesABBB
(lr)6–7 (lr)8–11 (lr)12–13 (lr)14–15 SourcersIDChrPositionGeneABAAABBBtotHR (HR)HR (HR) .logRank
NextGenrs70740641088673102BMPR1ATC220115293640.946.97e-011.097.35e-018.46e-01
NextGenrs44470761088686361MMRN2AG8281311941.322.12e-010.928.02e-013.59e-01
NextGenrs345870131088686602MMRN2CG2874023291.097.15e-019.33e-01
NextGenrs49342811088692330MMRN2GC1252692951.616.51e-012.154.49e-014.77e-01
NextGenrs108876731088692370MMRN2GA8254101460.701.81e-010.887.96e-014.06e-01
NextGen1088694221MMRN2GT2892003091.402.86e-012.84e-01
NextGenrs37508221088695286MMRN2TG63212860.538.97e-022.28e-01
NextGenrs42449731088707120MMRN2TA5312913271.277.08e-011.634.03e-014.45e-01
NextGenrs37508231088707134MMRN2CT139125633271.163.64e-011.077.52e-016.62e-01
NextGenrs18003731088708416SNCGAC111115903161.077.02e-010.958.17e-018.34e-01
NextGenrs7601131088709769SNCGCG1597292400.978.73e-010.592.53e-015.13e-01
NextGenrs98641088712378SNCGAT203105123200.957.47e-010.501.28e-012.98e-01
NextGenrs626210861088712453SNCGTG1662021880.764.43e-014.53e-01
NextGenrs22796011088720157C10orf116AG6157421601.451.79e-011.765.66e-021.40e-01
NextGenrs48691088720292C10orf116TC1001481063541.724.05e-032.062.75e-047.44e-04 *
NextGenrs79601088720354C10orf116CT7599382121.233.26e-011.773.65e-021.08e-01
SNP6.0rs49342821088732476AGAP11AC1382281004660.722.26e-020.373.12e-071.04e-06 *
NextGen1088748032AGAP11GT2522702790.897.00e-016.99e-01
NextGenrs12403701088748297AGAP11TC97118382531.086.85e-011.107.21e-018.99e-01
NextGenrs12403711088748466AGAP11CG122612502.123.22e-011.099.17e-014.62e-01
NextGenrs12404071088753935AGAP11TC16701051911.702.07e-012.175.95e-021.20e-01
NextGenrs726442401088754336AGAP11TG32641700.817.90e-011.019.88e-018.51e-01
NextGen1088757859AGAP11CT20450652.099.55e-028.85e-02
NextGen1088757910AGAP11GT33348751.245.75e-011.385.72e-017.87e-01
NextGenrs361043281088758019AGAP11AG1755842370.833.92e-016.88e-01
NextGenrs26415631088758233AGAP11AG161071612841.701.95e-012.175.26e-027.78e-02
NextGenrs26415621088758403AGAP11AG86119522571.233.08e-012.032.77e-038.99e-03
NextGenrs17459011088758637AGAP11CT211241813261.789.58e-022.192.12e-024.38e-02
NextGen1223127807CT29250541.404.69e-014.67e-01
SNP6.0rs18576231223131228DNAH14AG1452201044690.883.91e-012.042.39e-051.06e-07 *
NextGen1223142168AG19340531.167.34e-017.34e-01

Cox model results for exome/capture loci, with significant SNP6.0 loci provided in context. Given are the sample numbers, hazard ratios and associated values for each genotype, as well as the logrank values for the Cox model.

denotes significant SNPs;

denotes non-specific regions in the exome/capture data that may reflect variation from another genomic region.

Table 4

SIFT and logRE predictions for missense SNPs.

substitutionSIFTlogRE
rsIDChrPositionGeneRefSeqNTAAscorepredictionscoreprediction
rs37508231088707134MMRN2NM_024756c.G145Ap.G49S0.15borderline0.27neutral
rs49342811088692330MMRN2NM_024756c.C2191Gp.H731D0.62neutralNANA
rs345870131088686602MMRN2NM_024756c.G2728Cp.V910L0.91neutralNANA
rs98641088712378SNCGNM_003087c.A329Tp.E110V0.15borderline0.36neutral
rs26415631088758233AGAP11NM_133447c.A244Gp.I82V1.00neutralNANA

SIFT and logRE predictions for missense SNPs. Shown are the location, gene, and RefSeq IDs for the SNPs, the nucleotide (NT) and amino acid (AA) substitutions, and the SIFT and logRE scores and predictions. SIFT scores are classified into predictions as follows: 0.00—0.05, probably damaging; 0.051—0.10, possibly damaging; 0.101—0.20, borderline; 0.201—1.00, neutral. logRE scores are classified into predictions as follows: 1—up, probably damaging; 0.7—0.99, possibly damaging; 0.5—0.69, borderline; 0.0—0.49, neutral.

Cox model results for exome/capture loci, with significant SNP6.0 loci provided in context. Given are the sample numbers, hazard ratios and associated values for each genotype, as well as the logrank values for the Cox model. denotes significant SNPs; denotes non-specific regions in the exome/capture data that may reflect variation from another genomic region. SIFT and logRE predictions for missense SNPs. Shown are the location, gene, and RefSeq IDs for the SNPs, the nucleotide (NT) and amino acid (AA) substitutions, and the SIFT and logRE scores and predictions. SIFT scores are classified into predictions as follows: 0.00—0.05, probably damaging; 0.051—0.10, possibly damaging; 0.101—0.20, borderline; 0.201—1.00, neutral. logRE scores are classified into predictions as follows: 1—up, probably damaging; 0.7—0.99, possibly damaging; 0.5—0.69, borderline; 0.0—0.49, neutral. Finally, we used data derived from normal–paired tumor samples to assess whether the strong effect of germline genotype on survival was significantly mediated or moderated by tumor gene expression gain or loss of copy number in the tumor, or by loss of heterozygosity (see File S1) to test the hypothesis that the effect of germline genotype on ovarian cancer survival might be influenced by somatic events. We found no significant association of tumor gene expression, copy number variation, or loss of heterozygosity in these regions with survival (see File S1). Rather, the large effect of germline genotype at the loci on patient survival is independent of these somatic changes, and appears to suggest that constitutional genetic variation in these regions plays a role in treatment response.

Discussion

Recent studies have demonstrated that common genetic variants are associated with ovarian cancer risk [15], [16]. However, it remains difficult to predict ovarian cancer survival independent of stage; current clinical findings show that tumor response and extreme drug resistance in vitro are not good predictors of ovarian cancer survival [17], [18]. In our study, we comprehensively tested the SNPs assayed in the TCGA SNP6.0 data for association with survival, and additionally analyzed whole-genome and exome/capture SNPs in the genomic regions surrounding the significant SNP6.0 SNPs. We identified three SNPs in two genomic regions that had a statistically–significant association with survival. As shown in Table 2, the hazard ratios for homozygous minor alleles approached or exceeded two-fold in stage-stratified Cox proportional hazard models, and the per-allele effect sizes for these SNPs using a stage-stratified additive genotype model were HR = 0.599 and HR = 1.425 for rs4934282 and rs1857623 , respectively. Interestingly, none of the somatic variations we examined (tumor gene expression, copy number variation, and loss of heterozygosity) were associated either with the germline genotype at these loci or with survival, despite a plausible hypothesis that somatic changes in the tumor might have an effect on the genotype–survival association. Rather, these SNPs are strongly predictive of survival independent of somatic changes that had already occurred in the tumor (see File S1). Two of the survival–associated SNPs are located within a 2200 bp region on chromosome 10 (rs4934282 at chr10:88732476 and rs4869 at chr10:88730312) and are in near–perfect LD in this data. This genomic region is associated with C10orf116 (chr10:88727949–88730672) and AGAP11 (chr10:88730498–88769960), which overlap; the biological significance of the variation probed by rs4934282 and rs4869 may be associated with either. AGAP11 is a member of the ankyrin repeat and GTPase domain Arf GTPase activating protein gene family [19]. C10orf116 (also referred to as APM2) is a protein of unknown function that is homologous to the medium chain of mammalian clathrin-associated protein complex and is involved in vesicular transport in yeast. The genomic region containing rs4934282 and rs4869 is shown in Figure 4.
Figure 4

Genomic region containing rs4934282 and rs4869.

Detailed description of the genomic region of chromosome 10 containing rs4934282 (second SNP from the right) and rs4869 (shown in green). Note the overlap between AGAP11 and C10orf116.

Genomic region containing rs4934282 and rs4869.

Detailed description of the genomic region of chromosome 10 containing rs4934282 (second SNP from the right) and rs4869 (shown in green). Note the overlap between AGAP11 and C10orf116. While little prior evidence exists linking AGAP11 to cancer susceptibility, survival, or treatment response, some evidence exists for the role of C10orf116. C10orf116/APM2 expression has been implicated in other gynecological cancers; for instance, is has been shown to strongly differentiate between the BRCA1 associated breast tumor subclasses ESR1-positive and ESR1-negative [20] and is has been found to be downregulated in utering cancer in a number of studies [21]. More recently, C10orf116 has been shown to exhibit differential expression in different pathological grades of ovarian carcinoma [22] and in the response of breast cancer to chemotherapy [23], [24]. More importantly, there exists from cell lines pointing to C10orf116 as a mediator of cisplatin resistance. Ovarian cancer has been treated with platinum compounds for many years [25], [26], with cisplatin and carboplatin (which has a more acceptible toxicity profile) as a standard therapy for newly–diagnosed stage III ovarian cancers [26], [27]. However, while many patients respond to initial treatment, the five-year survival rates remain poor (34% overall for stage III [1]). APM2 (C10orf1116) has been shown to promote cisplatin resistance when overexpressed in HCT116 cell lines that were sensitive to chemotherapy and radiation [28], suggesting a possible mechanism by which rs4869 and rs4934282 influence survival. Silencing of APM2 by shRNA was shown to enhance the cytotoxic effects of cisplatin on tumor xenografts grown in CD-1 nude mice. Additionally, APM2 was found to be overexpressed in cisplatin resistant gastric cancer cells, but not in gastric cancer cells resistant to 5-FU or doxorubicin [29]. More recently, it was found that rs1649942, a SNP located 5 Mb upstream of rs4934282/rs4869, had a modest association with carboplatin-induced cytotoxicity and the survival of ovarian cancer patients following carboplatin-based chemotherapy [3]. Although this SNP failed to reach significance in their phase 2 validation analysis (and likewise not significant in our study), it adds to the body of evidence implicating this genomic region in platinum sensitivity. The third significant SNP, rs1857623, is found in an intergenic region on chromosome 1, 53 Kb upstream of DNAH14 and 136 Kb downstream from CNIH3. DNAH14 belongs to the dynein heavy chain family, a motor protein which attaches to microtubules and walks along cytoskeletal microtubules [30]. The mechanism by which variation in DNAH14 may impact survival is less clear. One possible avenue for future studies is its potential role in the context of taxol therapy: DNAH14 contains the microtubule-binding stalk of dynein motor (pfam12777 at Location:2910–3244 of reference protein NP_001364.1), and it has been demonstrated that taxol binds microtubules [28]. DNAH14 has also been found to be differentially regulated in response to taxane therapy in gastric cancers [31] and doxorubicin therapy in endometrial cells [32]. These findings suggest that consitutional genetic variations in these regions may play a role in ovarian cancer survival even among late-stage cases. However, it should be noted that the results presented here constitute a discovery–based analysis that did not include a validation cohort. As such, the findings may be spurious false positives, and require confirmation in follow–up studies. If validated, these SNPs may have important clinical potential as prognostic biomarkers since germline genotype can be assayed noninvasively and because the variant alleles at the significant loci are common (frequencies for rs4934282 A/C alleles = 0.54/0.46 respectively; rs1857623 A/G alleles = 0.55/0.45, respectvely; both comparable to allele frequencies for the Caucasian CEPH population in HapMap [33]). The significant loci are located in genes previously identified as having a possible relationship to chemotherapeutic response, suggesting that their association with survival may be due to their influence on treatment response. Our study suggests potential targets for prognositic tests and individualized therapies, and provides a basis for follow-up research.

Materials and Methods

Data

Data were collected by the TCGA project as described elsewhere [14]. Follow-up times, vital status, tumor stage, and germline genotype data were obtained from the TCGA project [14] via the data portal on 06/03/2011.

SNP6 genotypes

Genotype calls for the 906,600 SNP probes assayed using the Affymetrix GenomeWide SNP6.0 platform and processed using Birdseed were obtained from TCGA. Samples that did not pass the TCGA quality control (per the TCGA copy number Sample Data Relationship Format file) were removed. A total of 496 ovarian serous cystadenocarcinoma patients had survival time and germline (either blood or tumor-adjacent normal) genotype data. Genotype calls were coded as 0, 1 or 2 according to the number of variant alleles and filtered according to a Birdseed confidence threshold of 0.05. The genotype data were subject to additional quality control filtration criteria as follows. SNPs with call rates or minor allele frequencies were excluded, as were SNPs out of Hardy Weinberg equilibrium with . All samples with a call rate below 80% were excluded. Identity by state was computed using the R GenABEL package, and closely related samples with IBS were removed. The SNP and sample filtration criteria were applied iteratively until all samples and SNPs met the stated thresholds. In total, 489 samples and 662,521 SNPs passed were kept in the analysis.

Tumor stage

Stage subcategories were coalesced for the purposes of this analysis into summary stage categories yielding four stage classifications (i.e., Stage IA, IB, IC were treated as Stage I, etc.). The number of samples in each stage category is given in Table 1.

Exome/capture data

Next generation exome/capture sequencing data were also retrieved for 375 patients with available germline data. The analysis was restricted to 100 Kbp windows centered about the two SNPs identified as significant in the SNP6 data, specifically chr10:88672456–88772455 and chr1:223081228–223181227. Graphical descriptions of these genomic regions are provided in Figures 5 and 6.
Figure 5

Genomic region surrounding rs4934282.

Image from cgwb.nci.nih.gov of selected tracks for genome build NCBI36 (hg18) for the region surrounding two germline variations associated with survival in ovarian cancer in C10orf116/AGAP11 region on chromosome 10. The tracks are a custom track showing the SNPs rs4869 and rs4934282, RefSeq gene, mRNA, spliced ESTs and mapability.

Figure 6

Genomic region surrounding rs1857623.

Image from cgwb.nci.nih.gov of selected tracks for genome build NCBI36 (hg18) for the region surrounding a germline variation associated with survival in ovarian cancer upstream of DNAH14 on chromosome 1. The tracks are a custom track showing the SNP rs1857623, RefSeq gene, mRNA, spliced ESTs and mapability.

Genomic region surrounding rs4934282.

Image from cgwb.nci.nih.gov of selected tracks for genome build NCBI36 (hg18) for the region surrounding two germline variations associated with survival in ovarian cancer in C10orf116/AGAP11 region on chromosome 10. The tracks are a custom track showing the SNPs rs4869 and rs4934282, RefSeq gene, mRNA, spliced ESTs and mapability.

Genomic region surrounding rs1857623.

Image from cgwb.nci.nih.gov of selected tracks for genome build NCBI36 (hg18) for the region surrounding a germline variation associated with survival in ovarian cancer upstream of DNAH14 on chromosome 1. The tracks are a custom track showing the SNP rs1857623, RefSeq gene, mRNA, spliced ESTs and mapability. Binary Sequence Alignment/Map (BAM) files were downloaded from dbGAP, using for each sample the largest available normal BAM file. The “mpileup” and “bcftools” features of SAMtools [34] were used to generate the variant call information, with calling criteria as follows: if the coverage in a given sample for a given locus was less than the coverage threshold (see following paragraph), no call was made; otherwise, if non-reference allele frequency was less than 10%, the call was “homozygous reference;” if the non-reference frequency was greater than 90%, the call was “homozygous nonreference;” if it was between 10% and 90%, the call was “heterozygous.” To set the coverage threshold for the exome/capture data, we compared the exome/capture calls to the SNP6 germline genotype calls for 41 tag SNPs located in those regions. Treating the SNP6 calls as the gold standard for accuracy, we define the “mismatch rate” to be the number of calls for exome/capture and SNP6 data differ, divided by the total number of exome/capture calls made at that coverage depth. As coverage threshold is increased and the exome/capture data becomes more reliable, the mismatch rate decreases, but fewer exome/capture calls can be made. We varied the coverage threshold from 5 to 30, selecting the lowest coverage that yielded a mismatch rate smaller than 0.05. The optimum coverage was 9 (with a mismatch rate of 0.045). We considered a locus to be informative (ie, having sufficient variation) if at least 20 germline samples had a heterozygous call at that coverage threshold; these criteria yield 29 total informative SNPs in the 100 Kbp regions surrounding rs4934282 and rs1857623, shown in Table 3, which we considered in the analysis.

Survival analysis

Survival analysis was carried out in R [35] using the “survival” package [36]. For each SNP represented in the data, Cox proportional hazards regression was used to model survival as a function of genotype. Because of the significant association of stage with survival, all models were stratified by stage. Genotype calls were treated as categorical variables with 0 as the referent group to avoid imposing linearity in the number of variant alleles. Each model yielded two hazard ratios per SNP (one for genotype = 1 with respect to genotype = 0 and another for genotype = 2 w.r.t. genotype = 0). The significance of the association was assessed using the logrank (Score) test [37]. A test of Schoenfeld residuals was used to check whether the proportional hazards assumption was met; only models with were considered valid. 639,510 SNPs tested met the proportional hazards assumption. Because the large number of SNPs implies a vast number of hypotheses being tested, multiple testing adjustments were made to the values. This was done in two ways. We report both the false discovery rate [38] () for the values obtained for the parametric tests described above. In addition, we report permutation values obtained using 600,000 independent resamplings of the data. Permutation tests, while computationally intensive, are considered the strongest and most appropriate control of type-I error rates in genome-wide studies [39]–[41]. To investigate the existence and effect of any population stratification, the R package GenABEL [42] was used to examine population substructure. The genomic inflation factor was estimated to be , indicating that population substructure, if present, should have no appreciable effect on the results. Using a randomly selected set of 12,000 independent (pairwise LD ) SNPs with MAF, population substructure was examined using principal component analysis. Pairwise plots of the first four components are provided in the File S2. We adjusted the models in two ways: using the first four PCs, and using cluster assignments identified from the PCA using R package mclust [43]. As expected based on , we observed no appreciable changes in the Cox model results (data not shown). The results presented here are therefore not adjusted for population substructure.

Sequencing data analysis

We compared the SNP6 genotypes at the significant loci (chr10:88722456 and chr1:223131228) to those from whole-genome sequencing data for 10 available samples; all 10 matched the SNP6 calls for the significant SNPs, supporting the SNP6 genotype calls. For the two SNPs showing significant association with survival in the SNP6 data, we further investigated the surrounding genomic regions using combined whole-genome and exome/capture sequencing data. We investigated 29 SNPs in the the genomic regions surrounding rs4934282 and rs1857623 shown in Table 3 and chosen as described above. Stage-stratified Cox proportional hazards models were then constructed for the germline genotypes as described above. It should be noted neither rs4934282 nor rs1857623 were included due to insufficient exome/capture data (rs4934282 is in an intronic region and hence not assayed in the exome/capture data; rs1857623 had no calls in the majority of samples). It should be noted that not all the genomic regions contributing to these data have unique sequences. To assess this, we used the “mapability” criteria as implemented in CGWB [44]: for each locus under consideration, we consider a sliding 75 base-pair window containing that locus and attempt to match it to other regions in the genome; the locus is flagged as unique if, for every position of the sliding window, the sequence only maps to the location of the window and no other genomic region. Loci for which some (or all) positions of the sliding window contain sequences that map to multiple genomic regions are flagged with a dagger in Table 3, denoting that the reads contributing to the calls at that locus may be nonspecific.

Prediction of amino-acid substitutions

We examined the SNPs in Table 3 for mis-sense substitutions using program ANNOVAR [45] and predicted their functional impact on protein sequences with logRE and SIFT. LogRE is the of the ratio of HMMER -values for the fit to a PFAM motif domain of two amino acid sequences that differ by an amino acid substitution. A logRE score whose absolute value is greater than or equal to 1 indicates that the amino acid alteration is likely to affect protein [46]. SIFT is a sequence homology-based tool that Sorts Intolerant From Tolerant amino acid substitutions and predicts deleterious amino acid substitutions. SIFT values are predicted to be deleterious [47]. Of the SNPs considered above five mis-sense snps were identified: three in MMRN2 (rs3750823, rs4934281, rs34587013), one in SNCG (rs9864), and one in AGAP11 (rs2641563). However, there is no evidence that these amino acid changes have functional impact on the proteins (Table 4).

Analysis of somatic variations

To test the hypothesis that somatic changes might have an additive or moderating effect on the association between germline genotype and ovarian cancer survival, we used TCGA data derived from paired tumor samples to assess whether tumor gene expression, gain or loss of copy number in the tumor, or loss of heterozygosity were significantly associated with survival. A full description of the methods and results for this analysis is given in the File S1. None of these additional covariates were significant. Methods and results of analysis of somatic variations. (PDF) Click here for additional data file. Methods and results of population substructure analysis. (PDF) Click here for additional data file.
  40 in total

Review 1.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application.

Authors:  Rita M Cantor; Kenneth Lange; Janet S Sinsheimer
Journal:  Am J Hum Genet       Date:  2010-01       Impact factor: 11.025

2.  Differential gene expression identifies subgroups of ovarian carcinoma.

Authors:  Amy P N Skubitz; Stefan E Pambuccian; Peter A Argenta; Keith M Skubitz
Journal:  Transl Res       Date:  2006-11       Impact factor: 7.012

Review 3.  Fresh platinum complexes with promising antitumor activity.

Authors:  Xiaoyong Wang
Journal:  Anticancer Agents Med Chem       Date:  2010-06       Impact factor: 2.505

4.  Identification of predicted human outer dynein arm genes: candidates for primary ciliary dyskinesia genes.

Authors:  G J Pazour; N Agrin; B L Walker; G B Witman
Journal:  J Med Genet       Date:  2005-06-03       Impact factor: 6.318

5.  Genetic variation that predicts platinum sensitivity reveals the role of miR-193b* in chemotherapeutic susceptibility.

Authors:  Dana Ziliak; Eric R Gamazon; Bonnie Lacroix; Hae Kyung Im; Yujia Wen; Rong Stephanie Huang
Journal:  Mol Cancer Ther       Date:  2012-06-29       Impact factor: 6.261

Review 6.  Role of genetic polymorphisms and ovarian cancer susceptibility.

Authors:  Peter A Fasching; Simon Gayther; Leigh Pearce; Joellen M Schildkraut; Ellen Goode; Falk Thiel; Georgia Chenevix-Trench; Jenny Chang-Claude; Shan Wang-Gohrke; Susan Ramus; Paul Pharoah; Andrew Berchuck
Journal:  Mol Oncol       Date:  2009-02-04       Impact factor: 6.603

7.  Systematic analysis of genetic alterations in tumors using Cancer Genome WorkBench (CGWB).

Authors:  Jinghui Zhang; Richard P Finney; William Rowe; Michael Edmonson; Sei Hoon Yang; Tatiana Dracheva; Jin Jen; Jeffery P Struewing; Kenneth H Buetow
Journal:  Genome Res       Date:  2007-05-24       Impact factor: 9.043

Review 8.  Tumour-inhibiting platinum complexes--state of the art and future perspectives.

Authors:  M A Jakupec; M S Galanski; B K Keppler
Journal:  Rev Physiol Biochem Pharmacol       Date:  2003-02-04       Impact factor: 7.500

9.  Common variants at 19p13 are associated with susceptibility to ovarian cancer.

Authors:  Kelly L Bolton; Jonathan Tyrer; Honglin Song; Susan J Ramus; Maria Notaridou; Chris Jones; Tanya Sher; Aleksandra Gentry-Maharaj; Eva Wozniak; Ya-Yu Tsai; Joanne Weidhaas; Daniel Paik; David J Van Den Berg; Daniel O Stram; Celeste Leigh Pearce; Anna H Wu; Wendy Brewster; Hoda Anton-Culver; Argyrios Ziogas; Steven A Narod; Douglas A Levine; Stanley B Kaye; Robert Brown; Jim Paul; James Flanagan; Weiva Sieh; Valerie McGuire; Alice S Whittemore; Ian Campbell; Martin E Gore; Jolanta Lissowska; Hanna P Yang; Krzysztof Medrek; Jacek Gronwald; Jan Lubinski; Anna Jakubowska; Nhu D Le; Linda S Cook; Linda E Kelemen; Angela Brooks-Wilson; Angela Brook-Wilson; Leon F A G Massuger; Lambertus A Kiemeney; Katja K H Aben; Anne M van Altena; Richard Houlston; Ian Tomlinson; Rachel T Palmieri; Patricia G Moorman; Joellen Schildkraut; Edwin S Iversen; Catherine Phelan; Robert A Vierkant; Julie M Cunningham; Ellen L Goode; Brooke L Fridley; Susan Kruger-Kjaer; Jan Blaeker; Estrid Hogdall; Claus Hogdall; Jenny Gross; Beth Y Karlan; Roberta B Ness; Robert P Edwards; Kunle Odunsi; Kirsten B Moyisch; Julie A Baker; Francesmary Modugno; Tuomas Heikkinenen; Ralf Butzow; Heli Nevanlinna; Arto Leminen; Natalia Bogdanova; Natalia Antonenkova; Thilo Doerk; Peter Hillemanns; Matthias Dürst; Ingo Runnebaum; Pamela J Thompson; Michael E Carney; Marc T Goodman; Galina Lurie; Shan Wang-Gohrke; Rebecca Hein; Jenny Chang-Claude; Mary Anne Rossing; Kara L Cushing-Haugen; Jennifer Doherty; Chu Chen; Thorunn Rafnar; Soren Besenbacher; Patrick Sulem; Kari Stefansson; Michael J Birrer; Kathryn L Terry; Dena Hernandez; Daniel W Cramer; Ignace Vergote; Frederic Amant; Diether Lambrechts; Evelyn Despierre; Peter A Fasching; Matthias W Beckmann; Falk C Thiel; Arif B Ekici; Xiaoqing Chen; Sharon E Johnatty; Penelope M Webb; Jonathan Beesley; Stephen Chanock; Montserrat Garcia-Closas; Tom Sellers; Douglas F Easton; Andrew Berchuck; Georgia Chenevix-Trench; Paul D P Pharoah; Simon A Gayther
Journal:  Nat Genet       Date:  2010-09-19       Impact factor: 41.307

10.  Mathematical prognostic biomarker models for treatment response and survival in epithelial ovarian cancer.

Authors:  Jason B Nikas; Kristin L M Boylan; Amy P N Skubitz; Walter C Low
Journal:  Cancer Inform       Date:  2011-10-03
View more
  18 in total

1.  Abnormal methylation status of FBXW10 and SMPD3, and associations with clinical characteristics in clear cell renal cell carcinoma.

Authors:  Jinyou Wang; Jian Li; Jun Gu; Jian Yu; Shicheng Guo; Yao Zhu; Dingwei Ye
Journal:  Oncol Lett       Date:  2015-09-16       Impact factor: 2.967

2.  Genome-wide association study identifies the GLDC/IL33 locus associated with survival of osteosarcoma patients.

Authors:  Roelof Koster; Orestis A Panagiotou; William A Wheeler; Eric Karlins; Julie M Gastier-Foster; Silvia Regina Caminada de Toledo; Antonio S Petrilli; Adrienne M Flanagan; Roberto Tirabosco; Irene L Andrulis; Jay S Wunder; Nalan Gokgoz; Ana Patiño-Garcia; Fernando Lecanda; Massimo Serra; Claudia Hattinger; Piero Picci; Katia Scotlandi; David M Thomas; Mandy L Ballinger; Richard Gorlick; Donald A Barkauskas; Logan G Spector; Margaret Tucker; D Hicks Belynda; Meredith Yeager; Robert N Hoover; Sholom Wacholder; Stephen J Chanock; Sharon A Savage; Lisa Mirabello
Journal:  Int J Cancer       Date:  2017-12-23       Impact factor: 7.396

3.  CrossHub: a tool for multi-way analysis of The Cancer Genome Atlas (TCGA) in the context of gene expression regulation mechanisms.

Authors:  George S Krasnov; Alexey A Dmitriev; Nataliya V Melnikova; Andrew R Zaretsky; Tatiana V Nasedkina; Alexander S Zasedatelev; Vera N Senchenko; Anna V Kudryavtseva
Journal:  Nucleic Acids Res       Date:  2016-01-14       Impact factor: 16.971

4.  Identification of crucial long non-coding RNAs and mRNAs along with related regulatory networks through microarray analysis in esophageal carcinoma.

Authors:  Yaowen Zhang; Huitao Wang; Fuyou Zhou; Anlin Hao; Ningtao Dai; Haijun Yang; Anping Zheng
Journal:  Funct Integr Genomics       Date:  2021-04-16       Impact factor: 3.410

5.  Unequal prognostic potentials of p53 gain-of-function mutations in human cancers associate with drug-metabolizing activity.

Authors:  J Xu; J Wang; Y Hu; J Qian; B Xu; H Chen; W Zou; J-Y Fang
Journal:  Cell Death Dis       Date:  2014-03-06       Impact factor: 8.469

6.  Upregulation of miR-21 in cisplatin resistant ovarian cancer via JNK-1/c-Jun pathway.

Authors:  Ileabett M Echevarría-Vargas; Fatma Valiyeva; Pablo E Vivas-Mejía
Journal:  PLoS One       Date:  2014-05-27       Impact factor: 3.240

7.  Gene-gene interaction network analysis of ovarian cancer using TCGA data.

Authors:  Huanchun Ying; Jing Lv; Tianshu Ying; Shanshan Jin; Jingru Shao; Lili Wang; Hongying Xu; Bin Yuan; Qing Yang
Journal:  J Ovarian Res       Date:  2013-12-06       Impact factor: 4.234

8.  Somatic mutations favorable to patient survival are predominant in ovarian carcinomas.

Authors:  Wensheng Zhang; Andrea Edwards; Erik Flemington; Kun Zhang
Journal:  PLoS One       Date:  2014-11-12       Impact factor: 3.240

9.  Heterogeneity of Li-Fraumeni syndrome links to unequal gain-of-function effects of p53 mutations.

Authors:  Jie Xu; Jin Qian; Ye Hu; Jilin Wang; Xiaolin Zhou; Haoyan Chen; Jing-Yuan Fang
Journal:  Sci Rep       Date:  2014-02-27       Impact factor: 4.379

10.  Combined PTEN Mutation and Protein Expression Associate with Overall and Disease-Free Survival of Glioblastoma Patients.

Authors:  Jie Xu; Zhaoli Li; Jilin Wang; Haoyan Chen; Jing-Yuan Fang
Journal:  Transl Oncol       Date:  2014-03-04       Impact factor: 4.243

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.