Literature DB >> 27729437

Comprehensive Cross-Population Analysis of High-Grade Serous Ovarian Cancer Supports No More Than Three Subtypes.

Gregory P Way^1,2, James Rudd³, Chen Wang⁴, Habib Hamidi⁵, Brooke L Fridley⁶, Gottfried E Konecny⁵, Ellen L Goode⁴, Casey S Greene^7,8, Jennifer A Doherty⁹.

Abstract

Four gene expression subtypes of high-grade serous ovarian cancer (HGSC) have been previously described. In these early studies, a fraction of samples that did not fit well into the four subtype classifications were excluded. Therefore, we sought to systematically determine the concordance of transcriptomic HGSC subtypes across populations without removing any samples. We created a bioinformatics pipeline to independently cluster the five largest mRNA expression datasets using k-means and nonnegative matrix factorization (NMF). We summarized differential expression patterns to compare clusters across studies. While previous studies reported four subtypes, our cross-population comparison does not support four. Because these results contrast with previous reports, we attempted to reproduce analyses performed in those studies. Our results suggest that early results favoring four subtypes may have been driven by the inclusion of serous borderline tumors. In summary, our analysis suggests that either two or three, but not four, gene expression subtypes are most consistent across datasets.

Entities: Chemical Disease Species

Keywords: molecular subtypes; ovarian cancer; reproducibility; unsupervised clustering

Mesh：

Year: 2016 PMID： 27729437 PMCID： PMC5144978 DOI： 10.1534/g3.116.033514

Source DB: PubMed Journal: G3 (Bethesda) ISSN： 2160-1836 Impact factor: 3.154

Invasive ovarian cancer is a heterogeneous disease typically diagnosed at a late stage, with high mortality (Kurman and Shih 2010). The most aggressive and common histologic type is HGSC (Vang ), characterized by extensive copy number variation and TP53 mutation (Cancer Genome Atlas Research Network 2011). Given the genomic complexity of these tumors, mRNA expression can be thought of as a summary measurement of these genomic and epigenetic alterations, to the extent that the alterations influence gene expression in either the cancer or stroma. Four gene expression subtypes with varying components of mesenchymal, proliferative, immunoreactive, and differentiated gene expression signatures have been reported in all studies of HGSC to date (Bonome ; Tothill ; Cancer Genome Atlas Research Network 2011; Tan ; Konecny ). Two of these studies also observed survival differences across subtypes (Tothill ; Konecny ). Tothill ) first identified four HGSC subtypes (as well as two other subtypes that largely included low-grade serous and serous borderline tumors) in an Australian population using k-means clustering. Later, The Cancer Genome Atlas (TCGA) used NMF and also reported four subtypes that were labeled as: “mesenchymal,” “differentiated,” “proliferative,” and “immunoreactive” (Cancer Genome Atlas Research Network 2011). The TCGA group also applied NMF clustering to the Tothill data and observed similar subtypes (Cancer Genome Atlas Research Network 2011). Konecny ) applied NMF to cluster an independent set of HGSC samples and reported four subtypes, which they labeled as C1–C4 (Konecny ). These subtypes were similar to those in the TCGA, but a subtype classifier trained on these subtypes better differentiated survival in their own data, data from TCGA, and Bonome ). Despite the extensive research in the area, work to date has several limitations. In both the TCGA and Tothill studies, ∼8–15% of samples were excluded from analyses. A reanalysis of the TCGA data showed that over 80% of the samples could be assigned to more than one subtype (Verhaak ). In more recent TCGA analyses by the Broad Institute Genome Data Analysis Center (GDAC) Firehose initiative, with the largest number of HGSC cases evaluated to date (n = 569), three subtypes fit the data better than four (Broad Institute TCGA Genome Data Analysis Center 2016a,b). This uncertainty in HGSC subtyping led us to determine if four homogeneous subtypes exist across study populations. Our goal is to rigorously assess the number of HGSC subtypes. We reanalyze data from the five largest independent studies to date (and add an analysis of our own collection of samples) using a standardized bioinformatics pipeline. We apply k-means clustering as well as NMF to each population and do not remove “hard-to-classify” samples, as was done in previous studies (Tothill ; Cancer Genome Atlas Research Network 2011). We perform independent analyses within each dataset and compare subtyping results across studies. We summarize each subtype’s expression patterns using moderated t-score vectors and comprehensively characterize correlations between subtypes across populations. This method contrasts with earlier reanalyses that pooled HGSC datasets together to identify subtypes (Tan ). We sidestep gene expression platform or dataset biases, which could affect clustering if under or overcorrected, by comparing dataset- and subtype-specific summary statistics instead of pooling raw gene expression data. Our cross-population comparative analysis does not support the conclusion that four HGSC subtypes exist; rather, the data more strongly support an interpretation that there are either two or three subtypes. We show that the support for four subtypes observed in TCGA’s reanalysis of the Tothill data (Cancer Genome Atlas Research Network 2011) is lost when serous borderline tumors, which have very different genomic profiles and survival compared to HGSC (Bonome ; Ouellet ), are excluded before clustering. Our work also highlights the impact that a single study can have on the trajectory of subtyping research and suggests the importance of periodic histopathologic review and rigorous reanalysis of existing data for cross-study commonalities.

Materials and Methods

Data inclusion

We applied inclusion criteria as described in detail in the supplemental materials using data from the R package, curatedOvarianData (Ganzfried ), and our own dataset (“Mayo”). A subset of these data has been published previously (GSE53963; Konecny ), but the present dataset (GSE74357) contains 343 more samples (Supplemental Material, Table S1). Briefly, these criteria selected HGSC samples from studies including at least 130 cases assayed on standard microarrays. We included only HGSC and high-grade endometrioid samples [which are molecularly similar to HGSC (Köbel )] as identified by study-specific pathological review. Data from the new Mayo HGSC samples, as well as other samples with mixed histologies and grades, for a total of 528 additional ovarian tumor samples, were deposited in NCBI’s Gene Expression Omnibus (GEO) (Edgar ); these data can be accessed with the accession number GSE74357 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE74357). All study participants provided written informed consent, and this work was approved by the Mayo Clinic and Dartmouth College Institutional Review Boards. After applying the unified inclusion criteria, our final analytic datasets included: TCGA (n = 499) (Cancer Genome Atlas Research Network 2011; Broad Institute TCGA Genome Data Analysis Center 2016a); Mayo (n = 379; GSE74357) (Konecny ); Yoshihara (n = 256; GSE32062.GPL6480) (Yoshihara ); Tothill (n = 242; GSE9891) (Tothill ); and Bonome (n = 185; GSE26712) (Bonome ) (Table 1). We restricted analyses to the 10,930 genes measured successfully in all five populations (Figure S1).

Table 1

Characteristics of the populations included in the five analytic datasets

	TCGA	Mayo	Yoshihara	Tothill	Bonome
GEO		GSE74357	GSE32062	GSE9891	GSE26712
Platform	Affymetrix HGU1133	Agilent 4x44K	Agilent 4x44K	Affymetrix HGU1133	Affymetrix HGU1133
Population	United States	United States	Japan	Australia	United States
Original sample size	578	528	260	285	195
Analytic sample size^a	499	379	256	242	185
Age [Mean (SD)]	60.0 (11.6)	62.9 (11.3)	NR	60.3 (10.3)	61.5 (11.9)
Stage
I	10 (2%)	7 (3%)	0 (0%)	11 (5%)	0 (0%)
II	17 (4%)	11 (3%)	0 (0%)	8 (4%)	0 (0%)
III	351 (80%)	275 (73%)	202 (79%)	178 (83%)	146 (80%)
IV	63 (14%)	86 (23%)	54 (21%)	17 (8%)	36 (20%)
Grade
2	55 (12%)	3 (1%)	130 (51%)	80 (37%)	NR
3	386 (88%)^b	376 (99%)	126 (49%)	134 (63%)	NR
Debulking
Optimal	325 (74%)	287 (76%)	101 (39%)	132 (62%)	89 (49%)
Suboptimal	116 (26%)	87 (23%)	155 (61%)	82 (38%)	93 (51%)

TCGA, The Cancer Genome Atlas; NR, data not reported.

Samples without survival data were excluded in survival analyses.

One sample was labeled as “Grade 4” in TCGA.

TCGA, The Cancer Genome Atlas; NR, data not reported. Samples without survival data were excluded in survival analyses. One sample was labeled as “Grade 4” in TCGA.

Clustering

We performed independent clustering within each dataset to avoid potential biases from different platforms or studies. As detailed in File S1, we identified the 1500 genes with the highest variance from each dataset and used the union of these genes (n = 3698) for clustering. We performed clustering within each dataset using each potential k from 2 to 4 clusters. We performed k-means clustering in each population using the R package “cluster” (version 2.0.1) (Maechler ) with 20 initializations. We repeated these analyses using NMF in the R package “NMF” (version 0.20.5) (Brunet ) with 100 different random initializations for each k. As done in prior studies, we calculated cophenetic correlation coefficients to select appropriate k for each dataset after NMF clustering with 10 consensus runs. The cophenetic correlation identifies appropriate solutions and tends to decrease with increasing k unless a more accurate solution is observed at a larger k.

Identification of analogous clusters within and across studies

We performed significance analysis of microarray (SAM) (Tusher ; Schwender ) analysis on all clusters from each study using all 10,930 genes. This resulted in a cluster-specific moderated t statistic for each of the input genes (Schwender 2012). To summarize the expression patterns of all 10,930 genes for a specific cluster in a specific population, we combined gene-wise moderated t statistics into a vector of length 10,930. We repeated the SAM analysis using only the MAD subset genes and the results were similar. The TCGA subtype labels have become widely used in the field. To generate comparable labels across k and across studies, we mapped our TCGA subtype assignments back to the original TCGA labels to define reference clusters at k = 4 (that is, mesenchymal-like, proliferative-like, etc.). Clusters in other populations that were most strongly correlated with the TCGA clusters were assigned the same label.

Clustering analysis of randomized data

Any clustering procedure is expected to induce strong correlational structure across clusters within a dataset, even if there is no true underlying structure. However, if there is no true underlying structure, clusters across datasets are not expected to be correlated. To assess this, we used the same datasets but shuffled each gene’s expression vector to disrupt the correlative structure. We performed within- and cross-study analyses of cluster identification using this set of data that were parallel to those performed using the nonrandomized data.

Assessing the reproducibility of single-population studies

We compared our sample assignments at k = 2–4 to the four subtypes reported in the Tothill, TCGA, and Konecny publications (Tothill ; Cancer Genome Atlas Research Network 2011; Konecny ). Because the labels that were assigned in TCGA’s reanalysis of the Tothill data were not available, we performed NMF consensus clustering of Tothill’s data without removing low malignant potential (LMP) samples in order to generate labels for comparison.

Data availability

We provide software under a permissive open source license to download the required data and reproduce our analyses (Way ). Analyses were run in a Docker container, allowing the computing environment to be recreated (Boettiger 2015). Our Docker image can be pulled from: https://hub.docker.com/r/gregway/hgsc_subtypes/. This allows interested users to freely download the software, reproduce the analyses, and then build on this work. All data used in this analysis is publicly available including data we generated (accessible under GEO accession GSE74357).

Results

To visually inspect the consistency and distinctness of clusters, we compared sample-by-sample correlation heatmaps. For k = 2–4 within each study, we observed high sample-by-sample correlations within clusters and relatively low sample-by-sample correlations across clusters (Figure S2). Clustering results using NMF were similar to k means results (Figure S3).

Correlation of cluster-specific expression patterns

Across datasets, we observed strong positive correlations of moderated t score vectors between analogous clusters in TCGA, Tothill, Mayo, and Yoshihara (Figure 1 and Table 2). However, clustering of the Bonome data did not correlate strongly with clusters identified in the other datasets (Table 2). We believe that we were unable to assign parallel subtypes in Bonome because of either RNA contamination or inappropriate grading assignments. However, more work is required in order to identify exactly why we were unable to classify. In contrast to our analyses, which independently cluster data from each study, Konecny ) assigned subtypes to the Bonome data by applying a Predictive Analysis of Microarray (PAM) (Tibshirani ) to their own subtypes to define reduced, subtype-specific predictive gene lists. They then assigned Bonome samples based on the highest Spearman correlation against subtype centroids (Konecny ).

Figure 1

Table 2

SAM moderated t score vector Pearson correlations between analogous clusters across populations

	Cluster 1^a	Cluster 2	Cluster 3	Cluster 4
k = 2^a	0.62–0.81	0.62–0.81	NR	NR
k = 3^a	0.77–0.85	0.80–0.90	0.65–0.77	NR
k = 4^a	0.77–0.85	0.83–0.89	0.51–0.76	0.61–0.75
Bonome k = 2^b	−0.08–0.24	−0.08–0.24	NR	NR
Bonome k = 3^b	0.45–0.46	−0.02–0.12	0.22–0.42	NR
Bonome k = 4^b	0.50–0.57	−0.04–0.04	0.13–0.29	0.26–0.43

TCGA, The Cancer Genome Atlas; NR, data not reported.

Correlation ranges for TCGA, Mayo, Yoshihara, and Tothill.

Bonome is removed from gene set analyses because of low correlating clusters.

Significance analysis of microarray (SAM) moderated t score Pearson correlation heatmaps reveal consistency across datasets. (A) Correlations across datasets for k means k = 2. (B) Correlations across datasets for k means k = 3. (C) Correlations across datasets for k means k = 4. TCGA, The Cancer Genome Atlas. TCGA, The Cancer Genome Atlas; NR, data not reported. Correlation ranges for TCGA, Mayo, Yoshihara, and Tothill. Bonome is removed from gene set analyses because of low correlating clusters. To assess our analytical approach, we performed an analysis using randomized data. This showed that within-population correlation structure was induced by clustering, but structure between populations was not (Figure S4). The off-diagonals in this figure are close to, but not exactly, zero. Permutation induces more independent features than in real gene expression data and therefore may produce much lower correlations if structure is present in real data. Comparing Figure 1 with Figure S4, we observed much higher correlation across datasets (Figure 1), which was lost after randomization (Figure S4). For example, for k = 2, the TCGA and Mayo cluster correlations for analogous clusters was high (top left panel in Figure 1). Conversely, the same relationship in randomized data (second row, first column panel in Figure S4) showed correlations near zero. This indicates that the high correlations observed across datasets in Figure 1 are induced by similar underlying structure in the data. Across studies, positive correlations between analogous clusters and negative correlations between nonanalogous clusters were stronger for clusters identified when k = 2 and k = 3 than when k = 4 (Figure 1), with comparable statistical precision (Table S2). These cross-population comparisons suggested that two and three subtypes fit HGSC gene expression data more consistently than the four widely accepted subtypes. Within each population, clusters identified by NMF were similar to those identified using k-means clustering (Figure 2), suggesting that these results were independent of clustering algorithm. With NMF, both positive and negative correlations were stronger for k = 2 and k = 3 than for k = 4. Across k = 3 and k = 4, correlations were strongest for clusters 1 and 2. Sample cluster assignments for both k-means and NMF clusters are provided in Table S3.

Figure 2

Significance analysis of microarray (SAM) moderated t score Pearson correlation heatmaps of clusters formed by k means clustering and NMF clustering reveals consistency across clustering methods. Within dataset results are shown for both methods when setting each algorithm to find 2, 3, and 4 clusters. NMF, nonnegative matrix factorization; TCGA, The Cancer Genome Atlas.

Comparison with previously-identified HGSC clusters

Our clustering results for the Tothill, TCGA, and Mayo datasets were highly concordant with the clustering described in the original publications (Tothill ; Cancer Genome Atlas Research Network 2011; Konecny ), as evidenced by the high degree of consistent overlap in sample assignments to the previously-defined clusters (Table 3). Our cross-study cluster 1 was mostly mapped to the “Mesenchymal” label from TCGA, “C1” from Tothill, and “C4” from Mayo. This cluster was the most stable in our analysis within all datasets, across k = 2, 3, and 4, and across clustering algorithms. Cross-study cluster 2, which was also observed consistently, was most similar to the “Proliferative” label from TCGA, “C5” from Tothill, and “C3” from Mayo. Cross-study cluster 3 for k = 3 was associated with both the “Immunoreactive” and “Differentiated” TCGA labels, “C2” and “C4” in Tothill, and “C1” and “C2” in Mayo. For analyses where k = 4, the third cluster was associated with “Immunoreactive,” “C2,” and “C1,” while the fourth cluster was associated with “Differentiated,” “C4,” and “C2” for TCGA, Tothill, and Mayo, respectively. For additional comparisons see the Supplementary Materials (File S1), which includes survival analyses (Table S4, Figure S9), cluster specific genes (Table S5), and pathway analyses (Table S6).

Table 3

Distributions of sample membership in the clusters identified in our study by the original cluster assignments in the TCGA, Tothill, and Konecny studies

	TCGA					Tothill							Konecny
	Mes	Pro	Imm	Dif	NC	C1	C2	C3	C4	C5	C6	NC	C1	C2	C3	C4	NA
k = 2
Cluster 1	98	7	93	68	21	78	39	1	0	0	0	11	36	21	2	26	114
Cluster 2	1	127	2	60	22	0	5	5	44	35	2	22	6	39	41	0	94
k = 3
Cluster 1	98	2	20	11	6	77	22	0	0	0	0	6	16	13	2	26	82
Cluster 2	1	111	0	11	16	1	0	0	3	35	2	5	0	16	36	0	56
Cluster 3	0	21	75	106	21	0	22	6	41	0	0	22	26	31	5	0	70
k = 4
Cluster 1	97	4	12	12	5	74	0	0	0	0	0	0	7	12	3	25	62
Cluster 2	1	85	0	0	13	1	0	0	1	34	2	5	0	9	31	0	41
Cluster 3	0	5	80	3	12	3	42	0	1	1	0	14	29	6	0	1	57
Cluster 4	1	40	3	113	13	0	2	6	42	0	0	14	6	33	9	0	48

Clusters identified in our study using k-means clustering with k = 2, k = 3, and k = 4. The corresponding labels for the generally similar HGSC gene expression subtypes observed in the TCGA, Tothill, and Konecny studies are, respectively: mesenchymal/C1/C4, proliferative/C5/C3, immunoreactive/C2/C1, and differentiated/C4/C2). TCGA, The Cancer Genome Atlas; Mes, mesenchymal; Pro, proliferative; Imm, immunoreactive; Dif, differentiated; NC = samples not clustered in original publication; NA = samples not assessed at the time of the original publication.

Meta-research into previous HGSC subtyping studies

Each of the publications that only considered high-grade samples (Cancer Genome Atlas Research Network 2011; Konecny ) found clustering coefficients consistent with k = 2, k = 3, and k = 4. Nevertheless, each publication concludes the existence of four subtypes, while our cross-population analysis suggested that two or three clusters fit HGSC data better than four clusters. To compare with previous results, we evaluated the number of subtypes that fit the data best within each study by calculating cophenetic correlation coefficients at k = 2 through k = 8 clusters inclusively. While cophenetic correlations typically decrease with increasing k, if substructure is present in the data, we would expect there to be higher values for the most appropriate number of subtypes. We observed a similar pattern in each population (Figure 3A, Figure S5, Figure S6, and Figure S7) in which the highest cophenetic correlation was reached for two clusters and, based on the heatmaps, appeared to have the highest consensus (also see Figure S8). In every dataset, four clusters were not observed to represent the data better than two or three. The only results in previous studies that contradicted this work were from TCGA’s reanalysis of the Tothill data. According to Figure S6.2 in the TCGA paper, the reanalysis included serous borderline tumors (i.e., tumors with low malignant potential) (n = 18). The inclusion of these tumors in the TCGA HGSC reanalyses was done even though, in the original Tothill paper, the serous borderline tumors had a unique gene expression pattern and clustered entirely in a group labeled “C3.”

Figure 3

Comparing NMF consensus clustering in the Tothill dataset. Data displays consensus clustering for k = 2 to k = 6 for 10 NMF initializations alongside the cophenetic correlation results for k = 2 to k = 8. (A) Tothill dataset (n = 260) with borderline samples (n = 18) not removed prior to clustering. (B) Tothill dataset with borderline samples removed (n = 242). To assess the extent to which serous borderline tumors inclusion drove the TCGA reanalysis results, we reproduced TCGA’s reanalysis of the Tothill dataset, including the serous borderline tumors (n = 18); we indeed observed that the cophenetic correlation is higher for k = 4 than k = 3 (Figure 3A). However, when we appropriately removed these serous borderline tumors, we observed an increase in the k = 3 cophenetic correlation (Figure 3B). The results that support four subtypes were generated during clustering of HGSC and serous borderline tumors combined. Subtyping analyses of HGSC alone reveal less than four subtypes.

Discussion

Although prior studies have reported the existence of four molecular subtypes of HGSC ovarian cancer (Cancer Genome Atlas Research Network 2011; Tothill ; Konecny ; Broad Institute TCGA Genome Data Analysis Center 2016a), our analysis suggests the existence of only two or three subtypes. This conclusion is based on our observation that concordance of analogous subtypes across study populations was stronger for two or three clusters as opposed to four. Previous studies used either k-means or NMF clustering, and because our results contradicted prior work, we performed analyses using both of these methods. Results for each population were similar for the k means and NMF clustering algorithms, suggesting that the clustering algorithm did not drive the observed differences. Because cross-population comparisons suggest that two and three clusters show more consistency than four, we explored within-study heuristics (cophenetic correlation coefficients) that suggested four subtypes in previous research. The cophenetic coefficient measures how precisely a dendrogram retains sample-by-sample pairwise distances and can be used to compare clustering accuracy (Sokal and Rohlf 1962). While both the Konecny and TCGA studies reported four subtypes, in both analyses, k = 2 and k = 3 resulted in higher cophenetic coefficients than k = 4 [Figure 2A in Konecny ) and Figure S6.1 in TCGA (Cancer Genome Atlas Research Network 2011)]. We observed the same patterns in our own reanalysis of TCGA and analysis of the expanded Mayo cohort (Figure S5 and Figure S6). Yoshihara and Tothill did not report cophenetic coefficients, but our analysis of each revealed similar patterns to TCGA and Konecny (Figure 3A, Figure S7, Figure S8, and Figure S9). In the previous literature, the only report that suggested four subtypes represented the data better than three was TCGA’s reanalysis of the Tothill data (Figure S6.2 in their publication); the cophenetic coefficient dropped dramatically at k = 3 before recovering at k = 4 (Cancer Genome Atlas Research Network 2011). Notably, TCGA’s figure legend for this supplemental result indicates that they did not remove serous borderline tumors from the Tothill data. Our analysis of the Tothill data differed from TCGA’s in that we excluded serous borderline tumors, and instead supports the existence of two or three subtypes. To evaluate the influence of these serous borderline tumors in the Tothill data, we repeated our analyses including serous borderline tumors, and observed a drop in the cophenetic coefficient for k = 3 relative to k = 4 (Figure 3). This suggests that the four subtypes observed in TCGA’s analysis of the Tothill data may be due, in part, to the inclusion of serous borderline tumors. There are several limitations to note in the HGSC data we analyzed. Given the intratumor heterogeneity that is likely to exist (Blagden 2015), our approach would be strengthened by having data on multiple areas of the tumors. Additionally, since histology and grade classification have changed over time (Silverberg 2000; Soslow 2008), it is unclear whether the populations we studied used comparable guidelines to determine histology and grade. We attempted to exclude all low-grade serous and low-grade endometrioid samples because they often have very different gene expression patterns and more favorable survival compared to their higher-grade counterparts (Vang ). It is unclear why the Bonome clusters did not correspond to the clusters observed in other populations. Lack of consistency could result from unreported biological differences. In summary, our study demonstrates that two clusters of HGSC, “mesenchymal-like” and “proliferative-like,” are clearly and consistently identified within and between populations. This suggests that there are two reproducible HGSC subtypes that are either etiologically distinct, or acquire phenotypically determinant alterations through their development. Our study also suggests that the previously described “immunoreactive-like” and “differentiated-like” subtypes appear to be more variable across populations, and tend to be collapsed into a single category when three subtypes are specified. These may represent, for example, steps along an immunoreactive continuum or could represent the basis of a third, but more variable, subtype. Understanding the underlying biology of the robust, well-defined “mesenchymal-like” and “proliferative-like” subtypes universally observed across populations could lead to targeted treatments that might influence survival. More work needs to be done to determine whether the heterogeneous samples that do not fall into one of these clear groups can be classified into homogeneous subtypes using other characteristics such as methylation markers or a combination of genomic measures. Our analysis reveals the importance of critically reassessing molecular subtypes across multiple large study populations using parallel analyses and consistent inclusion criteria. New systematic approaches hold promise for the implementation of such analyses (Celik ; Planey and Gevaert 2016). Our results underscore the importance of ovarian cancer histopathology, contradict the four HGSC subtype hypothesis, and suggest that there may be fewer HGSC molecular subtypes with variable immunoreactivity and stromal infiltration.

22 in total

1. Metagenes and molecular pattern discovery using matrix factorization.

Authors: Jean-Philippe Brunet; Pablo Tamayo; Todd R Golub; Jill P Mesirov
Journal: Proc Natl Acad Sci U S A Date: 2004-03-11 Impact factor: 11.205

2. Biomarker-based ovarian carcinoma typing: a histologic investigation in the ovarian tumor tissue analysis consortium.

Authors: Martin Köbel; Steve E Kalloger; Sandra Lee; Máire A Duggan; Linda E Kelemen; Leah Prentice; Kimberly R Kalli; Brooke L Fridley; Daniel W Visscher; Gary L Keeney; Robert A Vierkant; Julie M Cunningham; Christine Chow; Roberta B Ness; Kirsten Moysich; Robert Edwards; Francesmary Modugno; Clareann Bunker; Eva L Wozniak; Elizabeth Benjamin; Simon A Gayther; Aleksandra Gentry-Maharaj; Usha Menon; C Blake Gilks; David G Huntsman; Susan J Ramus; Ellen L Goode
Journal: Cancer Epidemiol Biomarkers Prev Date: 2013-07-23 Impact factor: 4.254

3. The origin and pathogenesis of epithelial ovarian cancer: a proposed unifying theory.

Authors: Robert J Kurman; Ie-Ming Shih
Journal: Am J Surg Pathol Date: 2010-03 Impact factor: 6.394

4. Prognostic and therapeutic relevance of molecular subtypes in high-grade serous ovarian cancer.

Authors: Gottfried E Konecny; Chen Wang; Habib Hamidi; Boris Winterhoff; Kimberly R Kalli; Judy Dering; Charles Ginther; Hsiao-Wang Chen; Sean Dowdy; William Cliby; Bobbie Gostout; Karl C Podratz; Gary Keeney; He-Jing Wang; Lynn C Hartmann; Dennis J Slamon; Ellen L Goode
Journal: J Natl Cancer Inst Date: 2014-09-30 Impact factor: 13.506

5. Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Authors: Robert Tibshirani; Trevor Hastie; Balasubramanian Narasimhan; Gilbert Chu
Journal: Proc Natl Acad Sci U S A Date: 2002-05-14 Impact factor: 11.205

6. Prognostically relevant gene signatures of high-grade serous ovarian carcinoma.

Authors: Roel G W Verhaak; Pablo Tamayo; Ji-Yeon Yang; Diana Hubbard; Hailei Zhang; Chad J Creighton; Sian Fereday; Michael Lawrence; Scott L Carter; Craig H Mermel; Aleksandar D Kostic; Dariush Etemadmoghadam; Gordon Saksena; Kristian Cibulskis; Sekhar Duraisamy; Keren Levanon; Carrie Sougnez; Aviad Tsherniak; Sebastian Gomez; Robert Onofrio; Stacey Gabriel; Lynda Chin; Nianxiang Zhang; Paul T Spellman; Yiqun Zhang; Rehan Akbani; Katherine A Hoadley; Ari Kahn; Martin Köbel; David Huntsman; Robert A Soslow; Anna Defazio; Michael J Birrer; Joe W Gray; John N Weinstein; David D Bowtell; Ronny Drapkin; Jill P Mesirov; Gad Getz; Douglas A Levine; Matthew Meyerson
Journal: J Clin Invest Date: 2012-12-21 Impact factor: 14.808

7. A gene signature predicting for survival in suboptimally debulked patients with ovarian cancer.

Authors: Tomas Bonome; Douglas A Levine; Joanna Shih; Mike Randonovich; Cindy A Pise-Masison; Faina Bogomolniy; Laurent Ozbun; John Brady; J Carl Barrett; Jeff Boyd; Michael J Birrer
Journal: Cancer Res Date: 2008-07-01 Impact factor: 12.701

8. CoINcIDE: A framework for discovery of patient subtypes across multiple datasets.

Authors: Catherine R Planey; Olivier Gevaert
Journal: Genome Med Date: 2016-03-09 Impact factor: 11.117

9. curatedOvarianData: clinically annotated data for the ovarian cancer transcriptome.

Authors: Benjamin Frederick Ganzfried; Markus Riester; Benjamin Haibe-Kains; Thomas Risch; Svitlana Tyekucheva; Ina Jazic; Xin Victoria Wang; Mahnaz Ahmadifar; Michael J Birrer; Giovanni Parmigiani; Curtis Huttenhower; Levi Waldron
Journal: Database (Oxford) Date: 2013-04-02 Impact factor: 3.451

10. Functional genomics identifies five distinct molecular subtypes with clinical relevance and pathways for growth control in epithelial ovarian cancer.

Authors: Tuan Zea Tan; Qing Hao Miow; Ruby Yun-Ju Huang; Meng Kang Wong; Jieru Ye; Jieying Amelia Lau; Meng Chu Wu; Luqman Hakim Bin Abdul Hadi; Richie Soong; Mahesh Choolani; Ben Davidson; Jahn M Nesland; Ling-Zhi Wang; Noriomi Matsumura; Masaki Mandai; Ikuo Konishi; Boon-Cher Goh; Jeffrey T Chang; Jean Paul Thiery; Seiichi Mori
Journal: EMBO Mol Med Date: 2013-05-13 Impact factor: 12.137

12 in total

1. Pooled Clustering of High-Grade Serous Ovarian Cancer Gene Expression Leads to Novel Consensus Subtypes Associated with Survival and Surgical Outcomes.

Authors: Chen Wang; Sebastian M Armasu; Kimberly R Kalli; Matthew J Maurer; Ethan P Heinzen; Gary L Keeney; William A Cliby; Ann L Oberg; Scott H Kaufmann; Ellen L Goode
Journal: Clin Cancer Res Date: 2017-03-09 Impact factor: 12.531

2. Molecular Classification of Epithelial Ovarian Cancer Based on Methylation Profiling: Evidence for Survival Heterogeneity.

Authors: Clara Bodelon; J Keith Killian; Joshua N Sampson; William F Anderson; Rayna Matsuno; Louise A Brinton; Jolanta Lissowska; Michael S Anglesio; David D L Bowtell; Jennifer A Doherty; Susan J Ramus; Aline Talhouk; Mark E Sherman; Nicolas Wentzensen
Journal: Clin Cancer Res Date: 2019-05-29 Impact factor: 12.531

3. Extracting a biologically relevant latent space from cancer transcriptomes with variational autoencoders.

Authors: Gregory P Way; Casey S Greene
Journal: Pac Symp Biocomput Date: 2018

4. Functional network community detection can disaggregate and filter multiple underlying pathways in enrichment analyses.

Authors: Lia X Harrington; Gregory P Way; Jennifer A Doherty; Casey S Greene
Journal: Pac Symp Biocomput Date: 2018

5. Consensus on Molecular Subtypes of High-Grade Serous Ovarian Carcinoma.

Authors: Gregory M Chen; Lavanya Kannan; Ludwig Geistlinger; Victor Kofia; Zhaleh Safikhani; Deena M A Gendoo; Giovanni Parmigiani; Michael Birrer; Benjamin Haibe-Kains; Levi Waldron
Journal: Clin Cancer Res Date: 2018-07-03 Impact factor: 12.531

6. Challenges and Opportunities in Studying the Epidemiology of Ovarian Cancer Subtypes.

Authors: Jennifer Anne Doherty; Lauren Cole Peres; Chen Wang; Gregory P Way; Casey S Greene; Joellen M Schildkraut
Journal: Curr Epidemiol Rep Date: 2017-07-10

7. The need for rapid therapeutic efficacy testing for cancer therapy.

Authors: David G Kaufman
Journal: Exp Mol Pathol Date: 2020-01-23 Impact factor: 4.401

8. Development and Validation of the Gene Expression Predictor of High-grade Serous Ovarian Carcinoma Molecular SubTYPE (PrOTYPE).

Authors: Aline Talhouk; Joshy George; Chen Wang; Ellen L Goode; Susan J Ramus; Jennifer A Doherty; David D Bowtell; Michael S Anglesio; Timothy Budden; Tuan Zea Tan; Derek S Chiu; Stefan Kommoss; Huei San Leong; Stephanie Chen; Maria P Intermaggio; Blake Gilks; Tayyebeh M Nazeran; Mila Volchek; Wafaa Elatre; Rex C Bentley; Janine Senz; Amy Lum; Veronica Chow; Hanwei Sudderuddin; Robertson Mackenzie; Samuel C Y Leong; Geyi Liu; Dustin Johnson; Billy Chen; Aocs Group; Jennifer Alsop; Susana N Banerjee; Sabine Behrens; Clara Bodelon; Alison H Brand; Louise Brinton; Michael E Carney; Yoke-Eng Chiew; Kara L Cushing-Haugen; Cezary Cybulski; Darren Ennis; Sian Fereday; Renée T Fortner; Jesús García-Donas; Aleksandra Gentry-Maharaj; Rosalind Glasspool; Teodora Goranova; Casey S Greene; Paul Haluska; Holly R Harris; Joy Hendley; Brenda Y Hernandez; Esther Herpel; Mercedes Jimenez-Linan; Chloe Karpinskyj; Scott H Kaufmann; Gary L Keeney; Catherine J Kennedy; Martin Köbel; Jennifer M Koziak; Melissa C Larson; Jenny Lester; Liz-Anne Lewsley; Jolanta Lissowska; Jan Lubiński; Hugh Luk; Geoff Macintyre; Sven Mahner; Iain A McNeish; Janusz Menkiszak; Nikilyn Nevins; Ana Osorio; Oleg Oszurek; José Palacios; Samantha Hinsley; Celeste L Pearce; Malcolm C Pike; Anna M Piskorz; Isabelle Ray-Coquard; Valerie Rhenius; Cristina Rodriguez-Antona; Raghwa Sharma; Mark E Sherman; Dilrini De Silva; Naveena Singh; Peter Sinn; Dennis Slamon; Honglin Song; Helen Steed; Euan A Stronach; Pamela J Thompson; Aleksandra Tołoczko; Britton Trabert; Nadia Traficante; Chiu-Chen Tseng; Martin Widschwendter; Lynne R Wilkens; Stacey J Winham; Boris Winterhoff; Alicia Beeghly-Fadiel; Javier Benitez; Andrew Berchuck; James D Brenton; Robert Brown; Jenny Chang-Claude; Georgia Chenevix-Trench; Anna deFazio; Peter A Fasching; María J García; Simon A Gayther; Marc T Goodman; Jacek Gronwald; Michelle J Henderson; Beth Y Karlan; Linda E Kelemen; Usha Menon; Sandra Orsulic; Paul D P Pharoah; Nicolas Wentzensen; Anna H Wu; Joellen M Schildkraut; Mary Anne Rossing; Gottfried E Konecny; David G Huntsman; Ruby Yun-Ju Huang
Journal: Clin Cancer Res Date: 2020-06-17 Impact factor: 13.801

9. Assessment of data transformations for model-based clustering of RNA-Seq data.

Authors: Janelle R Noel-MacDonnell; Joseph Usset; Ellen L Goode; Brooke L Fridley
Journal: PLoS One Date: 2018-02-27 Impact factor: 3.240

10. High-grade serous tubo-ovarian cancer refined with single-cell RNA sequencing: specific cell subtypes influence survival and determine molecular subtype classification.

Authors: Siel Olbrecht; Pieter Busschaert; Junbin Qian; Adriaan Vanderstichele; Liselore Loverix; Toon Van Gorp; Els Van Nieuwenhuysen; Sileny Han; Annick Van den Broeck; An Coosemans; Anne-Sophie Van Rompuy; Diether Lambrechts; Ignace Vergote
Journal: Genome Med Date: 2021-07-09 Impact factor: 11.117