Literature DB >> 34966482

Comparative transcriptome analysis between patient and endometrial cancer cell lines to determine common signaling pathways and markers linked to cancer progression.

Madelaine J Cho-Clark1, Gauthaman Sukumar2, Newton Medeiros Vidal3, Sorana Raiciulescu4, Mario G Oyola1, Cara Olsen4, Leonardo Mariño-Ramírez5, Clifton L Dalgard2,6, T John Wu1.   

Abstract

The rising incidence and mortality of endometrial cancer (EC) in the United States calls for an improved understanding of the disease's progression. Current methodologies for diagnosis and treatment rely on the use of cell lines as models for tumor biology. However, due to inherent heterogeneity and differential growing environments between cell lines and tumors, these comparative studies have found little parallels in molecular signatures. As a consequence, the development and discovery of preclinical models and reliable drug targets are delayed. In this study, we established transcriptome parallels between cell lines and tumors from The Cancer Genome Atlas (TCGA) with the use of optimized normalization methods. We identified genes and signaling pathways associated with regulating the transformation and progression of EC. Specifically, the LXR/RXR activation, neuroprotective role for THOP1 in Alzheimer's disease, and glutamate receptor signaling pathways were observed to be mostly downregulated in advanced cancer stage. While some of these highlighted markers and signaling pathways are commonly found in the central nervous system (CNS), our results suggest a novel function of these genes in the periphery. Finally, our study underscores the value of implementing appropriate normalization methods in comparative studies to improve the identification of accurate and reliable markers. Copyright:
© 2021 Cho-Clark et al.

Entities:  

Keywords:  cancer stage; comparative transcriptome analysis; endometrial cancer; normalization; signaling pathways

Year:  2021        PMID: 34966482      PMCID: PMC8711572          DOI: 10.18632/oncotarget.28161

Source DB:  PubMed          Journal:  Oncotarget        ISSN: 1949-2553


INTRODUCTION

Endometrial cancer (EC) is a common gynecologic malignancy in the United States with an estimated 66,570 new cases and 12,940 deaths in 2021 [1]. Historically, while EC is presented more commonly amongst older women, it is the only gynecologic cancer with increased incidences at earlier age onset with a concomitant rise in mortality rate [2-5]. Current staging approaches for tumors are important in assessing size, spread, prognosis, and treatment of the disease. Reports that analyzed the Surveillance Epidemiology, and End Results (SEER) database suggest that early tumor staging (stage I and stage II) is correlated to better prognosis and a higher 5-year overall survival rate (OS) in comparison to advanced stages (stage III and stage IV) [1]. The survival rates drop dramatically from 96 to 18–70% respectively [1, 6]. Although the OS is relatively high with early stage detection, the vast majority of late stage EC exhibits a dramatic decline in survival due to lowered responses to radiation, hormone, and non-hormone based treatments [7]. However, staging can be inaccurate and limited in predicting responses to therapy, as some of the early stage lesions display aggressive metastatic behavior, tumor heterogeneity, ambiguous histology, and overlapping molecular characteristics [8-Gynecol Oncol. 2017 ">16]. Over the past few decades, cell lines have been frequently used as models to understand cancer biology in tumors [17-20]. With the advent of various platforms and bank centers for next-generation sequencing, large sets of molecular profiles are available for comparison between tumor samples and cell lines [21-23]. Cell lines with maximal molecular similarity to tumors can be useful in identifying targets and signaling mechanisms necessary for drug development [17]. However, a majority of these comparative molecular profiling studies between cell lines and tumors in EC have reported their findings based on integrated genomic characterization (e.g., copy number alternations, polymerase epsilon (POLE) ultramutations, microsatellite instability); whereas analyses in transcriptomics related to stage advancement still remains obscure [24-33]. This is a limitation of targeted genomic approaches due to lack of substantiation at the level of expression and function. Therefore bridging the gap between gene mutations-alterations and transcriptional activity between cancer stages can provide a more comprehensive insight into the processes involved in EC progression. Here we present a comparative transcriptome analysis between early and advance stage endometrial carcinomas in cell lines and patient tumor samples from the TCGA database. Initially we ascertained whether there are overall transcriptome parallels between cells lines and tumors. Once similarities were established, we identified signaling pathways and potential molecular markers that define changes in progression between early and advanced stage EC. These molecular insights into tumor classification and progression may have a direct effect on providing a more accurate stage classification for EC.

RESULTS

Removal of unwanted variation in RNA-seq data

Preliminary findings using scatter plots with linear regression analysis of overall transcriptome for TCGA patients vs cell lines indicates a global shift in favor for higher overall expression for TCGA patients at each stage with R2 < 0.46 (Supplementary Table 1; Supplementary Figure 1, left panel). This bias between data sets suggests the existence of unwanted technical effects and the need for a more effective normalization procedure. For this purpose, comparisons between library size and removal of unwanted variation by control genes (RUVg) normalization methods was further evaluated using relative log expression (RLE) and principal component analysis (PCA). As seen previously in our findings, normalization of read counts using library size demonstrated to be unsatisfactory. The RLE boxplots displays distributional differences and excessive variability between samples (Figure 1A, left panel). In contrast, the RUVg normalization method resulted in shifting distributions of all read counts across all samples towards 0. Furthermore, an attenuation of expression magnitude suggests improved resilience against outliers (Figure 1A, middle and right panel). This in turn led to a more robust differential expression result downstream (Supplementary Table 2). Due to increase in statistical sensitivity with increasing k value (k = 10), a slight reduction in the number of differentially expressed genes (DEGs), approximately 11–13%, was also observed.
Figure 1

Relative log expression (RLE) and principal component analysis (PCA) plots of overall transcriptome profiles of TCGA tumor samples and endometrial cancer cell lines.

(A) The RLE boxplot distributions of datasets normalized using RUVg k = 3 or k = 10 resulted in improved log counts centered around zero; demonstrating lowered magnitude in variability and higher resilience toward outliers between tumor samples and cell lines. (B) PCA plot axes represents major sources of variation based on genes profiles in the first two dimensions, PC1 and PC2 (centered, log scale)). Scatter plots indicates normalization by RUVg methods leads to better clustering between TCGA tumor samples and cell lines. Normalization with library size, as seen by a distinct separation in scatter plots between TCGA patients and cell lines, suggests similarities in expression are more dependent on sample type.

Relative log expression (RLE) and principal component analysis (PCA) plots of overall transcriptome profiles of TCGA tumor samples and endometrial cancer cell lines.

(A) The RLE boxplot distributions of datasets normalized using RUVg k = 3 or k = 10 resulted in improved log counts centered around zero; demonstrating lowered magnitude in variability and higher resilience toward outliers between tumor samples and cell lines. (B) PCA plot axes represents major sources of variation based on genes profiles in the first two dimensions, PC1 and PC2 (centered, log scale)). Scatter plots indicates normalization by RUVg methods leads to better clustering between TCGA tumor samples and cell lines. Normalization with library size, as seen by a distinct separation in scatter plots between TCGA patients and cell lines, suggests similarities in expression are more dependent on sample type. The PCA modeling of overall transcriptome expression offers a global assessment of similarity between samples. The values of first two principal components PC1 (44.26, 14.43, and 8.76%) and PC2 (19.82, 8.40, 7.54%) between library size, RUVg k = 3, and RUVg k = 10 normalization method respectively demonstrated a reduction in variation in expression between samples when using the RUVg method (Figure 1B). Furthermore, when considering normalization by library size, a biological divergence inferred by differential clustering between TCGA patient tumors and cell lines samples was also observed (Figure 1B, left panel). In contrast, the RUVg normalization method eliminated this separation and exhibited scatter plots clustering towards the center indicating greater similarity in overall transcriptome profile between samples (Figure 1B, middle and right panel).

Differential gene expression analysis

With the exception of stage I and stage II comparison (Figure 2A, left panel), the hierarchical heat map of the tumor-derived endometrial cancer cell lines and their respective TCGA patient samples clustered together in a stage dependent manner (Figure 2A, middle and right panel). This demonstrates that changes in expression between early and late stage EC in both tumors and cell lines are dependent on staging and not dependent on sample type. The number of DEGs (FDR < 0.05) between each stage comparisons are summarized in Supplementary Table 2.
Figure 2

Differential expression analysis of transcriptomes in TCGA tumor samples and endometrial cancer cell lines.

(A) Hierarchiral clustering analysis of all DEGs between TCGA tumors and cell lines indicates that early stage comparisons show higher degree of clustering between sample types. However, comparisons between early to a later cancer stage demonstrates clustering between stages suggesting clear distinctive expression differences that is stage dependent. All DEGs shown are significant (p value <0.05) (top panel). (B) Respective volcano plot and bar charts highlighting up- or downregulated DEGs (red dots) suggests downregulation of DEGs with advanced cancer stage (p < 0.05; at least |log2 fold change (FC)| ≥ 1), non-significant (NS, grey), log2 fold change (FC)| ≥ 1 (logFC, green), p < 0.05 (p-value, blue) (bottom panel).

Differential expression analysis of transcriptomes in TCGA tumor samples and endometrial cancer cell lines.

(A) Hierarchiral clustering analysis of all DEGs between TCGA tumors and cell lines indicates that early stage comparisons show higher degree of clustering between sample types. However, comparisons between early to a later cancer stage demonstrates clustering between stages suggesting clear distinctive expression differences that is stage dependent. All DEGs shown are significant (p value <0.05) (top panel). (B) Respective volcano plot and bar charts highlighting up- or downregulated DEGs (red dots) suggests downregulation of DEGs with advanced cancer stage (p < 0.05; at least |log2 fold change (FC)| ≥ 1), non-significant (NS, grey), log2 fold change (FC)| ≥ 1 (logFC, green), p < 0.05 (p-value, blue) (bottom panel). In order to better understand genes that may be driving stage progression in EC, we identified a set of genes for each stage comparison by merging cell lines and TCGA patients according to stage and using volcano plots that highlights DEGs that are up- or downregulated (p < 0.05; at least |log2 fold change (FC)| ≥ 1). For stage I vs. stage II comparison, we observed a differential expression signature of 3,414 genes (2,182 up-regulated and 1,232 down-regulated genes; Figure 2B, left panel). In stage II vs. stage II comparison, we observed 3,369 DEGs (887 up-regulated and 2,482 down-regulated genes; Figure 2B, middle panel). In stage I vs. stage III comparison, we observed 2,070 DEGs (733 up-regulated and 1,337 down-regulated genes; Figure 2B, right panel). The shift from up- to downregulation in global expression from early to late cancer stage suggests a possible divergent mode of action. Gene sets correlated to each stage comparisons and expression directionality are described in Supplementary Table 3.

Ingenuity pathway analysis (IPA) and gene set enrichment analysis

In order to determine signaling pathways that are involved in cancer progression, DEGs from each stage comparison in Figure 2B, were used as input for the IPA Core analysis. We identified the top five signaling pathways (p-value < 0.05) for each comparison with its respective number of genes that were up- or downregulated (Figure 3A). Out of the fifteen pathways identified, a third appear to be conserved: (1) liver X receptor/retinoid X receptor (LXR/RXR activation), (2) neuroprotective role of THOP1 gene in Alzheimer’s disease, (3) glutamate receptor signaling, (4) nNOS signaling in skeletal muscle cells, and (5) calcium signaling pathways. The majority of these signaling pathways, specifically in the later stage comparisons, appear to be downregulated with a negative z-score indicating a divergent expression direction relationship from the Ingenuity Pathway Knowledge Base (IPKB) (Figure 3B). All other significant pathways are listed in Supplementary Table 4. All genes identified under each signaling pathways are described in Supplementary Table 5.
Figure 3

Top signaling pathways and enriched gene sets associated to stage comparisons.

(A) Top five signaling pathways that are altered between stages. (B) Respective z-scores and color intensity that is correlated to expected relationship direction (gene expression from knowledge base) and observed gene expression (C) Venn diagram identifying DEGs that are unique or commonly regulated across or between all stages. (D) GO analysis of all dysregulated gene sets (FDR <0.05).

Top signaling pathways and enriched gene sets associated to stage comparisons.

(A) Top five signaling pathways that are altered between stages. (B) Respective z-scores and color intensity that is correlated to expected relationship direction (gene expression from knowledge base) and observed gene expression (C) Venn diagram identifying DEGs that are unique or commonly regulated across or between all stages. (D) GO analysis of all dysregulated gene sets (FDR <0.05). Further insights into the biological relevance and mechanism of these expression profiles were ascertained using a Venn diagram with subsequent gene set enrichment analysis (GSEA) (FDR <0.05) to identify genes sets within each zone of the Venn diagram. Across all stage comparisons, 241 genes are identified as conserved (intersect), suggesting dynamic expression changes in that set. (Figure 3C; Supplementary Table 6). The identification of the top three significantly enriched gene ontology (GO) gene sets with each stage comparison demonstrates enrichment for regulation of transport, extracellular space, intrinsic component of plasma membrane, cell projection, ion transport, neuron part, neurogenesis, and regulation of membrane potential (Figure 3D). All other gene sets in each of the seven zones are identified in Supplementary Table 6.

DISCUSSION

In this study, we established transcriptome similarities between the cell lines used in this study and patient tumor samples from TCGA database. This comparison allowed us to identify signaling pathways and gene sets that are dysregulated in both datasets. Most notably, an altered expression pattern in neuronal related signaling pathways and markers was observed between early and advanced histological stages. This expression pattern indicates a novel function of these genes in the periphery with a potential role in regulating the transformation and progression of tumors in EC. Identifying mutually dysregulated biomarkers and signaling pathways in cell lines and tumors can advantageously provide a more expedient method for studying mechanisms in cancer biology. However, these comparative type studies have proven to be relatively unsuccessful due to the high degree of variability between datasets [18, 20, 23]. Furthermore, inherent variation in tumor collection, processing, and storage between specimens can add an additional layer of variability within the tissue data set [34]. The current standard practice of RNA-seq normalization by library size may be inadequate for complex data sets involving varied samples, platforms, library kits, sequencing depth, and users [35]. Previous efforts have demonstrated that unaddressed ‘latent-hidden variables’ during the normalization process can introduce unwanted expression heterogeneity and inadvertent biases [36-38]. These artifacts can subsequently deviate differential expression (DE) analysis downstream and generate higher false positive rates with reduced detection power for true differences [38]. In our study, we determined that normalization by library size was insufficient in removing the high degree of variability to justify a comparative analysis between the two data sets. A preliminary linear regression analysis between our cell lines and TCGA patient tumors, normalized solely by library size, displayed a global shift in higher raw count values for TCGA tumor data in all intra-stage comparisons (Supplementary Figure 1; left panel). These augmented raw count values clearly demonstrated the need for a better normalization method to ensure unbiased expression levels for all subsequent analyses. Due to the need for better normalization, we utilized another normalization method to mitigate all possible innate biases in our study. The RUVg method is fundamentally a modified version of RUV-2 that adjusts for technical effects as described previously [39-41]. It performs a factor analysis on counts by identifying a set of negative control genes not affected by the biological covariates of interest, but are affected by the factors of unwanted ‘technical’ variation (technical effects are independent of biological conditions of interest). Other researchers that have employed the RUV normalization methods have successfully removed latent variables in bulk RNA-seq experiments [42-47]. In this study, expression variances and overall expression similarities between the two data sets, as demonstrated in the RLE and PCA plots, were markedly improved using RUVg normalization (Figure 1). Furthermore, subsequent DE analysis between stages yielded a much higher number of DEGs (Supplementary Table 2). This increase in sensitivity may be more reflective of the genes regulating changes that may be lost during standard library normalization procedures. Establishing similarity in overall transcriptome between cells lines and tumors in the TCGA database using the RUVg normalization method allowed us to further investigate signaling pathways and enriched gene sets involved in cancer progression. Expression clustering analysis of DEGs in both instances, stage I and stage II, demonstrated better grouping by stage when compared to advanced stage III; suggesting that the expression profile in advanced stage EC is clearly distinct to early stages (Figure 2A). Furthermore, in contrast to an elevated number of upregulated genes between the early EC stages, a vast majority of genes exhibited downregulation by advanced stage III (Figure 2B). Likewise, prior studies have reported gene downregulation in advanced cancer stages as a consequence of heightened negative feedback in order to maintain network stability despite environmental and genetic stress [48, 49]. Another possibility, as demonstrated in a four-stage model study, indicates that the malignant transformation of tumors may be driven by the downregulation of genes to promote dedifferentiation [50]. In addition to distinguishing the expression pattern of genes between stages in our study, we also identified the top five signaling pathways that may be involved in cancer progression. Three of those pathways specifically the LXR/RXR activation, neuroprotective role for THOP1 in Alzheimer’s disease, and glutamate receptor signaling pathways, shift from being up- to mostly downregulated as approaching advanced cancer stage III. Biological systems rely heavily on negative feedback mechanisms to maintain homeostasis. In signal transduction pathways, feedback inhibition is integral to dampening over activation of signaling output in response to external stimuli or growth factors; however in tumor cells this feedback inhibition is dysregulated by constitutively activated oncoproteins. Moreover, mutational occurrences resulting in the attenuation of negative feedback loop during cancer development may be a critical step in the transformation of tumors to a more aggressive metastatic phenotype [48, 49]. The LXR/RXR pathway is known for regulating cholesterol, glucose, and fatty acid metabolism in a tissue tissue- dependent manner. In addition to its role in metabolism, this pathway is correlated to carcinogenesis [51-58]. Currently, researchers have correlated obesity and elevated cholesterol levels as a major risk factor for malignancy in EC. A study using a luciferase reporter gene system demonstrated that the cholesterol metabolite, 27-hydroxycholesterol (27HC), functions as an agonist for LXR. Stimulation of LXR resulted in the increase of LXR response element (LXRE) transcriptional activity to augment cell proliferation in the Ishikawa cell line [59]. In another study from the endometrium of ovariectomized C57BL/6 mouse given subcutaneous 17-β estradiol (E2) treatment and a high-fat diet (HFD) displayed a divergent action. Their IPA analysis demonstrated a decrease in the LXR/RXR signaling pathway whereas the NF-κB pathway was elevated [60]. Taken together, our findings suggests that an altered LXR/RXR signaling pathway may contribute to the progression of EC and markers identified in this paper should be further investigated. THOP1 encodes for zinc metalloendopeptidase EC 3.4.24.15 (EP24.15, thimet oligopeptidase), a neuropeptide processing enzyme that is central to the formation and degradation of many bioactive peptides. EP24.15 is also expressed in the periphery and hydrolyzes the neuropeptide, gonadotropin-releasing hormone (GnRH), to yield a biologically active metabolite GnRH-(1–5) [61-66]. The decapeptide GnRH and GnRH analogs have demonstrated to exert an anti-tumorgenic effect in EC [67-70]. However, its metabolite GnRH-(1–5), displays a divergent mechanism of action and biological behavior from its parental peptide [71-73]. In earlier studies, GnRH-(1–5) mediated changes in GnRH-II expression and increased cell proliferation in the Ishikawa cell line [72, 73]. The mechanism for driving cell proliferation and enhanced migration by GnRH(1–5) is through its ability to stimulate the release of epidermal growth factor (EGF) through G protein-coupled receptor 101 (GPR101) to activate the EGF receptor (EGFR) signaling pathway [74]. A subsequent study identified EGF release and increased cellular invasion to be dependent on matrix metallopeptidase (MMP)-9 activity; suggesting the possibility of its role in increasing cellular metastatic potential [75]. Future studies should address the relationship between increased THOP1 expression and enzymatic activity to all its related markers identified in this paper to ascertain its role in driving cancer progression. Glutamate is a primary excitatory neurotransmitter in the CNS. It has also been implicated in exerting proliferative effects on peripheral tumors through its behavior as a growth factor and subsequent activation of known oncogenic signaling pathways [76-79]. Recent studies have suggested that the altered expression of specific glutamate receptor subunits in cancer cells may regulate DNA repair and intracellular signaling. As a consequence, the stimulation of angiogenesis and cell proliferation leads to the promotion of malignant phenotype and metastatic potential [76, 77, 79]. All things considered, while these observations are intriguing, the biological function or purpose for the shift in expression of genes identified between stages still remains elusive and warrants further investigation. In addition to signaling pathways, we also identified enriched gene sets to define stage-dependent changes correlated to biological relevance. As seen previously with the signaling transduction analyses, enriched gene sets involving neurogenesis and neuron part emerges several times between stage comparisons and may be a potential driver for metastatic transition across all cancer stages (Figure 3C and 3D). Abnormal neuronal growth and innervation within the endometrium has been correlated to infertility, uterine dysfunction, and endometriosis [80-86]. Previous studies have noted associations between the nervous system and cancer by implicating nerves as having an important role in tumor growth, invasion, and metastasis [87, 88]. Autonomic nerves, specifically the sympathetic nerves, demonstrated a significant role in progression of prostate, gastric, and breast cancers by regulating the cancer microenvironment and immune checkpoints [89-91]. Therefore targeting cancer neurogenesis with corresponding neuronal markers with possible autocrine function may be a promising development in new cancer treatment. In conclusion, the conventional method of staging classifications to define patients groups to standardize management has been limited due to inconsistencies in tumor behavior, heterogeneity, ambiguous histology, and overlapping molecular characteristics. Here we demonstrate that with the appropriate normalization, we were able to correlate progression in histological staging with transcriptomics that is conserved in both cells lines and TCGA patient tumor sets. The signaling pathways and markers identified in this paper may possibly be used to define and distinguish molecular changes between stages. We demonstrate a substantial down-regulation of genes between early and advanced staged tumors with an altered expression pattern of neuronal signaling pathways and markers. These findings may serve as a novel and promising development in the cancer field as the initial function in these neuronal markers may have a different role and function in the periphery.

MATERIALS AND METHODS

Cell culture

The human endometrial adenocarcinoma cell line, the Ishikawa cell line [92], was obtained from American Type Culture Collection (ATCC) (Manassas, VA) [93]. The primary tumor-derived endometrial adenocarcinoma cell lines originated from patients with Stage IC Grade 3 (ACI-181), Stage IIB Grade 2 (ACI-52), and Stage IIIC Grade 2 (ACI-80) International Federation of Gynecology and Obstetrics (FIGO) staging (gift from Dr. Risinger, Michigan State University, Department of Obstetrics, Gynecology and Reproductive Biology, Michigan State University, Grand Rapids 49503, MI, USA) (Supplementary Table 1). All cell lines used in this study are identified as having endometrioid histologic characteristics and were grown-maintained as previously described [74, 75]. In brief, cells were grown in phenol red free-DMEM (Cellgro-Mediatech, Inc., Manassas, VA, USA) supplemented with 10% FBS (Atlanta Biologicals, Lawrenceville, GA, USA) and 2 mM L-Glutamine (Quality Biological Inc., Gaithersburg, MD, USA). These cells were maintained at 37°C with humidified atmosphere of 5% CO2 until 90–100% confluence was reached. Cells were subsequently passaged in a 1:5 ratio into 10-cm dishes (Costar, Corning, NY, USA).

RNA extraction and data acquisition

Total RNA was extracted using Trizol reagent (Invitrogen, Carlsbad, CA, USA) according to manufacturer’s recommendations then purified with DNase I using RNeasy Mini Kit (Qiagen, Germantown, MD). Sequencing libraries were generated from purified RNA from cell lines as described previously [94, 95]. Illumina reads in FASTQ format were trimmed and cropped using Trimmomatic before aligning and mapping to Genome Reference Consortium Human Build 38 patch release 7 (GRCh38.p7) using HISAT Alignment v2.0. Subsequent processing with Samtools v1.3.1 and HTSeq 0.6.0 generated counts based on the number of reads that matched each gene in an annotation file in gene transfer format (GTF). All raw RNA-Seq data for the primary tumor-derived EC cell lines discussed in this publication have been submitted to the SRA database under the accession number SRP074707, and BioProject accession number PRJNA321028. RNA-Seq data acquisition from patients with similar histology, staging, and grading to the primary tumor-derived cell lines were obtained from TCGA cBio Cancer Genomics Portal (http://www.cbioportal.org) in HTSeq file format. Count tables generated by HTSeq-count were imported into R version 3.5.1.

Normalization methods and assessment of data variation

Previous studies have demonstrated that normalization in RNA-seq data is a crucial step to consider due to its impact on DEGs downstream [35–38, 96, 97]. Numerous factors in our study may introduce nuisance technical effects (i.e., multiple sequencing centers, low input, differences in sequencing depth, gene length biases, varying library kits, flow cells, batches, different experimenters) leading to unwanted bias in our expression sets [36, 40, 96]. Here we employed two normalization methods and determined which was most suitable approach under these experimental conditions. Normalization methods on raw counts using library size or RUVg method were processed as previously described [39-42]. The RUVg method, in brief, utilizes factor analysis to adjust counts for unwanted technical effects based on negative control genes that is determined a priori, in silico, which are not affected by the biological covariates of interest. The observed read counts are regressed on both the known covariates of interest and unknown nuisance variables (factors of unwanted variation, k). Although there is no clear cut way for determining k, the number of factors of unwanted variation, k = 3, for this study was selected by considering sample size (number of DEGs obtained) and the degree of technical effects (represented by error bar magnitude) demonstrated by varying k values [39-42](Figure 1A). For a preliminary determination of whether the global transcriptome of TCGA patients and cell lines are comparative, the counts per million (CPM) of each gene was log transformed to log2CPM. Each gene was plotted for TCGA patients vs. cell lines for each normalization method. The R2 values were assessed to determine how similar TCGA tumor samples were to cell lines. Scatter plots and R2 values were generated using SigmaPlot 10.0. The effectiveness of normalization in removing variability and improving clustering between samples was assessed using RLE and PCA. RLE is a diagnostic box plot that is useful in visually presenting overall quality and distributions of transformed read counts of each gene across samples. The distribution of the log-ratio of a read count of each gene to the median count across samples that have unwanted variation removed should be centered at the zero line. Furthermore, comparable samples should display similar RLE distributions. The PCA plot displays clusters of samples by assessing similarities in overall gene expression [97-99]. It also describes variation and accounts for varied influences of the original characteristics. The principal components are orthogonal linear combinations of gene expression profiles for each sample. Similarly expressed groups will cluster by class in the first few PCs. Clustering will also highlight possible batch effects and outlying samples. The RLE and PCA analysis were performed using EDASeq packages in R [40].

Differential gene expression analysis and clustering

DEGs between each stage comparisons in cell lines and tumors were determined by negative binomial generalized linear models (GLMs) by weighted likelihood empirical Bayes with estimate dispersion within edgeR [35]. Genes with false discovery rate (FDR) < 0.05 were considered differentially expressed. To ascertain whether various normalization methods have an impact on downstream differential expression results, we considered changes in the number of DEGs obtained. Once a normalization method was selected, hierarchical heat maps of DEGs and respective volcano plots were generated to determine clustering of stage comparisons-sample types and to identify sets of up- and downregulated DEGs with each stage comparison (p < 0.05 and |log2FC| > 1). P-values instead of FDR values were used for all downstream bioinformatics analysis for statistical uniformity unless indicated. Differential gene expression analysis, heat maps, and volcano plots were performed using gplot function and packages edgeR [35], RUVSeq [40], EDASeq [40], ggplot2 [100], and Rcpp [101] in R environment.

Ingenuity pathway analysis and gene ontology analysis

The Core analysis feature of the IPA software (Ingenuity Systems, https://www.ingenuity.com [Qiagen]) was used to discover signaling pathways that may regulate cancer progression during stage comparison analysis. The list of DEGs for each stage comparisons were uploaded and categorized to related canonical pathways based on the IPKB. This analysis was set to include direct and indirect relationships and filtered to only consider molecules and/or relationships of the human species. Cutoffs for gene inputs were set to p < 0.05 and |log2FC| > 1 for down-and upregulated gene expression. Pathways with an overlapping p-value < 0.05 calculated by Fisher’s exact test right tailed were considered to be significant. The z-score as indicated by the color intensity considers the match between expected relationship direction (gene expression from IPKB) and observed gene expression. Only z-scores <-2 or >2 were considered significant. The identification of gene sets that define stage-dependent changes or are conserved across all stage comparisons were depicted using Venn diagrams. Only DEGs that were positively mapped ID in the IPA analyses with a p < 0.05 and |log2FC| > 1 were considered. The GSEA analyses using the Molecular Signatures Databases (MSigDB v6.2) on each venn zone was performed using tools from GSEA Broad Institute (http://software.broadinstitute.org/gsea/msigdb/annotate.jsp) [102-104]. Gene sets corresponding to GO terms with FDR q-value < 0.05 were considered significantly enriched. Venn diagrams were performed using VennDiagram package in R.
  101 in total

1.  A processed metabolite of luteinizing hormone-releasing hormone has proliferative effects in endometrial cells.

Authors:  Kathryn Walters; Yue Pui Chin; T John Wu
Journal:  Am J Obstet Gynecol       Date:  2007-01       Impact factor: 8.661

Review 2.  Targeting liver X receptors in cancer therapeutics.

Authors:  Chin-Yo Lin; Jan-Åke Gustafsson
Journal:  Nat Rev Cancer       Date:  2015-03-19       Impact factor: 60.716

3.  Should grade 3 endometrioid endometrial carcinoma be considered a type 2 cancer-a clinical and pathological evaluation.

Authors:  Martin A Voss; Raji Ganesan; Linmarie Ludeman; Keith McCarthy; Robert Gornall; Gerhard Schaller; Wenbin Wei; Sudha Sundar
Journal:  Gynecol Oncol       Date:  2011-08-23       Impact factor: 5.482

4.  Calcium modulates endopeptidase 24.15 (EC 3.4.24.15) membrane association, secondary structure and substrate specificity.

Authors:  Vitor Oliveira; Paula A G Garrido; Claudia C Rodrigues; Alison Colquhoun; Leandro M Castro; Paulo C Almeida; Claudio S Shida; Maria A Juliano; Luiz Juliano; Antonio C M Camargo; Stephen Hyslop; James L Roberts; Valerie Grum-Tokars; Marc J Glucksman; Emer S Ferro
Journal:  FEBS J       Date:  2005-06       Impact factor: 5.542

5.  Alternative preprocessing of RNA-Sequencing data in The Cancer Genome Atlas leads to improved analysis results.

Authors:  Mumtahena Rahman; Laurie K Jackson; W Evan Johnson; Dean Y Li; Andrea H Bild; Stephen R Piccolo
Journal:  Bioinformatics       Date:  2015-07-24       Impact factor: 6.937

6.  A scaling normalization method for differential expression analysis of RNA-seq data.

Authors:  Mark D Robinson; Alicia Oshlack
Journal:  Genome Biol       Date:  2010-03-02       Impact factor: 13.583

Review 7.  Negative feedback and adaptive resistance to the targeted therapy of cancer.

Authors:  Sarat Chandarlapaty
Journal:  Cancer Discov       Date:  2012-03-22       Impact factor: 39.397

Review 8.  The role of neuropeptide processing enzymes in endocrine (prostate) cancer: EC 3.4.24.15 (EP24.15).

Authors:  Todd A Swanson; Sandra I Kim; Michael Myers; Amanda Pabon; Keith D Philibert; Mina Wang; Marc J Glucksman
Journal:  Protein Pept Lett       Date:  2004-10       Impact factor: 1.890

9.  Comparative transcriptomes of adenocarcinomas and squamous cell carcinomas reveal molecular similarities that span classical anatomic boundaries.

Authors:  Eric W Lin; Tatiana A Karakasheva; Dong-Jin Lee; Ju-Seog Lee; Qi Long; Adam J Bass; Kwok K Wong; Anil K Rustgi
Journal:  PLoS Genet       Date:  2017-08-07       Impact factor: 5.917

10.  Comprehensive transcriptomic analysis of cell lines as models of primary tumors across 22 tumor types.

Authors:  K Yu; B Chen; D Aran; J Charalel; C Yau; D M Wolf; L J van 't Veer; A J Butte; T Goldstein; M Sirota
Journal:  Nat Commun       Date:  2019-08-08       Impact factor: 14.919

View more
  1 in total

1.  Investigation of Transcriptome Patterns in Endometrial Cancers from Obese and Lean Women.

Authors:  Konii Takenaka; Ashton Curry-Hyde; Ellen M Olzomer; Rhonda Farrell; Frances L Byrne; Michael Janitz
Journal:  Int J Mol Sci       Date:  2022-09-29       Impact factor: 6.208

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.