Xiaoyu Lin1, Tiantongfei Jiang1, Jing Bai1, Junyi Li1, Tianshi Wang2, Jun Xiao1, Yi Tian1, Xiyun Jin1, Tingting Shao1, Juan Xu1, Lingchao Chen3, Lihua Wang4, Yongsheng Li5. 1. College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China. 2. Department of Biochemistry and Molecular Cell Biology, Shanghai Key Laboratory for Tumor Microenvironment and Inflammation, Shanghai Jiao Tong University School of Medicine, Shanghai, China. 3. Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai 200040, China. Electronic address: chenlingchao12@sina.com. 4. Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang 150081, China. Electronic address: wanglh211@163.com. 5. College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150081, China. Electronic address: liyongsheng@ems.hrbmu.edu.cn.
Abstract
Long noncoding RNAs (lncRNAs) have been implicated in cancer biogenesis and prognosis. However, we still lack knowledge on their function during glioma progression. In this study, we analyzed the lncRNA expression profile across 907 glioma patients in grades II, III, and IV. Widespread dynamic expression of lncRNAs during glioma progression was revealed, and we identified 33 onco-lncRNAs and 61 tumor suppressor lncRNAs. We found that the expression of these oncogenic lncRNAs is regulated by grade-specific expressed transcription factors. Based on the "guilt by association" rule, we predicted the potential functions of oncogenic lncRNAs, and the majority of these lncRNAs are involved in cancer hallmarks. Especially we found that CARD8-AS1 regulates the metastatic potential of glioma cell lines in vitro. Integrating clinical information, we identified the 12 protective and 8 risk lncRNAs (such as PWAR6 and CARD8-AS1) in glioma. Finally, an lncRNA-gene functional module was identified to be associated with the survival of patients. The predictive ability of this module signature was further validated in an independent dataset. Our results revealed the dynamic transcriptome transition during glioma progression, indicating that the lncRNA signature could be a useful biomarker that may improve upon our understanding of the molecular mechanisms underlying glioma progression.
Long noncoding RNAs (lncRNAs) have been implicated in cancer biogenesis and prognosis. However, we still lack knowledge on their function during glioma progression. In this study, we analyzed the lncRNA expression profile across 907 gliomapatients in grades II, III, and IV. Widespread dynamic expression of lncRNAs during glioma progression was revealed, and we identified 33 onco-lncRNAs and 61 tumor suppressor lncRNAs. We found that the expression of these oncogenic lncRNAs is regulated by grade-specific expressed transcription factors. Based on the "guilt by association" rule, we predicted the potential functions of oncogenic lncRNAs, and the majority of these lncRNAs are involved in cancer hallmarks. Especially we found that CARD8-AS1 regulates the metastatic potential of glioma cell lines in vitro. Integrating clinical information, we identified the 12 protective and 8 risk lncRNAs (such as PWAR6 and CARD8-AS1) in glioma. Finally, an lncRNA-gene functional module was identified to be associated with the survival of patients. The predictive ability of this module signature was further validated in an independent dataset. Our results revealed the dynamic transcriptome transition during glioma progression, indicating that the lncRNA signature could be a useful biomarker that may improve upon our understanding of the molecular mechanisms underlying glioma progression.
Glioma is one of the most common brain tumors, accounting for approximately 50%–60% of all primary brain tumors. Glioma is histologically classified as a malignant brain tumor and classified into grades I–IV by the World Health Organization (WHO). In brief, grade I and II astrocytoma and oligodendroglioma are low-grade gliomas, whereas grade III and IV astrocytoma and oligodendroglioma are high-grade gliomas. Despite advances in treatment modalities, it is still an urgent challenge to identify sensitive early biomarkers for the diagnosis and prognosis of glioma.With the development of molecular profiles during glioma progression, increasing numbers of molecular biomarkers have been identified for the diagnosis and prognosis of gliomas.3, 4 For example, ARK5 has been shown to be unregulated in glioma, and the upregulation was correlated with the grade of glioma. In addition, patients with a high expression of ARK5 exhibit shorter survival time. The most famous gene IDH1 is well studied in glioma, and the prognostic impact of the IDH1 mutation has been reported by several studies. Tumorpatients harboring a mutation of the IDH1 gene have a better outcome than nonmutated tumors regardless of the grade considered.6, 7 In particular, transcription factors (TFs) play important roles in the transcriptional networks that regulate gene expression and modify and control cancer phenotypes.8, 9 Differentially expressed TFs in glioma and their downstream gene targets may be potential therapeutic biomarkers of glioma. Several TFs have also been identified as important regulators of glioma progression, including TP53, SP1, JUN, and STAT3.10, 11 These candidate genes provide novel insights into the progression of glioma.In addition to the protein-coding genes, increasing numbers of noncoding RNAs have been identified to play critical roles during glioma progression. Specifically, microRNAs are small noncoding RNAs that regulate gene expression posttranscriptionally and play important roles in regulating diverse biological processes. During the initiation and progression of humangliomas, microRNAs (miRNAs) have been shown to modulate cell proliferation, survival, tumor angiogenesis, invasion, and metastasis.14, 15 In addition, we also revealed the miRNA regulatory network that is associated with glioma progression and identified the progression-related miRNAs. These studies demonstrated that noncoding RNAs are important molecular regulators during glioma progression.In addition to miRNAs, another type of noncoding RNA has recently emerged. Long noncoding RNAs (lncRNAs) have been shown to be associated with cancer development and progression, demonstrating potential applications as novel diagnostic or prognostic molecular markers.16, 17 Tumor-suppressive lncRNA MALAT1 was shown to play critical roles in glioma by downregulating matrix metallopeptidase 2 (MMP2) and inactivating extracellular signal-regulated kinase (ERK)/mitogen-activated protein kinase (MAPK) signaling. High expression of lncRNA CASC2c was positively correlated with astrocytoma progression, which is an unfavorable prognosis factor for patients. lncRNA NEAT1 was shown to be regulated by the epidermal growth factor receptor (EGFR) pathway, which contributes to glioma progression through the Wnt/beta-catenin pathway by scaffolding EZH2. However, the majority of these studies have focused on the specific grades of glioma or the most malignant type (glioblastoma multiforme [GBM]). The functional significance of lncRNAs in the malignant progression of gliomas is still unclear and needs to be further explored.To address these questions, in this study we investigated the dynamic transcriptome transition of lncRNAs during glioma progression. Glioma progression-related lncRNAs were first identified, including oncogenic and tumor suppressor lncRNAs. The expression of these candidate lncRNAs was strictly regulated by TFs that showed grade-specific expression. Functional analysis revealed that these lncRNAs were involved in cancer hallmarks, such as cell death. Integrating the clinical information, we also revealed a candidate survival-related lncRNA functional module, which provided deep insights into the molecular mechanisms of glioma progression.
Results
Overview of Identifying Glioma Progression-Related lncRNAs
We systematically analyzed the lncRNA and protein-coding gene transcriptome transition during glioma progression by analyzing the expression profiles of two cohorts (Figure 1A). Both sets of expression profiles of lncRNAs and coding genes were obtained from brain tumors of grades II, III, and IV. We first identified the lncRNAs and coding genes that showed dynamic expression during glioma progression. These lncRNAs and coding genes were further classified into nine groups. By further integration of transcription regulation data, we revealed that the expression of lncRNAs and coding genes was regulated by grade-specific TFs. Moreover, we predicted the potential function of lncRNAs, and a network module comprising lncRNA and coding genes was found to be associated with the survival of gliomapatients.
Figure 1
Dynamic Transcriptome Landscape of lncRNAs during Glioma Progression
(A) A flowchart shows the dynamic transcriptome analysis during glioma progression. Approximately thousands of glioma patients were integrated for identification of oncogenic lncRNAs. The transcription regulation, functions, and clinical features of these lncRNAs were analyzed in this study. (B) The cumulative distribution of lncRNA and coding gene expression of glioma patients in the TCGA and CGGA datasets. (C and D) The expression of lncRNAs during glioma progression for (C) TCGA and (D) CGGA project data. The heatmaps show the normalized expression of lncRNAs in grade II, III, and IV patients. The color bars in the left represent different classes of lncRNAs. The right boxplots show the normalized expression of different classes of lncRNAs.
Dynamic Transcriptome Landscape of lncRNAs during Glioma Progression(A) A flowchart shows the dynamic transcriptome analysis during glioma progression. Approximately thousands of gliomapatients were integrated for identification of oncogenic lncRNAs. The transcription regulation, functions, and clinical features of these lncRNAs were analyzed in this study. (B) The cumulative distribution of lncRNA and coding gene expression of gliomapatients in the TCGA and CGGA datasets. (C and D) The expression of lncRNAs during glioma progression for (C) TCGA and (D) CGGA project data. The heatmaps show the normalized expression of lncRNAs in grade II, III, and IV patients. The color bars in the left represent different classes of lncRNAs. The right boxplots show the normalized expression of different classes of lncRNAs.
Dynamic Transcriptome Transition during Glioma Progression
To investigate the dynamic expression transcriptome profiles during glioma progression, we collected RNA-sequencing (RNA-seq)-based datasets at three grades of gliomapatients from The Cancer Genome Atlas (TCGA) and Chinese Glioma Genome Atlas (CGGA) projects (Table 1). After reads mapping and transcriptome assembly, we obtained the expression of 5,376 lncRNAs and 15,934 protein-coding genes in TCGA data and 6,710 lncRNAs and 1,8976 genes in CGGA data. Evidence has indicated that the expression of lncRNAs is lower than protein-coding genes in various types of tissues. Thus, we analyzed the global average expression of lncRNAs and coding genes in gliomapatients. The average expression of lncRNAs was significantly lower than that of coding genes in both TCGA and CGGA datasets (Figure 1B; p < 2.2e−16, Kolmogorov-Smirnov test).
Table 1
Clinical Characteristics of the Glioma Patients of Different Grades
TCGA
CGGA
II
III
IV
II
III
IV
n
238
238
159
100
72
100
Age (years)
15–75
23–76
22–90
21–65
18–75
18–81
Sex
Male
124
132
151
59
45
64
Female
114
106
58
41
27
36
IDH1
Mutant
52
60
7
78
35
26
Wild-type
143
128
151
18
33
70
Survival
Death
33
65
127
29
43
81
Survival
162
123
31
68
26
13
Time (days)
1–5,546
0–6,423
0–2,681
21–3,361
29–3,063
34–2,961
Clinical Characteristics of the GliomaPatients of Different GradesNext, we performed a t test to identify the lncRNAs and coding genes that were differentially expressed during glioma progression (see details in Materials and Methods). We found that the expression of lncRNAs showed a greater difference in the TCGA data, and approximately 78% of the lncRNAs showed a dynamic expression transition during glioma progression (Figure 1C; Table S1). In addition, we identified that approximately 27% of the lncRNAs showed variable expression in the CGGA data (Figure 1D; Table S2). We found that the expression of lncRNAs showed greater changes during the transition from grade III to IV. Specifically, we focused on the lncRNAs that showed a consistent dynamic expression during glioma progression. Hundreds of lncRNAs exhibited consistent upexpression or downexpression during glioma progression (Figures 1C and 1D). Among these lncRNAs, several have been demonstrated to be associated with glioma, including CRNDE, CARD8-AS1, and PWAR6. In addition to lncRNAs, we also identified the protein-coding genes that showed variable expression during glioma progression (Figure S1; Tables S3 and S4). These dynamically expressed lncRNAs and coding genes provided a valuable resource for investigating the transcriptome changes and identifying the key gene regulators during glioma progression.
Glioma Progression-Related lncRNAs Are Regulated by Specific TFs
The analysis above indicated that lncRNAs and coding genes are dynamically expressed during glioma progression. Next, we combined the results from two independent datasets and identified the consistent glioma progression-related lncRNAs (Figure 2A). Specifically, we identified 93 and 142 lncRNAs showing consistent upregulation and downregulation during glioma progression, respectively. These lncRNAs account for approximately 57% of all consistently expressed lncRNAs (Figure 2B), suggesting their critical role in glioma progression. Moreover, we found that the majority of lncRNAs were intergenic lncRNAs (Figure 2C). Although previous studies have identified several lncRNAs associated with glioma based on exon array, this observation suggests that RNA-seq data can provide more candidates for further functional investigations.
Figure 2
Dynamic Regulation of Oncogenic lncRNAs during Glioma Progression
(A) The overlap of different classes of lncRNAs between TCGA and CGGA data. The heatmap shows the number of overlapping lncRNAs, and the bar plots show the number of lncRNAs in each class. (B) The pie chart shows the proportion of lncRNAs in each class. lncRNAs in the UU class are defined as onco-lncRNAs, and lncRNAs in the DD class are defined as tumor suppressor lncRNAs. (C) The enrichment of different types of lncRNAs in each group. The grids colored with black lines are significantly enriched or depleted. (D) The TF motif enrichment for different classes of lncRNAs. The color indicates the −log10 (p value). (E) The normalized expression distribution of two representative TFs.
Dynamic Regulation of Oncogenic lncRNAs during Glioma Progression(A) The overlap of different classes of lncRNAs between TCGA and CGGA data. The heatmap shows the number of overlapping lncRNAs, and the bar plots show the number of lncRNAs in each class. (B) The pie chart shows the proportion of lncRNAs in each class. lncRNAs in the UU class are defined as onco-lncRNAs, and lncRNAs in the DD class are defined as tumor suppressor lncRNAs. (C) The enrichment of different types of lncRNAs in each group. The grids colored with black lines are significantly enriched or depleted. (D) The TF motif enrichment for different classes of lncRNAs. The color indicates the −log10 (p value). (E) The normalized expression distribution of two representative TFs.Next, we investigated the upstream regulators of lncRNAs. Given that TFs act as master regulators of gene expression, we next integrated gene or lncRNA expression and sequencing binding to identify TF-lncRNA regulation in glioma. We first screened the lncRNA promoters to find the TF binding sites by match in TRANSFAC. Next, the correlation coefficient in the expression between TF and lncRNAs was calculated. TF-lncRNA pairs with a correlation coefficient greater than 0.4 were used for further analysis. We found that lncRNAs are regulated by different TFs (Figure 2D), whereas downregulated lncRNAs are regulated by several TFs that are involved in glioma. Specifically, we also identified additional TFs that show grade-specific expression (Figure 2E). SATB1 has been reported to be expressed in several humancancers, and it plays an important role in glioma development and progression. Consistent with a previous study, we identified that SATB1 was low expressed in high-grade glioma compared with low-grade glioma. Moreover, we also identified that the downregulated lncRNAs were strictly regulated by TEF, which shows decreased expression during glioma progression (Figure S2A). For the lncRNAs with increased expression, we identified several TF regulators, including SPI1 and PLAU (Figure 2E; Figure S2B). SPI1 was demonstrated to play critical roles in glioma, and PLAU was associated with tissue remodeling and wound repair. Taken together, all of these results indicate that these TFs play important roles during glioma progression by regulating the dynamically expressed lncRNAs.
Oncogenic lncRNAs Are Involved in Cancer Hallmarks
Next, we integrated the expression of lncRNAs in normal samples and identified the differentially expressed lncRNAs between cancer and normal samples. In addition, we overlapped these differentially expressed lncRNAs with those showing dynamic expression during glioma progression. In total, 33 upregulated lncRNAs also showed dynamic upregulation and 61 downregulated lncRNAs showed dynamic downregulation during glioma progression (Figure 3A; Table S5). We defined these upregulated lncRNAs as onco-lncRNAs and downregulated lncRNAs as tumor suppressor lncRNAs. Moreover, we found that these lncRNAs showed greater expression changes during the transition from grade III to IV, suggesting that they play critical roles during the progression of low-grade glioma to high-grade glioma. Evidence has shown that the expression of lncRNAs might be affected by mutations or copy number variation (CNV).26, 27 We thus integrated the mutation and CNV data in glioma and found that several lncRNAs were located within CNV alteration regions or adjacent to genes with mutations in cancer (Figure 3B). For example, we found that STXBP5-AS1 is a tumor-suppressing lncRNA. We found that the genomic region of this lncRNA had a loss in CNV, which may be the response for the decreasing expression during glioma progression.
Figure 3
The Potential Functions of Oncogenic lncRNAs
(A) The heatmap shows the dynamic expression of onco-lncRNAs and tumor suppressor lncRNAs. (B) The Circos plot shows the genomic and transcriptional alterations of the glioma progression-related lncRNA and genes. The inner network shows the lncRNA-gene-hallmark links. (C) The normalized expression distribution of CARD8-AS1. Dark boxes are for TCGA data, and light boxes are for CGGA data. (D) The normalized expression distribution of PWAR6. (E) The enrichment plot shows the distribution of genes in the regulation of immune response process that are correlated with the expression of CARD8-AS1. (F) The enrichment plot shows the distribution of the genes in the regulation of the DNA repair process that are correlated with the expression of PWAR6.
The Potential Functions of Oncogenic lncRNAs(A) The heatmap shows the dynamic expression of onco-lncRNAs and tumor suppressor lncRNAs. (B) The Circos plot shows the genomic and transcriptional alterations of the glioma progression-related lncRNA and genes. The inner network shows the lncRNA-gene-hallmark links. (C) The normalized expression distribution of CARD8-AS1. Dark boxes are for TCGA data, and light boxes are for CGGA data. (D) The normalized expression distribution of PWAR6. (E) The enrichment plot shows the distribution of genes in the regulation of immune response process that are correlated with the expression of CARD8-AS1. (F) The enrichment plot shows the distribution of the genes in the regulation of the DNA repair process that are correlated with the expression of PWAR6.Our analysis above identified the oncogenic lncRNAs during glioma progression; we next investigated the functions of these lncRNAs. We employed “guilt by association” to identify the corresponding protein-coding genes that are coexpressed (false discovery rate [FDR] < 0.01 and R > 0.75) with each onco-lncRNA or tumor suppressor lncRNA and performed functional enrichment analysis. Here, we focused on differentially expressed genes in cancer hallmark-related functions. This analysis revealed that one or more hallmarks were enriched by coexpressed genes of these oncogenic lncRNAs (Figure 3B; Figure S3). Interestingly, the majority of these genes were enriched in insensitivity to antigrowth signals (Figure S3). Cancer cells with defects in the antigrowth signaling pathway are missing a critical gatekeeper of the cell-cycle progression; thus, cancer cells keep growing and dividing. In addition, these genes are significantly enriched in tissue invasion and metastasis, further suggesting the important roles of these oncogenic lncRNAs during glioma progression.Specifically, we found that several lncRNAs coexpress with more genes (hubs) in the coexpression network (Figure S3A). For instance, the onco-lncRNA CARD8-AS1 coexpresses with several genes involved in insensitivity to antigrowth signals and tissue invasion and metastasis. CD164, a sialomucin, has been demonstrated to be involved in the regulation of proliferation, apoptosis, adhesion, and differentiation in multiple cancers. In addition, RAC2 is important for glioblastoma tumorigenesis and can serve as the potential therapeutic target against glioblastoma and its stem-like cells. This lncRNA also shows dynamic upregulation during glioma progression (Figure 3C). In addition, we performed a gene set enrichment analysis (GSEA) based on the coexpression of each lncRNA.31, 32 We found that this lncRNA was involved in the regulation of immune response (Figure 3E; FDR < 0.001) and epithelial mesenchymal transition (Figures S3B and S3D; FDR = 0.021). Moreover, another hub tumor suppressor lncRNA PWAR6 coexpresses with the genes involved in insensitivity to antigrowth signals (such as BTRC and PRKCE) and evading apoptosis (such as FAIM2). PWAR6 shows dynamic downregulation during glioma progression (Figure 3D). The GSEA analysis indicated that this lncRNA was involved with the regulation of immune response and DNA repair (Figure 3F; Figure S3). Together, these results indicate that lncRNAs exhibit grade-specific dynamic expression and regulated hallmark-related genes, which could serve as important regulators during glioma progression.
Validation of the Functions of Glioma-Related lncRNAs
Next, we sought to study the functional roles of top candidate glioma-associated lncRNAs. We focused our analysis on CARD8-AS1. To explore the role of CARD8-AS1 in proliferation, we used a loss-of-function antisense approach. A CARD8-AS1 short hairpin RNA (shRNA) lentivirus was used to knock down CARD8-AS1 expression in U251 and A172 cells. Transfection of the CARD8-AS1 shRNA lentivirus led to downregulation of CARD8-AS1 as determined by qRT-PCR (Figure 4A) Lower CARD8-AS1 expression led to marked morphological changes in both cell lines. Specifically, there was a pronounced decrease in the fraction of elongated, spindle-shaped cells that was paralleled by an increase in rounded, apoptotic cells. A significant decrease in the cell viability was observed over time in glioma cells that had low CARD8-AS1 expression compared to control group cells as observed by light microscope (Figure 4B). In addition, Annexin V assay revealed that knockdown of CARD8-AS1 for 48 hr increased rates of cell apoptosis both in U251 and A172 cells, compared with control groups (Figure 4C). Furthermore, in vitro cell scratch tests revealed that lower CARD8-AS1 treatment reduced the number of migrations in U251 and A172 cells, compared with controls (Figure 4D). Overall, these data suggest that CARD8-AS1 regulates the metastatic potential of glioma cell lines in vitro.
Figure 4
CARD8-AS1 shRNA Lentivirus Suppresses Glioma Cell Proliferation and Migration and Induces Apoptosis In Vitro
(A) CARD8-AS1 expression was quantified by qRT-PCR analysis. CARD8-AS1 shRNA lentivirus significantly reduced CARD8-AS1 expression, compared with the control groups. (B) Morphological alterations and accounted assay in the U251 and A172 cells upon CARD8-AS1 suppression, as assessed by phase contrast microscopy. (C and D) Representative images of in vitro Annexin V (C) and cell scratch assays (D) of U251 and A172 after transfection with CARD8-AS1 shRNA lentivirus or control lentivirus.
CARD8-AS1 shRNA Lentivirus Suppresses Glioma Cell Proliferation and Migration and Induces Apoptosis In Vitro(A) CARD8-AS1 expression was quantified by qRT-PCR analysis. CARD8-AS1 shRNA lentivirus significantly reduced CARD8-AS1 expression, compared with the control groups. (B) Morphological alterations and accounted assay in the U251 and A172 cells upon CARD8-AS1 suppression, as assessed by phase contrast microscopy. (C and D) Representative images of in vitro Annexin V (C) and cell scratch assays (D) of U251 and A172 after transfection with CARD8-AS1 shRNA lentivirus or control lentivirus.
Survival-Related lncRNA Network Module in Glioma
Recent studies have demonstrated the utility and superiority of lncRNAs as novel biomarkers for cancer diagnosis, prognosis, and therapy. We next analyzed the associations between the expression of oncogenic lncRNAs and the clinical outcome of gliomapatients. Using survival analysis and a Cox regression model, we identified a set of lncRNAs (including 12 protective and 8 risk lncRNAs) demonstrating an ability to stratify patients into high- and low-risk groups with significantly different survival in glioma (Figure S4A; Table S6). These lncRNAs show consistent power in the TCGA and CGGA datasets. The multivariate Cox and stratification analysis indicated that these oncogenic lncRNA signatures were independent prognostic factors after adjusting for other clinical covariates, such as age and sex. Specifically, we found that the high expression of PWAR6 is associated with better survival of gliomapatients in the TCGA (Figure S4B; hazard ratio [HR] = 0.77, log rank p < 2.2e−16) and CGGA datasets (HR = 0.76, log rank p = 1.35e−10). In addition, we also identified a risk lncRNA CARD8-AS1 in the TCGA (HR = 1.18, log rank p < 2.2e−16) and CGGA (HR = 1.65, log rank p = 8.11e−8) data during glioma progression (Figure S4C).To further investigate the functions of these survival-related lncRNAs, we next identified the survival-related protein-coding genes based on the same procedure. In total, we identified 598 risk genes and 141 protective genes. Evidence has demonstrated that the lncRNA and genes synergistically regulate the important biological processes in cancer. We next identified the coexpression lncRNA-gene pairs with a correlation coefficient greater than 0.75 and defined these pairs as a survival-related module. Here, we identified a module formed by 22 lncRNA-gene pairs (Figures 5A and 5B). Based on the expression of these lncRNAs and genes, we trained a model in the TCGA dataset and found that the expression of this module can distinguish the patients with different survival times (Figure 5C; log rank p < 2.2e−16). In addition, this model was validated in the CGGA dataset (Figure 5D; log rank p < 2.2e−16). Our analysis above indicated that the expression of lncRNAs is strictly regulated by grade-specific TFs; we next identified the TFs that regulate the components of the module. As a result, we identified several TFs that are associated with cancer development and progression, such as SATB1, SPI1, HOXA3, HOXD11, ELF4, and SP100.
Figure 5
Survival-Related Oncogenic lncRNA Module in Glioma
(A) The survival portrait of oncogenic lncRNAs and genes in patients of TCGA data. (B) lncRNA-gene coexpression module in glioma. Solid lines represent the coexpression relationships, and dashed lines represent the transcription regulation between TFs and lncRNAs or genes. (C) The Kaplan-Meier analysis of glioma patients using the expression of lncRNA module in the TCGA dataset. (D) The Kaplan-Meier analysis of glioma patients using expression of the lncRNA module in the CGGA dataset.
Survival-Related Oncogenic lncRNA Module in Glioma(A) The survival portrait of oncogenic lncRNAs and genes in patients of TCGA data. (B) lncRNA-gene coexpression module in glioma. Solid lines represent the coexpression relationships, and dashed lines represent the transcription regulation between TFs and lncRNAs or genes. (C) The Kaplan-Meier analysis of gliomapatients using the expression of lncRNA module in the TCGA dataset. (D) The Kaplan-Meier analysis of gliomapatients using expression of the lncRNA module in the CGGA dataset.Next, we further explored whether the lncRNA module can be effectively used as a prognosis signature for high-grade gliomas (III and IV). Likewise, we trained the Cox regression coefficient for each lncRNA or gene in the module based on the expression of high-grade gliomapatients in the TCGA dataset and then calculated the risk score for each patient in the TCGA and CGGA datasets. The patients were divided into four groups based on the risk score and grade information; we found that these patients had a distinct survival time in both the TCGA and CGGA datasets (Figure S5). These results suggest that the lncRNA module signature integrated with the grade information can be a good candidate prognosis biomarker for glioma progression.
Prognostic Effect of lncRNA Network Module Is Independent from IDH1 Mutation
Mutation of the IDH1 gene has been demonstrated to be a very strong prognostic factor in gliomas regardless of the grade. Patients whose tumor harbored an IDH1 mutation had a significantly longer survival time than patients with a tumor of the same grade but a wild-type for IDH1. Next, we investigated whether the lncRNA module identified in our study is an independent prognosis biomarker. We divided patients into three groups: IDH1-mutated patients, IDH1 wild-type with a low risk score, and IDH1 wild-type patients with a high risk score. We found that IDH1-mutated patients showed similar survival rates as the wild-type but with a low risk score in the TCGA dataset (Figure 6A). These two groups showed better survival than the patients with a high risk score (p < 2.2e−16). In addition, we found that the lncRNA module signature can also classify patients with different survival rates in the CCGA data (Figure 6B; p < 2.2e−16). Specifically, the patients in the IDH1 wild-type and low-risk group showed better survival than those of the IDH1-mutated group with high risk. Taken together, these results indicate that the identified oncogenic lncRNA module signatures have important clinical implications for improving clinical outcome predictions and guiding the therapy for gliomapatients with further prospective validation.
Figure 6
Prognosis of the lncRNA Module Independent of IDH1 Mutation
(A) The Kaplan-Meier analysis of glioma patients based on the lncRNA module signature and IDH1 mutation in the TCGA dataset. (B) The Kaplan-Meier analysis of glioma patients based on the lncRNA module signature and IDH1 mutation in the CGGA dataset.
Prognosis of the lncRNA Module Independent of IDH1 Mutation(A) The Kaplan-Meier analysis of gliomapatients based on the lncRNA module signature and IDH1 mutation in the TCGA dataset. (B) The Kaplan-Meier analysis of gliomapatients based on the lncRNA module signature and IDH1 mutation in the CGGA dataset.
Web-Based, User-Friendly Platform for Investigating lncRNA Expression during Glioma Progression
An increasing number of studies have demonstrated that lncRNAs play critical roles during glioma progression. Our current studies have identified lncRNAs that are associated with glioma progression. To facilitate the investigation of the functions of lncRNAs in glioma by users, we constructed a web-based platform (http://bio-bigdata.hrbmu.edu.cn/AGP-lnc/) for viewing the dynamic expression changes of lncRNAs during glioma progression (Figure 7). On this platform, the users can select the glioma datasets from TCGA or CGGA. By inputting the name of the lncRNAs of interest, the users can obtain the dynamic expression of lncRNAs during glioma progression, including grades II, III, and IV. From the resulting boxplots, it is easy to obtain the expression pattern of the lncRNAs. If the expression of lncRNAs increases with the grade of glioma, they may test for oncogene functions. In contrast, they might be candidate tumor suppressors if their expression decreases during glioma progression. In addition, all of the datasets can be downloaded for further analyses. Taken together, this platform provides a better view of lncRNA functions during glioma progression.
Figure 7
Web-Based Platform for Glioma-Related lncRNAs/Genes
(A) The users can input the lncRNAs/genes of interest and select the data resource for viewing the dynamic expression changes during glioma progression. (B) The boxplots show the expression distribution of the lncRNAs/genes of interest in grade II, III, and IV glioma patients. (C) The downloaded files in this platform. (D) The statistics of the platform.
Web-Based Platform for Glioma-Related lncRNAs/Genes(A) The users can input the lncRNAs/genes of interest and select the data resource for viewing the dynamic expression changes during glioma progression. (B) The boxplots show the expression distribution of the lncRNAs/genes of interest in grade II, III, and IV gliomapatients. (C) The downloaded files in this platform. (D) The statistics of the platform.
Discussion
lncRNAs are involved in various biological processes in glioma cells, including apoptosis, cell proliferation, and invasion.39, 40 The dysregulation of lncRNA expression has been observed in various types of cancer, including glioma. However, we still lack knowledge on the functions of lncRNAs during glioma progression. Here, we integrated the genome-wide lncRNA expression profiles across thousands of gliomapatients of different grades, transcription regulation, and functional genomics datasets. The integration analysis revealed critical lncRNAs that show dynamic expression and are regulated by grade-specific TFs. Moreover, we identified several oncogenic lncRNAs that are associated with the survival of patients, such as PWAR6 and CARD8-AS1. All of these results provide a valuable resource for further investigating the roles of lncRNAs during glioma progression.Although a number of lncRNAs were identified to be dysregulated in cancer, we lack knowledge on the upstream regulators. In this study, we identified not only the dynamically expressed lncRNAs but also the candidate TF regulators based on a motif enrichment analysis. We revealed that these grade-specific expressed lncRNAs are regulated by grade-specific TFs. The promoters of lncRNAs are significantly bound by these TFs. In addition, we found that these TFs also show higher expression at corresponding grades. These results indicate that these dynamically expressed lncRNAs are likely to be regulated by these TFs, further playing a critical role in glioma progression. In addition, we also investigated the TF-lncRNA regulation based on public chromatin immunoprecipitation sequencing (ChIP-seq) data. We found that approximately 41% of the TF-lncRNA regulation identified based on TF binding analysis was also supported by the ChIP-seq data from ChIPBase v2.0. With the increasing number of high-throughput sequencing data, such as ChIP-seq, we can obtain more details on the regulation of these oncogenic lncRNAs in cancer. In addition, evidence has indicated that miRNA also plays a critical role in regulating the expression of lncRNAs. We next predicted the miRNA regulators for two important lncRNAs identified here, including PWAR6 and CARD8-AS1. We found that these two lncRNAs were regulated by several miRNAs that have been demonstrated to be involved in glioma, such as hsa-miR-184, hsa-miR-21-3p, and hsa-miR-20a-5p.A large number of putative lncRNAs have been identified or predicted in humans; however, the functions of the majority of lncRNAs remain poorly characterized. To infer the possible functional roles of the dynamically expressed lncRNAs during glioma progression, we used a computational method integrating lncRNA and mRNA expression profiles to infer the potential functions of lncRNAs. Based on the coexpression network, we found that the majority of coding genes are involved in cancer hallmark-related functions, such as insensitivity to antigrowth signals, tissue invasion, and metastasis. Specifically, the GSEA analyses indicated that CARD8-AS1 and PWAR6 are involved in the regulation of immune response, epithelial-mesenchymal transition (EMT), and DNA repair. Moreover, we also revealed that several lncRNAs are associated with patient survival, while possibly serving as candidate prognostic biomarkers in glioma. Although these results provided evidence for the functions of these oncogenic lncRNAs, more experimental validation is needed to illustrate their detailed functions in glioma.The current study utilized comprehensive bioinformatics analyses to determine the dynamic transcriptome landscapes of lncRNA during glioma progression. Thus, it will be important to validate the expression dynamic of the key lncRNA regulators and their target genes by low experimental methods in the future. Collectively, our study provides a foundation for understanding lncRNA expression and regulation during glioma progression.
Materials and Methods
Transcriptome during Glioma Progression
The genome-wide lncRNA and protein-coding gene expression of glioma samples were downloaded from TCGA project, including 476 lower-grade glioma (LGG) and 159 GBM samples.47, 48 In addition, we also downloaded the clinical information, including age, sex, grade, IDH1 mutation status, and survival time of these samples. There were 238 grade II, 238 grade III, and 159 grade IV samples for further analysis (Table 1). The expression of the lncRNAs and genes were measured by fragments per kilobase of transcript per million mapped reads (FPKM). To ensure detection reliability and reduce noise, we applied two filters used in one previous study in each cancer type to identify the expressed lncRNAs. First, the lncRNAs for which the 50th-percentile FPKM value was equal to zero were eliminated; second, we selected only the lncRNAs for which the 90th-percentile FPKM value was greater than 0.1 for further analysis. The expression value of each lncRNA was log-transformed.In addition, we also obtained another RNA-seq dataset during glioma progression from the GEO (GEO: GSE48865). The raw fastq files were downloaded and processed using Tophat for alignment and Cufflinks for assembly. All default options for these tools were used. The human reference genome GRCh37 and the corresponding gtf annotation were downloaded from GENCODE. Then we harvested the lncRNA and gene expression profile for 272 glioma samples, including 100 grade II, 72 grade III, and 100 grade IV samples (Table 1). The expression profiles were processed as the TCGA data. In addition, we also downloaded the clinical information of these samples from the CGGA project.
Identification of Glioma Progression-Associated lncRNAs and Genes
To identify glioma progression-associated lncRNAs and genes, we used a t test model with BH-corrected p < 0.1 to select RNAs (including lncRNAs and coding genes) that were differentially expressed between adjacent grades. Specifically, for a specific gene or lncRNA i, we first calculated the standard test statistic for comparing two groups:where is the mean expression value of gene or lncRNA i in group I, is the mean in group II, and is the within-groups SE for gene or lncRNA i. The t statistic would follow the t distribution indexed by n1+n2−2 degrees of freedom. n1 and n2 are the number of tumor samples in different grade of glioma progression. In addition, p values were obtained according to the t distribution.Next, these lncRNAs or genes were filtered based on the fold change (FC) between two adjacent grades. The previous grade was set as the denominator (III versus II and IV versus III). The lncRNAs and genes with FC > 1 were considered to be upregulated during progression and grouped into the “Up” pattern. The lncRNAs and genes with FC < 1 were grouped into the “Down” pattern, and the remaining genes were considered nondifferentially expressed and grouped into “Maintain.” Thus, all lncRNAs and genes were grouped into one of nine possible patterns.
Transcription Regulation Analysis of the lncRNAs and Genes
Promoter sequences (defined as the 2-kb regions around the transcription start sites) of the lncRNAs or genes were first downloaded from UCSC (University of California, Santa Cruz). The promoter sequences of glioma progression-related lncRNAs with different expression patterns were subjected to match in TRANSFAC to identify the TFs that specifically bind to the promoters of the lncRNAs and genes. To identify the active TF-lncRNA or TF-gene regulation in glioma, we also calculated the expression correlation between TFs and lncRNAs or TFs and genes. The pairs with a correlation coefficient greater than 0.4 and p values less than 0.01 were retained for further analysis. For the lncRNA modules, we used the correlation coefficient 0.75 as a cutoff.In addition, we also downloaded the TF-lncRNA regulation from ChIPBase v2.0. Next, we identified the TFs and lncRNAs that overlapped with the regulation predicted above. The common TF-lncRNA regulation was identified.
Genomic Alterations of lncRNAs/Genes in Glioma
We also downloaded the genomic alterations data from the TCGA project. The copy number alterations for glioma were obtained from Broad GDAC Firehose (https://gdac.broadinstitute.org/). We used the 95% confidence level datasets. If the lncRNAs overlapped with the copy number alteration regions, we considered them as CNV altered. In addition, we also obtained the somatic mutation data from the MC3 file of the TCGA project. Genes with mutations were identified. We next collected the cancer-related genes from the Cancer Gene Census (CGC).
Functional Analysis of Glioma Progression-Related lncRNAs
To identify the function of lncRNAs, guilt by association was used in our analysis. We calculated the expression correlation coefficient of lncRNAs and protein-coding genes, and genes with correlation coefficients (R) greater than 0.75 and p values <0.01 for each lncRNA were identified. In addition, we also identified the cancer hallmark gene-related lncRNAs and constructed the lncRNA-hallmark genes network.Moreover, to predict the potential functions of glioma progression-related lncRNAs, we performed GSEA.31, 32 First, we calculated the correlation between the expressions of coding genes with specific lncRNAs. Next, the genes were ranked by the correlation coefficients and subjected to the GSEA analysis. Here, we focused on cancer hallmark-related functions. The Gene Ontology (GO) terms with FDR <0.05 were regarded as potential functions of the specific lncRNAs.
Cell Lines and Transfection
U251 and A172 cells were passaged and maintained following standard techniques in 5% CO2% and 95% air cultured following manufacturer instructions (American Type Culture Collection [ATCC]). U251 and A172 cells were transfected lentivirus vector of CARD8-AS1 shRNA following manufacturer’s instructions (Genecard Technologies). Transfection was done using lentivirus vector following manufacturer’s instructions (Genecard Technologies). Cell lines were purchased and verified by ATCC, maintained at low passage.
RNA Extraction and qRT-PCR Analysis
Total RNA was isolated from cultured cell lines using the RNeasy Mini Kit (Qiagen). RLT buffer was supplemented with 2-mercaptoethanol (Sigma-Aldrich), and DNase treatment was performed for 20 min using the RNase-Free DNase set (Qiagen). About 1 μg of total RNA was reverse transcribed into cDNA using random hexamers with SuperScript III First-Strand Synthesis kit (Life Technologies). About 20 ng of cDNA was used in the qRT-PCR reaction with iQ SYBR Green supermix (Bio-Rad) and custom-designed primers. All experiments were calculated as a function of gene expression relative to either control TATA-box binding protein (TBP) expression or GAPDH. qPCR data were expressed as mean FC (2−ΔΔCT).
Proliferation Assay and In Vitro Migration Assays
Following transfection, an accounted assay was used to quantitate the cell viability of humanglioma cells. Each experiment was performed in triplicate. Cell scratch tests were used to quantify in vitro glioma cell migration. Fold migration was calculated relative to the blank control.
Apoptosis Assays
Apoptosis was quantified 48 hr after transfection, using Annexin V labeling. For the Annexin V assay, an Annexin V-FITC-labeled Apoptosis Detection Kit (Abcam) was used according to the manufacturer’s protocol.
Identification of Survival-Related lncRNAs and Genes
A Cox regression analysis was used for identifying the lncRNAs or genes for which the expression is associated with patient survival. A multivariate Cox regression model was used, in which the age, sex, grade, and IDH1 mutation status were taken into account. The lncRNAs or genes with a p value for expression less than 0.05 were identified as glioma survival-related lncRNAs or genes. lncRNAs or genes with an HR greater than 1 were identified as risk lncRNAs, or genes and those with an HR less than 1 were identified as protective lncRNAs or genes.The survival-related lncRNAs and genes were subjected to a coexpression analysis. We identified the lncRNA-gene pairs with a correlation coefficient greater than 0.75; then the survival-related lncRNA-gene module was identified. Then, for each gliomapatient, we calculated a risk score based on the Cox regression coefficient and the expression of the lncRNA or gene in the functional module.where β is the Cox regression coefficient, n is the number of lncRNAs and genes in the module, and eki is the expression level of lncRNA or gene k in patient i. The regression coefficient was trained in the TCGA dataset; then we applied the same coefficient to the CGGA datasets. Patients were divided into two groups based on the risk score, and a log rank test was used to evaluate the survival difference between two groups. In addition, we also considered the IDH1 mutation status and classified the patients into four groups.
Statistics and Visualization of Networks
All of the statistical analyses were performed using R Statistical Software, and the biological networks were visualized by Cytoscape. The genome-wide alteration plot for lncRNAs and genes was plotted by the Circos tool.
Implication of Web-Based Glioma-Related lncRNA Platform
This web site was developed in JSP using a Servlet framework, and it is deployed on a Tomcat 6.0.33 web server, which runs under a CentOS 5.5 system. The data were stored and administered by MySQL 5.5.1. CSS (Cascading Style Sheets) was used to control all of the layout and appearance of the atlas of glioma progression-related lncRNAs. The Echarts were used to show the result by creating a boxplot. The atlas of the glioma progression-related lncRNAs platform was fully tested in Google Chrome (version 17 and later).
Author Contributions
Y.L., L.W., and L.C. supervised the whole project; Y.L. and J.X. conceived of and designed the study; X.L., T.J., J.B., J.L., J.X., Y.T., T.S., X.J., and J.X. contributed to the data analysis; and Y.L., L.W., and J.X. wrote the manuscript with input from the other authors.
Authors: Daniel J Brat; Roel G W Verhaak; Kenneth D Aldape; W K Alfred Yung; Sofie R Salama; Lee A D Cooper; Esther Rheinbay; C Ryan Miller; Mark Vitucci; Olena Morozova; A Gordon Robertson; Houtan Noushmehr; Peter W Laird; Andrew D Cherniack; Rehan Akbani; Jason T Huse; Giovanni Ciriello; Laila M Poisson; Jill S Barnholtz-Sloan; Mitchel S Berger; Cameron Brennan; Rivka R Colen; Howard Colman; Adam E Flanders; Caterina Giannini; Mia Grifford; Antonio Iavarone; Rajan Jain; Isaac Joseph; Jaegil Kim; Katayoon Kasaian; Tom Mikkelsen; Bradley A Murray; Brian Patrick O'Neill; Lior Pachter; Donald W Parsons; Carrie Sougnez; Erik P Sulman; Scott R Vandenberg; Erwin G Van Meir; Andreas von Deimling; Hailei Zhang; Daniel Crain; Kevin Lau; David Mallery; Scott Morris; Joseph Paulauskis; Robert Penny; Troy Shelton; Mark Sherman; Peggy Yena; Aaron Black; Jay Bowen; Katie Dicostanzo; Julie Gastier-Foster; Kristen M Leraas; Tara M Lichtenberg; Christopher R Pierson; Nilsa C Ramirez; Cynthia Taylor; Stephanie Weaver; Lisa Wise; Erik Zmuda; Tanja Davidsen; John A Demchok; Greg Eley; Martin L Ferguson; Carolyn M Hutter; Kenna R Mills Shaw; Bradley A Ozenberger; Margi Sheth; Heidi J Sofia; Roy Tarnuzzer; Zhining Wang; Liming Yang; Jean Claude Zenklusen; Brenda Ayala; Julien Baboud; Sudha Chudamani; Mark A Jensen; Jia Liu; Todd Pihl; Rohini Raman; Yunhu Wan; Ye Wu; Adrian Ally; J Todd Auman; Miruna Balasundaram; Saianand Balu; Stephen B Baylin; Rameen Beroukhim; Moiz S Bootwalla; Reanne Bowlby; Christopher A Bristow; Denise Brooks; Yaron Butterfield; Rebecca Carlsen; Scott Carter; Lynda Chin; Andy Chu; Eric Chuah; Kristian Cibulskis; Amanda Clarke; Simon G Coetzee; Noreen Dhalla; Tim Fennell; Sheila Fisher; Stacey Gabriel; Gad Getz; Richard Gibbs; Ranabir Guin; Angela Hadjipanayis; D Neil Hayes; Toshinori Hinoue; Katherine Hoadley; Robert A Holt; Alan P Hoyle; Stuart R Jefferys; Steven Jones; Corbin D Jones; Raju Kucherlapati; Phillip H Lai; Eric Lander; Semin Lee; Lee Lichtenstein; Yussanne Ma; Dennis T Maglinte; Harshad S Mahadeshwar; Marco A Marra; Michael Mayo; Shaowu Meng; Matthew L Meyerson; Piotr A Mieczkowski; Richard A Moore; Lisle E Mose; Andrew J Mungall; Angeliki Pantazi; Michael Parfenov; Peter J Park; Joel S Parker; Charles M Perou; Alexei Protopopov; Xiaojia Ren; Jeffrey Roach; Thaís S Sabedot; Jacqueline Schein; Steven E Schumacher; Jonathan G Seidman; Sahil Seth; Hui Shen; Janae V Simons; Payal Sipahimalani; Matthew G Soloway; Xingzhi Song; Huandong Sun; Barbara Tabak; Angela Tam; Donghui Tan; Jiabin Tang; Nina Thiessen; Timothy Triche; David J Van Den Berg; Umadevi Veluvolu; Scot Waring; Daniel J Weisenberger; Matthew D Wilkerson; Tina Wong; Junyuan Wu; Liu Xi; Andrew W Xu; Lixing Yang; Travis I Zack; Jianhua Zhang; B Arman Aksoy; Harindra Arachchi; Chris Benz; Brady Bernard; Daniel Carlin; Juok Cho; Daniel DiCara; Scott Frazer; Gregory N Fuller; JianJiong Gao; Nils Gehlenborg; David Haussler; David I Heiman; Lisa Iype; Anders Jacobsen; Zhenlin Ju; Sol Katzman; Hoon Kim; Theo Knijnenburg; Richard Bailey Kreisberg; Michael S Lawrence; William Lee; Kalle Leinonen; Pei Lin; Shiyun Ling; Wenbin Liu; Yingchun Liu; Yuexin Liu; Yiling Lu; Gordon Mills; Sam Ng; Michael S Noble; Evan Paull; Arvind Rao; Sheila Reynolds; Gordon Saksena; Zack Sanborn; Chris Sander; Nikolaus Schultz; Yasin Senbabaoglu; Ronglai Shen; Ilya Shmulevich; Rileen Sinha; Josh Stuart; S Onur Sumer; Yichao Sun; Natalie Tasman; Barry S Taylor; Doug Voet; Nils Weinhold; John N Weinstein; Da Yang; Kosuke Yoshihara; Siyuan Zheng; Wei Zhang; Lihua Zou; Ty Abel; Sara Sadeghi; Mark L Cohen; Jenny Eschbacher; Eyas M Hattab; Aditya Raghunathan; Matthew J Schniederjan; Dina Aziz; Gene Barnett; Wendi Barrett; Darell D Bigner; Lori Boice; Cathy Brewer; Chiara Calatozzolo; Benito Campos; Carlos Gilberto Carlotti; Timothy A Chan; Lucia Cuppini; Erin Curley; Stefania Cuzzubbo; Karen Devine; Francesco DiMeco; Rebecca Duell; J Bradley Elder; Ashley Fehrenbach; Gaetano Finocchiaro; William Friedman; Jordonna Fulop; Johanna Gardner; Beth Hermes; Christel Herold-Mende; Christine Jungk; Ady Kendler; Norman L Lehman; Eric Lipp; Ouida Liu; Randy Mandt; Mary McGraw; Roger Mclendon; Christopher McPherson; Luciano Neder; Phuong Nguyen; Ardene Noss; Raffaele Nunziata; Quinn T Ostrom; Cheryl Palmer; Alessandro Perin; Bianca Pollo; Alexander Potapov; Olga Potapova; W Kimryn Rathmell; Daniil Rotin; Lisa Scarpace; Cathy Schilero; Kelly Senecal; Kristen Shimmel; Vsevolod Shurkhay; Suzanne Sifri; Rosy Singh; Andrew E Sloan; Kathy Smolenski; Susan M Staugaitis; Ruth Steele; Leigh Thorne; Daniela P C Tirapelli; Andreas Unterberg; Mahitha Vallurupalli; Yun Wang; Ronald Warnick; Felicia Williams; Yingli Wolinsky; Sue Bell; Mara Rosenberg; Chip Stewart; Franklin Huang; Jonna L Grimsby; Amie J Radenbaugh; Jianan Zhang Journal: N Engl J Med Date: 2015-06-10 Impact factor: 91.245
Authors: Goro Sashida; Narae Bae; Silvana Di Giandomenico; Takashi Asai; Nadia Gurvich; Elena Bazzoli; Yan Liu; Gang Huang; Xinyang Zhao; Silvia Menendez; Stephen D Nimer Journal: Cancer Res Date: 2011-05-26 Impact factor: 12.701
Authors: Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov Journal: Proc Natl Acad Sci U S A Date: 2005-09-30 Impact factor: 11.205