| Literature DB >> 31860871 |
Wan Li1, Gui Deng1, Ji Zhang1, Erqiang Hu1, Yuehan He1, Junjie Lv1, Xilin Sun2,3, Kai Wang2,3, Lina Chen1.
Abstract
Breast cancer is one of the most common malignant cancers among females worldwide. This complex disease is not caused by a single gene, but resulted from multi-gene interactions, which could be represented by biological networks. Network modules are composed of genes with significant similarities in terms of expression, function and disease association. Therefore, the identification of disease risk modules could contribute to understanding the molecular mechanisms underlying breast cancer. In this paper, an integrated disease risk module identification strategy was proposed according to a multi-objective programming model for two similarity criteria as well as significance of permutation tests in Markov random field module score, function consistency score and Pearson correlation coefficient difference score. Three breast cancer risk modules were identified from a breast cancer-related interaction network. Genes in these risk modules were confirmed to play critical roles in breast cancer by literature review. These risk modules were enriched in breast cancer-related pathways or functions and could distinguish between breast tumor and normal samples with high accuracy for not only the microarray dataset used for breast cancer risk module identification, but also another two independent datasets. Our integrated strategy could be extended to other complex diseases to identify their risk modules and reveal their pathogenesis.Entities:
Keywords: breast cancer; disease risk module; integrated strategy; multi-objective programming model; permutation test
Mesh:
Year: 2019 PMID: 31860871 PMCID: PMC6949069 DOI: 10.18632/aging.102546
Source DB: PubMed Journal: Aging (Albany NY) ISSN: 1945-4589 Impact factor: 5.682
Figure 1A schematic diagram of the integrated breast cancer risk module identification strategy.
Primary modules.
| Primary module 1 | 91 | 15 | 76 |
| Primary module 2 | 61 | 14 | 47 |
| Primary module 3 | 59 | 13 | 46 |
| Primary module 4 | 6 | 5 | 1 |
| Primary module 5 | 7 | 5 | 2 |
| Primary module 6 | 4 | 3 | 1 |
Figure 2Candidate modules.
Figure 3P-values of permutation tests for candidate modules.
Figure 4Breast cancer risk modules. Red nodes are seed genes, yellow are confirmed non-seed genes and blue are unconfirmed non-seed genes.
The confirmation rate of non-seed genes in breast cancer risk modules.
| Module 1 | 28 | 20 | 71.43% |
| Module 2 | 15 | 11 | 73.33% |
| Module 3 | 22 | 15 | 68.18% |
| Total | 44 | 33 | 75.00% |
Figure 5Pathways and functions enriched by breast cancer risk modules.
Figure 6The breast cancer pathway. Red nodes are seed genes and yellow are non-seed genes. Modules these genes belong to are marked beside them.
Figure 7The ROC curves and AUC values of breast cancer risk modules and seed genes in these modules for GSE15852.
The classification accuracy with breast cancer risk modules and seed genes in risk modules as features for another two datasets.
| AUC values with breast cancer risk modules | 0.899 | 0.893 | 0.889 | 0.985 | 0.989 | 0.992 |
| AUC values with seed genes in risk modules | 0.804 | 0.778 | 0.783 | 0.974 | 0.976 | 0.977 |
Figure 8The number of common genes in risk modules from all samples and from random samples. Blue dots represent the number of genes in breast cancer risk modules from all samples. Boxplots represent the distribution of the number of common genes in breast cancer risk modules from all samples and risk modules form random samples.
Figure 9The number of genes and classification accuracy for cliques/modules detected by MClique, MCODE and GraphWeb.
Figure 10The number of genes and AUC values for candidate modules discovered using different criteria.
The classification accuracy of genes removed from primary modules and non-seed genes remained in breast cancer risk modules.
| Module 1 | 0.856 | 0.889 |
| Module 2 | 0.795 | 0.878 |
| Module 3 | 0.847 | 0.908 |
The classification accuracy of breast cancer risk modules for breast cancer subtypes.
| Basal vs Her2 | 0.978 | 0.964 | 0.985 |
| Basal vs LumA | 0.999 | 0.999 | 0.996 |
| Basal vs LumB | 0.995 | 0.990 | 0.991 |
| Her2 vs LumA | 0.978 | 0.983 | 0.972 |
| Her2 vs LumB | 0.938 | 0.913 | 0.902 |
| LumA vs LumB | 0.845 | 0.863 | 0.842 |
Breast cancer-associated genes and their source databases.
| BARD1 | √ | √ | ||
| BRCA1 | √ | √ | √ | |
| BRCA2 | √ | √ | √ | |
| RB1 | √ | √ | √ | |
| TP53 | √ | √ | √ | |
| AKT1 | √ | √ | √ | |
| ARID1A | √ | √ | ||
| ARID1B | √ | √ | ||
| BAP1 | √ | √ | ||
| CASP8 | √ | √ | ||
| CCND1 | √ | √ | √ | √ |
| CDH1 | √ | √ | √ | |
| CDKN1B | √ | √ | √ | |
| CTCF | √ | √ | ||
| EP300 | √ | √ | ||
| ERBB2 | √ | √ | √ | √ |
| ESR1 | √ | √ | √ | |
| FOXA1 | √ | √ | ||
| GATA3 | √ | √ | ||
| IRS4 | √ | √ | ||
| MAP2K4 | √ | √ | ||
| MAP3K1 | √ | √ | ||
| MAP3K13 | √ | √ | ||
| NCOR1 | √ | √ | ||
| NOTCH1 | √ | √ | ||
| NTRK3 | √ | √ | ||
| PBRM1 | √ | √ | ||
| PIK3CA | √ | √ | √ | |
| PPM1D | √ | √ | ||
| SMARCD1 | √ | √ | ||
| TBX3 | √ | √ | ||
| ZMYM3 | √ | √ |