| Literature DB >> 30420728 |
Andrés López-Cortés1,2, César Paz-Y-Miño3, Alejandro Cabrera-Andrade4,5, Stephen J Barigye6, Cristian R Munteanu7,8, Humberto González-Díaz9,10, Alejandro Pazos7,8, Yunierkis Pérez-Castillo5,11, Eduardo Tejera12,13.
Abstract
Consensus strategy was proved to be highly efficient in the recognition of gene-disease association. Therefore, the main objective of this study was to apply theoretical approaches to explore genes and communities directly involved in breast cancer (BC) pathogenesis. We evaluated the consensus between 8 prioritization strategies for the early recognition of pathogenic genes. A communality analysis in the protein-protein interaction (PPi) network of previously selected genes was enriched with gene ontology, metabolic pathways, as well as oncogenomics validation with the OncoPPi and DRIVE projects. The consensus genes were rationally filtered to 1842 genes. The communality analysis showed an enrichment of 14 communities specially connected with ERBB, PI3K-AKT, mTOR, FOXO, p53, HIF-1, VEGF, MAPK and prolactin signaling pathways. Genes with highest ranking were TP53, ESR1, BRCA2, BRCA1 and ERBB2. Genes with highest connectivity degree were TP53, AKT1, SRC, CREBBP and EP300. The connectivity degree allowed to establish a significant correlation between the OncoPPi network and our BC integrated network conformed by 51 genes and 62 PPi. In addition, CCND1, RAD51, CDC42, YAP1 and RPA1 were functional genes with significant sensitivity score in BC cell lines. In conclusion, the consensus strategy identifies both well-known pathogenic genes and prioritized genes that need to be further explored.Entities:
Mesh:
Year: 2018 PMID: 30420728 PMCID: PMC6232116 DOI: 10.1038/s41598-018-35149-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Identification (in %) of pathogenic genes in each approach.
| Methods | 1% | 5% | 10% | 20% | 50% | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | |
| GLAD4U | 6.8 | 4.5 | 3.2 | 15.3 | 15.3 | 12.3 | 20.3 | 22.5 | 19.4 | 32.2 | 34.2 | 30.3 | 45.8 | 47.7 | 43.9 |
| Disgenet | 0.0 | 0.0 | 0.0 | 1.7 | 1.8 | 1.3 | 8.5 | 4.5 | 3.2 | 10.2 | 9.0 | 6.5 | 15.3 | 12.6 | 9.7 |
| Genie | 3.4 | 1.8 | 1.3 | 5.1 | 2.7 | 2.6 | 6.8 | 4.5 | 4.5 | 47.5 | 27.9 | 31.0 | 67.8 | 55.0 | 56.1 |
| SNP3D | 11.9 | 8.1 | 5.8 | 22.0 | 26.1 | 20.6 | 35.6 | 37.8 | 32.9 | 44.1 | 54.1 | 47.7 | 59.3 | 65.8 | 60.6 |
| Guildify | 18.6 | 16.2 | 14.8 | 18.6 | 23.4 | 20.0 | 23.7 | 28.8 | 25.2 | 44.1 | 36.9 | 38.1 | 76.3 | 69.4 | 70.3 |
| Cipher | 3.4 | 2.7 | 1.9 | 5.1 | 7.2 | 5.8 | 13.6 | 14.4 | 12.3 | 20.3 | 16.2 | 15.5 | 25.4 | 21.6 | 20.0 |
| Phenolyzer | 47.5 | 29.7 | 31.6 | 79.7 | 55.0 | 60.6 | 86.4 | 71.2 | 74.2 | 88.1 | 85.6 | 85.2 | 94.9 | 98.2 | 96.8 |
| Polysearch | 0.0 | 0.0 | 0.0 | 1.7 | 0.9 | 0.6 | 1.7 | 0.9 | 0.6 | 3.4 | 1.8 | 1.3 | 5.1 | 4.5 | 3.2 |
| Consensus | 49.2 | 42.3 | 40.6 | 76.3 | 84.7 | 80.0 | 83.1 | 98.2 | 92.3 | 93.2 | 100.0 | 97.4 | 96.6 | 100.0 | 98.7 |
Average ranking of identified pathogenic genes in each method.
| Methods | 1% | 5% | 10% | 20% | 50% | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | G1 | G2 | G1 + G2 | |
| GLAD4U | 4.2 | 2.7 | 1.9 | 20.3 | 10.3 | 8.1 | 30.6 | 18.6 | 14.4 | 64.5 | 26.9 | 27.4 | 123.6 | 71.0 | 53.0 |
| Disgenet | 0.0 | 0.0 | 0.0 | 2.5 | 1.4 | 0.6 | 5.1 | 2.7 | 1.9 | 6.3 | 5.0 | 3.5 | 12.4 | 7.2 | 5.4 |
| Genie | 11.9 | 6.3 | 4.5 | 27.6 | 8.7 | 10.3 | 50.8 | 51.5 | 36.1 | 273.6 | 146.2 | 107.4 | 389.5 | 247.9 | 174.0 |
| SNP3D | 6.9 | 4.6 | 3.3 | 24.1 | 17.9 | 13.6 | 60.7 | 34.4 | 26.7 | 104.4 | 63.2 | 48.5 | 214.9 | 108.8 | 84.6 |
| Guildify | 97.8 | 39.5 | 31.4 | 97.8 | 120.1 | 78.6 | 424.7 | 226.9 | 169.0 | 1576.3 | 551.4 | 508.5 | 3531.5 | 1863.9 | 1370.9 |
| Cipher | 2.5 | 2.7 | 1.9 | 20.8 | 18.8 | 14.4 | 89.4 | 45.7 | 33.2 | 133.2 | 51.6 | 43.4 | 204.7 | 116.7 | 81.2 |
| Phenolyzer | 95.3 | 45.7 | 36.0 | 355.8 | 191.4 | 147.2 | 441.9 | 323.7 | 221.4 | 461.2 | 399.3 | 264.1 | 532.0 | 444.7 | 298.7 |
| Polysearch | 0.0 | 0.0 | 0.0 | 1.7 | 0.9 | 0.6 | 1.7 | 0.9 | 0.6 | 4.2 | 2.3 | 1.6 | 6.3 | 5.2 | 3.7 |
| Consensus | 91.7 | 66.9 | 46.2 | 372.5 | 271.3 | 189.5 | 510.5 | 400.2 | 277.0 | 989.5 | 430.4 | 356.2 | 1392.2 | 430.4 | 413.5 |
Figure 1(a) Variation of I with respect to genes ranking. The maximal value of I is 0.787148315 and corresponds with a ranking value of 1842 genes. (b) Communality network analysis by clique percolation method. Values of S with respect to each k-clique cutoff value. (c) Clustering result (3 clusters) integrating different communities. Green circles represent cluster 1, blue circles represent cluster 2, and purple circles represent cluster 3. X-axis represents the average ranking of communities and Y-axis represents weight of pathogenic genes.
Pathway enrichment analysis (k-clique 9) and their associated weights.
| Pathways | PathRank | N Community | PathGene | PathScore | Community |
|---|---|---|---|---|---|
| ERBB signaling pathway | 0.815143 | 14 | 0.715853953 | 0.763886926 | 4 25 26 33 34 36 38 40 42 43 44 46 47 48 |
| Prolactin signaling pathway | 0.795867 | 15 | 0.72857406 | 0.761477386 | 4 6 11 33 34 36 38 39 40 42 43 44 46 47 48 |
| mTOR signaling pathway | 0.815500 | 4 | 0.687676019 | 0.748865671 | 4 36 42 44 |
| p53 signaling pathway | 0.735875 | 8 | 0.735254081 | 0.735564475 | 4 9 10 12 16 30 32 42 |
| FOXO signaling pathway | 0.787647 | 17 | 0.683991499 | 0.733991752 | 4 5 6 11 12 22 34 36 38 39 42 43 44 45 46 47 48 |
| HIF-1 signaling pathway | 0.796182 | 11 | 0.673983105 | 0.7325388 | 2 4 5 22 34 36 38 41 42 45 46 |
| VEGF signaling pathway | 0.799750 | 16 | 0.663653015 | 0.728530369 | 4 6 11 25 26 33 34 36 38 42 43 44 45 46 47 48 |
| Homologous recombination | 0.689800 | 5 | 0.744804648 | 0.716774892 | 9 24 27 30 32 |
| Thyroid hormone signaling pathway | 0.801071 | 14 | 0.626992865 | 0.708707323 | 4 5 10 20 28 33 34 35 36 37 43 44 46 47 |
| Adherens junction | 0.794533 | 15 | 0.630206366 | 0.70761569 | 4 5 11 25 26 28 33 36 38 40 43 44 46 47 48 |
| Adipocytokine signaling pathway | 0.831000 | 6 | 0.596127825 | 0.703833945 | 4 5 10 42 46 48 |
| TNF signaling pathway | 0.790667 | 12 | 0.621398946 | 0.700941819 | 4 6 11 16 36 39 41 42 45 46 47 48 |
| Neurotrophin signaling pathway | 0.794800 | 15 | 0.61762929 | 0.700636681 | 4 6 11 25 34 36 38 39 40 43 44 45 46 47 48 |
| B cell receptor signaling pathway | 0.839583 | 12 | 0.583361014 | 0.699842972 | 4 33 34 36 38 39 42 44 45 46 47 48 |
| Fc epsilon RI signaling pathway | 0.785500 | 14 | 0.623089264 | 0.699597468 | 4 6 11 25 33 34 36 38 40 43 44 46 47 48 |
| Cell cycle | 0.705455 | 11 | 0.681447933 | 0.693347346 | 4 5 9 10 12 13 22 29 30 32 47 |
| Insulin resistance | 0.854000 | 4 | 0.560416943 | 0.691806381 | 4 5 42 46 |
| PI3K-AKT signaling pathway | 0.802462 | 13 | 0.584009347 | 0.68457654 | 4 22 26 33 34 35 36 38 42 44 45 46 47 |
| Focal adhesion | 0.800353 | 17 | 0.576200699 | 0.679090513 | 4 11 22 25 26 33 34 36 38 40 42 43 44 45 46 47 48 |
| AMPK signaling pathway | 0.817000 | 4 | 0.562233667 | 0.677749885 | 4 10 42 44 |
| NOD-like receptor signaling pathway | 0.786500 | 10 | 0.580649858 | 0.675781853 | 4 6 11 36 39 41 43 46 47 48 |
| Sphingolipid signaling pathway | 0.782615 | 13 | 0.576929156 | 0.671947642 | 4 6 11 33 34 35 36 43 44 45 46 47 48 |
| T cell receptor signaling pathway | 0.776857 | 14 | 0.577623933 | 0.669874076 | 4 6 11 25 26 34 36 38 39 40 44 46 47 48 |
| JAK-STAT signaling pathway | 0.830000 | 6 | 0.523496172 | 0.659167523 | 4 10 34 42 44 46 |
| RAS signaling pathway | 0.780833 | 18 | 0.548420257 | 0.654388889 | 4 8 11 22 25 26 33 34 36 38 40 42 43 44 45 46 47 48 |
| Mismatch repair | 0.720200 | 5 | 0.582186126 | 0.647526407 | 9 15 24 30 32 |
| Estrogen signaling pathway | 0.731111 | 18 | 0.559789644 | 0.639740908 | 1 3 4 6 14 20 31 34 35 36 38 39 40 41 44 45 46 47 |
| MAPK signaling pathway | 0.777053 | 19 | 0.514896219 | 0.63253574 | 4 6 8 11 20 22 25 26 34 36 38 39 42 43 44 45 46 47 48 |
| RAP1 signaling pathway | 0.736048 | 21 | 0.539811636 | 0.630338853 | 1 4 6 11 14 22 25 26 31 33 34 35 36 38 42 43 44 45 46 47 48 |
Genes present in the most relevant communities in k-clique 9.
| Communities | Genes | Average | Average Rank | Average Degree | N pathogenic | Pathogenic genes/genes | HPT* (p-value) |
|---|---|---|---|---|---|---|---|
| 46 | CREBBP MAPK14 AKT1 SRC ESR1 JUN RAC3 CCND1 NFKB1 RELA | 0.939 | 147.4 | 138 | 4 | 0.400 | 0.007783988 |
| 45 | AKT1 MMP9 BCL2 VEGFA JUN TP53 TGFB1 IL6 FGF2 MMP2 | 0.924 | 181.8 | 181.8 | 7 | 0.700 | 3.25867E-06 |
| 47 | MAPK14 CTNNB1 MAPK8 RAC1 SRC ABL1 MAPK1 JUN RAC3 STAT3 TP53 CCND1 FOS | 0.899 | 240.62 | 45.62 | 3 | 0.231 | 0.098109212 |
| 42 | AKT1 VEGFA JUN LEP TGFB1 IGF1 IL6 INS SERPINE1 | 0.887 | 269.89 | 101.3 | 6 | 0.667 | 2.72754E-05 |
| 44 | CDH2 CTNNB1 AKT1 RAC1 SRC CDC42 CDH1 PIK3CA CCND1 | 0.885 | 275 | 141.11 | 4 | 0.444 | 0.00500697 |
| 30 | RPA1 RPA3 CDK4 RAD51C ATM ATR DMC1 NBN MRE11 RBBP8 H2AFX RAD51 | 0.862 | 328.83 | 42.67 | 5 | 0.417 | 0.002288344 |
| 37 | CREBBP PPARA MED1 NCOA1 CARM1 NCOA6 YAP1 CTGF WWTR1 NCOA2 | 0.862 | 330.1 | 60.6 | 0 | 0.000 | N/A |
| 41 | MMP9 VEGFA JUN STAT3 CXCL8 IL6 TIMP1 MMP2 IL1B | 0.853 | 352 | 80.2 | 5 | 0.556 | 0.000452371 |
| 43 | CDH2 MAPK14 CTNNB1 MAPK8 RAC1 SRC CDC42 ABL1 CCND1 | 0.849 | 365.56 | 124.67 | 2 | 0.222 | 0.182829173 |
| 38 | PIK3CA EGF EGFR GRB2 ERBB2 ERBB3 ERBB4 CBL PLCG1 | 0.848 | 362.33 | 89.3 | 3 | 0.333 | 0.037259742 |
| 48 | MAPK14 MAPK8 RAC1 SRC ABL1 MAPK1 LCK STAT3 FYN | 0.841 | 379.33 | 127.11 | 1 | 0.111 | 0.562833095 |
| 32 | CDK2 RPA1 RPA3 CDK4 ATM DMC1 MLH1 MRE11 BLM TOP3A H2AFX RAD51 | 0.824 | 421.25 | 48.75 | 2 | 0.250 | 0.080438401 |
| 5 | CREBBP SRA1 CITED2 PPARGC1A EP300 PPARA MED1 NRIP1 NCOA1 | 0.8 | 423.2 | 76.8 | 0.0 | 0.000 | N/A |
| 20 | CREBBP JUN TP53 ATF2 KAT2B SMARCB1 IRF1 NR3C1 SMARCE1 HMGB1 ARID1A | 0.8 | 398.7 | 85.4 | 1.0 | 0.091 | 0.636520998 |
*HPT: Hypergeometric probability test.
Figure 2Communality network analysis for k-clique 9. Red nodes represent genes that are part of several communities. The other colors correspond with the most relevant communities obtained.
Figure 3Circular chord diagram of the BC integrated network. PPi among the most relevant communities (k-clique 9), pathogenic genes (G1 + G2), PAM50 genes and genes of the most relevant KEGG signaling pathways in BC.
Figure 4Significant correlation of degree centrality between the OncoPPi BC network and our BC integrated network (p < 0.05), (r2 = 0.23688). This sub-network is conformed by genes of the most relevant communities (k-clique 9), pathogenic genes (G1 + G2), PAM50 genes, and genes of the ERBB, PI3K-AKT, FOXO, and HIF- signaling pathways in BC.
Figure 5Oncogenomics validation with the DRIVE project. (a) Percentage of essential, active and inert genes in all cancer cell lines. (b) Percentage of genes with sensitivity score ≤−3 in >50%, 1–40%, and 0% of BC cell lines. (c) Venn diagram of genes with significant sensitivity score in >50% of BC cell lines.