| Literature DB >> 35782051 |
Ángela Santonja1,2,3, Aurelio A Moya-García2,4,3, Nuria Ribelles5,6, Begoña Jiménez-Rodríguez5, Bella Pajares5, Cristina E Fernández-De Sousa1,2, Elísabeth Pérez-Ruiz7, María Del Monte-Millán6,8, Manuel Ruiz-Borrego9, Juan de la Haba6,10, Pedro Sánchez-Rovira11, Atocha Romero12, Anna González-Neira13, Ana Lluch6,14,15, Emilio Alba2,5,6.
Abstract
Most cancer-related deaths in breast cancer patients are associated with metastasis, a multistep, intricate process that requires the cooperation of tumour cells, tumour microenvironment and metastasis target tissues. It is accepted that metastasis does not depend on the tumour characteristics but the host's genetic makeup. However, there has been limited success in determining the germline genetic variants that influence metastasis development, mainly because of the limitations of traditional genome-wide association studies to detect the relevant genetic polymorphisms underlying complex phenotypes. In this work, we leveraged the extreme discordant phenotypes approach and the epistasis networks to analyse the genotypes of 97 breast cancer patients. We found that the host's genetic makeup facilitates metastases by the dysregulation of gene expression that can promote the dispersion of metastatic seeds and help establish the metastatic niche-providing a congenial soil for the metastatic seeds. Copyright:Entities:
Keywords: breast cancer; epistasis; germline variants; network analysis; seed and soil
Mesh:
Year: 2022 PMID: 35782051 PMCID: PMC9245581 DOI: 10.18632/oncotarget.28250
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Top SNPs ranked by SNPrank score
| SNP ID | Gene | Chromosome | SNP Rank score | SNP location | MAF |
|---|---|---|---|---|---|
| rs11139965 | RN7SKP242 | 9 | 0.5898 | intergenic | 0.170 |
| rs67242866 | LINC01362 | 1 | 0.5784 | intergenic | 0.028 |
| rs199830092 | PIK3C2B | 1 | 0.5749 | intron variant | NA |
| rs72951131 | ZAP70 | 2 | 0.5587 | intron variant | 0.183 |
| rs34000182 | MIR5689HG | 6 | 0.5485 | intron variant | 0.053 |
| rs61776380 | RNU6-830P | 1 | 0.5444 | intergenic | 0.065 |
| rs778902 | ISCA1P2 | 1 | 0.5378 | intergenic | 0.098 |
| rs2501357 | C1orf204 | 1 | 0.5360 | intron variant | 0.306 |
| rs742635 | ABTB2 | 11 | 0.5349 | intron variant | 0.085 |
| rs16927008 | CLVS1 | 8 | 0.5324 | intron variant | 0.027 |
| rs4849127 | IL1B | 2 | 0.5257 | downstream gene variant | 0.125 |
| rs41263676 | C1orf21 | 1 | 0.5251 | 3’UTR variant | 0.131 |
| rs11692741 | MYT1L | 2 | 0.5211 | intron variant | 0.180 |
| rs77142354 | MIR4300HG | 11 | 0.5165 | intron variant | 0.039 |
| rs77162747 | ZNF767P | 7 | 0.5144 | intergenic | 0.072 |
| rs77169575 | MAML3 | 4 | 0.5142 | intron variant | 0.198 |
| rs9994379 | ELOVL6 | 4 | 0.5136 | intron variant | 0.034 |
| rs13421497 | PRKCE | 2 | 0.5134 | intergenic | 0.073 |
| rs1197934 | LINGO2 | 9 | 0.5128 | intron variant | 0.112 |
| rs75394800 | EML6 | 2 | 0.5115 | intron variant | 0.033 |
| rs3799142 | VIP | 6 | 0.5039 | downstream gene variant | 0.172 |
Abbreviation: MAF: minor allele frequency.
Figure 1Epistasis network encoding the susceptibility to metastasis in our cohort.
The genes with high community centrality are represented in blue. The right panel highlights the participation of two community-central genes in several communities by the colour of their links.
Figure 2Gene regulatory network of breast cancer metastasis.
Network communities are depicted in different colours and annotated according to the enriched functions of their genes.
Role of the metastasis influence genes on the metastasis regulatory network, their expression in models of metastatic breast tumours and their association with distant metastasis-free survival
| Gene | Regulon size | Central in the
| Bow-tie core of the
| Expression in BC
| Differentially expressed
| Germline-somatic
| Associated with
| Regulon associated
| Stemness
| Implicated in
|
|---|---|---|---|---|---|---|---|---|---|---|
| AR | 5 | yes | BL | cell line 2 | – | NKI; METABRIC | [ | |||
| BACH2 | 3 | BL, HER2, LUM, NL | mouse | – | ||||||
| CALN1 | NA | BL, HER2, LUM, NL | – | NA | ||||||
| CDCA8 | NA | BL, HER2, LUM, NL | cell line 1 | NKI; METABRIC;
| NA | yes | [ | |||
| CLEC14A | NA | yes | BL, HER2, LUM, NL | – | – | NA | [ | |||
| COL10A1 | NA | BL, HER2, LUM, NL | cell line 2 | VDX | NA | yes | [ | |||
| COMP | NA | BL, HER2, LUM, NL | – | yes | TRANSBIG | NA | yes | [ | ||
| EBF1 | 13 | yes | BL, HER2, LUM, NL | cell line 1; mouse | yes | – | MAINZ | yes | ||
| EN1 | 3 | yes | BL | cell line 1 | yes | – | MAINZ | [ | ||
| EN2 | 2 | BL, HER2, LUM, NL | cell line 1; cell line 2 | yes | – | NKI; METABRIC | [ | |||
| EXO1 | NA | BL, HER2, LUM, NL | – | MAINZ; UNT | NA | yes | [ | |||
| FLI1 | 4 | BL, HER2, LUM, NL | – | MAINZ; UNT | MAINZ; METABRIC;
| [ | ||||
| GNA14 | NA | BL, HER2, LUM, NL | cell line 2 | yes | – | NA | [ | |||
| GPIHBP1 | NA | yes | BL, HER2, LUM, NL | mouse | – | NA | yes | |||
| GRM7 | NA | BL, HER2, LUM, NL | – | yes | – | NA | ||||
| L3MBTL4 | 6 | yes | BL, HER2, LUM, NL | – | UNT | – | yes | |||
| LHX2 | 3 | BL, HER2, LUM, NL | cell line 1 | TRANSBIG; METABRIC | METABRIC | [ | ||||
| LRP1B | NA | BL, HER2, LUM, NL | – | yes | – | NA | [ | |||
| LRRC4B | NA | BL, HER2, LUM, NL | – | yes | – | NA | ||||
| MEF2A | 4 | BL, HER2, LUM, NL | – | yes | – | TRANSBIG | [ | |||
| METTL11B | NA | yes | BL, HER2, LUM, NL | – | – | NA | yes | |||
| NEIL3 | NA | yes | BL, HER2, LUM, NL | cell line 1 | TRANSBIG; METABRIC | NA | yes | [ | ||
| NEK2 | NA | yes | BL, HER2, LUM, NL | cell line 1; mouse | NKI | NA | yes | [ | ||
| NFE2L3 | 3 | yes | BL, HER2, LUM, NL | cell line 1; cell line 2 | – | – | [ | |||
| NMNAT3 | NA | BL, HER2, LUM, NL | – | – | NA | |||||
| NR3C1 | 5 | yes | BL, HER2, LUM, NL | cell line 2; mouse | yes | – | NKI; TRANSBIG;
| yes | [ | |
| RP9P | NA | BL, HER2, LUM, NL | – | – | NA | |||||
| RPS6KA2 | NA | BL, HER2, LUM, NL | – | – | NA | [ | ||||
| SALL4 | 6 | BL, HER2, LUM, NL | cell line 1; cell line 2 | yes | – | MAINZ | yes | [ | ||
| SMAD3 | 3 | BL, HER2, LUM, NL | cell line 1 | – | – | [ | ||||
| SMARCD3 | NA | BL, HER2, LUMB | – | – | NA | [ | ||||
| SMYD3 | 6 | yes | BL, HER2, LUM, NL | cell line 1 | – | – | [ | |||
| SPARCL1 | NA | yes | BL, HER2, LUM, NL | cell line 2 | NKI | NA | yes | [ | ||
| STARD8 | NA | yes | BL, HER2, LUM, NL | – | – | NA | [ | |||
| TMEM132C | NA | BL, HER2, LUM, NL | – | – | NA | yes | ||||
| TNS1 | NA | yes | BL, HER2, LUM, NL | cell line 1; mouse | – | NA | yes | [ | ||
| TSHZ2 | 28 | yes | BL, HER2, LUM, NL | – | yes | – | MAINZ; NKI;
| yes | [ | |
| TUBA1C | NA | BL, HER2, LUM, NL | mouse | METABRIC | NA | yes | [ | |||
| ZNF385D | 3 | BL, HER2, LUM, NL | – | yes | – | METABRIC | yes |
The gene is central in the metastasis regulatory network if its community centrality (a measure of its importance by the number of network communities to which the gene belongs) is in the top 20%. The transcription factor is in the bow-tie core if its bow-tie score is in the top 10%. Abbreviations: BL, basal-line; HER2, HER2-enriched; LUM, luminal; LUMB, luminal-B; NL, normal-like. Cell line 1, the gene is upregulated in the metastatic breast cancer cell line MDA-MB-468GFP vs. the poorly metastatic breast cancer cell line MDA-MB-468LN; cell line 2, the gene is upregulated in the metastatic breast cancer cell line MCF7 vs. the mammary epithelium cell line MCF10. Germline-somatic interaction, the gene harbours germline variations associated with somatic events. Stemness phenotype, the gene has an expression profile across the TCGA breast tumour samples significantly correlated with the mRNA stemness index (see Methods). Associated with DMFS, the gene is significantly more associated with DMFS than random genes in the indicated expression datasets. Implicated in metastasis reference of the studies that show how the gene participates in the molecular mechanisms of metastasis.
Figure 3A pipeline of the epistasis network modelling with Encore.
We used as input .bim/.bed/.bam files from PLINK. 1) The linkage disequilibrium pruning step removes highly correlated (i.e. low informative) SNPs. 2) Evaporative cooling is a machine learning method that integrates multiple importance scores while removing irrelevant genetic variants. In this step, we kept the 10000 most relevant SNPs, which constitutes a significant reduction from the initial ~ 4.3 million. 3) After filtering, Encore calculates the pairwise interaction for the 10000 SNPs with a generalised linear model. It computes a matrix of epistatic interactions among SNPs with Benjamini-Hochberg false discovery rate corrected p-values (reGAIN matrix). From that matrix, SNPs are ranked and filtered with SNPrank; we kept 2016 SNPs. 4) We obtained the names of the genes in or near (1 MB) the most relevant SNPs with the R library PostGWAS [50]. 5) Finally, we ranked the most relevant genes by their community centrality (using link communities [51]); genes are important if they participate in many communities.