| Literature DB >> 31338326 |
Manu Shivakumar1, Jason E Miller1,2, Venkata Ramesh Dasari3, Radhika Gogoi3, Dokyoon Kim1,4,5.
Abstract
Endometrial cancer is the fourth most commonly diagnosed cancer in women. Family history is a known risk factor for endometrial cancer. The incidence of endometrial cancer in a first-degree relative elevates the relative risk to range between 1.3 and 2.8. It is unclear to what extent or what other novel germline variants are at play in endometrial cancer. We aim to address this question by utilizing whole exome sequencing as a means to identify novel, rare variant associations between exonic regions and endometrial cancer. The MyCode community health initiative is an excellent resource for this study with germline whole exome data for 60,000 patients available in the first phase, and further 30,000 patients independently sequenced in the second phase as part of DiscovEHR study. We conducted exome-wide rare variant association using 472 cases and 4,110 controls in 60,000 patients (discovery cohort); and 261 cases and 1,531 controls from 30,000 patients (replication cohort). After binning rare germline variants into genes, case-control association tests performed using Optimal Unified Approach for Rare-Variant Association, SKAT-O. Seven genes, including RBM12, NDUFB6, ATP6V1A, RECK, SLC35E1, RFX3 (Bonferroni-corrected P < 0.05) and ATP8A1 (suggestive P < 10-5), and one long non-coding RNA, DLGAP4-AS1 (Bonferroni-corrected P < 0.05), were associated with endometrial cancer. Notably, RECK, and ATP8A1 were replicated from the replication cohort (suggestive threshold P < 0.05). Additionally, a pathway-based rare variant analysis, using pathogenic and likely pathogenic variants, identified two significant pathways, pyrimidine metabolism and protein processing in the endoplasmic reticulum (Bonferroni-corrected P < 0.05). In conclusion, our results using the single-source electronic health records (EHR) linked to genomic data highlights candidate genes and pathways associated with endometrial cancer and indicates rare variants involvement in endometrial cancer predisposition, which could help in personalized prognosis and also further our understanding of its genetic etiology.Entities:
Keywords: DiscovEHR cohort; EHR; EHR-linked Biobank; cancer predisposition gene; endometrial cancer; rare variant
Year: 2019 PMID: 31338326 PMCID: PMC6626914 DOI: 10.3389/fonc.2019.00574
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1Schematic overview of the association study. The blood samples were collected and sequenced as part of MyCode and DiscovEHR projects. The phenotype information was pulled from the cancer registry and EHR.
Characteristics for the study population.
| Number of patients | 472 | 261 | 4,110 | 1,531 |
| Average | 60.12 | 60.97 | 58.36 | 56.45 |
| Median | 60.00 | 61.00 | 59.00 | 57.00 |
| Standard deviation | 11.69 | 10.51 | 14.45 | 14.05 |
| Average | 37.83 | 36.56 | 31.73 | 31.72 |
| Median | 36.72 | 35.18 | 30.00 | 30.00 |
| Standard deviation | 9.55 | 10.10 | 8.12 | 8.07 |
| Alive | 408 (86.4%) | 248 (95.0%) | 3,888 (94.6%) | 1,509 (98.6%) |
| Dead | 64 (13.6%) | 13 (5.0%) | 222 (5.4%) | 22 (1.4%) |
| Stage 1/2 | 252 (53.4%) | 134 (51.3%) | Not applicable | Not applicable |
| Stage 3/4 | 36 (7.6%) | 21 (8.0%) | Not applicable | Not applicable |
| Unknown | 184 (40.0%) | 106 (40.6%) | Not applicable | Not applicable |
| Family history of cancer | 355 (75.2%) | 205 (78.5%) | Not available | Not available |
Figure 2Manhattan Plot of SKAT-O association results: Each point represents one of the 20,385 genes plotted against x-axis being chromosome position and y-axis log transformed SKAT-O p-value. All the genes above the red line are Bonferroni significant, and genes above the blue line have a p-value < 1 × 10−5.
Genes associated with endometrial cancer after Bonferroni correction in discovery analysis.
| 20:35648925–35664956 | 74 | 34 (6.78%) | 164 (3.89%) | 6.82E-08 | 0.0014 | 64 | 77 (6.13%) | 520 (4.77%) | 0.5394 | |
| 9:32553526–32573184 | 111 | 86 | 630 (14.08%) | 1.31E-07 | 0.0027 | 23 | 12 (4.6%) | 60 (3.92%) | 0.9169 | |
| 20:36507702–36573275 | 98 | 37 (6.14%) | 225 (4.87%) | 3.86E-07 | 0.0081 | 49 | 33 (10.73%) | 205 (11.1%) | 0.8315 | |
| 3:113747019–113812058 | 114 | 57 | 362 | 5.00E-07 | 0.0104 | 74 | 21 (7.66%) | 183 (10.12%) | 0.2754 | |
| 9:36036905–36124455 | 188 | 158 (20.34%) | 1127 (21.56%) | 1.86E-06 | 0.0388 | 130 | 140 (27.59%) | 1142 (34.16%) | ||
| 19:16549837–16572382 | 87 | 79 (14.83%) | 524 (12%) | 2.11E-06 | 0.0441 | 54 | 37 (14.18%) | 266 (16.2%) | 0.4076 | |
| 9:3218297–3526529 | 176 | 178 (23.73%) | 1279 (21.44%) | 2.23E-06 | 0.0465 | 148 | 119 (31.42%) | 855 (37.17%) | 0.5394 | |
| 4:42408373–42657105 | 268 | 302 (41.74%) | 2553 (39.1%) | 7.17E-06 | 0.1494 | 174 | 160 (37.55%) | 908 (39.32%) | ||
The associations were adjusted for age, BMI, and four principal components. The table shows the number of loci, allele counts in cases and controls and association results for discovery and replication datasets.
N locus, Total number of genomic loci binned in the gene; MAC Case, Total minor allele count in the gene in case population with the percentage samples that have rare variants; MAC Control, Total minor allele count in the gene in control population with the percentage samples that have rare variants.
Suggestive association—not Bonferroni significant but p-value < 10.
The two genes marked in bold were replicated with suggestive threshold P < 0.05.
The variants with lower Prm in RECK.
| 9:36083487 | rs754745207&COSM1177811 | A/G | A/T | 0.00010912 | missense_variant | MODERATE | 3.15E-06 |
| 9:36037075 | – | G/GGGGCCTGGCTC | GGLAP/GX | 0.00010912 | frameshift_variant | HIGH | 2.98E-06 |
| 9:36117172 | rs140337764 | C/T | C/R | 0.00021825 | missense_variant | MODERATE | 2.89E-06 |
| 9:36112408 | – | A/T | H/Q | 0.00021825 | missense_variant | MODERATE | 2.67E-06 |
| 9:36118904 | rs763992953 | A/G | E/K | 0.00021825 | missense_variant | MODERATE | 2.50E-06 |
| 9:36110029 | rs772507584&COSM1462342 | C/G | R/P | 0.00010912 | missense_variant | MODERATE | 2.30E-06 |
| 9:36037021 | rs557893747 | C/T | L/P | 0.00054633 | missense_variant | MODERATE | 2.13E-06 |
| 9:36037047 | rs139893051 | A/G | A/T | 0.00065517 | missense_variant | MODERATE | 1.94E-06 |
Figure 3Plot of all variants with lower Prm in RECK which were classified as moderate or high impact by VEP. The y-axis represents negative log scaled Prm-Pval where Pval is the original SKAT-O p-value listed in Table 2, and the x-axis is relative genomic coordinate in the gene.
Figure 4Distribution of variant consequence as determined by VEP for all rare variants in significant genes.
The variants with lower Prm found in COSMIC.
| 20:36525992 | rs971669684 | COSM5601362, COSM5601361 | Skin | |
| 20:35652667 | rs747020729 | COSM5039294 | Liver | |
| 20:36548162 | rs7273824 | COSM3693464 | Large intestine, prostate | |
| 9:3247951 | rs2229356 | COSM3763880 | Large intestine | |
| 20:36526879 | rs114982034 | COSM4098029, COSM4098028 | Stomach | |
| 3:113795187 | rs771311957 | COSM4583806, COSM1036559 | Bone, | |
| 4:42485617 | rs370223580 | COSM1184146, COSM1184145 | Large intestine | |
| 9:36083487 | rs754745207 | COSM1177811 | ||
| 19:16553901 | rs773244448 | COSM1391284, COSM1391283 | Large intestine | |
| 4:42586383 | COSM3603994, COSM3603993 | Skin | ||
| 9:36110029 | rs772507584 | COSM1462342 | Large intestine | |
| 4:42443590 | rs140420171 | COSM1184148, COSM1184147 | Large intestine, skin |
Somatic mutation in variant rs771311957 in ATP6V1A and variant rs754745207 in RECK were discovered in Endometrium.
Primary tissue where the somatic mutations were found as cataloged by COSMIC database.
The tissue source where the somatic mutations were found in endometrium in COSMIC database are marked in bold.
Pathways associated with endometrial cancer in discover analysis.
| Pyrimidine metabolism | 271 | 173 (30.29%) | 1384 (28.73%) | 5.86E-06 | 0.0019 | 0.0018 |
| Protein processing in endoplasmic reticulum | 302 | 96 (18%) | 654 (14.28%) | 9.60E-05 | 0.0152 | 0.0304 |
| Pentose and glucuronate interconversions | 120 | 85 (16.95%) | 670 (14.65%) | 8.31E-04 | 0.0704 | 0.2635 |
| Pancreatic secretion | 293 | 85 (15.68%) | 543 (12.04%) | 1.18E-03 | 0.0704 | 0.3728 |
| RNA polymerase | 69 | 20 (4.023%) | 152 (3.5%) | 1.28E-03 | 0.0704 | 0.4044 |
| Pantothenate and CoA biosynthesis | 64 | 26 (5.51%) | 164 (3.92%) | 1.33E-03 | 0.0704 | 0.4223 |
The associations were adjusted for age, BMI, and four principal components. The table shows the number of loci, allele counts in cases and controls and association results for discovery and replication datasets.
N locus, Total number of genomic loci binned in the gene; MAC Case, Total minor allele count in the gene in case population with the percentage samples that have rare variants; MAC Control, Total minor allele count in the gene in control population with the percentage samples that have rare variants.
Cox regression results for gene NDUFB6.
| Discovery | 74 (15.7%) | 398 (84.3%) | 0.039 |
| Replication | 12 (4.6%) | 249 (95.4%) | 0.037 |
N.
Figure 5Kaplan–Meier plot for the survival of patients with rare variants in NDUFB6.