| Literature DB >> 25779658 |
Natalie C Fonville1, Zalman Vaksman1, Lauren J McIver1, Harold R Garner1.
Abstract
Ovarian cancer (OV) ranks fifth in cancer deaths among women, yet there remain few informative biomarkers for this disease. Microsatellites are repetitive genomic regions which we hypothesize could be a source of novel biomarkers for OV and have traditionally been under-appreciated relative to Single Nucleotide Polymorphisms (SNPs). In this study, we explore microsatellite variation as a potential novel source of genomic variation associated with OV. Exomes from 305 OV patient germline samples and 54 tumors, sequenced as part of The Cancer Genome Atlas, were analyzed for microsatellite variation and compared to healthy females sequenced as part of the 1,000 Genomes Project. We identified a subset of 60 microsatellite loci with genotypes that varied significantly between the OV and healthy female populations. Using these loci as a signature set, we classified germline genomes as 'at risk' for OV with a sensitivity of 90.1% and a specificity of 87.6%. Cross-analysis with a similar set of breast cancer associated loci identified individuals 'at risk' for both diseases. This study revealed a genotype-based microsatellite signature present in the germlines of individuals diagnosed with OV, and provides the basis for a potential novel risk assessment diagnostic for OV and new personal genomics targets in tumors.Entities:
Keywords: 1000 genomes project; The Cancer Genome Atlas; biomarkers; breast cancer; ovarian cancer
Mesh:
Substances:
Year: 2015 PMID: 25779658 PMCID: PMC4484465 DOI: 10.18632/oncotarget.2933
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Statistically significant loci that differentiate healthy from OV cancer germlines
Loci demarked in bold were also informative for breast cancer using a similar approach. Encode and other element designations (from the UCSC browser) are as follows: 1 – Transcription factor binding site, 2 – DNaseI hypersensitivity locus, 3 – Spliced EST, 4 – H3K27Ac mark (found near active regulator elements), 5 – human mRNA.
| Microsatellite Locus | Motif | Region | Gene | Encode element / other | 1kGP samples genotyped | Percent 1kGP Non-modal | OV Samples Genotyped | Percent OV Non-modal | Relative Risk |
|---|---|---|---|---|---|---|---|---|---|
| chr5:122714135–122714152 | A | intron | CEP120 | 50 | 10% | 70 | 64% | 6.4 | |
| chr2:91886031–91886042 | A | intergenic | – | 186 | 14% | 240 | 48% | 3.4 | |
| chr5:158511580–158511594 | A | intron | EBF1 | 1 | 33 | 24% | 38 | 82% | 3.4 |
| chr10:69699479–69699497 | AT | intron | HERC4 | 79 | 14% | 120 | 39% | 2.8 | |
| chr2:223339530–223339550 | T | intron | SGPP2 | 22 | 36% | 41 | 85% | 2.3 | |
| chr7:81695843–81695858 | A | intron | CACNA2D1 | 42 | 38% | 83 | 87% | 2.3 | |
| chr18:21120382–21120397 | A | intron | NPC1 | 1, 2 | 42 | 43% | 60 | 93% | 2.2 |
| chr13:49951024–49951057 | ATAG | intron | CAB39L | 140 | 26% | 217 | 52% | 2.0 | |
| chr2:234368716–234368729 | A | intron | DGKD | 20 | 50% | 114 | 93% | 1.9 | |
| chr11:30438959–30438973 | T | intron | MPPED2 | 40 | 53% | 87 | 14% | 1.8 | |
| chr12:75901962–75901976 | A | intron | KRR1 | 41 | 51% | 94 | 12% | 1.8 | |
| chr14:91928846–91928860 | T | intron | SMEK1 | 2 | 28 | 54% | 42 | 95% | 1.8 |
| chr1:149900986–149901001 | A | exon | MTMR11 | 2, 3, 5 | 37 | 51% | 54 | 89% | 1.7 |
| chr9:52626–52640 | A | intergenic | - | 44 | 55% | 90 | 92% | 1.7 | |
| chr20:44333327–44333340 | T | intron | WFDC10B | 80 | 45% | 90 | 76% | 1.7 | |
| chr4:186188374–186188387 | A | intron | SNX25 | 75 | 47% | 127 | 12% | 1.7 | |
| chr11:116691512–116691528 | ACAG | exon | APOA4 | 5 | 156 | 53% | 222 | 84% | 1.6 |
| chr9:133498230–133498244 | A | intron | FUBP3 | 37 | 41% | 83 | 8% | 1.5 | |
| chr8:121518869–121518882 | T | intron | MTBP | 39 | 36% | 72 | 6% | 1.5 | |
| chr12:22676634–22676648 | A | intron | C2CD5 | 52 | 40% | 97 | 12% | 1.5 | |
| chr10:93579112–93579132 | T | intron | TNKS2 | 43 | 37% | 103 | 8% | 1.5 | |
| chr17:47899281–47899294 | A | intron | KAT7 | 3 | 30 | 33% | 81 | 2% | 1.5 |
| chr3:50095097–50095118 | T | intron | RBM6 | 61 | 36% | 76 | 8% | 1.4 | |
| chr7:36465607–36465621 | T | intron | ANLN | 90 | 44% | 154 | 21% | 1.4 | |
| chr19:21558016–21558032 | TG | intron | ZNF738 | 159 | 48% | 186 | 27% | 1.4 | |
| chr12:106500161–106500174 | A | intron | NUAK1 | 1, 2 | 53 | 32% | 121 | 5% | 1.4 |
| chr17:57078816–57078830 | A | intron | TRIM37 | 1, 4 | 33 | 27% | 100 | 1% | 1.4 |
| chr1:169555368–169555380 | A | intron | F5 | 1 | 82 | 28% | 161 | 4% | 1.3 |
| chr4:22444252–22444266 | A | intron | GPR125 | 77 | 26% | 111 | 4% | 1.3 | |
| chr17:66041872–66041885 | T | intron | KPNA2 | 3 | 69 | 30% | 159 | 10% | 1.3 |
| chr6:76728584–76728597 | A | intron | IMPG1 | 68 | 24% | 111 | 3% | 1.3 | |
| chr10:22515002–22515024 | A | intergenic | - | 54 | 22% | 111 | 2% | 1.3 | |
| chr5:86679677–86679690 | T | intron | RASA1 | 4 | 67 | 21% | 116 | 1% | 1.3 |
| chr10:94266331–94266345 | T | intron | IDE | 2 | 82 | 18% | 75 | 0% | 1.2 |
| chr18:2960513–2960525 | A | intron | LPIN2 | 1,2,4 | 67 | 18% | 90 | 0% | 1.2 |
| chr15:64972761–64972788 | TG | intron | ZNF609 | 121 | 23% | 208 | 8% | 1.2 | |
| chr16:10783089–10783101 | A | intron | TEKT5 | 66 | 17% | 130 | 0% | 1.2 | |
| chr4:71888333–71888347 | T | intron | DCK | 2 | 49 | 16% | 111 | 0% | 1.2 |
| chr1:236721453–236721465 | A | intron | HEATR1 | 101 | 17% | 150 | 1% | 1.2 | |
| chrX:11187894–11187905 | T | intron | ARHGAP6 | 61 | 16% | 187 | 2% | 1.2 | |
| chr11:89534160–89534172 | A | intron | TRIM49 | 80 | 15% | 131 | 1% | 1.2 | |
| chr6:89638989–89639003 | A | intron | RNGTT | 94 | 15% | 130 | 1% | 1.2 | |
| chr4:141448596–141448609 | T | intron | ELMOD2 | 100 | 14% | 157 | 1% | 1.1 | |
| chr7:31132236–31132248 | T | intron | ADCYAP1R1 | 2 | 114 | 12% | 192 | 2% | 1.1 |
| chr19:20829219–20829233 | AC | intron | ZNF626 | 203 | 5% | 281 | 0% | 1.1 |
Figure 1ROC curve using OV germline genotypes at the 60 microsatellite loci which had significantly different genotype distributions between OV and normal genomes
Figure 2Microsatellite variation signature evaluated as a composite of the 60 statistically significant loci
The non-overlapping distributions (healthy and cancer germline) is illustrative of the power to distinguish those populations. The dashed line marks where the 83% cut-off for calling a sample “OV-like” lies.
The mean numbers of OV and BC signature loci genotyped are within standard deviation for each population
| Population | OV Loci GenotypedMean (SD) | OV “Cancer-like” LociMean (SD) / % | BC Loci GenotypedMean (SD) | BC “Cancer-like” LociMean (SD) / % |
|---|---|---|---|---|
| 20.1 (8.8) | 13.1 (7.4) / 65% | 15.5 (6.4) | 8.9 (3.9) / 57% | |
| 25.0 (9.9) | 22.7 (9.0) / 91% | 16.5 (6.5) | 13.4 (5.5) / 81% | |
| 30.2 (6.7) | 26.7 (6.9) / 88% | NA | NA | |
| 20.5 (7.7) | 6.2 (7.2) / 79% | 17.1 (4.9) | 14.7 (4.3) / 86% |
Concordance between genotype calls for those loci that were genotyped in matched tumor and germline samples
| Participant ID (from CGHub) | Microsatellite loci genotyped in germline samples | Microsatellite loci genotyped in tumor samples | Microsatellite loci genotyped in both samples | Percent of loci whose genotype did not change | Total loci with a genotype change |
|---|---|---|---|---|---|
| 34520 | 35000 | 31644 | 99.56% | 140 | |
| 36203 | 38368 | 32978 | 99.59% | 134 | |
| 37073 | 38394 | 32796 | 99.58% | 138 | |
| 34085 | 33010 | 30611 | 99.54% | 142 | |
| 33219 | 34426 | 31315 | 99.72% | 88 | |
| 34699 | 32049 | 28937 | 99.55% | 130 | |
| 26236 | 30068 | 24182 | 99.57% | 103 | |
| 32870 | 32752 | 29467 | 99.59% | 120 | |
| 33095 | 34636 | 30829 | 99.55% | 139 | |
| 28086 | 30409 | 26132 | 99.53% | 123 | |
| 38903 | 35810 | 33186 | 99.44% | 185 | |
| 30982 | 30940 | 28150 | 99.59% | 116 | |
| 36652 | 35583 | 30508 | 99.50% | 153 | |
| 37568 | 39147 | 33420 | 99.49% | 172 | |
| 28946 | 38352 | 27525 | 99.30% | 192 | |
| 31361 | 32433 | 28154 | 99.53% | 132 | |
| 32455 | 31604 | 29193 | 99.62% | 111 | |
| 33192 | 33002 | 29848 | 99.63% | 109 | |
| 32512 | 33874 | 29947 | 99.45% | 166 | |
| 35536 | 33135 | 30532 | 99.50% | 153 | |
| 34736 | 34209 | 31701 | 99.56% | 138 | |
| 38597 | 37074 | 33637 | 99.57% | 144 | |
| 33755 | 35583 | 31398 | 99.61% | 122 | |
| 33402 | 30381 | 28370 | 99.51% | 138 | |
| 38030 | 35882 | 33150 | 99.52% | 158 | |
| 36368 | 20670 | 19339 | 99.81% | 36 | |
| 33007 | 33671 | 30530 | 99.61% | 119 | |
| 33456 | 34118 | 30483 | 99.48% | 160 | |
| 35475 | 37981 | 32405 | 99.53% | 151 |
Microsatellite loci whose genotypes between matched tumor and germline samples were discordant predominantly showed loss of an allele
| Participant ID (from CGHub) | Total Number of discordant loci | Percent of discordant loci with LOH | Percent of discordant loci with an allele gain | Percent of discordant loci with no concordant allele |
|---|---|---|---|---|
| 140 | 78% | 21% | 1% | |
| 134 | 72% | 25% | 2% | |
| 138 | 65% | 28% | 7% | |
| 142 | 80% | 15% | 4% | |
| 88 | 39% | 60% | 1% | |
| 130 | 75% | 19% | 5% | |
| 103 | 79% | 17% | 5% | |
| 120 | 69% | 28% | 3% | |
| 139 | 64% | 30% | 6% | |
| 123 | 66% | 32% | 2% | |
| 185 | 71% | 28% | 1% | |
| 116 | 72% | 27% | 1% | |
| 153 | 69% | 29% | 3% | |
| 172 | 71% | 27% | 2% | |
| 192 | 82% | 13% | 6% | |
| 132 | 70% | 27% | 3% | |
| 111 | 66% | 32% | 2% | |
| 109 | 73% | 22% | 5% | |
| 166 | 73% | 23% | 4% | |
| 153 | 73% | 25% | 3% | |
| 138 | 67% | 29% | 4% | |
| 144 | 69% | 28% | 3% | |
| 122 | 67% | 30% | 2% | |
| 138 | 81% | 16% | 3% | |
| 158 | 72% | 25% | 3% | |
| 36 | 83% | 14% | 3% | |
| 119 | 80% | 19% | 1% | |
| 160 | 83% | 15% | 3% | |
| 151 | 64% | 32% | 4% |
Figure 3Cross analysis of the OV and BC samples and significant loci sets
(A) Evaluation of the 1kGP-EUF healthy control, OV and BC germline exomes using the OV-signature set of microsatellites. (B) Evaluation of the 1kGP-EUF healthy control, OV and BC germline exomes at the BC-signature set of microsatellites.