| Literature DB >> 22492626 |
Shawn E Yost1, Erin N Smith, Richard B Schwab, Lei Bao, HyunChul Jung, Xiaoyun Wang, Emile Voest, John P Pierce, Karen Messer, Barbara A Parker, Olivier Harismendy, Kelly A Frazer.
Abstract
The utilization of archived, formalin-fixed paraffin-embedded (FFPE) tumor samples for massive parallel sequencing has been challenging due to DNA damage and contamination with normal stroma. Here, we perform whole genome sequencing of DNA isolated from two triple-negative breast cancer tumors archived for >11 years as 5 µm FFPE sections and matched germline DNA. The tumor samples show differing amounts of FFPE damaged DNA sequencing reads revealed as relatively high alignment mismatch rates enriched for C · G > T · A substitutions compared to germline samples. This increase in mismatch rate is observable with as few as one million reads, allowing for an upfront evaluation of the sample integrity before whole genome sequencing. By applying innovative quality filters incorporating global nucleotide mismatch rates and local mismatch rates, we present a method to identify high-confidence somatic mutations even in the presence of FFPE induced DNA damage. This results in a breast cancer mutational profile consistent with previous studies and revealing potentially important functional mutations. Our study demonstrates the feasibility of performing genome-wide deep sequencing analysis of FFPE archived tumors of limited sample size such as residual cancer after treatment or metastatic biopsies.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22492626 PMCID: PMC3413110 DOI: 10.1093/nar/gks299
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.(A) Frequency of mismatches within sequencing reads for germline and FFPE tumor samples. The distribution of reads with 0, 1, 2 or ≥3 mismatches to the reference genome is shown for all sequencing data (All) and a random subset of 50 M, 5 M and 1 M sequencing reads. (B) Read based global nucleotide mismatch rate for all base substitutions. (C) Read based global nucleotide mismatch rate for each substitution type.
Figure 2.Distribution of substitution types for variants passing Filter 2.1 in germline (G) and FFPE tumor (T) samples and called homozygous alternate (Alt) or heterozygous (Het). Variants identified in public SNP repository (Known) or novel for both patients in this study (Novel) or passing in both germline and FFPE tumor samples (Paired) or only in one sample (Unique) are distinguished. The fraction of novel heterozygous variants (C·G > T·A) called between the tumor and germline samples of patient 02542 is substantially different.
Sequencing statistics
| Patient Sample | Sample 06408 | Sample 02542 | ||
|---|---|---|---|---|
| Germline | FFPE tumor | Germline | FFPE tumor | |
| Raw color-space reads | 1 352 676 084 | 2 823 592 370 | 1 251 754 629 | 3 174 447 825 |
| Fraction of reads aligned to hg18 (%) | 67.2 | 59.3 | 65.8 | 54.5 |
| Fraction of uniquely | 70 | 63 | 70 | 60 |
| Average haploid coverage (×) | 12.6× | 23.4× | 11.7× | 22.2× |
| Fraction of genome covered (%) | 88 | 89 | 87 | 89 |
| Fraction of genome with ≥3× coverage (%) | 85 | 86 | 81 | 87 |
aReads with only one possible mapping location.
bReads after mapping, duplicate removal, local-realignment and merging technical replicates; excluding chrY.
Figure 3.Flow diagram describing the number of variants passing each filtering step for both patients 06408 (blue) and 02542 (red).
Figure 4.Filters 2.5 and 2.6 remove false positive somatic variants due to formalin fixation and other systematic and random errors in the process. Shown is the fraction of substitution types for somatic variants after Filter 2.4, after Filter 2.5 and after Filter 2.6 for 06408 and 02542 FFPE tumors. After Filter 2.6 the novel somatic variants of substitution type C·G > T·A called in 02542 tumor have a similar profile to that observed for novel germline variants in the matched sample (Figure 2).
High-confidence FFPE tumor coding somatic variants within cancer associated genes and/or DNA damage repair genes
| Patient | Gene | NCBI ID | Chr | Position (hg18) | Germline | Tumor | Mutation type | Amino acid change |
|---|---|---|---|---|---|---|---|---|
| 06408 | ATRX | NM_000489 | chrX | 76735852 | A/A | A/C | Missense | L2027R |
| ELN | NM_000501 | chr7 | 73109920 | G/G | G/C | Missense | A458P | |
| KIAA1549 | NM_020910 | chr7 | 138253476 | T/T | T/C | Missense | Q429R | |
| MYH9 | NM_002473 | chr22 | 35040266 | T/T | T/A | Missense | K475M | |
| NOTCH1 | NM_017617 | chr9 | 138520141 | G/G | G/A | Missense | A1343V | |
| NUMA1 | NM_006185 | chr11 | 71417948 | C/C | C/G | Missense | V27L | |
| NUP214 | NM_005085 | chr9 | 132998395 | A/A | A/G | Missense | D270G | |
| TP53 | NM_000546 | chr17 | 7517747 | G/G | G/A | Nonsense | R306STOP | |
| 02542 | AKT1 | NM_001014431 | chr14 | 104312544 | A/A | A/G | Missense | F161L |
| BLM | NM_000057 | chr15 | 89105082 | T/T | T/A | Missense | F492Y | |
| CREBBP | NM_001079846 | chr16 | 3772787 | G/G | G/A | Missense | P453L | |
| EXT1 | NM_000127 | chr8 | 118886256 | G/G | G/T | Missense | D647E | |
| GNA11 | NM_002067 | chr19 | 3070205 | A/A | A/G | Missense | N246S | |
| JARID1A | NM_001042603 | chr12 | 297581 | G/G | G/C | Missense | T950R | |
| LPP | NM_005578 | chr3 | 190066803 | G/G | G/T | Missense | G511V | |
| MLL2 | NM_003482 | chr12 | 47722022 | T/T | T/C | Missense | K2043R | |
| MLL3 | NM_170606 | chr7 | 151504320 | G/G | G/C | Missense | Q3051E | |
| PDGFRA | NM_006206 | chr4 | 54824777 | G/G | G/A | Missense | G185E | |
| RET | NM_020630 | chr10 | 42921884 | G/G | G/T | Missense | G308W | |
| RPN1 | NM_002950 | chr3 | 129823703 | G/G | G/T | Nonsense | C545STOP | |
| RUNX1 | NM_001001890 | chr21 | 35181094 | C/C | C/T | Coding-synonymous | NA | |
| STK11 | NM_000455 | chr19 | 1171708 | G/G | G/A | Coding-synonymous | NA | |
| TP53 | NM_000546 | chr17 | 7519259 | C/C | C/A | Missense | K132N | |
| ZNF521 | NM_015461 | chr18 | 21060818 | C/C | C/G | Coding-synonymous | NA |