| Literature DB >> 24885028 |
Stephen Q Wong, Jason Li, Angela Y-C Tan, Ravikiran Vedururu, Jia-Min B Pang, Hongdo Do, Jason Ellul, Ken Doig, Anthony Bell, Grant A MacArthur, Stephen B Fox, David M Thomas, Andrew Fellowes, John P Parisot, Alexander Dobrovic1.
Abstract
BACKGROUND: Clinical specimens undergoing diagnostic molecular pathology testing are fixed in formalin due to the necessity for detailed morphological assessment. However, formalin fixation can cause major issues with molecular testing, as it causes DNA damage such as fragmentation and non-reproducible sequencing artefacts after PCR amplification. In the context of massively parallel sequencing (MPS), distinguishing true low frequency variants from sequencing artefacts remains challenging. The prevalence of formalin-induced DNA damage and its impact on molecular testing and clinical genomics remains poorly understood.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24885028 PMCID: PMC4032349 DOI: 10.1186/1755-8794-7-23
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Summary of formalin-fixed samples
| Breast | 81 | 16.6% |
| Head and neck | 80 | 16.4% |
| Prostate | 79 | 16.2% |
| Colorectal | 52 | 10.7% |
| Lung | 47 | 9.6% |
| Other* | 42 | 8.6% |
| Cervical | 25 | 5.1% |
| Bone and soft tissue | 22 | 4.5% |
| Oesophagogastric | 15 | 3.1% |
| Renal | 14 | 2.9% |
| Central nervous system | 12 | 2.5% |
| Melanoma | 11 | 2.3% |
| Cancer of unknown primary | 8 | 1.6% |
*Represents other cancer types with smaller numbers including pancreatic, ovarian, thyroid, testicular, bladder, hepatic, endometrial, biliary and anal cancers.
Figure 1Association of fragmentation of DNA in FFPE samples with low sequencing coverage. Coverage of each sample (number of reads) versus the copies of the FTH1 gene as assessed by a Taqman PCR assay (copies per microlitre). n = 253. There was a positive correlation between the FTH1 result and coverage (Spearman correlation, r = −0.29, p < 0.0001). The 50 samples with the highest C>T/G>A levels in the 1-10% allele frequency range are shown in red.
Figure 2Significant levels of C>T/G>A sequencing artefacts in FFPE samples. (A) Assessment of sequence artefacts in cell line DNA and FFPE samples. The prevalence of each type of nucleotide change in the 1-10% allele frequency range was computed. Likely true variants identified through the Varscan2 variant caller were operationally removed to enrich for sequencing artefact changes. The graph shows all FFPE samples sorted according to the counts of C>T/G>A changes. Zoomed view: HL-60 and H1975 cell lines were used as good quality DNA controls. (B) The prevalence of each type of nucleotide change in the 10-25% allele frequency range. The graph shows all FFPE samples sorted according to the counts of C>T/G>A changes.
Figure 3Low coverage samples have higher rates of C>T/G>A sequencing artefacts. For all FFPE samples (x-axis), values for coverage (blue) and the counts of C>T/G>A sequencing artefacts (red) are plotted on the same y-axis. There was an inverse correlation between coverage and C>T/G>A sequence artefacts (Spearman correlation, r = −0.24, p < 0.0001).
Figure 4Uracil-DNA glycosylase treatment of FFPE DNA samples distinguishes true and false positive clinical relevant mutations. Integrative Genomic Viewer (IGV) screenshots of two breast cancers and one melanoma sample pre- and post- uracil-DNA glycosylase (UDG) treatment samples. The two breast cancer samples have confirmed PIK3CA mutations (E545K for Ca309 and H1047Y for Ca285) as these mutations were still detected after UDG treatment. The NRAS G12D mutation identified in the pre-UDG sample (Ca97) was a false positive as it was not present after UDG treatment. The variant reads over the total reads and overall allele frequency (a.f.) are shown for each case.
Figure 5Low template copies are associated with higher probability of sequencing artefacts post-PCR amplification. In good quality DNA from sources such as blood and fresh frozen tissue, fragmentation and uracil lesions are present at very low levels. In this circumstance, high amounts of amplifiable template increase the likelihood of accurately identifying mutations due to high sequencing coverage with little or no stochastic enrichment of sequencing artefacts. In FFPE DNA with moderate fragmentation, the number of amplifiable templates is reduced, with some formalin-induced uracil lesions being present in template DNA. Subsequently PCR amplification results in lower coverage due to less amplifiable template numbers. Uracil lesions are also amplified, and due to the lower copy numbers, can appear as non-reproducible sequencing artefacts (C>T/G>A changes). These artefacts will be low in frequency. In the case of FFPE with high amounts of fragmentation, the numbers of amplifiable template are severely limited. An artefact in one of these templates can then appear as a moderate to high frequency sequencing variant. These can subsequently be interpreted as real mutations.