| Literature DB >> 32631411 |
Louise de Schaetzen van Brienen1, Maarten Larmuseau1, Kim Van der Eecken2, Frederic De Ryck3, Pauline Robbe4,5, Anna Schuh4, Jan Fostier1, Piet Ost6, Kathleen Marchal7,8.
Abstract
BACKGROUND: Research grade Fresh Frozen (FF) DNA material is not yet routinely collected in clinical practice. Many hospitals, however, collect and store Formalin Fixed Paraffin Embedded (FFPE) tumor samples. Consequently, the sample size of whole genome cancer cohort studies could be increased tremendously by including FFPE samples, although the presence of artefacts might obfuscate the variant calling. To assess whether FFPE material can be used for cohort studies, we performed an in-depth comparison of somatic SNVs called on matching FF and FFPE Whole Genome Sequence (WGS) samples extracted from the same tumor.Entities:
Keywords: Cohort studies; FFPE; Precision oncology; Somatic variants; WGS
Mesh:
Substances:
Year: 2020 PMID: 32631411 PMCID: PMC7336445 DOI: 10.1186/s12920-020-00746-5
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Input material (in ng) per sample analyzed of patient UZ001
| Sample type | FFPE | Blood | FF |
|---|---|---|---|
| 300 | 300 | 500 |
Description of samples available from a pilot study of the 100,000 Genomes Project England
| Patient ID | FF sample ID | FFPE sample ID | Blood sample ID | Cancer Type | FFPE Prep Kit |
|---|---|---|---|---|---|
| LP2000456-DNA_A01 | LP2000467-DNA_A01 | LP2000446-DNA_A01 | Prostate | Covaris | |
| LP2000457-DNA_A01 | LP2000468-DNA_A01 | LP2000447-DNA_A01 | Prostate | Covaris | |
| LP2000558-DNA_A01 | LP2000596-DNA_A01 | LP2000577-DNA_A01 | Prostate | Covaris | |
| LP2000559-DNA_A01 | LP2000597-DNA_A01 | LP2000578-DNA_A01 | Prostate | Covaris | |
| LP2000462-DNA_A01 | LP2000691-DNA_A01 | LP2000452-DNA_A01 | Renal | Qiagen | |
| LP2000460-DNA_A01 | LP2000621-DNA_A01 | LP2000450-DNA_A01 | Renal | Covaris | |
| LP2000498-DNA_A01 | LP2000645-DNA_A01 | LP2000484-DNA_A01 | Renal | Covaris | |
| LP2000696-DNA_A01 | LP2000683-DNA_A01 | LP2000695-DNA_A01 | Renal | Qiagen | |
| LP2100046-DNA_A01 | LP2000888-DNA_A01 | LP2000888-DNA_C03 | Renal | Covaris |
Summary of the main parameters used for Strelka2, Mutect2, VarScan2 and Shimmer
| Strelka2 | Shimmer |
| Min Somatic EVS = 7 | Max q-value acceptable FDR = 0.05 |
Min base quality score = 10 Min Phred-scaled confidence threshold = 10 Min TLOD = 5.3 Min NLOD = 2.3 Sample ploidy = 2 Min MedianBaseQuality = 20 Min MedianMappingQuality = 30 | Min coverage in normal, in tumor = 8, 6 Min variant allele frequency = 0.01 Max somatic Min variant allele frequency = 0 Min read depth = 10 Min average quality = 20 Max somatic p-value = 0.01 |
Screening space for the threshold optimization for each variant caller
| Strelka2 | Shimmer |
| Somatic EVS from 5 to 20 (steps of 0.25) | Q-value from 0.0005 to 0.05 (steps of 0.0005) |
| TLOD from 0 to 200 (steps of 1) | Somatic p-value from 0.00005 to 0.01 (steps of 0.00005) |
Performance measures of calls considering the FF sample as gold standard for each variant caller (UZ001)
| Variant caller | FF | FFPE | Overlap | Sensitivity | Precision | F1-score |
|---|---|---|---|---|---|---|
| 6292 | 6761 | 4225 | 0.6715 | 0.6249 | 0.6474 | |
| 10,460 | 11,815 | 5755 | 0.5502 | 0.4871 | 0.5167 | |
| 4067 | 1760 | 883 | 0.2171 | 0.5017 | 0.3031 | |
| 8109 | 10,080 | 4865 | 0.6000 | 0.4826 | 0.5349 |
Fig. 1(Sub) clonal populations detection in the FF (left) and the FFPE (right) sample using Strelka2 (UZ001). The upper panel shows coverage as a function of the VAF [26], where a higher variance in the coverage can be observed for the FFPE sample. The lower panel shows the distribution of the VAFs. The blue distribution denotes all calls made in a given sample, while the orange distribution shows only the calls common to the FF and the FFPE sample
Average significance scores for somatic variants reported in both samples for each variant caller (UZ001). For Strelka2 and Mutect2, a higher Somatic EVS and TLOD means a higher confidence in the calls, while for VarScan2 and Shimmer a lower value implies a higher confidence
| Sample | Strelka2 | Mutect2 | VarScan2 | Shimmer |
|---|---|---|---|---|
| 17.33 | 51.88 | 0.0024 | 0.0160 | |
| 14.34 | 26.00 | 0.0038 | 0.0166 |
Fig. 2Boxplots comparing the significance level of FFPE reported or not in FF (UZ001). For Strelka2 and Mutect2, a higher Somatic EVS and TLOD means a higher confidence in the calls, while for Varscan2 and Shimmer a lower value implies a higher confidence
Fig. 3Coverage versus VAF for variants reported by Strelka2, comparing FF (left) against FFPE (right) (UZ001). This plot is identical to the upper panel of Fig. 1 but with a color used to indicate the most significant calls. The upper panel shows the 25% highest confidence calls in orange and the lower confidence in blue. The lower panel shows which calls are also found by other callers where blue = unique calls, orange = calls reported by two callers, green = calls reported by 3 callers, red = calls reported by 4 callers
Optimized F1-scores of calls for each variant caller considering FF sample as gold standard (UZ001). Optimized F1-scores of calls made by each variant caller considering FF sample as gold standard
| Caller - threshold | FF (gold std.) | FFPE | Overlap | Sensitivity | Precision | F1-score |
|---|---|---|---|---|---|---|
| 4656 | 4559 | 3316 | 0.7122 | 0.7274 | 0.7197 | |
| 4656 | 5658 | 3418 | 0.7341 | 0.6041 | 0.6628 | |
| 4656 | 1755 | 425 | 0.1045 | 0.2422 | 0.1460 | |
| 4656 | 262 | 16 | 0.0197 | 0.0611 | 0.0298 |
Strategies to retrieve the ground truth calls from the FF in the FFPE sample (UZ001)
| Reported by … (in FFPE) | FF (gold std.) | FFPE | Overlap | Sensitivity | Precision | F1-score |
|---|---|---|---|---|---|---|
| 4656 | 16,020 | 4155 | 0.8924 | 0.2594 | 0.4019 | |
| 4656 | 4232 | 3684 | 0.7912 | 0.8705 | 0.8290 | |
| 4656 | 340 | 325 | 0.0698 | 0.9559 | 0.1301 | |
| 4656 | 8 | 8 | 0.0017 | 1 | 0.0034 |
Summary table of the consistency (F1-scores) between the FF and the FFPE sample
| Patient ID | Strelka2 | Mutect2 | VarScan2 | Shimmer | Maximum | At least 2 | Improvement |
|---|---|---|---|---|---|---|---|
| 0.6474 | 0.5167 | 0.3031 | 0.5349 | 0.6474 | 0.8290 | 0.1816 | |
| 0.1533 | 0.1563 | 0.0297 | 0.0061 | 0.1563 | 0.2317 | 0.0754 | |
| 0.2344 | 0.0823 | 0.0168 | 0.0023 | 0.2344 | 0.3895 | 0.1551 | |
| 0.2298 | 0.1756 | 0.0210 | 0.0107 | 0.2298 | 0.4203 | 0.1904 | |
| 0.3586 | 0.2485 | 0.0449 | 0.0335 | 0.3586 | 0.5326 | 0.1740 | |
| 0.5656 | 0.5226 | 0.1502 | 0.2200 | 0.5656 | 0.7149 | 0.1493 | |
| 0.0646 | 0.1012 | 0.0034 | 0.0004 | 0.1012 | 0.0647 | −0.0365 | |
| 0.6308 | 0.5309 | 0.1604 | 0.0304 | 0.6308 | 0.6345 | 0.0037 | |
| 0.7724 | 0.7730 | 0.2358 | 0.5837 | 0.7730 | 0.8716 | 0.0986 | |
| 0.6532 | 0.6155 | 0.2473 | 0.1530 | 0.6532 | 0.8051 | 0.1519 | |
| 0.4310 | 0.3753 | 0.1212 | 0.1575 | 0.4350 | 0.5494 | 0.1144 |
Artefact contribution to the mutational profiles of the FF and the FFPE sample per caller. Values with an asterisk are artefact estimates based on reconstructed mutation profiles with a cosine similarity below 0.9 with the original mutation profiles. These may be unreliable (see Methods)
| Patient ID | Strelka2 | Mutect2 | VarScan2 | Shimmer | At least 2 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| 0.1419 | 0.0847 | 0.1024 | 0.0757 | 0.2010 | 0.0981 | 0.3252 | 0.6759 | 0.0822 | 0.0936 | |
| 0.1626 | 0.5381 | 0.1285 | 0.1824 | 0.3843 | 0.1981 | 0.1555 | 0.1863 | 0.0714 | 0.1132 | |
| 0.2133 | 0.1377 | 0.1302 | 0.2501 | 0.4345 | 0.1925 | 0.2466 | 0.1631 | 0.0796 | 0.1704 | |
| 0.4953 | 0.3111 | 0.1244 | 0.1261 | 0.5331 | 0.2191 | 0.1882 | 0.0556 | 0.1050 | 0.0899 | |
| 0.1344 | 0.1670 | 0.1041 | 0.0938 | 0.4947 | 0.1503 | 0.1152 | 0.0509 | 0.0666 | 0.0859 | |
| 0.2067 | 0.2896 | 0.1099 | 0.1611 | 0.3565 | 0.0932 | 0.1940 | 0.3090 | 0.0590 | 0.0775 | |
| 0.1164 | 0.1168 | 0.0833 | 0.2664 | 0.4955 | 0.1528 | 0.1474 | 0.3310 | 0.0747 | 0.0506 | |
| 0.1047 | 0.1686 | 0.0580 | 0.1127 | 0.1180 | 0.1425 | 0.1662 | 0.1458 | 0.1574 | 0.0816 | |
| 0.0822 | 0.2095 | 0.0781 | 0.0827 | 0.0583 | 0.0758 | 0.1022 | 0.1120 | 0.0544 | 0.0558 | |
| 0.0886 | 0.1701 | 0.0662 | 0.1192 | 0.0862 | 0.1051 | 0.0894 | 0.1362 | 0.0331 | 0.0436 | |
Fig. 4Evaluating the overlap between clonal variants from FF and FFPE samples of UZ001. By first defining the clonal and subclonal variants (see main text), it is possible to calculate the overlap between clonal variants only. Compared to Fig. 1, only variants using the at least two approach are shown, which are known to lie in diploid regions in both FF and FFPE and can be considered clonal. Overlap: red (green) refers to variants in the FFPE (FF) sample also found in the FF (FFPE) sample. Clearly, these additional filtering steps lead to an appreciable improvement in overlap between the FF and the FFPE sample
Number of clonal and subclonal variants detected in diploid regions common to FF and FFPE. For both clonal and subclonal variants the F1-score between the FF and the FFPE variants was calculated. Variants were called using the at least two variant calling strategy
| Clonality | Clonal | Subclonal | At least 2 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Patient ID | |||||||||
| 3229 | 3246 | 2795 | 0.86 | 372 | 0 | 0 | 0.00 | 0.83 | |
| 662 | 25 | 1 | 0.04 | 1862 | 1183 | 223 | 0.19 | 0.23 | |
| 1109 | 2417 | 996 | 0.41 | 425 | 0 | 0 | 0.00 | 0.39 | |
| 1016 | 296 | 244 | 0.82 | 275 | 1413 | 4 | 0.00 | 0.42 | |
| 1037 | 1585 | 908 | 0.57 | 721 | 0 | 0 | 0.00 | 0.53 | |
| 1074 | 1074 | 965 | 0.90 | 710 | 630 | 297 | 0.47 | 0.71 | |
| 3249 | 202 | 126 | 0.62 | 506 | 2103 | 4 | 0.00 | 0.06 | |
| 1434 | 1362 | 1287 | 0.94 | 587 | 552 | 321 | 0.58 | 0.63 | |
| 9660 | 9821 | 8890 | 0.91 | 354 | 678 | 16 | 0.02 | 0.87 | |
| 1099 | 1013 | 962 | 0.95 | 1331 | 1309 | 937 | 0.72 | 0.80 | |