M C Liu1, G R Oxnard2, E A Klein3, C Swanton4, M V Seiden5. 1. Division of Medical Oncology, Department of Oncology, Mayo Clinic, Rochester, USA. 2. Lowe Center for Thoracic Oncology, Dana Farber Cancer Institute, Boston, USA. 3. Glickman Urological and Kidney Institute, Cleveland Clinic, Cleveland, USA. 4. Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute; Cancer Evolution and Genome Instability Laboratory, University College London Cancer Institute, London, UK. 5. US Oncology Research, US Oncology, The Woodlands, USA. Electronic address: Michael.seiden@mckesson.com.
Abstract
BACKGROUND: Early cancer detection could identify tumors at a time when outcomes are superior and treatment is less morbid. This prospective case-control sub-study (from NCT02889978 and NCT03085888) assessed the performance of targeted methylation analysis of circulating cell-free DNA (cfDNA) to detect and localize multiple cancer types across all stages at high specificity. PARTICIPANTS AND METHODS: The 6689 participants [2482 cancer (>50 cancer types), 4207 non-cancer] were divided into training and validation sets. Plasma cfDNA underwent bisulfite sequencing targeting a panel of >100 000 informative methylation regions. A classifier was developed and validated for cancer detection and tissue of origin (TOO) localization. RESULTS: Performance was consistent in training and validation sets. In validation, specificity was 99.3% [95% confidence interval (CI): 98.3% to 99.8%; 0.7% false-positive rate (FPR)]. Stage I-III sensitivity was 67.3% (CI: 60.7% to 73.3%) in a pre-specified set of 12 cancer types (anus, bladder, colon/rectum, esophagus, head and neck, liver/bile-duct, lung, lymphoma, ovary, pancreas, plasma cell neoplasm, stomach), which account for ∼63% of US cancer deaths annually, and was 43.9% (CI: 39.4% to 48.5%) in all cancer types. Detection increased with increasing stage: in the pre-specified cancer types sensitivity was 39% (CI: 27% to 52%) in stage I, 69% (CI: 56% to 80%) in stage II, 83% (CI: 75% to 90%) in stage III, and 92% (CI: 86% to 96%) in stage IV. In all cancer types sensitivity was 18% (CI: 13% to 25%) in stage I, 43% (CI: 35% to 51%) in stage II, 81% (CI: 73% to 87%) in stage III, and 93% (CI: 87% to 96%) in stage IV. TOO was predicted in 96% of samples with cancer-like signal; of those, the TOO localization was accurate in 93%. CONCLUSIONS: cfDNA sequencing leveraging informative methylation patterns detected more than 50 cancer types across stages. Considering the potential value of early detection in deadly malignancies, further evaluation of this test is justified in prospective population-level studies.
BACKGROUND: Early cancer detection could identify tumors at a time when outcomes are superior and treatment is less morbid. This prospective case-control sub-study (from NCT02889978 and NCT03085888) assessed the performance of targeted methylation analysis of circulating cell-free DNA (cfDNA) to detect and localize multiple cancer types across all stages at high specificity. PARTICIPANTS AND METHODS: The 6689 participants [2482 cancer (>50 cancer types), 4207 non-cancer] were divided into training and validation sets. Plasma cfDNA underwent bisulfite sequencing targeting a panel of >100 000 informative methylation regions. A classifier was developed and validated for cancer detection and tissue of origin (TOO) localization. RESULTS: Performance was consistent in training and validation sets. In validation, specificity was 99.3% [95% confidence interval (CI): 98.3% to 99.8%; 0.7% false-positive rate (FPR)]. Stage I-III sensitivity was 67.3% (CI: 60.7% to 73.3%) in a pre-specified set of 12 cancer types (anus, bladder, colon/rectum, esophagus, head and neck, liver/bile-duct, lung, lymphoma, ovary, pancreas, plasma cell neoplasm, stomach), which account for ∼63% of US cancer deaths annually, and was 43.9% (CI: 39.4% to 48.5%) in all cancer types. Detection increased with increasing stage: in the pre-specified cancer types sensitivity was 39% (CI: 27% to 52%) in stage I, 69% (CI: 56% to 80%) in stage II, 83% (CI: 75% to 90%) in stage III, and 92% (CI: 86% to 96%) in stage IV. In all cancer types sensitivity was 18% (CI: 13% to 25%) in stage I, 43% (CI: 35% to 51%) in stage II, 81% (CI: 73% to 87%) in stage III, and 93% (CI: 87% to 96%) in stage IV. TOO was predicted in 96% of samples with cancer-like signal; of those, the TOO localization was accurate in 93%. CONCLUSIONS: cfDNA sequencing leveraging informative methylation patterns detected more than 50 cancer types across stages. Considering the potential value of early detection in deadly malignancies, further evaluation of this test is justified in prospective population-level studies.
Earlier cancer detection offers the opportunity to identify tumors when cures are more achievable, outcomes are superior, and treatment can be less morbid.[1,2] Effective screening paradigms exist only for a small subset of cancers, are focused on single cancer types, and have variable adoption and compliance.[3-6] Thus, diagnoses are often prompted by symptoms and are made at later stages. Utilization of blood-based circulating tumor cell-free DNA (cfDNA) to simultaneously detect and localize multiple cancer types[7,8] may address this large unmet need. In large-scale population screening, such a multi-cancer detection approach would require high specificity, clinically useful sensitivity, and highly accurate tissue of origin (TOO) identification to limit the scope, cost, and complexity of evaluating asymptomatic patients.There are few studies interrogating simultaneous detection and localization of multiple cancer types using cfDNA or other analytes.[9-11] These studies generally analyzed a handful of cancer types in geographically-restricted cohorts[9-11] or interrogated a single cfDNA-based molecular approach.[11] Current commercially available cfDNA-based approaches interrogating single-nucleotide variants (SNVs/indels) that focus on key alterations associated with specific tumor types or treatment options[12] may be hampered by confounding signals from white blood cells (WBCs) or other tissue.[13,14] Similarly, approaches based on detecting somatic copy number alterations may be limited by smaller relative differences between cases and controls resulting in a need for increased sequencing depth as well as technical variation restricting the signal-to-noise ratio.[15,16] These approaches as well as others such as protein biomarkers have not yet demonstrated robust TOO assignment across a broad range of tumor types to direct a diagnostic evaluation. To date there have been no population-scale studies of cancer cfDNA signatures nor have there been studies designed to ensure consistent performance in a representative screening population.The Circulating Cell-free Genome Atlas (CCGA; NCT02889978) study was designed to determine whether genome-wide cfDNA sequencing in combination with machine learning could detect and localize a large number of cancer types at sufficiently high specificity to be considered for a general population-based cancer screening program. During discovery work in the first CCGA sub-study, whole-genome bisulfite sequencing (WGBS) interrogating genome-wide methylation patterns outperformed whole-genome sequencing (WGS) and targeted sequencing approaches interrogating copy-number variants (CNVs) and single-nucleotide variants (SNVs)/small insertions and deletions, respectively.[7,17] Additionally, targeted sequencing with SNV-based classification was significantly confounded by clonal hematopoiesis of indeterminate potential (CHIP)[18]; such a test would thus require concurrent sequencing of WBCs to return accurate results. Herein, we report results from the second case-control sub-study designed to develop, train, and validate a methylation-based assay for simultaneous multi-cancer detection across stages as well as TOO localization in preparation for clinical validation and utility studies (NCT03085888, NCT03934866) and for a study in which results will be returned to health care providers and patients (NCT04241796).
METHODS
Study design and participants
CCGA (NCT02889978) is a prospective, multicenter, case-control, observational study with longitudinal follow-up. De-identified biospecimens were collected from 15 254 participants with (n = 8584; 56%) and without (n = 6670; 44%) cancer from 142 sites in North America. Up to 80 ml whole blood was collected from all participants as part of the research study; only two tubes of plasma were processed separately per participant. Pre-treatment tumor tissue when available was submitted from those with cancer (supplementary information, available at Annals of Oncology online). The CCGA study was divided into three pre-specified sub-studies: (1) discovery,[17] (2) further analysis (training/validation) with the selected assay (reported herein), and (3) further validation (forthcoming) (Figure 1A). The first sub-study aimed to identify the highest performing assay(s) for further development and included three independent, comprehensive sequencing approaches.[17,19] A methylation-based assay was selected for further development in this second sub-study based on the previous finding that WGBS outperformed targeted sequencing and WGS approaches targeting SNVs/small insertions and deletions and CNVs, respectively.[7,17] The primary objective was to train and validate a classifier for cancer versus non-cancer and TOO identification utilizing an updated targeted methylation assay (Figure 1B and C, Figure 2). Pre-specified analysis groups included all cancer types (more than 50 cancer types[20]; cancers grouped for reporting purposes, see supplementary information, available at Annals of Oncology online) and a subset of 12 high-signal cancers (supplementary information, available at Annals of Oncology online) based on results from the first sub-study (>50% sensitivity on at least one of three prototype sequencing assays) and Surveillance, Epidemiology, and End Results (SEER) mortality data (anus, bladder, colon/rectum, esophagus, head and neck, liver/bile-duct, lung, lymphoma, ovary, pancreas, plasma cell neoplasm, stomach).[17,21] The third sub-study was designed to further validate the classifier in a large population and is ongoing.
Figure 1.
The CCGA study for development and validation of a cfDNA-based assay for multi-cancer detection.
(A) CCGA study design. The CCGA study included three pre-specified sub-studies designed to discover, train, and validate an assay for multi-cancer detection and localization. The burgundy, shaded boxes highlight the second sub-study, which is the focus of this report. (B) Methylation biology discriminates cancer from non-cancer. One circulating cfDNA fragment is represented on the top left; individual CpGs are indicated as burgundy (methylated) or teal (unmethylated) circles. This assay interrogated fragment-level methylation patterns as indicated on the bottom left (‘Fragment-Level CpG Sites’). In non-cancer participants (top right), cfDNA is shed from cells across the body including WBCs[13] and is present in plasma. These DNA fragments retain methylation marks from the originating cells as indicated in this example from a region on chromosome 10. Individual cfDNA fragment sequencing reads are indicated as horizontal lines of differing sizes and are aligned vertically. In non-cancer participants, these fragments are largely unmethylated as indicated by the almost uniformly teal fragments. In a participant with lung cancer (bottom right), the plasma contains a mix of methylated (burgundy) and unmethylated (teal) fragments as the circulating cfDNA is a mixture of tumor cfDNA and cfDNA from other cells in the body. Sequencing of the tumor tissue sample confirms that this region is almost entirely methylated as indicated. Note that tumor tissue is not a requirement for this assay but is illustrative. (C) Target selection. A large database of methylation patterns (‘data input types’) was constructed from WGBS analysis of cfDNA and tissue samples from the CCGA study as well as WGBS analysis of a set of commercially sourced tissue samples. Systematic examination of the fragment-level methylation signature (‘methylation information type’) from these samples allowed the identification of a large number of genomic regions (‘target selection’) containing informative biological signatures of cancer and TOO.
Figure 2.
Pre-classifier sample preparation and preprocessing overview.
Illustration of how cfDNA fragments from the blood are processed: cfDNA was extracted from plasma, subjected to bisulfite treatment, and regions of interest were pulled down, followed by sequencing and alignment. In this way the methylation state of fragments was obtained.
This second sub-study reported herein included 4841 participants from CCGA divided into a training set (n = 3133: 1742 cancer and 1391 non-cancer; Figure 3) and an independent validation set (n = 1354: 740 cancer and 614 non-cancer); 354 samples were reserved for a tumor biopsy reference set. Samples for each sub-study were selected to ensure a pre-specified distribution of cancer types and non-cancers across sites in each cohort (supplementary information, available at Annals of Oncology online).
Figure 3.
Participant disposition.
A total of 4841 participants (2836 cancer, 2005 non-cancer) from the CCGA study and 2202 non-cancer participants from the STRIVE study were included in this pre-specified analysis. Of these, 3133 samples from CCGA were allocated to training (1742 cancer, 1391 non-cancer) and 1354 were allocated to validation (740 cancer, 614 non-cancer); 1587 samples from STRIVE were allocated to training and 615 to validation. STRIVE non-cancer samples were used to train the classifier and to ensure >99% specificity was achieved with >90% confidence (see Methods, supplementary information, available at Annals of Oncology online). Participant disposition is indicated. Overall, 3052 samples in training (1531 cancer, 1521 non-cancer) and 1264 samples in validation (654 cancer, 610 non-cancer) were analyzable and in the pre-specified primary analysis population. Samples reserved for pre-specified future analyses (as indicated) included, for example, samples lacking 1-year follow-up and samples from participants with carcinoma in situ (CIS) (see Methods).
To operate at high specificity, large numbers of well-characterized controls were required both to train classification and to measure specificity accurately due to sampling variability (i.e. to ensure >99% specificity was achieved with >90% confidence). These additional non-cancer blood samples were obtained from the independent STRIVE study (NCT03085888), a prospective, multicenter, case-cohort, observational study with longitudinal follow-up. STRIVE was designed to independently validate the ability of this multi-cancer early detection test to detect and localize multiple invasive cancers including breast in a population of women undergoing screening mammography (supplementary information, available at Annals of Oncology online). De-identified biospecimens were collected from 99 286 participants from 35 sites. Samples from women without a known history of cancer (non-cancer) were selected from a single STRIVE site and incorporated into the training (n = 1587) and independent validation sets (n = 615). As noted above, these STRIVE non-cancer samples were used to train the classifier and to ensure >99% specificity was achieved with >90% confidence. Thus, 6689 total samples were analyzed in this second sub-study: 4720 in the training set (1742 cancer; 2978 non-cancer) and 1969 in the independent validation set (740 cancer; 1229 non-cancer; Figure 3).All participants were required to provide informed consent; eligibility and exclusion criteria for each study are described in the supplementary information, available at Annals of Oncology online. Institutional Review Board or independent ethics committee approval was obtained at each participating site and the study was conducted in accordance with the Good Clinical Practice Guidelines of the International Conference on Harmonization.
Sample collection, accessioning, storage, and processing
Details of plasma and tumor tissue sample collection, accessioning, storage, and processing are described in the supplementary information, available at Annals of Oncology online.
WGBS
WGBS resulted in 3508 analyzable samples: cfDNA [n = 2628 (1493 cancer; 1135 non-cancer)], formalin fixed paraffin embedded (FFPE) tumor biopsies (n = 242), and WBCs (n = 70) from the first CCGA sub-study; commercial tissue or cells [n = 227; Discovery Life Sciences (formerly Conversant Biologics, Inc.); Huntsville, AL]; non-cancer cells (n = 1; from Yuval Dor, Hebrew University, Jerusalem, Israel); and FFPE tumor biopsies from the second CCGA sub-study (n = 340; these participants were not used in subsequent evaluation of the classifier). WGBS is described in the supplementary information, available at Annals of Oncology online.
Targeted methylation panel design, sample processing, and sequencing
Based on the WGBS results as noted above[17,19] and methylation array data from The Cancer Genome Atlas,[21,22] regions of the hg19 genome[22] predicted to contain cancer and/or tissue-specific methylation patterns in cfDNA relative to non-cancer controls were identified and the most informative targets were combined into a custom hybridization capture panel (Twist Bioscience, San Francisco, CA) using a custom algorithm (supplementary information, available at Annals of Oncology online). The final targeted methylation panel covered 103 456 distinct regions (17.2 Mb) and 1 116 720 cytosine-guanine dinucleotides (CpGs).Plasma cfDNA (up to 75 ng) was subjected to bisulfite conversion (EZ-96 DNA Methylation Kit; Zymo Research, Irvine, CA), prepared as a dual indexed sequencing library, and enriched using standard hybridization capture conditions, followed by 150-bp paired-end sequencing on Illumina NovaSeq (supplementary information, available at Annals of Oncology online). Individual libraries were sequenced to a median depth of 113 million fragments (median unique on-target depth: 139X).
Classification of cancer versus non-cancer and TOO
Custom software was built to classify samples using source models that recognized methylation patterns per region as similar to those derived from a particular cancer type, followed by a pair of ensemble logistic regressions: one to determine cancer/non-cancer status and the other to resolve the TOO to one of the listed sites (supplementary Figure S1C, available at Annals of Oncology online). Both levels of model (source models, logistic regression) as well as any thresholding parameters were trained on 11 154 samples from 5854 participants (supplementary information, available at Annals of Oncology online) using cross-validation so that no data in held-out folds could be accessed during training.
Tumor fraction
Tumor fraction analyses are described in the supplementary information, available at Annals of Oncology online.
Blinding
All analyses in training and validation were double-blinded; classifiers developed on the training set were locked and the final classifier was selected before the validation dataset was released and before release of the validation dataset a data integrity team blinded to classifier development, analysis, and clinical/assay evaluability reviewed merged data to ensure completeness (supplementary information, available at Annals of Oncology online). Researchers developing classifiers were also blinded to cancer status in the validation set.
RESULTS
This second pre-specified CCGA sub-study included 6689 participants with previously untreated cancer (n = 2482) or without cancer (n = 4207) (Figure 3). More than 50 primary cancer types[20] across all clinical stages were represented. Samples were divided into training (n = 4720) and independent validation sets (n = 1969). A total of 4316 participants [training: 3052 (1531 cancer: stage I: 28%; stage II: 25%; stage III: 20%; stage IV: 24%; missing/not expected: 3%; 1521 non-cancer); validation: 1264 (654 cancer: stage I: 28%; stage II: 25%; stage III: 21%; stage IV: 23%; missing/not expected: 3%; 610 non-cancer)] were analyzable and included in the primary analysis population (supplementary Table S1 available at Annals of Oncology online). Training and validation sets were generally comparable with respect to age, sex, race/ethnicity, and body mass index in the cancer and non-cancer groups; as expected, fewer never-smokers were in the cancer group in the training and validation sets (supplementary Table S1 available at Annals of Oncology online).The classifier achieved consistently high specificity between the cross-validated training and independent validation sets [99.8% (95% CI: 99.4% to 99.9%) versus 99.3% (95% CI: 98.3% to 99.8%), respectively; P = 0.095] (Figure 4A); this reflected a single, consistent, false-positive rate (FPR) of less than 1% across the more than 50 cancer types. The FPR in the validation set was similar for the CCGA and STRIVE non-cancer samples [0.7% (95% CI: 0.1% to 2.6%) versus 0.6% (95% CI: 0.1% to 2.1%), respectively; P = 0.830]; supporting that performance was not biased by sites or selected samples.
Figure 4.
Targeted methylation cfDNA test performance.
(A) Specificity. Specificity was >99% in the training and validation sets. Importantly, this represents a consistent, single false-positive rate (FPR) across the >50 cancer types in this study. (B) Sensitivity. Sensitivity (y-axis) is reported by clinical stage (x-axis) in the pre-specified cancer types (left panel) and in all cancer types (right panel) for training and validation. Numbers indicate samples in training|validation sets. It excludes 45 samples in training and 21 samples in validation without stage information (e.g. leukemias). (C) Tissue of origin. Tissue of origin (TOO) accuracy (y-axis) is reported by clinical stage (x-axis) in the pre-specified cancer types (left panel) and in all cancer types (right panel) for training and validation. Numbers indicate samples in training|validation sets.
Sensitivity was consistent in the training and validation sets. In all cancers, stage I–III sensitivity was 44.2% (95% CI: 41.3% to 47.2%) versus 43.9% (95% CI: 39.4% to 48.5%) (P = 1.000), respectively. For the pre-specified set of 12 high-signal cancers, stage I–III sensitivity was 69.8% (95% CI: 65.6% to 73.7%) versus 67.3% (95% CI: 60.7% to 73.3%), respectively (P = 0.988). Similarly, stage I–IV sensitivity across all cancer types was 55.2% (95% CI: 52.7% to 57.7%) versus 54.9% (95% CI: 51.0% to 58.8%), respectively (P = 0.897), and in the pre-specified cancers was 77.9% (95% CI: 75.0% to 80.7%) versus 76.4% (95% CI: 71.6% to 80.7%), respectively (P = 0.573). Clinical models incorporating baseline demographic information and blood sample quality metrics alone resulted in <10% sensitivity in the training set and as such were not evaluated in the validation set (supplementary information, available at Annals of Oncology online).Sensitivity increased with increasing stage of disease (Figure 4B). In validation, sensitivity in pre-specified cancer types was 39% (95% CI: 27% to 52%) in stage I (n = 62), 69% (95% CI: 56% to 80%) in stage II (n = 62), 83% (95% CI: 75% to 90%) in stage III (n = 102), and 92% (95% CI: 86% to 96%) in stage IV (n = 130). Among all cancer types, sensitivity was 18% (95% CI: 13% to 25%) in stage I (n = 185), 43% (95% CI: 35% to 51%) in stage II (n = 166), 81% (95% CI: 73% to 87%) in stage III (n = 134), and 93% (95% CI: 87% to 96%) in stage IV (n = 148). Performance in individual tumor types is depicted in Figure 5. These included numerous deadly cancer types without screening paradigms; for example in pancreatic cancer sensitivity was 63% (95% CI: 24% to 91%) in stage I, 83% (95% CI: 36% to 100%) in stage II, 75% (95% CI: 35% to 97%) in stage III, and 100% (80% to 100%) in stage IV. Tumor fraction (supplementary Figure S2, available at Annals of Oncology online) as measured by the frequency of abnormal tumor methylation patterns in plasma correlated with tumor fraction based on tumor mutation variant allele frequencies in plasma confirming the tumor-derived nature of the methylation signal.
Figure 5.
Sensitivity in individual tumors by stage.
Sensitivity at 99.8% specificity (training) or 99.3% specificity (validation) with 95% confidence intervals is reported for individual cancer types with at least 50 samples. Clinical stage is indicated below the plots as is the number of samples in training and validation (separated by a vertical line).
To ensure consistent performance in centers that did not contribute to training and to ensure that single sites did not over-contribute, a post hoc site balancing analysis interrogated performance in a subset of centers dropped from the analysis (supplementary information, available at Annals of Oncology online). A limited shift in sensitivity consistent with variability in the training set was observed when omitting those sites from training [sensitivity for included versus excluded sites: 53.6% (95% CI: 51.8% to 55.4%) versus 50.0% (95% CI: 48.2% to 51.8%), respectively]; specificity was also within the expected range of variation [FPR of 0.5% (95% CI: 0.2% to 1%) versus 0.4% (95% CI: 0.2% to 0.8%), respectively].A critical attribute of a blood-based multi-cancer detection test is the ability to localize the TOO to direct the diagnostic workup. A pre-specified analysis of TOO accuracy (the fraction of all TOO predictions that were correct) found that TOO was predicted in 96% (344/359) of samples with a cancer-like signal in the validation set; among these, accuracy was 93% (321/344). Accuracy was consistent between the training and validation sets and across stages (Figure 4C). The classifier distinguished from among the numerous cancer types included in the study with consistent performance in individual cancer types (Figure 6).
Figure 6.
Tissue of origin accuracy by individual cancer type in the training and validation sets.
Confusion matrices representing the accuracy of tissue of origin (TOO) localization in the (A) training and (B) validation sets. Agreement between the actual (x-axis) and predicted (y-axis) TOO per sample using the targeted methylation classifier is depicted. Color corresponds to the proportion of predicted TOO calls. Included participants (training: n = 844, validation: n = 359) are those with cancer predicted as having cancer at 99.8% specificity (training) or 99.3% specificity (validation). The TOO calls were assigned in 95% (806/844) of cases in training and in 96% (344/359) of cases in validation; calls were correct in 92% (744/806) of cases in training and in 93% (321/344) of cases in validation.
DISCUSSION
Herein we report on the largest clinical genomics programs, to our knowledge, on participants with and without cancer, to develop and validate a blood-based test for multi-cancer early detection. The CCGA study was designed such that results may be generalizable as well as to minimize bias, a problem that has plagued the early detection field. This was accomplished by pre-specifying analyses, controlling for pre-analytic factors (e.g., age, sex, site location) and ensuring that demographics were comparable between the cancer and non-cancer groups, ensuring that stage distribution and method of diagnosis were consistent in independent training and validation sets (supplementary Table S1 available at Annals of Oncology online), ensuring that multiple cancer types at all stages (including early stages) were represented such that resultant cancer classifiers would not be confounded by inappropriate comparison cohorts, and ensuring that there were no site-specific effects on classifier performance. Inclusion of an independent validation set confirmed that the classifier was not over-fitted. Lastly, the inclusion of a large non-cancer cohort enriched in potentially confounding conditions demonstrated with confidence a high specificity (i.e. safety) that may be appropriate for population-level screening, minimizing potential harm from false positives. Together, these data provide compelling evidence that targeted methylation analysis of cfDNA can detect and localize a broad range of non-metastatic and metastatic cancer types including many common and deadly cancers that lack effective screening strategies.Methylation outperformed WGS and targeted mutation panels in cancer detection and TOO localization[7,8] for a number of reasons. Methylation is more pervasive compared with canonical mutation sites[23] typically interrogated in traditional liquid biopsy approaches. Indeed, this targeted methylation approach interrogated approximately 1 million informative CpG sites out of the roughly 30 million CpGs across the genome that can be methylated or unmethylated.[24] This allowed deeper sequencing of those informative regions compared with WGBS and may overcome expected cost and efficiency limitations of WGS or WGBS approaches. Although WGS detected cancer at high tumor fractions, it had a worse limit of detection than a methylation-based approach.[7] Targeted mutation detection also suffered a worse limit of detection[7] and was subject to highly prevalent mutations present in individuals due to biological processes such as CHIP.[18] As such, unlike methylation, targeted sequencing required concurrent WBC sequencing to achieve strong performance. Finally, epigenetic signals inherently reflect tissue differentiation and malignant cancer states; this likely contributed to the strong cancer detection and TOO classification.Other recently published studies also reported the feasibility of blood-based multi-cancer detection. These studies combined mutation detection with serum protein biomarkers,[25] leveraged differences in cfDNA fragment lengths between cancer and non-cancer participants,[10] or utilized a methylation-based immunoprecipitation approach to avoid perceived issues with bisulfite sequencing degradation.[11] To overcome potential concerns about applicability to broad populations and less prevalent cancers, here we reported on >50 cancer types (supplementary information, available at Annals of Oncology online[20]) covering sites with >95% of the SEER program cancer incidence and included more than 4300 cancer and non-cancer participants in the primary analysis population. Additionally, this approach was developed after exhaustive preclinical analyses into three complementary and comprehensive sequencing approaches,[17,19] ensuring that the highest performing assay was further developed.A blood-based multi-cancer detection test should demonstrate certain fundamental performance characteristics to be useful in a general screening population. These include a sufficiently high specificity to ensure a low rate of false positives as well as accuracy in determining TOO. We reported a <1% FPR, a single, fixed FPR across all cancer types, such that inclusion of additional cancer types to the test would increase the number of tumors detected but not the number of false positives. By contrast, single-cancer early detection tests used in combination would generally have a cumulative FPR higher than the individual tests,[26] potentially increasing unnecessary diagnostic work-ups. Accurate TOO localization is critical to direct the diagnostic workup; in its absence, patients with a positive test may be subjected to a diagnostic odyssey. This also applies to blood-based single-cancer detection, which would still require accurate TOO to avoid a diagnostic odyssey from other potential cancers not being tested for. The ability to discriminate from among so many cancer types may also be useful in cases of diagnostic uncertainty such as cancer of unknown primary origin. Finally, there would be little-to-no benefit from artificially increasing sensitivity by excluding cancer types in a multi-cancer test; the higher overall prevalence of cancer versus the prevalence of any single-cancer type means that a multi-cancer test with moderate sensitivity may result in a higher yield of detected cancers than a single-cancer test with very high sensitivity. As such, sensitivity must be considered in light of the number of interrogated cancer types.Screening tests are subject to FPRs and the subsequent morbidity and psychological, physical, and financial costs associated with secondary screening or diagnostic tests. Minimizing these risks and costs requires high positive predictive value (PPV) in the target population, especially in asymptomatic populations. PPV is more significantly impacted by specificity and disease prevalence than by sensitivity. As noted above, multi-cancer detection would thus benefit from aggregate cancer incidence compared with single-cancer screening given that most cancer types have low prevalence in a screening population. While precise PPV calculations require measurement in a prospective trial of asymptomatic individuals, preliminary calculations can be carried out based on available cancer statistics. Specifically, assuming test performance replicates in an asymptomatic population, a multi-cancer test with a stage I–IV sensitivity of 55% and a specificity of 99.3%, both from representative populations as reported here, applied to a similar population with a 1.3% incidence rate per year of cancer[1,27] would detect 715 cancers per 100 000 screened persons in a long-term screening program and would necessitate diagnostic work-ups in 691 FPRs, yielding a PPV of 51%. By contrast, the PPVs for United States Preventive Services Task Force (USPSTF) recommended screening for breast, colorectal (stool-based), and lung cancer (in the USPSTF-recommended high-risk population) range from 3.7% to 4.4%[28-30] for every one person with cancer correctly detected, there would be between 22 and 27 people incorrectly identified as having cancer.Despite the scale of and care in developing and validating this targeted methylation approach, the study has limitations. Participants with cancer were not all asymptomatic; to understand performance in an asymptomatic screening population will require additional studies, which are ongoing. Establishing a mortality benefit will also require additional studies as the CCGA study was not designed to examine all-cause mortality outcomes. Until such longer-term studies are completed, a multi-cancer test that shifts detection to earlier stages may function as a proxy for mortality, given that cancer-specific mortality is improved when cancer is diagnosed at earlier stages. Whether cancers detected at later stages can be intercepted at earlier stages using this cfDNA-based multi-cancer early detection test will require additional studies in intended use populations. At the time of analysis, complete 1-year follow-up was not available on all non-cancer participants to ensure their ascribed non-cancer status was accurate, thus potentially overestimating the FPR and underestimating PPV. Indeed, prior results from the first sub-study identified cancer signal up to 15 months before clinical diagnosis in participants enrolled without a cancer diagnosis.[31] Follow-up of non-cancer participants in this sub-study is ongoing. Aggregate sensitivities were likely affected by stage and cancer distribution; reporting by individual cancer type and by stage as in this report is thus critical to contextualizing aggregate performance metrics. Confusion in TOO identification often occurred among HPV-driven cancers (e.g. cervix, anus, head and neck cancers); analyses are ongoing to further improve accuracy by leveraging this information. Finally, despite the broad range of cancer types captured in this study, for some cancer types the sample size was small, precluding a full representation of heterogeneity within some cancer types.In summary, cfDNA sequencing of informative methylation patterns detected a broad range of cancer types at metastatic and non-metastatic stages with specificity and sensitivity performance approaching the goal for population-level screening. The pre-specified cancer types identified here account for ~63% of all estimated cancer deaths.[21,27] Clinical validation in intended use populations is ongoing (NCT03085888, NCT03934866) and a study has been initiated that is returning results to health care providers and patients (NCT04241796). These results support the feasibility of employing this targeted methylation analysis of cfDNA in ongoing clinical trials in the intended use population for early cancer detection.
Authors: Anand Narayan; Alexander Fischer; Zihe Zhang; Ryan Woods; Elizabeth Morris; Susan Harvey Journal: Breast Cancer Res Treat Date: 2017-05-15 Impact factor: 4.872
Authors: Jason D Merker; Geoffrey R Oxnard; Carolyn Compton; Maximilian Diehn; Patricia Hurley; Alexander J Lazar; Neal Lindeman; Christina M Lockwood; Alex J Rai; Richard L Schilsky; Apostolia M Tsimberidou; Patricia Vasalos; Brooke L Billman; Thomas K Oliver; Suanna S Bruinooge; Daniel F Hayes; Nicholas C Turner Journal: J Clin Oncol Date: 2018-03-05 Impact factor: 44.544
Authors: Yuebi Hu; Bryan C Ulrich; Julianna Supplee; Yanan Kuang; Patrick H Lizotte; Nora B Feeney; Nicolas M Guibert; Mark M Awad; Kwok-Kin Wong; Pasi A Jänne; Cloud P Paweletz; Geoffrey R Oxnard Journal: Clin Cancer Res Date: 2018-03-22 Impact factor: 12.531
Authors: Constance D Lehman; Robert F Arao; Brian L Sprague; Janie M Lee; Diana S M Buist; Karla Kerlikowske; Louise M Henderson; Tracy Onega; Anna N A Tosteson; Garth H Rauscher; Diana L Miglioretti Journal: Radiology Date: 2016-12-05 Impact factor: 11.105
Authors: Bert Vogelstein; Nickolas Papadopoulos; Victor E Velculescu; Shibin Zhou; Luis A Diaz; Kenneth W Kinzler Journal: Science Date: 2013-03-29 Impact factor: 47.728
Authors: Jennifer Miller Croswell; Barnett S Kramer; Aimee R Kreimer; Phil C Prorok; Jian-Lun Xu; Stuart G Baker; Richard Fagerstrom; Thomas L Riley; Jonathan D Clapp; Christine D Berg; John K Gohagan; Gerald L Andriole; David Chia; Timothy R Church; E David Crawford; Mona N Fouad; Edward P Gelmann; Lois Lamerato; Douglas J Reding; Robert E Schoen Journal: Ann Fam Med Date: 2009 May-Jun Impact factor: 5.166
Authors: Timothy R Church; William C Black; Denise R Aberle; Christine D Berg; Kathy L Clingan; Fenghai Duan; Richard M Fagerstrom; Ilana F Gareen; David S Gierada; Gordon C Jones; Irene Mahon; Pamela M Marcus; JoRean D Sicks; Amanda Jain; Sarah Baum Journal: N Engl J Med Date: 2013-05-23 Impact factor: 91.245
Authors: Stephen Cristiano; Alessandro Leal; Jillian Phallen; Jacob Fiksel; Vilmos Adleff; Daniel C Bruhm; Sarah Østrup Jensen; Jamie E Medina; Carolyn Hruban; James R White; Doreen N Palsgrove; Noushin Niknafs; Valsamo Anagnostou; Patrick Forde; Jarushka Naidoo; Kristen Marrone; Julie Brahmer; Brian D Woodward; Hatim Husain; Karlijn L van Rooijen; Mai-Britt Worm Ørntoft; Anders Husted Madsen; Cornelis J H van de Velde; Marcel Verheij; Annemieke Cats; Cornelis J A Punt; Geraldine R Vink; Nicole C T van Grieken; Miriam Koopman; Remond J A Fijneman; Julia S Johansen; Hans Jørgen Nielsen; Gerrit A Meijer; Claus Lindbjerg Andersen; Robert B Scharpf; Victor E Velculescu Journal: Nature Date: 2019-05-29 Impact factor: 49.962
Authors: Joshua D Cohen; Lu Li; Yuxuan Wang; Christopher Thoburn; Bahman Afsari; Ludmila Danilova; Christopher Douville; Ammar A Javed; Fay Wong; Austin Mattox; Ralph H Hruban; Christopher L Wolfgang; Michael G Goggins; Marco Dal Molin; Tian-Li Wang; Richard Roden; Alison P Klein; Janine Ptak; Lisa Dobbyn; Joy Schaefer; Natalie Silliman; Maria Popoli; Joshua T Vogelstein; James D Browne; Robert E Schoen; Randall E Brand; Jeanne Tie; Peter Gibbs; Hui-Li Wong; Aaron S Mansfield; Jin Jen; Samir M Hanash; Massimo Falconi; Peter J Allen; Shibin Zhou; Chetan Bettegowda; Luis A Diaz; Cristian Tomasetti; Kenneth W Kinzler; Bert Vogelstein; Anne Marie Lennon; Nickolas Papadopoulos Journal: Science Date: 2018-01-18 Impact factor: 47.728
Authors: Erica K Barnell; Yiming Kang; Andrew R Barnell; Kimberly R Kruse; Jared Fiske; Zachary R Pittz; Adnan R Khan; Thomas A Huebner; Faith L Holmes; Malachi Griffith; Obi L Griffith; Aadel A Chaudhuri; Elizabeth M Wurtzler Journal: Clin Transl Gastroenterol Date: 2021-05-24 Impact factor: 4.488
Authors: David R Baldwin; Matthew E Callister; Philip A Crosbie; Emma L O'Dowd; Robert C Rintoul; Hilary A Robbins; Robert J C Steele Journal: Eur Respir J Date: 2021-01-14 Impact factor: 16.671
Authors: Vincent K Lam; Jianjun Zhang; Carol C Wu; Hai T Tran; Lerong Li; Lixia Diao; Jing Wang; Waree Rinsurongkawong; Victoria M Raymond; Richard B Lanman; Jeff Lewis; Emily B Roarty; Jack Roth; Stephen Swisher; J Jack Lee; Don L Gibbons; Vassiliki A Papadimitrakopoulou; John V Heymach Journal: J Thorac Oncol Date: 2020-12-31 Impact factor: 15.609
Authors: Peter J Mazzone; Gerard A Silvestri; Lesley H Souter; Tanner J Caverly; Jeffrey P Kanne; Hormuzd A Katki; Renda Soylemez Wiener; Frank C Detterbeck Journal: Chest Date: 2021-07-13 Impact factor: 9.410