| Literature DB >> 34516908 |
Paulina Siejka-Zielińska1,2, Jingfei Cheng1,2, Felix Jackson1,2,3, Yibin Liu1,2, Zahir Soonawalla4, Srikanth Reddy5, Michael Silva6, Luminita Puta1, Misti Vanette McCain7,8, Emma L Culver9, Noor Bekkali10, Benjamin Schuster-Böckler1,11, Pier Francesco Palamara12,13, Derek Mann7,14,15, Helen Reeves7,8,16, Eleanor Barnes9, Shivan Sivakumar17,18,19, Chun-Xiao Song1,2.
Abstract
Multimodal, genome-wide characterization of epigenetic and genetic information in circulating cell-free DNA (cfDNA) could enable more sensitive early cancer detection, but it is technologically challenging. Recently, we developed TET-assisted pyridine borane sequencing (TAPS), which is a mild, bisulfite-free method for base-resolution direct DNA methylation sequencing. Here, we optimized TAPS for cfDNA (cfTAPS) to provide high-quality and high-depth whole-genome cell-free methylomes. We applied cfTAPS to 85 cfDNA samples from patients with hepatocellular carcinoma (HCC) or pancreatic ductal adenocarcinoma (PDAC) and noncancer controls. From only 10 ng of cfDNA (1 to 3 ml of plasma), we generated the most comprehensive cfDNA methylome to date. We demonstrated that cfTAPS provides multimodal information about cfDNA characteristics, including DNA methylation, tissue of origin, and DNA fragmentation. Integrated analysis of these epigenetic and genetic features enables accurate identification of early HCC and PDAC.Entities:
Year: 2021 PMID: 34516908 PMCID: PMC8442905 DOI: 10.1126/sciadv.abh0534
Source DB: PubMed Journal: Sci Adv ISSN: 2375-2548 Impact factor: 14.136
Fig. 1.cfDNA analysis by TAPS.
(A) Schematic representation of the TAPS approach for cfDNA analysis. CfDNA is isolated from 1 to 3 ml of plasma. cfDNA (10 ng) is ligated to Illumina sequencing adapters and topped up with 100 ng of carrier DNA. Subsequently, 5mC and 5hmC in DNA are oxidized by mTet1CD enzyme to 5caC, reduced by PyBr to DHU, and amplified and detected as T in the final sequencing. Computational analysis of TAPS data allows simultaneous characterization of multiple cfDNA features including DNA methylation, tissue of origin, fragmentation patterns, and CNVs. (B) Number of total reads, uniquely mapped reads, and uniquely mapped, PCR deduplicated reads in 87 cfDNA TAPS libraries. Total number of reads, mean percentage of uniquely mapped reads, and deduplicated reads compared to total reads are shown above the bars. Error bars represent SE. (C) 5mC conversion rate and false-positive rate in 85 cfDNA TAPS libraries based on spike-in controls with modified or unmodified cytosines at the known positions. Each dot represents an individual sample.
Fig. 2.cfDNA methylation in clinical samples.
(A) Cancer stage distribution of 21 HCC and 23 PDAC patients included in the study. (B) Mean per CpG genome modification level in noncancer controls, HCC, and PDAC cfDNA. Each dot represents an individual sample. (C) PCA plot of cfDNA methylation in 1-kb genomic windows in noncancer controls and HCC. (D) PCA plots of cfDNA methylation in 1-kb genomic windows in noncancer controls and PDAC. (E) The overrepresentation analysis on the regions correlated most with PC2 for HCC and PC1 for PDAC in regulatory regions. (F) ROC curve of model classification performance based on differentially methylated enhancers in HCC and noncancer controls (n = 51, HCC = 21, noncancer controls = 30). FPF, false positive fraction; TPF, true positive fraction. (G) LOO cancer prediction scores for HCC and noncancer controls. Dashed line represents probability score threshold. Samples with a probability score above this threshold were predicted as HCC. (H) ROC curve of model classification performance based on differentially methylated promoters between PDAC and noncancer controls (n = 53, PDAC = 23, noncancer controls = 30). (I) LOO cancer prediction scores for PDAC and noncancer controls. Dashed line represents probability score threshold. Samples with a probability score above this threshold were predicted as PDAC.
Fig. 3.cfTAPS enables analysis of tissue of origin and fragmentation patterns in cfDNA.
(A) Mean tissue contribution in noncancer individuals estimated by NNLS. Tissue contributions less than 1.5% are aggregated as “Other.” (B) Boxplot showing the estimated liver cancer contribution within noncancer, HCC, and PDAC groups. Statistical significance was assessed with a paired t test. n.s., not significant. (C) Length distribution of cfDNA fragments in the three groups. For each sample, proportion in 10-bp intervals of long cfDNA fragments (300 to 500 bp) was used as fragmentation features for PCA and machine learning. (D) Boxplot showing proportion of short (70 to 150 bp) and long (300 to 500 bp) fragments in noncancer controls, PDAC, and HCC. The Kruskal-Wallis test was performed to test differences in fragment size distribution between groups. Statistically significant differences are marked with asterisks (**P < 0.01, ****P < 0.0001). (E) PCA plot of cfDNA 10-bp fragment fraction in noncancer controls and HCC (left) and noncancer controls and PDAC (right).
Fig. 4.Integrating multimodal features from cfTAPS enhances multicancer detection.
(A) Heatmap showing individual model performance on multicancer prediction and the predicted probabilities for each patient. Each vertical column is a patient. Detection yes/no means that patients are correctly classified or misclassified on the basis of a particular feature. Predicted score means the probability of classifying the patients to a specific group based on a particular feature. (B) Schematic detailing the method of integrating multiple features (DNA methylation, tissue contribution, and fragmentation fraction) extracted from cfTAPS data for multicancer prediction. (C) Actual and predicted patient status calculated in LOO cross-validation.