Literature DB >> 32859160

A machine learning approach to optimizing cell-free DNA sequencing panels: with an application to prostate cancer.

Clinton L Cario^1,2, Emmalyn Chen², Lancelote Leong², Nima C Emami^1,2, Karen Lopez³, Imelda Tenggara³, Jeffry P Simko^3,4, Terence W Friedlander⁵, Patricia S Li⁵, Pamela L Paris^3,5, Peter R Carroll³, John S Witte^6,7.

Abstract

BACKGROUND: Cell-free DNA's (cfDNA) use as a biomarker in cancer is challenging due to genetic heterogeneity of malignancies and rarity of tumor-derived molecules. Here we describe and demonstrate a novel machine-learning guided panel design strategy for improving the detection of tumor variants in cfDNA. Using this approach, we first generated a model to classify and score candidate variants for inclusion on a prostate cancer targeted sequencing panel. We then used this panel to screen tumor variants from prostate cancer patients with localized disease in both in silico and hybrid capture settings.
METHODS: Whole Genome Sequence (WGS) data from 550 prostate tumors was analyzed to build a targeted sequencing panel of single point and small (< 200 bp) indel mutations, which was subsequently screened in silico against prostate tumor sequences from 5 patients to assess performance against commonly used alternative panel designs. The panel's ability to detect tumor-derived cfDNA variants was then assessed using prospectively collected cfDNA and tumor foci from a test set 18 prostate cancer patients with localized disease undergoing radical proctectomy.
RESULTS: The panel generated from this approach identified as top candidates mutations in known driver genes (e.g. HRAS) and prostate cancer related transcription factor binding sites (e.g. MYC, AR). It outperformed two commonly used designs in detecting somatic mutations found in the cfDNA of 5 prostate cancer patients when analyzed in an in silico setting. Additionally, hybrid capture and 2500X sequencing of cfDNA molecules using the panel resulted in detection of tumor variants in all 18 patients of a test set, where 15 of the 18 patients had detected variants found in multiple foci.
CONCLUSION: Machine learning-prioritized targeted sequencing panels may prove useful for broad and sensitive variant detection in the cfDNA of heterogeneous diseases. This strategy has implications for disease detection and monitoring when applied to the cfDNA isolated from prostate cancer patients.

Entities: Chemical Disease Gene Mutation Species

Keywords: Cell-free DNA; Machine learning; Panel design; Prostate cancer; Tumor variant detection

Mesh：

Substances：

Year: 2020 PMID： 32859160 PMCID： PMC7456018 DOI： 10.1186/s12885-020-07318-x

Source DB: PubMed Journal: BMC Cancer ISSN： 1471-2407 Impact factor: 4.430

Background

Substantial research has explored potential oncological applications of cell-free DNA (cfDNA), including in early detection, monitoring of residual disease, recurrence following treatment, and as a discovery tool for determining actionable therapeutic targets [1-3]. However, success using cfDNA in cancer has been limited by heterogeneity and signal intensity. In the context of heterogeneous cancers like those of the prostate, cfDNA also provides an opportunity to comprehensively measure tumor clonality (i.e. via liquid biopsy) through detection of genetic signatures of foci that would otherwise be missed with traditional tissue biopsy. Despite promising initial results, widespread clinical adoption of cfDNA as a biomarker has been impeded by several challenges [4]. One of the most important limitations, especially in the context of variant detection, is the scarcity of circulating tumor DNA (ctDNA) molecules derived from a tumor from typical blood draw volumes, an issue compounded by the weak signal-to-noise ratio of ctDNA with respect to the cfDNA derived from healthy tissue (ctDNA often representing much less than 1% of the total cfDNA fraction) [5, 6]. Several strategies have been developed to circumvent this issue, including techniques to enrich tumor derived molecules [7], highly sensitive qPCR- or ddPCR-based assays to detect well-characterized (or personalized) mutations [8-10], and deep sequencing of broad regions of the genome. Each approach has limitations; for example, enrichment techniques are limited to only modest (~ 2–4 fold) enrichment [7, 11] while qPCR-based methods require a priori or patient-specific variant knowledge and cannot readily be used for de novo discovery or across broad patient cohorts. In some cancers, like prostate, this is especially problematic as even the most common driver mutations exist at frequencies too low to be of broad clinical utility [12]. Targeted deep sequencing, on the other hand, can be used for de novo discovery and broader patient coverage, but faces issues concerning sensitivity and specificity introduced by weak tumor signal, clonal hematopoiesis (CH) [13], and technical artifacts introduced during library preparation and sequencing. Additionally, efforts to mitigate these issues are diametrically opposed— at fixed cost, one must choose to either sequence broadly at low depth with reduced sensitivity or more narrowly and deeply but with reduced specificity. To improve upon detection, we propose a solution that leverages the strengths of targeted deep sequencing and minimizes the weaknesses of traditional panel design by generating a targeted panel guided by machine learning. This solution consists of three strategies: 1) generating a sub exome-sized (2.5 Mb) targeted sequencing panel, but instead of only including the coding regions of known cancer genes, focusing on small (~ 350 bp, corresponding to dinucleosomal cfDNA) regions of the genome that are either coding or regulatory non-coding and potentially harbor tumor mutations; 2) computationally selecting candidates for inclusion on this panel with a machine learning model built from actual tumor data and optimized to detect functional or regulatory mutations (“orchid”); and 3) using unique molecular identifiers (UMIs) to suppress technical errors induced by library preparation and sequencing. In this article, we present our targeted sequencing panel design, demonstrate its in silico performance through comparison with two other design approaches, and then validate its ability to detect somatically validated multi-foci tumor variants in the cfDNA of prostate cancer patients at the time of prostatectomy.

Methods

Patients cohorts

This study uses data from two main patient cohorts, including public prostate tumor variant data from 550 patients cataloged in the International Cancer Genome Consortium (ICGC) and 23 (5 for our in-silico analysis and 18 for our variant capture test) patients from the University of California, San Francisco (UCSF). In the ICGC dataset, patient ages ranged between 32 and 81 (mean of 58.7) and had the following stage distributions: T1 (30%), T2 (42%), T3 (17%), T4 (1%), and Unknown (11%). In the UCSF cohort, patient ages ranged between 50 and 73 (mean of 62.9) and had the following stage distributions: T1 (34.8%), T2 (60.9%), and T3 (4.3%). Additional information about patient cohorts is given in Supplemental File “Donor Information.xlsx”.

Training data

Whole Genome Sequence (WGS) tumor variant data from the 550 ICGC prostate cancer patients (274 with copy number information) was used to populate a mutation database. In total, the database consisted of 1,588,558 single base substitutions, 66,202 insertions ≤200 bp, and 90,255 deletions ≤200 bp. Of the 1,717,507 mutations, 90.5% had sequencing coverage between 30-80X. These mutations were annotated with 339 features using the orchid software (http://wittelab.ucsf.edu/orchid) (‘orchid’ panel; Supplemental) [14]. Among features, annotations included those related to functional impact, non-coding regulatory status, cancer driver-ness scores, and base-level evolutionary conservation among primates.

Panel generation

To build our targeted sequencing panel, we first trained a classification and ranking model, a linear support classifier (SVC), using the orchid software as well as data from our mutation database. We also generated two panels from methods widely used in the field in order to benchmark performance: 1) a gene-centric panel consisting of coding regions from the aggregated set of ~ 530 genes found in four clinically available cancer-specific targeted sequencing gene panels (referred to as “union-existing”; Supplemental Table 1), and 2) a “Frequency” panel, consisting of the most frequent mutations in the ICGC prostate cancer dataset. Code used to generate the panels and the panel variant composition can be found in the repository at https://github.com/wittelab/cfdna-panel-publication/.

In silico analysis

We first benchmarked the orchid panel’s variant capture performance against the two other designs using an in silico analysis of Whole Exome Sequenced (WES) tumor foci DNA and matched cfDNA from 5 patients undergoing radical prostatectomy at UCSF. Somatic variant calling was performed for at least 2 different tumor foci with a normal tissue control for each patient. Next, we generated in silico capture probes for the orchid panel by expanding the genomic coordinates of panel mutations by ±175 bp to match the mode size of cfDNA molecules. Tumor and cfDNA variants were intersected with the orchid panel and the two comparison panels described above.

Patient ctDNA variant detection

The cfDNA from a cohort of 18 prostate cancer patients was isolated and prepared as UMI-tagged libraries for sequencing. After the in silico validation of the orchid panel, hybrid capture probes were ordered and used to sequenced panel regions at 2500X. Tumor variants were subsequently called using the Curio Genomics platform (https://curiogenomics.com), which was designed specifically for processing UMI-barcoded data generated though ThruPlex Tag-seq library preparation by grouping reads into amplification families prior to constructing consensus reads for variant calling. See Supplemental for more details.

Results

Defining and evaluating mutation classes

It is widely accepted that a relatively small number of genetic variants are responsible for the cellular transformations leading to cancer and that these “drivers” often occur early in a tumor’s evolution leading to high clonality among tumor subclones [15-18]. Additionally, the proportion of drivers decreases relative to the number of passengers as a tumor accumulates mutations [17, 19–21]. Following this, we hypothesize that if tumors accumulate mutations as they evolve, those with the lowest mutational burden are both enriched for drivers and more likely to harbor variants at high allele frequency among subclones, making these variants the best candidates for detection in cfDNA. In order to prioritize which of these low burden mutations should be included on our cfDNA screening panel, we built a mutational scoring model using the sequencing data from the ICGC prostate cancer patients, first defining training labels by dividing ICGC prostate cancer patients into equal-sized groups (n = 275 each) based on their number of mutations: 1) Low Burden (LB), consisting of mutations from men with a lower mutational burden, and 2) High Burden (HB), consisting of mutations from men with a greater burden (Fig. 1a). We next tested the hypothesis that LB labeled mutations were enriched for drivers, evaluating for their presence in 88 known driver genes (as identified by The Cancer Genome Consortium Prostate Cancer Adenocarcinoma project (TCGA-PRAD) and the IntOGen database; accessed 6/18/2019 [22]) and found significant enrichment in the LB class (hypergeometric test; p = 4.11 e-129), but not the HB class (p > 0.99). This was also the case with 97 prostate driver genes defined by Fraser et. al. (LB p = 1.13 e-119; HB p > 0.99) [23]. With these two classes defined, computational complexity was reduced through random down sampling of the data to a total of approximately 50,000 unique mutations while preserving the original LB:HB mutation label ratio in the dataset (approximately 1:40).

Fig. 1

Modeling simple somatic mutations. a We divided ICGC prostate cancer donors into two classes, Low Burden (LB) or High Burden (HB), based on the number of somatic mutations in their tumors and labeled their mutations accordingly. b After modeling with a linear Support Vector Classifier (SVC), we generated a ROC curve of LB classification. Accuracy was 76% +/− 12%. c We visualized classification probabilities for test mutations. The model predicts fewer LB mutations and classifies both LB and HB with high confidence. d We show model feature weights for both classes when features were used as lone predictors. Repressed regions of the genome were more predictive of HB mutations whereas regulatory, transcribed regions of the genome or ‘deleterious’ mutations were more predictive of LB mutations

Initial modeling and performance

In an effort to guard against overfitting, we used orchid’s feature selection method—which removes features that each account for an average drop in accuracy < 0.1% when excluded from the full-featured model—to reduce the number of features from > 300 to 20, and performed 10-fold cross validation with a linear SVC, generating a “LB” predictive model. A ROC curve of model performance in the test sets is shown in Fig. 1b, indicating a 0.76 (± 0.12) classification accuracy. When classification probabilities across all test cases were plotted, we observed a higher likelihood of HB mutation classification, which was expected from the intentional unbalanced LB:HB class ratio used for training (Fig. 1c). To better understand the importance of classification features, we next used each feature singularly in a series new LB/HB classification models, visualizing feature weights and directionality. From this we observed that repressed regions of the genome were predictive of HB mutations, and conversely, regulatory/transcribed regions of the genome—and features indicating strong evolutionary conservation at the base level—were predictive of LB mutations (Fig. 1d; Supplemental Fig. 1). This was also expected under our assumption that LB mutations were more likely to be drivers. We elaborate on feature importance in Supplemental Results.

Mutation ranking

After selecting features, we then built a final classification model fully trained on down-sampled data (i.e. none withheld for testing) and used it to score LB probability for all prostate cancer mutations in the database. Mutation distances from the fit model’s classification hyperplane were then used to rank them. Those with the greatest magnitudes in the LB direction (i.e. the most “driver-like” or “clonal” under our hypothesis) were further considered for inclusion on the targeted sequencing panel.

Standardizing mutation scores

We annotated candidate LB mutations with associated gene information, if available, using SnpEff [24], as well as functional impact information and transcript length from the UCSC genome database. After binning genes according to their length, we visualized the number of mutations per gene (Fig. 2a). As expected, longer genes had more mutations (Pearson’s correlation = 0.20, p = 6.03 e-39), creating a scenario where marginally scored mutations could be selected for panel inclusion by virtue of strong gene-level feature annotations preferred by the model. To address this issue and to increase gene mutational diversity on the panel, we implemented a corrective standardization (Supplemental Fig. 2) and applied it to the distance scores of mutations (Fig. 2b). This standardization reduced Pearson’s correlation between gene length and candidate mutations number to 0.05 (p = 1.5 e-3). Mutations that were non-coding or without gene annotation were unaffected by this standardization. After applying this correction, the top 7034 mutations were then selected for the “orchid” panel. In all, this panel represented 0.41% of the total number of original candidate LB mutations.

Fig. 2

Generating a targeted sequencing library for hybrid capture of LB mutations. We generated a candidate panel consisting of probes targeting the ~ 7000 highest ranked LB mutation loci. a We binned genes represented by candidate mutations into 10 groups based on length and show the distribution in number of mutations. Gene length correlated with the number of mutations on the panel (Pearson’s correlation = 0.20, p = 6.03e-39). b We employed a distance standardization to mutation hyperplane distances to increase gene diversity on the panel. After standardization the correlation between gene length and number of mutations decreased significantly (Pearson’s correlation = 0.05, p = 0.0015). c We plotted the hyperplane distances of retained mutations after standardization against the log mutation rank. Mutations are labeled as coding (green) or non-coding (grey). The top 5 coding mutations with their corresponding genes are labeled. d We show a table of panel mutation consequence types and counts, colored by impact severity (red = high, orange = moderate, yellow = low, blue = modifier)

Panel composition

Once our standardized orchid panel was established, we attempted to biologically characterize the mutational composition. Looking at the top 5 coding mutations, for example, we noted that corresponding genes (FES, TNFSF15, MSRB1, HRAS, and SLC1A7) have all been experimentally implicated in cancer as drivers [25, 26] (Fig. 2c). Additionally, we found panel mutations to be significantly enriched for the aforementioned 97 prostate driver mutations (p = 2.24 e-13); KEGG-annotated general cancer (p = 4.18 e-228) and prostate cancer genes (p = 1.19 e-61) (https://www.genome.jp/kegg); and regions associated with regulation of cellular response to growth factors (p = 2.68 e-4), MAP kinase activity (p = 8.66 e-8), and Integrin signaling (p = 1.74 e-13) among others [27-29] (Enrichr; http://amp.pharm.mssm.edu/Enrichr/). Finally, looking at functional impact, we noticed a majority of coding mutations were classified as high or moderate impact and included 50 induced stop gains and 3386 missense mutations. A table of consequence mutations is shown in Fig. 2d. While the most highly ranked mutations were coding, many functional non-coding mutations were also included (~ 18%) on the panel. For example, we discovered significant enrichment for several general and prostate cancer transcription factor binding sites (Supplemental Fig. 3), including BRD4 (e = 329), CTCF (e = 254), FOXA1 (e = 188), MYC (e = 181), and AR (e = 159), as well as a microRNA involved in angiogenesis (mir-126) [27, 30] (ReMap; http://tagc.univ-mrs.fr/remap/).

Panel performance: in silico analysis

After characterizing the orchid panel, we compared how well it detected somatic variants in relation to two other panels: 1) the union of four existing sequencing panels (Fluxion Biosciences, Foundation Medicine, Guardant Health, and UCSF 500, referred to as the “union-existing” panel; Supplemental Table 1); and 2) a frequency-based panel (consisting of the most common mutations; see Methods). We assessed this by measuring each panels’ ability to identify somatic tumor-normal variants in multiple tumor foci from 5 prostate cancer patients. Overall, the orchid panel detected more variants than both the frequency panel (p = 7.4 e-9) and the union-existing panels (p = 3.6 e-10; Fig. 3), and these differences were statistically significant for all patients except P0024 (only one focus; union-existing [p < 0.03], frequency [p < 0.02] using a T-test). We also note that, given a fair percentage of the orchid panel (~ 18%) consists of non-coding regions, tumor variants within these regions could not be assessed through WES data, potentially underestimating the panel’s performance.

Fig. 3

Panel performance using in silico capture of cfDNA. Five patients with multiple prostate cancer tumor foci and normal tissue DNA were whole exome sequenced at 200X-fold coverage. Discovered somatic variants were in silico “captured” with three panels: 1) our orchid generated panel, 2) a panel consisting of all mutations in the ICGC prostate cancer dataset with a frequency > 1 patient, and 3) a panel consisting of genes on any of 4 clinically used panels (union-existing). The mean number of total somatic mutations across foci are listed below each patient and the mean numbers of those present on each of the three panels are shown (blue bars). Orchid detected significantly more mutations in all patients except P0024 (with only one focus; union-existing [p < 0.03], frequency [p < 0.02] using a T-test)

Panel performance: ctDNA variant detection

After confirming that our orchid-generated machine learning panel improved upon the union-existing and frequency-based panels in an in silico setting for detection of mutations in tumor tissue, we ordered hybrid capture probes for regions encompassing the orchid panel mutations (a genomic footprint totaling ~ 2.5 Mb). We then sequenced 18 patients with multiple prostate tumor foci and normal tissues at 2500X with our panel. Matched cfDNA was also collected for these patients at time of radical prostatectomy and targeted-sequenced at a depth of 2500X with our panel. We next assessed the panel’s performance in detecting somatic tumor-normal ctDNA variants within the collected cfDNA of these patients. After removing variants not passing quality control filters (see Supplemental Methods), we found that variants were detected in all 18 patients, ranging between 15 (S038) and 448 (S076) in number with a median of 122.5. We additionally filtered variants by requiring they be detected in multiple foci of a tumor. In this case, variants were detected in 15 of the 18 patients, ranging between 4 (S025 and S078) and 289 (S076) in number with a median of 26 (Fig. 4a). The allele frequency of detected variants across patients ranged between 0.24% (S058, S067, and S078) and 19.82% (S027; a conservative lower threshold for germline variants), with a median of 3.76% (Fig. 4b). Allele frequency did not correlate with age, stage, or Gleason score (p > 0.05).

Fig. 4

Variant detection and frequency distribution in prostate cancer patients using the orchid generated targeted sequencing panel. Eighteen patients with multiple tumor foci and normal tissue DNA were sequenced at 2500X-fold coverage after targeted capture using the orchid generated panel. Matched cfDNA was likewise captured and sequenced. a The number of tumor variants detected in the cfDNA of 15 patients is shown. Tumor variants were both somatic and present in multiple tumor foci. Three of the eighteen patients did not have any mutations detected in more than one focus. b The allele frequency distribution of all cfDNA detected tumor variants in A (germline threshold shown at 20%; theoretical sensitivity at 0.8%)

Discussion

Despite continued progress and the marked successes of cfDNA’s application in late-stage disease [2, 31, 32], ongoing issues prevent wide-spread adoption for early-stage cancer. These issues largely center on tumor heterogeneity and scarcity of tumor derived molecules in circulation. The issues are further compounded by challenges with sample collection and processing, variant artifacts (including CH mutations for non-tumor-matched samples), and bioinformatic analysis. The most straight-forward solutions to mitigate these problems include increasing the volume of blood collected (e.g., 30–100 mL), analyzing ctDNA variants with paired whole blood normal samples, and sequencing at ultra-high depths (e.g. > 30,000X). Other solutions include improving molecular techniques, error suppression (e.g. UMIs), and optimizing the composition of gene sequencing panels [11, 13, 33–37]. Here we expand upon optimizing panels, leveraging machine learning to move past driver-, gene-, or frequency- based panels towards one informed directly by biological datasets. In particular, this is accomplished by modeling low burden mutational signatures developed from tumor/normal sequence and variant annotation data. While our machine learning approach improved the sequencing panel design, the accuracy of predicting LB versus HB mutations was only 0.76. This accuracy can be largely explained by label contamination introduced though incomplete partitioning of driver and clonal variants into the LB class, as presence of these mutations also occurs in the HB class albeit at lower frequency (our hypothesis only assumes enrichment in LB). This situation motivated our use of a linear support classifier, which has a higher tolerance of noise (e.g. mislabeled training data) and better feature interpretability relative than other machine learning model types. We found orchid’s feature classification weights to be sensible; for example, associating evolutionary conserved/transcribed regions of the genome with LB tumor mutations, and repressed regions of the genome with HB tumor mutations. Still, despite a fairly high accuracy for noisy data and sensible feature selection, the modeling approach could be improved upon with alternative labeling strategies and/or training data, drawing upon recently generated datasets of statistically determined drivers in noncoding regions of the genome [38], for example. There are a number of other qualifications to our machine learning panel design approach that merit consideration. First, establishing a panel’s clinical utility will require much larger sample sizes and greater sequencing depth to further validate variant detection and improve sensitivity in early stage prostate cancer patients. Second, to better elucidate and catalogue CH variants, cfDNA samples should be paired with DNA isolated from whole blood samples and sequenced at equal depth, especially when matched tumor samples are not available. Third, although we compared our panel to two alternative designs in silico, future work should compare panels directly using patient cfDNA samples—ideally with paired deep whole genome tumor/normal sequence data. Finally, to further assess panel detection as it relates to mutation clonality, follow-up comparison with more sensitive detection strategies (e.g. qPCR), and serial sampling of patient tumors during course of treatment would need to be performed. As liquid biopsy and cfDNA continues to find increasing clinical applications, the modeling approach described here can be adopted to generate panels for those purposes as well. For example, in discriminatory dichotomous scenarios (early vs. late stage, onset vs. recurrence variants), mutational spectra can be learned to rank variants for the formation of a blended panel consisting of highly ranked mutations from both classes. Variants from this panel that are ultimately detected within a patient could then be used in a maximum likelihood computation to determine the patient’s likeliest class. Likewise, multi-class models (e.g. tumor stage) could be developed in a similar fashion. Finally, panels designed to optimize variant detection (like the orchid panel) could potentially be used to estimate Tumor Mutational Burden (TMB) which has recently become an important biomarker in cancer, particularly within the cancer immunotherapy field.

Conclusions

The use of machine learning to optimize targeted sequencing panel composition presents a promising new approach to improve ctDNA variant detection in patients with cancer. In an in silico screen, our panel outperformed two alternatives in detecting tumor-derived ctDNA mutations—one generated from a combination of several existing panels, and one based on tumor mutation frequencies. We also demonstrated the targeted panel’s ability to detect tumor variants found in both the cfDNA captured from prostate cancer patients and multiple foci of their tumors. In summary, we have developed a novel method to rank coding and non-coding tumor mutations for inclusion on a targeted sequencing panel. To our knowledge, this is the first use of machine leaning to generate a capture panel for screening ctDNA of cancer patients. While further research is needed to address the issues of scarce starting material, modeling, and variant discovery, our results provide a useful strategy for broad—yet sensitive—future panel design. Strategies like these are increasingly important for mutation detection in cfDNA isolated from cancer patients with heterogeneous disease, especially at sequencing depths required to reach levels of sensitivity needed for utility in early detection at an affordable cost. Additional file 1.

37 in total

1. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors: Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal: Fly (Austin) Date: 2012 Apr-Jun Impact factor: 2.160

2. Enhanced detection of circulating tumor DNA by fragment size analysis.

Authors: Florent Mouliere; Dineika Chandrananda; Anna M Piskorz; Elizabeth K Moore; James Morris; Lise Barlebo Ahlborn; Richard Mair; Teodora Goranova; Francesco Marass; Katrin Heider; Jonathan C M Wan; Anna Supernat; Irena Hudecova; Ioannis Gounaris; Susana Ros; Mercedes Jimenez-Linan; Javier Garcia-Corbacho; Keval Patel; Olga Østrup; Suzanne Murphy; Matthew D Eldridge; Davina Gale; Grant D Stewart; Johanna Burge; Wendy N Cooper; Michiel S van der Heijden; Charles E Massie; Colin Watts; Pippa Corrie; Simon Pacey; Kevin M Brindle; Richard D Baird; Morten Mau-Sørensen; Christine A Parkinson; Christopher G Smith; James D Brenton; Nitzan Rosenfeld
Journal: Sci Transl Med Date: 2018-11-07 Impact factor: 17.956

3. Genomic hallmarks of localized, non-indolent prostate cancer.

Authors: Michael Fraser; Veronica Y Sabelnykova; Takafumi N Yamaguchi; Lawrence E Heisler; Julie Livingstone; Vincent Huang; Yu-Jia Shiah; Fouad Yousif; Xihui Lin; Andre P Masella; Natalie S Fox; Michael Xie; Stephenie D Prokopec; Alejandro Berlin; Emilie Lalonde; Musaddeque Ahmed; Dominique Trudel; Xuemei Luo; Timothy A Beck; Alice Meng; Junyan Zhang; Alister D'Costa; Robert E Denroche; Haiying Kong; Shadrielle Melijah G Espiritu; Melvin L K Chua; Ada Wong; Taryne Chong; Michelle Sam; Jeremy Johns; Lee Timms; Nicholas B Buchner; Michèle Orain; Valérie Picard; Helène Hovington; Alexander Murison; Ken Kron; Nicholas J Harding; Christine P'ng; Kathleen E Houlahan; Kenneth C Chu; Bryan Lo; Francis Nguyen; Constance H Li; Ren X Sun; Richard de Borja; Christopher I Cooper; Julia F Hopkins; Shaylan K Govind; Clement Fung; Daryl Waggott; Jeffrey Green; Syed Haider; Michelle A Chan-Seng-Yue; Esther Jung; Zhiyuan Wang; Alain Bergeron; Alan Dal Pra; Louis Lacombe; Colin C Collins; Cenk Sahinalp; Mathieu Lupien; Neil E Fleshner; Housheng H He; Yves Fradet; Bernard Tetu; Theodorus van der Kwast; John D McPherson; Robert G Bristow; Paul C Boutros
Journal: Nature Date: 2017-01-09 Impact factor: 49.962

Review 4. Integrins in cancer: biological implications and therapeutic opportunities.

Authors: Jay S Desgrosellier; David A Cheresh
Journal: Nat Rev Cancer Date: 2010-01 Impact factor: 60.716

5. Quantitative detection of EGFR mutations in circulating tumor DNA derived from lung adenocarcinomas.

Authors: Kazuya Taniguchi; Junji Uchida; Kazumi Nishino; Toru Kumagai; Takako Okuyama; Jiro Okami; Masahiko Higashiyama; Ken Kodama; Fumio Imamura; Kikuya Kato
Journal: Clin Cancer Res Date: 2011-10-05 Impact factor: 12.531

Review 6. MAP kinase signalling pathways in cancer.

Authors: A S Dhillon; S Hagan; O Rath; W Kolch
Journal: Oncogene Date: 2007-05-14 Impact factor: 9.867

7. Unsupervised detection of cancer driver mutations with parsimony-guided learning.

Authors: Runjun D Kumar; S Joshua Swamidass; Ron Bose
Journal: Nat Genet Date: 2016-09-12 Impact factor: 38.330

Review 8. Cell-free DNA (cfDNA): Clinical Significance and Utility in Cancer Shaped By Emerging Technologies.

Authors: Stanislav Volik; Miguel Alcaide; Ryan D Morin; Colin Collins
Journal: Mol Cancer Res Date: 2016-07-15 Impact factor: 5.852

9. Patterns of somatic mutation in human cancer genomes.

Authors: Christopher Greenman; Philip Stephens; Raffaella Smith; Gillian L Dalgliesh; Christopher Hunter; Graham Bignell; Helen Davies; Jon Teague; Adam Butler; Claire Stevens; Sarah Edkins; Sarah O'Meara; Imre Vastrik; Esther E Schmidt; Tim Avis; Syd Barthorpe; Gurpreet Bhamra; Gemma Buck; Bhudipa Choudhury; Jody Clements; Jennifer Cole; Ed Dicks; Simon Forbes; Kris Gray; Kelly Halliday; Rachel Harrison; Katy Hills; Jon Hinton; Andy Jenkinson; David Jones; Andy Menzies; Tatiana Mironenko; Janet Perry; Keiran Raine; Dave Richardson; Rebecca Shepherd; Alexandra Small; Calli Tofts; Jennifer Varian; Tony Webb; Sofie West; Sara Widaa; Andy Yates; Daniel P Cahill; David N Louis; Peter Goldstraw; Andrew G Nicholson; Francis Brasseur; Leendert Looijenga; Barbara L Weber; Yoke-Eng Chiew; Anna DeFazio; Mel F Greaves; Anthony R Green; Peter Campbell; Ewan Birney; Douglas F Easton; Georgia Chenevix-Trench; Min-Han Tan; Sok Kean Khoo; Bin Tean Teh; Siu Tsan Yuen; Suet Yi Leung; Richard Wooster; P Andrew Futreal; Michael R Stratton
Journal: Nature Date: 2007-03-08 Impact factor: 49.962

10. Whole genome amplification of cell-free DNA enables detection of circulating tumor DNA mutations from fingerstick capillary blood.

Authors: Rekha Gyanchandani; Erik Kvam; Ryan Heller; Erin Finehout; Nicholas Smith; Karthik Kota; John R Nelson; Weston Griffin; Shannon Puhalla; Adam M Brufsky; Nancy E Davidson; Adrian V Lee
Journal: Sci Rep Date: 2018-11-23 Impact factor: 4.379

4 in total

Review 1. Liquid biopsies for colorectal cancer: a narrative review of ongoing clinical trials and the current use of this technology at a comprehensive cancer center.

Authors: Sacha P Broccard; Ali Abbaszadeh Kasbi; Sanjay P Bagaria; Jeremy Jones; Mira Shoudry; Emmanuel M Gabriel
Journal: J Gastrointest Oncol Date: 2022-02

2. Integrating chromatin accessibility states in the design of targeted sequencing panels for liquid biopsy.

Authors: Pegah Taklifi; Fahimeh Palizban; Mahya Mehrmohamadi
Journal: Sci Rep Date: 2022-06-21 Impact factor: 4.996

3. Cancer systems epidemiology: Overcoming misconceptions and integrating systems approaches into cancer research.

Authors: Patricia L Mabry; Nicolaas P Pronk; Christopher I Amos; John S Witte; Patrick T Wedlock; Sarah M Bartsch; Bruce Y Lee
Journal: PLoS Med Date: 2022-06-17 Impact factor: 11.613

4. Cell-Free DNA Detection of Tumor Mutations in Heterogeneous, Localized Prostate Cancer Via Targeted, Multiregion Sequencing.

Authors: Emmalyn Chen; Clinton L Cario; Lancelote Leong; Karen Lopez; César P Márquez; Patricia S Li; Erica Oropeza; Imelda Tenggara; Janet Cowan; Jeffry P Simko; Robin Kageyama; Daniel K Wells; June M Chan; Terence Friedlander; Rahul Aggarwal; Pamela L Paris; Felix Feng; Peter R Carroll; John S Witte
Journal: JCO Precis Oncol Date: 2021-04-27

4 in total