Literature DB >> 35477453

Discovery of genes positively modulating treatment effect using potential outcome framework and Bayesian update.

Young Keun Lee¹, Jisoo Kim², Sung Wook Seo^3,4.

Abstract

BACKGROUND: The recent explosion of cancer genomics provides extensive information about mutations and gene expression changes in cancer. However, most of the identified gene mutations are not clinically utilized. It remains uncertain whether the presence of a certain genetic alteration will affect treatment response. Conventional statistics have limitations for causal inferences and are hard to gain sufficient power in genomic datasets. Here, we developed and evaluated a C-search algorithm for searching the causal genes that maximize the effect of the treatment.
METHODS: The algorithm was developed based on the potential outcome framework and Bayesian posterior update. The precision of the algorithm was validated using a simulation dataset. The algorithm was implemented to a cBioPortal dataset. The genes discovered by the algorithm were externally validated within CancerSCAN screening data from Samsung Medical Center.
RESULTS: Simulation data analysis showed that the C-search algorithm was able to identify nine causal genes out of ten. The C-search algorithm shows the discovery rate rapidly increasing until the 1500 data instances. Meanwhile, the log-rank test shows a slower increase in performance. The C-search algorithm was able to suggest nine causal genes from the cBioPortal Metabric dataset. Treating the patients with the causal genes is associated with better survival outcome in both the cBioPortal dataset and the CancerSCAN dataset which is used for external validation.
CONCLUSIONS: Our C-search algorithm demonstrated better performance to identify causal effects of the genes than multiple log-rank test analysis especially within a limited number of data. The result suggests that the C-search can discover the causal genes from various genetic datasets, where the number of samples is limited compared to the number of variables.

Entities: Chemical

Keywords: Bayesian; Causal inference; Genomics; Potential outcome framework; Treatment modulators

Mesh：

Year: 2022 PMID： 35477453 PMCID： PMC9047392 DOI： 10.1186/s12911-022-01852-3

Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN： 1472-6947 Impact factor: 3.298

Introduction

Identifying genomic sequences and analyzing data is a major focus in cancer studies [1]. An understanding of the causal relationship between therapeutic effect and genomic variances among tumors will allow individualized treatment and reduce unnecessary treatment. There are open-access, open data sources, such as the cBioPortal for Cancer Genomics. Although a large amount of data is readily accessible to researchers, most of the identified gene mutations are not clinically utilized [2]. Conventional statistical analysis of genetic data consists of a series of single-statistic tests. The cumulative probability of false positives increases as the number of genes increases. To deal with multiple-testing problems, the false discovery rate (FDR) is used, which is “expected type I errors among the total number of rejected null hypotheses” [3]. Despite the approach, the dimension of the data significantly affects the statistical power of the test. In addition, conventional statistics only draw an association; therefore, distinguishing causal relationships from spurious associations is a challenge [4]. To draw causal effects from observational data, Rubin introduced a potential outcome framework [5, 6] Individual levels of treatment effect are derived from a comparison of two potential outcomes. however, observing the exposed and unexposed outcomes at the same time is impossible. One of the methods to overcome this fundamental problem of causal inference is to compute the potential outcome from samples matched with similar covariate profiles [4, 7, 8]. However, due to the curse of dimensionality in cancer genomics, enough sample size may not be available to match the exposed and unexposed within a genetic subset [9]. We developed an algorithm called C-search that can estimate potential outcomes using the similarity-weighted Monte Carlo method. We adopted Bayesian posterior update, which allows us to estimate the uncertainty of our decision boundary from small datasets without losing power [10]. This system was used to identify the causal genes that maximize the effect of the treatment. In this study, we compared the performance of causal gene discovery between the conventional statistical method and our C-search using a simulation dataset and an open-source gene dataset.

Materials and methods

Pseudo-counterfactual assumption and similarity weighted Monte Carlo

Assume that individual with variable is treated with the treatment variable is a binary variable (, if treated; , if not). There are two potential outcomes: , and . The causal effect of the treatment can be drawn from the comparison between both [6]. Here, we define as the outcome of an individual with variable when treated, and as the oucome of the individual when not treated. Then, individual treatment effect () of the individual , can be written as follows: However, we can observe only one potential outcome at most [4, 8]. Therefore, we should infer the counterfactuals from an untreated data pool. We call them pseudo-counterfactuals because they are not identical to the factual. Draw an individual with variable from the untreated data pool. Weight function is defined as the probability of similarity between the factual and pseudo-counterfactual [11]. Using the similarity weighted Monte Carlo method [12], we could estimate as follows:

Measurement of the difference in survival outcome using Win probability ()

Survival outcome includes survival time and survival events. Measuring individual differences in survival outcomes is difficult because they are right-censored. One of the most well-established outcome measures for survival difference is the bi-partite ranking system, such as the Wilcoxon − Mann − Whitney statistics [13]. Adopting this concept, we assumed the comparison in outcome between two individuals and as a Bernoulli trial. If lives longer than , will win a score. ITE for the survival outcome can be defined as the win probability ( the chance that the treated individual wins over its untreated counterpart), which follows a binomial distribution. The beta distribution is a conjugate prior for the binomial distribution. If we consider the comparison between two individuals and as a simple Bernoulli trial, the posterior distribution after observing the score (or observing times of winning of ) after trials can be defined as follows:

Update Win probability using similarity weighted Monte Carlo

In ideal settings where all individuals in the counterpart group that are identical to the individual , the outcome of is a good estimator for the counterfactual outcome of However, an identical condition is impossible in the observation setting. We use the similarity weighted Monte Carlo method to update . We matched individual with the individuals in the counterpart data pool and updated the score() with the similarity weight calculated from the similarity between and The can be the Euclidean distance in the original data space [14]. Other methods use a transformed one-dimensional score, that is, a regression function, such as a propensity function [15]. Here, we defined a basal function—a regression function that approximates the survival state. The covariates of individual and matched controls are projected onto the space through the basal functions and . To incorporate the difference onto the similarity weight, we used the Boltzmann probability distribution: is a constant and is the annealing temperature. These hyperparameters represent degrees of freedom. Let be the score from a single comparison between and . The posterior distribution after observing a single comparison between and can be written as follows: The is estimated from the posterior distribution after observing the number of counterpart individuals. We defined the observed clinical covariates as Genetic alteration is represented by simple binary values (e.g., for each genetic profile , existing alteration is given a value of 1; if not, it is given a value of 0). The individual has the genetic variable that consists of a set of genetic profile. To estimate , we may use the similarity weight calculated from the basal function using clinical covariates and/or genetic covariates. We denote the weight of the basal function of clinical covariates as and the weight using the basal function of the genetic covariates as . Using Eq. 3, and can be written as follows: The win probabilities of individual using both weights are as follows:

Causal gene suggestion

To find a single gene ( effect on the treatment effect, we assumed that each gene has an independent win probability . We used the similarity weighted Monte Carlo method to estimate or individual ’s win probability (Eq. 7). Observing individual ’s win probability updates the prior distribution of and the posterior is as follows: Sampling -number of individuals with , the posterior can be as follows: In reality is not stationary because depends on individual variables . To identify the marginal treatment effect, we need to balance all confounding variables using inverse propensity score weighting (IPW) as follows. For the treatment decision, we need to know whether treating a patient with genetic alteration is beneficial with sufficient evidence. We can estimate the posterior distribution of without in the same manner. If has a significant benefit for the treatment, the upper bound of the 95% confidence interval of ) should be lower than the lower bound of ).

Results

Simulation data analysis

The C-search algorithm was validated with the simulation data (Additional file 1), where the 10 causal genes that positively modulated the treatment outcome were hidden among 300 genes. We compared the precision of the C-search algorithm with that of the conventional log-rank survival analysis. Both the C-search algorithm and conventional statistics suggested 10 possible causal genes. The precision of the algorithm was determined by the number of true causal genes among the suggested genes. The performance of the algorithm was evaluated using five-fold cross-validation. To balance the covariate profile, we used propensity score matching. To perform the multiple comparison tests, we set the FDR to 0.05 and controlled it with the Benjamini − Hochberg procedure [16, 17]. The power of log-rank survival analysis depends on the number of data [18]. To demonstrate whether the algorithms are dependent on the number of data, 100, 200, 500, 1000, 1500, 2000, 2500, 3000, 3500, and 4000 simulation data were used for the analysis. As the number of data increases, both algorithms show better causal gene discovery. The C-search algorithm continuously improved its discovery performance until 1500 samples were obtained and then plateaued. Conventional statistics showed a linear improvement according to the number of samples. It required at least 4000 samples to show performance comparable to that of the C-search algorithm (Fig. 1).

Fig. 1

The number of causal genes discovered by C-search and conventional statistics. The X-axis is the number of samples consisting of the simulation data. The Y-axis is the number of true causal genes among the 10 suggested causal genes that the algorithm discovered. The C-search algorithm shows the discovery rate rapidly increasing until the 1500 data instances. The log-rank test shows a slower increase in performance

Finding positive modulators from open-source data

We also analyzed the cBioPortal Metabric breast cancer data using the C-search algorithm to identify causal genes that are associated with improved outcomes of chemotherapy. The dataset includes gene mutation profiles of 173 genes and clinical data from 2433 patients with primary breast cancer [19]. Clinical data consisted of age, chemotherapy, radiation therapy, sex, survival lifetime, and survival events. Among the 2433 records, 964 records that have missing on clinical data were omitted. A total of 1,469 patients were included in the analysis (Additional file 1). The C-search suggested nine positive modulators: PRKCZ, CLK3, CDKN2A, BRAF, KRAS, CASP8, JAK1, PRKACG, and SIK2. We allocated all patients who had any of them to the causal gene group and those who had not to the other gene group. If they are the true causal genes, treated patients should show a statistically significantly better prognosis compared to the untreated patients in the causal gene group. In addition, among the patients who were treated, the causal gene group should show a better prognosis than the other gene group. There were no overall survival differences between the causal gene group and the other gene group (Fig. 2a). Among patients with causal genes, the treatment group showed a better prognosis than the untreated group (Fig. 2b). Meanwhile, in the other gene group, there were no statistically significant differences between the treated and untreated groups (Fig. 2c). Among the treated patients, the causal gene group showed significantly better survival than the other gene group (Fig. 2d). In the untreated group, the other gene group showed better survival outcomes compared with the causal gene group (Fig. 2e). We define the optimal policy as treating patients with causal genes and not treating patients without causal genes. The other policy is to treat patients in the other gene group and not to treat patients in the causal gene group. The Kaplan − Meier survival curve of the optimal policy showed better survival outcomes than the other policies (Fig. 2f).

Fig. 2

Discovery of positive modulator genes by C-search in the cBioPortal breast cancer dataset. Nine causal genes are discovered, and patients with causal genes are assigned to the causal gene group. Patients without casual genes are assigned to the other gene group. All Kaplan − Meier survival curves are adjusted with propensity score matching [39]; 95% confidence intervals are depicted, and p-values are noted. a Kaplan − Meier survival curves of the causal gene group and the other gene group. b Treated and untreated patients are compared in the causal gene group. c Treated and untreated patients are compared for the other gene group. d The causal gene and other gene group are compared between treated patients. e The causal gene and other gene group are compared between the untreated patients. f Survival curve following the optimal policy and the other policy is shown

Conventional log-rank analysis of open-source data

To discover causal genes using conventional statistics from the cBioPortal dataset, we performed log-rank survival analysis. As the dataset includes gene mutation profiles of 173 genes, multiple log-rank survival analyses were applied for each mutation profile. For each gene mutation, the patients treated were divided into two groups: one group consisted of patients who harbored the mutation and one group of patients without the mutation. Survival outcomes were compared between the two groups. Propensity score matching was used to balance the covariate profile. We set the FDR to 0.01 and controlled it using the Benjamini − Hochberg procedure [16, 17]. A total of 10 genes were shown to be correlated with positive treatment outcomes: EGFR, CLK3, PTEN, CDH1, GATA3, KRAS, RB1, PRKACG, NEK1, and NRAS. Patients who have any of these are allocated to the causal gene group, and those who do not are assigned to other gene group. Kaplan − Meier survival curves comparing the overall survival differences between both groups are depicted in Fig. 3a. No survival differences were observed between patients with causal genes and without causal genes. In both the causal gene group and the other gene group, treatment was not associated with positive outcomes compared with no treatment (Fig. 3b, c). Meanwhile, in the treated group, the causal gene group showed better survival than the other gene group (Fig. 3d). Among the untreated patients, the causal gene group and the other gene group had no statistically significant differences in survival (Fig. 3e). Following optimal policy demonstrated better survival than following the other policy (Fig. 3f).

Fig. 3

Discovery of positive modulator genes using conventional log-rank analysis in the cBioPortal breast cancer dataset. Ten causal genes are discovered, and patients with causal genes are assigned to the causal gene group. Patients without casual genes are assigned to the other gene group. All Kaplan − Meier survival curves are adjusted with propensity score matching [39]; 95% confidence intervals are depicted, and p-values are noted. a Kaplan − Meier survival curves of the causal gene group and the other gene group. b Treated and untreated patients are compared in the causal gene group. c Treated and untreated patients are compared for the other gene group. d The causal gene and other gene group are compared between the treated patients. e The causal gene group and the other gene group are compared between the untreated patients. f Survival curve following the optimal policy and the other policy is shown

Comparison between C-search and conventional log-rank analysis

We compared the Kaplan − Meier survival curve of the following optimal policy, which was determined by the genes that each algorithm discovered. The survival outcome following the C-search policy showed statistically significant better survival outcomes (Fig. 4).

Fig. 4

Survival differences between the optimal policy determined by C-search and conventional log-rank analysis are shown. The C-search’s policy shows better outcomes than the others. The Kaplan − Meier survival curve is adjusted with propensity score matching; 95% confidence intervals are depicted, and p-values are noted

Validation with external data

To determine whether the genes found by the algorithm will act as causal genes in the other dataset, we used CancerSCAN screening data from Samsung Medical Center. CancerSCAN is a custom panel developed by the Samsung Genomic Institute [20]. The usage of the data was approved by the institutional review boards of the participating institutions (Samsung Medical Center 2019-11-127). CancerSCAN data consists of 559 breast cancer samples obtained at the Samsung Medical Center from January 2014 to September 2016. Mutation profiles of 81 genes and clinical data on age, chemotherapy, radiation therapy, sex, survival time, and survival events were included (Additional file 1). Among the causal genes suggested by C-search, mutation profiles of BRAF, KRAS, CDKN2A, and JAK1 were found in the CancerSCAN dataset. We assigned patients to the C-search causal gene group who acquired at least one of the mutations. The CDH1, EGFR, KRAS, PTEN, and RB1 are genes suggested by log-rank analysis whose mutation profiles exist in the CancerSCAN dataset. Patients with these mutations are assigned to the conventional statistics causal gene group. Figure 5a shows the significant survival difference between the treated and the untreated in the C-search causal gene group. The optimal policy suggested by the C-search showed a significantly better survival outcome (Fig. 5b). Meanwhile, among conventional statistical causal gene group, the treated did not demonstrate statistically better survival compared to the untreated (Fig. 5c). The survival outcomes following the optimal policy suggested by conventional statistics and those of the other policies are not statistically different (Fig. 5d).

Fig. 5

External validation of the causal genes suggested by C-search and conventional log-rank analysis. a Kaplan − Meier survival curve of the treated and the untreated among the C-search causal gene group. b Kaplan − Meier survival curve following and not following C-search optimal policy. c Kaplan − Meier survival curve of the treated and the untreated among the conventional statistics causal gene group. d Kaplan − Meier survival curve following and not following conventional statistics optimal policy

Discussion

Biotechnological breakthroughs in gene profiling have led to an increased focus on individualized precision therapy [21]. Clinicians want to treat patients who will benefit the most from the therapy while avoiding treatment that will not benefit or even get harm from therapy. There are studies on genetic assays to predict prognosis or response to treatment for breast cancer [22]. BluePrint molecular subtyping profile uses 80 genes to determine the sensitivity to adjuvant treatment [23]. Prosigna Breast Cancer Prognostic Gene Signature Assay utilizes the PAM50 test, which identifies gene signatures specific to breast cancer subtypes (luminal A/B, HRE2, basal-like) [24]. Both studies used genomic profiles to infer the molecular subtypes of cancer and drew a correlation between the subtypes and the prognosis or response to treatment. However, since the correlation does not infer causation, we cannot conclude that the specific genomic profile results in a positive response to treatment. In contrast, our C-search algorithm uses a potential outcome framework to study the causal effect of each gene on treatment. The estimation of causal effects from observational studies can be done in a number of ways. Methods based on propensity scores match, stratify, and/or inversely weight covariates that affect treatment allocation [25, 26]. G-computation implements regression models [27]. Mendelian randomization uses germline gene mutations as instruments to make causal inference [28, 29]. These methods are used to estimate average treatment effect of the target population. To estimate and compare the treatment effect of the patient who have a specific genetic mutation, one must calculate conditional causal effect, which is the average treatment effect of a subgroup of patients with that mutation [30]. However, due to the curse of dimensionality in cancer genomics, some subgroups may lack sufficient sample size to draw meaningful conclusions [31]. In addition, when the algorithm handles a smaller dataset, the result drawn from the inference has a considerable amount of uncertainty. It is important to incorporate uncertainty into the prediction of the algorithm [32, 33]. The C-search algorithm updates the gene’s win probability with a Bayesian update, thus, reflecting its uncertainty in the analysis. Pseudocode for the algorithm and the computational complexity are shown in Additional file 1. We demonstrated that the C-search algorithm can identify causal genes from a simulation dataset that includes hidden confounders. Compared to conventional log-rank analysis, C-search requires fewer data to gain sufficient power to find the causal genes. Therefore, the C-search may find more candidate causal genes than conventional association studies using genomic data. When there are not enough data subsets, it is important as we used Bayesian update to estimate the gene’s win probability. Both C-search and log-rank analysis successfully found positive modulators in the Metabric dataset (Figs. 2, 3), yet the gene set found by C-search showed better results than log-rank analysis (Fig. 4). Among the positive regulators that C-search and conventional log-rank analysis found, two genes are found in common in both algorithms: KRAS and PRKACG. The KRAS gene targets several miRNAs to enhance chemotherapy in acute myeloid leukemia, lung cancer, breast cancer, and gallbladder cancer [34-36]. PRKACG is a gene that encodes the protein kinase A subunit Cγ, whose role in cancer has not been elucidated. The causal genes suggested by both algorithms contained relatively few overlapping genes. This may be due to intercorrelated genes, at least to some extent; therefore, one of the co-expressed genes may be selected for the set as a predictor and yield comparable results [37]. Our study has several limitations. There were only a few clinical variables available in the open-source dataset; therefore, hidden confounders may affect the performance of the algorithm. IPW may result in bias in this setting [15, 38]. However, the algorithm performance in the simulation data, including 10 hidden confounders, showed better results than log-rank analysis. As there are no golden standard datasets to evaluate the performance of the algorithm to determine the treatment modulating effect of genetic variables, we can only evaluate the algorithm’s performance indirectly. Using the CancerSCAN dataset, we were able to externally validate that the causal genes suggested by C-search showed comparable results in the external dataset.

Conclusion

We proposed an algorithm that uses a potential outcome framework and Bayesian updating, inferring the causal effect of the genetic variable on treatment outcomes. The proposed algorithm was shown to find causal genes from the simulation data in a relatively small number of samples compared to the log-rank analysis. It also showed its performance in finding positive treatment modulators from the open-source breast cancer dataset, which is validated with external data. The C-search algorithm may be applied to various types of datasets where the number of samples is limited compared to the number of variables. Additional file 1. How to simulate data generation, demographics of cBioPortal dataset, demographics of CancerSCAN dataset, pseudocode of the C-search Algorithm, computational complexity

Individual	Clinical covariates	Genetic covariates	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T$$\end{document}T	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y\left( {T = 0} \right)$$\end{document}YT=0	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y\left( {T = 1} \right)$$\end{document}YT=1
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i$$\end{document}i	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$V_{i}$$\end{document}Vi	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G_{i}$$\end{document}Gi	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_{i}$$\end{document}Ti	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y_{i} \left( {T = 0} \right)$$\end{document}YiT=0	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$Y_{i} \left( {T = 1} \right)$$\end{document}YiT=1

26 in total

1. Reasoning under uncertainty.

Authors: Colin Aitken; Dimitris Mavridis
Journal: Evid Based Ment Health Date: 2019-01-24

2. Applying propensity score methods in medical research: pitfalls and prospects.

Authors: Zhehui Luo; Joseph C Gardiner; Cathy J Bradley
Journal: Med Care Res Rev Date: 2010-05-04 Impact factor: 3.929

3. Pharmacogenomic predictor of sensitivity to preoperative chemotherapy with paclitaxel and fluorouracil, doxorubicin, and cyclophosphamide in breast cancer.

Authors: Kenneth R Hess; Keith Anderson; W Fraser Symmans; Vicente Valero; Nuhad Ibrahim; Jaime A Mejia; Daniel Booser; Richard L Theriault; Aman U Buzdar; Peter J Dempsey; Roman Rouzier; Nour Sneige; Jeffrey S Ross; Tatiana Vidaurre; Henry L Gómez; Gabriel N Hortobagyi; Lajos Pusztai
Journal: J Clin Oncol Date: 2006-08-08 Impact factor: 44.544

Review 4. Human genomics projects and precision medicine.

Authors: F Carrasco-Ramiro; R Peiró-Pastor; B Aguado
Journal: Gene Ther Date: 2017-08-14 Impact factor: 5.250

5. How does multiple testing correction work?

Authors: William S Noble
Journal: Nat Biotechnol Date: 2009-12 Impact factor: 54.908

6. The use of propensity score methods with survival or time-to-event outcomes: reporting measures of effect similar to those used in randomized experiments.

Authors: Peter C Austin
Journal: Stat Med Date: 2013-09-30 Impact factor: 2.373

7. The somatic mutation profiles of 2,433 breast cancers refines their genomic and transcriptomic landscapes.

Authors: Bernard Pereira; Suet-Feung Chin; Oscar M Rueda; Hans-Kristian Moen Vollan; Elena Provenzano; Helen A Bardwell; Michelle Pugh; Linda Jones; Roslin Russell; Stephen-John Sammut; Dana W Y Tsui; Bin Liu; Sarah-Jane Dawson; Jean Abraham; Helen Northen; John F Peden; Abhik Mukherjee; Gulisa Turashvili; Andrew R Green; Steve McKinney; Arusha Oloumi; Sohrab Shah; Nitzan Rosenfeld; Leigh Murphy; David R Bentley; Ian O Ellis; Arnie Purushotham; Sarah E Pinder; Anne-Lise Børresen-Dale; Helena M Earl; Paul D Pharoah; Mark T Ross; Samuel Aparicio; Carlos Caldas
Journal: Nat Commun Date: 2016-05-10 Impact factor: 14.919

8. Prevalence and detection of low-allele-fraction variants in clinical cancer samples.

Authors: Hyun-Tae Shin; Yoon-La Choi; Jae Won Yun; Nayoung K D Kim; Sook-Young Kim; Hyo Jeong Jeon; Jae-Yong Nam; Chung Lee; Daeun Ryu; Sang Cheol Kim; Kyunghee Park; Eunjin Lee; Joon Seol Bae; Dae Soon Son; Je-Gun Joung; Jeeyun Lee; Seung Tae Kim; Myung-Ju Ahn; Se-Hoon Lee; Jin Seok Ahn; Woo Yong Lee; Bo Young Oh; Yeon Hee Park; Jeong Eon Lee; Kwang Hyuk Lee; Hee Cheol Kim; Kyoung-Mee Kim; Young-Hyuck Im; Keunchil Park; Peter J Park; Woong-Yang Park
Journal: Nat Commun Date: 2017-11-09 Impact factor: 14.919

9. Evaluation of conditional treatment effects of adjuvant treatments on patients with synovial sarcoma using Bayesian subgroup analysis.

Authors: Sung Wook Seo; Jisoo Kim; Jihye Son; Sungbin Lim
Journal: BMC Med Inform Decis Mak Date: 2020-12-03 Impact factor: 2.796

10. Comparison of PAM50 risk of recurrence score with oncotype DX and IHC4 for predicting risk of distant recurrence after endocrine therapy.

Authors: Mitch Dowsett; Ivana Sestak; Elena Lopez-Knowles; Kalvinder Sidhu; Anita K Dunbier; J Wayne Cowens; Sean Ferree; James Storhoff; Carl Schaper; Jack Cuzick
Journal: J Clin Oncol Date: 2013-07-01 Impact factor: 44.544