Literature DB >> 25336897

Bayesian disease classification using copy number data.

Subharup Guha¹, Yuan Ji², Veerabhadran Baladandayuthapani³.

Abstract

DNA copy number variations (CNVs) have been shown to be associated with cancer development and progression. The detection of these CNVs has the potential to impact the basic knowledge and treatment of many types of cancers, and can play a role in the discovery and development of molecular-based personalized cancer therapies. One of the most common types of high-resolution chromosomal microarrays is array-based comparative genomic hybridization (aCGH) methods that assay DNA CNVs across the whole genomic landscape in a single experiment. In this article we propose methods to use aCGH profiles to predict disease states. We employ a Bayesian classification model and treat disease states as outcome, and aCGH profiles as covariates in order to identify significant regions of the genome associated with disease subclasses. We propose a principled two-stage method where we first make inferences on the underlying copy number states associated with the aCGH emissions based on hidden Markov model (HMM) formulations to account for serial dependencies in neighboring probes. Subsequently, we infer associations with disease outcomes, conditional on the copy number states, using Bayesian linear variable selection procedures. The selected probes and their effects are parameters that are useful for predicting the disease categories of any additional individuals on the basis of their aCGH profiles. Using simulated datasets, we investigate the method's accuracy in detecting disease category. Our methodology is motivated by and applied to a breast cancer dataset consisting of aCGH profiles assayed on patients from multiple disease subtypes.

Entities: Chemical Disease Gene Species

Keywords: Bayesian network; breast cancer; classification; hidden Markov model

Year: 2014 PMID： 25336897 PMCID： PMC4196891 DOI： 10.4137/CIN.S13785

Source DB: PubMed Journal: Cancer Inform ISSN： 1176-9351

Introduction

DNA copy number variations (CNVs) have been shown to be associated with cancer development and progression.1 Somatic CNVs can lead to tumorigenesis. For example, loss of copy numbers for tumor suppressor genes or amplification for oncogenes both lead to cancer. The detection of these CNVs has the potential to impact the basic knowledge and treatment of many types of cancers, and can play a role in the discovery and development of molecular-based personalized cancer therapies.2 In early years, cytogeneticists have been limited to traditionally visually examining whole genomes with a microscope, a technique known as karyotyping or chromosome analysis. In the mid-70 s and 80s, the development and application of molecular diagnostic methods such as Southern blots, polymerase chain reaction (PCR), and fluorescence in situ hybridization (FISH) allowed clinical researchers to make many important advances in genetics, including clinical cytogenetics. However, these techniques have several limitations. First, they are very time consuming and labor intensive, and only a limited number and regions of the chromosome can be tested simultaneously. Further, because the probes are targeted to specific chromosome regions, the analysis requires prior knowledge of an abnormality and is of limited use for screening complex karyotypes. More recently, scientists have developed techniques that integrate aspects of both traditional and molecular cytogenetic techniques called chromosomal micorarrays.3 These high-throughput high-resolution microarrays have allowed researchers to diagnose numerous subtle genome-wide chromosomal abnormalities that were previously undetectable and find many cytogenetic abnormalities in part or all of a single gene. Such information is useful for biologists to detect new genetic disorders and also provide a better understanding of the pathogenetic mechanisms of many chromosomal aberrations. One of the most common types of high-resolution chromosomal microarrays are array-based comparative genomic hybridization (aCGH) methods that assay DNA CNVs across the whole genomic landscape in a single experiment.4 With aCGH, differentially labeled test and reference samples’ genomic DNAs are cohybridized to normal chromosomes, and fluorescence intensities/ratios along the length of chromosomes provide a cytogenetic representation of the relative DNA CNV across the whole genome. Whereas early aCGH arrays were mainly used in research settings, recent improvements in algorithms for aCGH data analysis as well as rapidly reducing costs now enable clinical applications of aCGH arrays, particularly in the study of cancer genomic as a diagnostic tool.2 In this article, we propose methods to use aCGH profiles to predict disease states. We employ a Bayesian classification model, and treat disease states as outcome and aCGH profiles as covariates – to identify significant regions of the genome associated with disease subclasses. Statistical challenges for aCGH classification include not only high dimensionality ie, large number (tens of thousands) of probes but also relatively small number of samples, more importantly, the presence of serial correlation among the features – nearby probes (by genomic location) tend to be highly correlated. Classical methods usually used for multivariate classification of high-dimensional genomic data, eg, penalized approaches (Zhu and Hastie5 and the references there-in), do not account for the specific structure of aCGH data, as they ignore the serial dependence in the probes. To exploit the serial genomic information, typical approaches first segment the data6 and then conduct downstream classification. Alternative methods are based on kernel-based techniques such as support vector machine (SVM),7 and its variants exploit genomic continuity.8 While incorporating excellent prediction capabilities, these methods do not explicitly utilize the inherent discrete nature of the latent copy number states (gain/loss/normal) in their variable selection procedures, which serves as one of the primary aims in this article. In the Bayesian framework, several innovative variable selection strategies have been developed in various contexts, with reasonable degrees of success. Some of these approaches can be regarded as linear variable selection methods. These include stepwise selection,9 penalized regression approaches such as lasso (and its variants),10 and non-concave penalized likelihood approaches.11 The technique applied in this paper is based on Bayesian linear variable selection approaches, including spike and slab mixture priors,12 stochastic search variable selection,13 Gibbs-based variable selection,14 Bayesian model averaging,15,16 and indicator priors.17 The stochastic search variable selection approach of George and McCulloch13 has been extended to multivariate settings by Brown et al.18 and to generalized linear mixed models by Cai and Dunson.19 Effective variable selection methods have also been developed for multinomial probit models by Sha et al.20 and for microarray data with censored outcomes by Lee and Mallick21 and Sha et al.22 However, none of these approaches account for natural spatial/serial dependency in the covariates (as in our case) – which might lead to biased estimates. In this article we propose a principled two-stage method for disease classification using covariates exhibiting serial dependence. In general, the technique is applicable to datasets having the following structure. For individuals i = 1, …, n, we have (i) two disease categories coded as the binary response y and (ii) aCGH emissions e1, …, e corresponding to p probes, with p typically being much larger than n. The analysis broadly consists of two stages. In Stage 1, we make inferences on underlying copy number states associated with the aCGH emissions based on hidden Markov model (HMM) formulations23 to account for serial dependencies. Subsequently in Stage 2, we analyze the model parameters associated with the binary responses, conditional on the parameters discovered in Stage 1, using Bayesian linear variable selection procedures. In particular, we select the aCGH probes having a linear regression relationship with the disease categories. The selected probes and their effects are parameters that are useful for predicting the disease categories of any additional individuals on the basis of their aCGH emissions. Our methodology is motivated by and applied to a dataset consisting of 111 breast cancer patients24 and falling into two disease subgroups, ER+ and triple negative (TN). There are 56 TN patients and 55 ER+ patients. For each patient, DNA copy number data were generated using Agilent 4x44K CGH arrays (available at ArrayExpress accession number E-TABM-484). The remainder of the paper is organized as follows. Section 2 provides details of the model for the two-stage analysis. Section 3 develops the posterior inference and prediction technique based on Markov chain Monte Carlo (MCMC) methods. In Section 4, using simulated datasets, we investigate the method’s accuracy in detecting disease category. Finally, Section 5 analyzes the motivating breast cancer dataset and makes test case predictions.

Model

Our modeling framework consists of two stages: In Stage 1, we model the aCGH emissions, relying on HMMs to account for the serial correlations among the emissions. Then, in Stage 2, the relationship between the HMM parameters and the subject-specific binary responses is specified using a probit regression model and the latent indicator variables using the approaches proposed by George and McCulloch,13 Kuo and Mallick,17 and Brown et al.18 We expound on each of these below.

Stage 1: relationship between aCGH emissions and latent copy number states

For subjects i = 1, …, n and probes j = 1, …, p, we have the binary responses y1, …, y representing the two disease subcategories and the set of real-valued aCGH emissions {e}. Let s ∈ {−1, 0, +1} be a latent variable called the copy number state, representing a loss, no change, and gain in copy number for individual i at probe j. The copy number state is inferred using a Bayesian HMM that accounts for the serial correlations of the aCGH emissions. Similarly to Guha et al.23 conditional on s, the aCGH emissions are assumed to be normally distributed: where, because of the specific biological interpretations associated with the HMM states, we assume that μ−1 < μ0 < μ+1. This assumption also prevents label switching, a well-known problem with mixture models, thereby making inferences even more efficient. The latent states s1, …, s are assumed to follow a three-state HMM with stationary transition probability matrix A = ((a))3×3 having row sums ∑ = 1,2,3a = 1 for u = 1, 2, 3. That is, P[s+1 = t | s = u] = a for j = 1, …, (n − 1). To further facilitate inferences of the state-specific parameters, informative conjugate priors are assigned to the parameters of the normal distribution ie, μ and σ for s ∈ {−1, 0, +1}. Refer to Guha et al.23 for further details about MCMC inference of the underlying copy number states of the probes for the individuals. The technique developed in that paper is applied to infer the latent copy number states (gain/loss/normal) s1, …, s for subjects i = 1, …, n that are subsequently used in the below Stage 2.

Stage 2: relationship between disease classification and latent copy number states

In the second stage of the analysis, we model the relationship between the disease category and latent copy number states of the genomic probes for each individual. These values are copy number states inferred from analysis in Section 2.1. Let and be indicator functions of loss and gain. To simplify the notation, for subjects i = 1, …, n, we collectively represent the vector of 2p covariates as . For covariate j = 1, …, 2p, averaging over the individuals, let . Centering and scaling over the n individuals, we transform the covariates as follows: Let Q be the set of covariates j for which assumes at least two distinct values. That is, . Because the variables v are centered, j ∉ Q if and only if v1 = … = v = 0. A key assumption of our model is that probes that do not belong to Q ie for which do not assume at least 2 distinct values, are not predictive of disease subcategory, although the probes could possibly be predictive of the disease. For this reason, we identify Q as the set of potential predictors of disease subcategory and write q = |Q| ≤ 2p. We discard all probes j ∉ Q, relabeling the variables {v: j ∈ Q} as {x: j = 1, …, q}. For individuals i = 1, …, n, we assume the probit regression model proposed by Albert and Chib25: For the intercept β0, we assume the prior γ = (γ1, …, γ)′ be i.i.d. Bernoulli variables with P[γ = ω], where ω is expected to be relatively small and is assigned the uniform prior on (0,0.1). The remaining coefficients in (1) are independently distributed as where δ0 denotes the point mass at 0. In other words, each probe is predictive of disease classification with probability ω. We assume independent exponential priors with mean 1 for and τ−2.

Gibbs Sampling Procedure

Let be the random number of variables (including the intercept β0) that participate in the disease classification. Let r = z − ∑≠ for i = 1, …, n. For a set of numbers {θ: i = 1, …, n, j = 1, …, q}, let represent the vector (θ, …, θ)′ for probe j = 1, …, q. Although the Gibbs sampler is conceptually straightforward, updating of γ can be computationally intensive for large q. The step is described as follows. For probe j = 1, …, q, let β– represent the set of regression coefficients excluding β. With denoting the identity matrix of order n and , the posterior probability P[γ |β, , ] is proportional to (1 − ω) · N(r | 0,) when γ = 0 and is proportional to ω N(r | 0,) when γ = 1. The density N(r 0,) can be quickly computed even in large problems. However, the density N(r 0,) involves the inversion and determinant calculation for the non-diagonal matrix . Because it must be iteratively performed for every probe j, it can be computationally expensive or can at least involve large amounts of memory, when q is large. Theorem 7.1 of the Appendix exploits the structure of to drastically simplify the computation. For probe j = 1, …, q, let Applying Theorem 7.1, we have det() = 1 + τ2, and N(r|0,) is proportional to exp . The calculation is feasible even for large q.

Outline of procedure

Let F·I(c, d) denote the distribution F restricted to the interval (c, d). The Gibbs sampler consists of the following steps: Applying Theorem 7.1, the binary indicators for probes j = 1, …, q are updated as follows: where and L1 is as defined in (2). Writing x = (1, x1, …, x) for individuals i = 1, …, n, the subject-specific latent variables z are independently distributed as Let β be the elements of β corresponding to the intercept and to the set of probes j for which γ = 1. Then . Vector β is jointly updated as where U is an n × ρ matrix with the first column equal to a vector of n 1’s and the remaining columns equal to the vectors x for which γ = 1. The variance matrix . is distributed as gamma . is distributed as gamma . ω | γ is distributed as beta (ρ, q − ρ + 1) · 1 (0, 0.1).

Test case predictions

Suppose we have the aCGH profiles of n* additional test case individuals from the same hypothetical disease population. Using the within-variable means and variances of the training sample, we transformed the aCGH profiles to obtain the covariates x∗ = (1,x∗1,…,x∗) for individuals i* = 1, …, n* belonging to the test sample. Let D represent the training set data. The posterior probability that individual i* belongs to disease category 1 is A consistent (in simulation size) estimate of this probability is then where β = β() is the value generated at the Mth MCMC iterate. We declare the disease category of the test case individual labeled i* as

Simulation Study

We generated a training sample consisting of p = 2000 aCGH profiles for n = 100 individuals. The individuals were regarded as random draws from a disease population where 100 × (1 − p*) = 25% of the individuals had “disease 0” and the remaining 100 × p* = 75% individuals had “disease 1,” so that p* = 0.75 represented the prior probability of disease 1 in the population. Disease 0 was assumed to be characterized by losses (s = −1) from probes 201 to 400 and gains (s = 1) from probes 1401 to 1800. Disease 1 was characterized by losses from probes 301 to 500 and also from probes 1601 to 1800. The remaining probes were assigned a copy number state of 0. For each disease subcategory, we randomly selected 10% of the probes that were associated with the disease and randomly set their copy number states to be copy neutral, gains, or losses with equal probability. Additionally, random noise at the probe level was then added to the profiles by selecting 2% (ie, 4000) of the remaining probes and randomly changing their copy number states. These values constituted the variables s in Stage 2 of the Section 2 model, and were assumed to be known in the simulation. As described in Section 2, the variables were then transformed to obtain the covariates w and v for i = 1, …, n and j = 1, …, 2p. The set was evaluated to identify q = 2571 probes for which the individuals had at least two distinct values. These variables were relabeled as {x:j = 1, …, q}, and the remaining variables were discarded. The model was fit using the Gibbs sampler of Section 3. An initial set of 10,000 samples was run to allow the MCMC chain to forget its starting values. A 1-in-10 subsample of M = 100,000 additional draws was stored for posterior inferences. Figure 1 presents histograms for the marginal posteriors of the intercept β0, standard deviations τ0 and τ, and Bernoulli probability ω, which are used in the sequel to make predictions for the disease categories of the test case individuals.

Figure 1

Histogram of selected model parameters for the simulation study.

We evaluated the success of the predictive ability of our approach by drawing 50 independent test samples of n* = 200 individuals from the same hypothetical disease population and generating their aCGH profiles based on their disease categories. Exactly 50 of these 200 test case individuals had disease 0, and the remaining 150 individuals had disease 1. Using the within-variable means and variances of each training sample, we transformed the aCGH profiles to obtain the covariates x∗ = (1,x∗1,…,x∗q) for individuals i* = 1, …, n* belonging to the test sample of each of the 50 datasets. For each dataset, using the stored MCMC sample of size M = 100,000 and as described in Section 3, we computed the posterior probability of disease 1, , for the n* = 200 individuals. The estimated for the n* = 200 individuals were computed as in (3). These values versus the true disease categories y∗ are summarized in Table 1. The graph reveals the remarkable accuracy of the proposed methodology in detecting disease category. Specifically, for all 50 datasets, the technique resulted in perfect disease prediction with no false classification.

Table 1

For the 200 individuals belonging to the 50 test samples of the simulation study, the estimated disease category versus the true category averaged over the 50 test samples. Perfect classification was obtained for each dataset. As a result, the standard errors shown in parenthesis are all zero.

ESTIMATED
	y^i*=0	y^i*=1
Truth
y_i_∗ = 0	50 (0)	0 (0)
y_i_∗= 1	0 (0)	150 (0)

Breast Cancer Data Analysis

We analyzed the breast cancer dataset from Andre et al.24 which consists of n = 111 individuals with either disease subcategory ER+ (label “1”) or TN (label “0”). There are 56 TN and 55 ER+ patients. aCGH emissions for these individuals were available on the same set of p = 42,416 probes along with the probes’ locations. Specifically, the chromosome and the distance in megabases (MB) from a telomere are available for every probe. As described in Section 2.1, we used this information to first infer the latent copy number states e of the probes using a Bayesian HMM, where i = 1, …, 111 and j = 1, …, 42,416. Then, as described in Section 2.2, we obtained the indicator functions, and , of gain and loss. These indicator variables were transformed to obtain the covariates w and v for i = 1, …, n and j = 1, …, 84,832. The set was evaluated to identify q = 5,543 covariates having at least two distinct values for the 111 individuals. These variables were relabeled as {x: j = 1, …, 5,543} and retained as potential regressors. The remaining variables were discarded because they were unlikely to be associated with the subcategory classification. To investigate the reliability of the proposed method of these actual datasets, we performed 50 independent replications of the following steps. (i) We randomly split the data into training and test sets in a 4:1 ratio. (ii) We analyzed the disease subcategories and the q = 5,543 covariates of the 89 training set individuals using the Bayesian probit regression model with likelihood function (1). The model was fit using the Gibbs sampler of Section 3. An initial set of 10,000 samples was run to allow the MCMC chain to overcome its initial values. A 1-in-10 subsample of M = 100,000 additional draws was stored for posterior inferences. (iii) As described in Section 3, we used the q = 5,543 covariates of the 22 test case individuals to predict their disease subcategories. These predictions were compared with the actual disease subcategories of these 22 individuals to compute the classification error rate for the specific training–test case random split. An average of the 50 independent estimates in Step (iii) yielded a simulation-based estimate of the classification error rate for the proposed method. This was estimated to be 22.55% with a standard error of 1.16%. The significant probes (covariates) that were found to be predictive of disease subtype are plotted in Figures 2–4. We assumed a posterior probability threshold of δ = 0.15 that yielded 500 markers along the entire genome predictive of the disease classification. Figure 2 plots a bar graph of the chromosomal breakdown of these markers. As can be seen, most of the significant markers are located on chromosomes 5, 12, 16, and 17. The corresponding karyograms Figures 3 and 4 show the breakdown on the markers by chromosomal locations for negative (red) and positive (green) associations with the disease states, respectively.

Figure 2

Number of significant markers broken down for each chromosome.

Figure 4

Human karyogram with significant locations. This figure is a karyogram that depicts the significant probes identified using our approach. The green color corresponds to positive regression coefficients.

Figure 3

Human karyogram with significant locations. This figure is a karyogram that depicts the significant probes identified using our approach. The red color corresponds to negative regression coefficients.

Our results are promising based on the locations of selected markers. As noted, most markers are on chromosomes 5, 12, 16, and 17. It has been shown that chromosome 5q deletions are the most frequent aberration in breast tumors from BRCA1 mutation carriers. The deletions in 5q occur at high frequencies on putative tumor suppressor genes such as XRCC4, RAD50, RASA1, APC, and PPP2R2B.26 Chromosome 16q has been a target region for the detection of biomarkers for breast cancer.24 We identified a high concentration of biomarkers in 16q as well. In addition, our flagged biomarkers on chromosome 17 are also convincing, since chromosome 17 is the host for the most famous breast cancer gene BRCA1 as well as ER. Interestingly, little is known about the association of CNVs on chromosome 12 with subgroups of breast cancer. Our findings on chromosome 12 could be potentially new discoveries that might warrant further functional validation.

Conclusions and Discussion

The detection of CNVs in aCGH methods is important for the treatment of many types of cancers, especially in the development of molecular-based personalized cancer therapies. We propose a framework for the prediction of disease types using aCGH profiles. We employ a Bayesian classification model and treat disease states as outcome and aCGH profiles as covariates in order to identify significant regions of the genome associated with disease subclasses. Specifically, we propose a principled two-stage method using the covariates exhibiting serial dependence. Stage 1 makes inferences on the underlying copy number states associated with the aCGH emissions based on HMM formulation. Using Bayesian linear variable selection procedures, Stage 2 detects the model parameters associated with the binary responses, conditional on the parameters of Stage 1. The selected probes and their effects are parameters that are useful for predicting the disease categories of any additional individuals on the basis of their copy number profiles. A simulation study demonstrates the method’s remarkable accuracy in detecting disease category. The methodology is applied to a breast cancer dataset, and we find several markers that are associated with disease subtype using the copy number profiles. Some of these discoveries confirm existing literature, and novel associations could be potential targets for future validation studies. Our methods are general and could be potentially applied to SNP arrays as well that yield copy number profiles. A nice generalization of the method would be to incorporate genotype information (eg, allelic frequencies) in the models (especially, Stage 1) that could lead to more refined estimation of the latent copy number states. Furthermore, current technologies enable collection of multiplatform data on matched patient samples such as mRNA expression (eg, The Cancer Genome Atlas (TCGA)) that can be leveraged to provide a more detailed understanding of the biological mechanisms involved in cancer development and progression. We leave these tasks for future consideration.

16 in total

1. Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage.

Authors: Naijun Sha; Marina Vannucci; Mahlet G Tadesse; Philip J Brown; Ilaria Dragoni; Nick Davies; Tracy C Roberts; Andrea Contestabile; Mike Salmon; Chris Buckley; Francesco Falciani
Journal: Biometrics Date: 2004-09 Impact factor: 2.571

2. A comparison study: applying segmentation to array CGH data for downstream analyses.

Authors: Hanni Willenbrock; Jane Fridlyand
Journal: Bioinformatics Date: 2005-09-13 Impact factor: 6.937

Review 3. Array comparative genomic hybridization and its applications in cancer.

Authors: Daniel Pinkel; Donna G Albertson
Journal: Nat Genet Date: 2005-06 Impact factor: 38.330

4. Bayesian variable selection for the analysis of microarray data with censored outcomes.

Authors: Naijun Sha; Mahlet G Tadesse; Marina Vannucci
Journal: Bioinformatics Date: 2006-07-15 Impact factor: 6.937

Review 5. Making sense of cancer genomic data.

Authors: Lynda Chin; William C Hahn; Gad Getz; Matthew Meyerson
Journal: Genes Dev Date: 2011-03-15 Impact factor: 11.361

6. The lasso method for variable selection in the Cox model.

Authors: R Tibshirani
Journal: Stat Med Date: 1997-02-28 Impact factor: 2.373

7. Chromosome 5 imbalance mapping in breast tumors from BRCA1 and BRCA2 mutation carriers and sporadic breast tumors.

Authors: Hrefna K Johannsdottir; Goran Jonsson; Gudrun Johannesdottir; Bjarni A Agnarsson; Hannaleena Eerola; Adalgeir Arason; Paivi Heikkila; Valgardur Egilsson; Hakan Olsson; Oskar Th Johannsson; Heli Nevanlinna; Ake Borg; Rosa B Barkardottir
Journal: Int J Cancer Date: 2006-09-01 Impact factor: 7.396

8. Molecular characterization of breast cancer with high-resolution oligonucleotide comparative genomic hybridization array.

Authors: Fabrice Andre; Bastien Job; Philippe Dessen; Attila Tordai; Stefan Michiels; Cornelia Liedtke; Catherine Richon; Kai Yan; Bailang Wang; Gilles Vassal; Suzette Delaloge; Gabriel N Hortobagyi; W Fraser Symmans; Vladimir Lazar; Lajos Pusztai
Journal: Clin Cancer Res Date: 2009-01-15 Impact factor: 12.531

9. Classification of arrayCGH data using fused SVM.

Authors: Franck Rapaport; Emmanuel Barillot; Jean-Philippe Vert
Journal: Bioinformatics Date: 2008-07-01 Impact factor: 6.937

10. Classification of gene microarrays by penalized logistic regression.

Authors: Ji Zhu; Trevor Hastie
Journal: Biostatistics Date: 2004-07 Impact factor: 5.899

1 in total

1. Genomic Amplifications and Distal 6q Loss: Novel Markers for Poor Survival in High-risk Neuroblastoma Patients.

Authors: Pauline Depuydt; Valentina Boeva; Toby D Hocking; Robrecht Cannoodt; Inge M Ambros; Peter F Ambros; Shahab Asgharzadeh; Edward F Attiyeh; Valérie Combaret; Raffaella Defferrari; Matthias Fischer; Barbara Hero; Michael D Hogarty; Meredith S Irwin; Jan Koster; Susan Kreissman; Ruth Ladenstein; Eve Lapouble; Geneviève Laureys; Wendy B London; Katia Mazzocco; Akira Nakagawara; Rosa Noguera; Miki Ohira; Julie R Park; Ulrike Pötschger; Jessica Theissen; Gian Paolo Tonini; Dominique Valteau-Couanet; Luigi Varesio; Rogier Versteeg; Frank Speleman; John M Maris; Gudrun Schleiermacher; Katleen De Preter
Journal: J Natl Cancer Inst Date: 2018-10-01 Impact factor: 13.506

1 in total