Literature DB >> 27240256

Data-driven hypothesis weighting increases detection power in genome-scale multiple testing.

Nikolaos Ignatiadis¹, Bernd Klaus¹, Judith B Zaugg¹, Wolfgang Huber¹.

Abstract

Hypothesis weighting improves the power of large-scale multiple testing. We describe independent hypothesis weighting (IHW), a method that assigns weights using covariates independent of the P-values under the null hypothesis but informative of each test's power or prior probability of the null hypothesis (http://www.bioconductor.org/packages/IHW). IHW increases power while controlling the false discovery rate and is a practical approach to discovering associations in genomics, high-throughput biology and other large data sets.

Entities: Chemical Disease Species

Mesh：

Year: 2016 PMID： 27240256 PMCID： PMC4930141 DOI： 10.1038/nmeth.3885

Source DB: PubMed Journal: Nat Methods ISSN： 1548-7091 Impact factor: 28.547

Multiple testing is an important part of many high-throughput data analysis workflows. A common objective is to maximize the number of discoveries while controlling the FDR, i. e., the expected fraction of false discoveries. Commonly used procedures, such as that of Benjamini and Hochberg, achieve this objective by working solely off the list of p-values [1-5]. However, such an approach has suboptimal power when the individual tests differ in their statistical properties, such as sample size, true effect size, signal-to-noise ratio, or prior probability of being false. For example, in RNA-seq differential expression analysis, each test is associated with a different gene, and because of differences in the number of reads mapped the genes greatly differ in their signal-to-noise ratio. In genome-wise association studies (GWAS), associations are sought between genetic polymorphisms and phenotypic traits; however, the power to detect an association is lower for rarer polymorphisms (all else being equal). In GWAS of gene expression phenotypes (eQTL), cis-effects are a priori more likely than associations between a gene product and a distant polymorphism. To take into account such differences in the statistical properties of the tests, one can associate each test with a weight, a non-negative number as a measure of its priority (Supplementary Note 1). The weights fulfill a budget criterion, commonly that they average to one. Hypotheses with higher weights get prioritized [6]. The procedure of Benjamini and Hochberg (BH) [1] can be modified to allow weighting simply by replacing the original p-values pi with their weighted versions pi /wi (where wi is the weight of hypothesis i) [6]. This approach controls the FDR if the weights are pre-specified and thus independent of the data. However, the optimal choice of weights is rarely known in practice, and a generally applicable data-driven method would be desirable [7-11]. Independent hypothesis weighting (IHW) is a multiple testing procedure that applies the weighted BH method [6] using weights derived from the data. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of the hypothesis tests, while it is independent of the p-value under the null hypothesis [9]. Such covariates exist in many applications and are often apparent to domain experts (Table 1). The conditional independence property can be verified either mathematically [9] or empirically [12]. Simple diagnostic plots of the data can help assess these assumptions (Fig. 1).

Table 1

Examples of covariates.

Application	Covariate

Differential expression analysis	Sum of read counts per gene across all samples [12]
Genome-wide association study (GWAS)	Minor allele frequency
Expression-QTL analysis	Distance between the genetic variant and genomic location of the phenotype
ChIP-QTL analysis	Comembership in a topologically associated domain [16]
t-test	Overall variance [9]
Two-sided tests	Sign of the effect
Various applications	Signal quality, sample size

Figure 1

Histograms stratified by the covariate as a diagnostic plot.

a) The histogram of all p-values shows a mixture of a uniform distribution (corresponding to the true null hypotheses) and an enrichment of small p-values to the left (corresponding to the alternatives). Such a well-calibrated histogram is the starting point for most multiple testing methods. b-d) Histograms after splitting the hypotheses into three groups based on the values of the covariate. Shown is an example of a good covariate: each histogram still shows a uniform component, but the mixture proportion and/or the shape of the alternative distribution differ between the groups. If all histograms look the same, the covariate is uninformative, and its use would not lead to an increase in power. If the tails are no longer uniform, independence under the null is violated, and application of IHW is not valid.

IHW is motivated by considering multiple testing as a resource allocation problem [6]: given a budget of acceptable FDR, how can it be distributed among the hypotheses in such a way as to obtain the best possible power overall? The first idea is to use the covariate to assign hypothesis weights. We approximate the covariate-weight relationship by a step-wise constant function. No further assumptions (e. g., monotonicity) are needed. The second idea is that the number of discoveries of the weighted BH procedure with given weights is an empirical indicator of the method’s power. Therefore, a good choice of the covariate-weight function should lead to a high number of discoveries. An initial implementation (“naive IHW”) is easy to explain. The algorithm divides the tests into groups based on the covariate. Then, we associate each group with a weight, so that all hypotheses within a group are assigned the same weight. For each possible choice of weights we apply the weighted BH procedure at level α and calculate the total number of discoveries. We choose the weights leading to the highest number of discoveries. In many applications, this approach is already satisfactory, but it has two shortcomings: First, the underlying optimization problem is difficult and does not easily scale to problems with millions of tests. Second, in certain situations, described below, this algorithm leads to loss of type I error control. The reason for the latter is analogous to overfitting in statistical learning, and we use methods from this field to overcome the shortcomings: convex relaxation, data splitting and regularization (Online Methods and Supplementary Note 2). The full IHW algorithm employs these three extensions. IHW increases empirical detection power compared to the BH procedure. We illustrate this claim on three exemplary applications (Supplementary Note 3). The first, by Bottomly et al. [13, 14], is an RNA-seq dataset used to detect differential gene expression between mouse strains. p-values were calculated using DESeq2 [12]. Here we used the mean of normalized counts for each gene, across samples, as the informative covariate. We saw an increased number of discoveries compared to BH (Fig. 2a). In addition, we observed that the learned weight function prioritized genes with higher mean normalized counts (Supplementary Fig. 1a).

Figure 2

Performance evaluation.

Panels a-c show the number of discoveries with IHW and BH on real data as a function of the target FDR. a) RNA-Seq dataset [13] with mean of normalized counts for each gene as the covariate. b) SILAC dataset [15], with number of peptides quantified per protein as the covariate. c) hQTL dataset [16] for Chromosome 21, with genomic distance between SNPs and ChIP-seq signals as the covariate. Independent Filtering with different distance cutoffs was also applied. d) Weight function learned by IHW at α = 0.1 for the hQTL dataset. Shown are the curves for the five folds in the data splitting scheme. Panels e-h benchmark different methods based on simulations. Brief descriptions of each method are in Table 2. e–f) Type I error control if all null hypotheses are true. Shown is the true FDR against the nominal significance level α. e) All methods shown make too many false discoveries. f) BH, FDRreg, and IHW control the FDR. LSL-GBH and Clfdr are slightly anticonservative. g-h) Implications of different effect sizes. The two-sample t-test was applied to Normal samples (n = 2 × 5, σ = 1) with either the same mean (nulls) or means differing by the effect size indicated on the x-axis (alternatives). The fraction of alternatives was 0.05. The pooled sample variance was used as the covariate. The nominal level was α = 0.1 (dotted line). g) The y-axis shows the actual FDR. h) Power analysis. All methods show improvement over BH.

Second, we analyzed a quantitative mass-spectrometry (SILAC) experiment in which yeast cells treated with rapamycin were compared to yeast treated with DMSO (2 × 6 biological replicates) [15]. Differential protein abundance of 2,666 proteins was evaluated using Welch’s t-test [15]. As a covariate we used the total number of peptides that were quantified across all samples for each protein. IHW again showed increased power compared to BH (Fig. 2b), and proteins with more quantified peptides were assigned higher weight, as expected (Supplementary Fig. 1b). In a third example, we searched for associations between SNPs and histone modification marks (H3K27ac) [16] on human Chromosome 21. This yielded 180 million tests. As a covariate we used the genomic distance between the SNP and the ChIP-seq signal. The power increase compared to BH was dramatic (Fig. 2c). IHW automatically assigned most weight to small distances (Fig. 2d). Thus IHW acted similarly to the common practice in eQTL-analysis of searching for associations only within a certain distance, a form of Independent Filtering. However, it had the advantage that no arbitrary choice of distance threshold was needed, and that the weights were more nuanced than a hard distance threshold. IHW does not exclude SNP-phenotype pairs far away, and these can still be detected as long as they have a sufficiently small p-value. The extensions to naive IHW are needed to ensure type I error control. Naive IHW, as well as previous approaches to data-driven hypothesis weighting or filtering, do not maintain FDR control in situations where all hypotheses are true (Fig. 2e) or where there is insufficient power to detect the false hypotheses (Supplementary Fig. 2a). In addition, the local fdr methods (Clfdr and FDRreg) often show strong deviations from the target FDR in a direction (conservative or anti-conservative) that is not apparent a priori (Fig. 2f,g). Thus, among all methods benchmarked across these scenarios, only BH, IHW (but not naive IHW) and LSL-GBH generally control the FDR. The results of our method comparisons are summarized in Table 2 (Fig. 2 and Supplementary Fig. 2), and the simulations are described in Supplementary Note 4.

Table 2

Short description of the different methods benchmarked and summary of the results of Fig. 2e–h and Supplementary Fig. 2.

Method	Short description	Type I error control		Gain in power		Comment
Method	Short description	π₀=1	t-test	t-test (vs BH)	size investing	Comment
BH	Method of Benjamini and Hochberg [1] to control false-discovery rate (FDR) for multiple exchangeable hypotheses.	Yes	Yes	–	–
IHW	Independent hypothesis weighting, as proposed here.	Yes	Yes	Yes	Yes
NaiveIHW	Naive independent hypothesis weighting, as proposed here.	No	No	Yes	Yes
Greedy Independent Filtering	The Independent Filtering procedure [9] modified to use a data-driven filter threshold which maximizes the number of discoveries.	No	No	Yes	No	The covariate-weights function is a binary step, monotonic.
SBH	Stratified Benjamini-Hochberg [28]: Apply the BH procedure at level α within each stratum, then combine the discoveries across the strata.	No	No	Yes	No
TST-GBH	The Group BH procedure [10]: An adaptive weighted BH procedure applied with weights proportional to π₁/π₀ within each group. π₀ is estimated using the TST estimator [2].	No	No	Yes	No
LSL-GBH	The Group BH procedure [10], where π₀ is estimated using the LSL estimator	Yes	Yes	Yes	No
Clfdr	In the Clfdr procedure [20], the local fdr is estimated separately within each group and the estimates are pooled together. For the fdr estimation here we use the modified Grenander estimator [5].	Yes	No	Yes	Yes
FDRreg	The FDR regression method [23] estimates the local fdr by assuming all hypotheses have the same alternative density and π₀ varies smoothly as a function of the covariate.	Yes	No	Yes	No	Requires z-scores.

IHW can apply a size investing strategy. IHW assigns low weight to covariate-groups with low signal (such as Fig. 1d). While this may be expected, a less intuitive effect can pertain to groups with very small p-values. IHW can move away weight from these towards groups with more intermediate p-values, since the former will be rejected even with a lower weight. This is called size investing [17]. Several other methods (Table 2), including Independent Filtering, stratified BH, LSL-GBH and FDRreg, cannot apply size investing and might even lose power compared to the BH method (Supplementary Fig. 2d,f and Supplementary Note 5). It is instructive to consider the relation between IHW and the concept of local true discovery rates. p-values are a reduction of data into one number, which typically does not contain all the important information (Table 1; [18, 19]). One might wonder whether there are other quantities that are better suited for selecting discoveries. The theoretically optimal candidate is the local true discovery rate (tdr) [4]. The tdr of the ith hypothesis is [4] A schematic explanation is given in Fig. 3a (see also Supplementary Figs. 3 and 4). fi is the density of the distribution of the p-value p. It is a mixture of two distributions, fi = π0,if0 + π1,if1,i, where the densities f0 and f1,i are conditional on the null or the alternative being true, respectively, and π0,i and π1,i (which sum up to 1) are the corresponding prior probabilities. The null distribution of a properly calibrated test is uniform, therefore we can set f0(p) = 1 irrespective of p and i. In Fig. 3b-d three hypotheses are shown with different tdr curves corresponding to different power profiles.

Figure 3

True discovery rate and informative covariates.

a) Schematic representation of the density fi, which is composed of the alternative density f1,i weighted by its prior probability π1,i and the uniform null density weighted by π0,i. b-d) The true discovery rate (tdr) of individual tests can vary. In b), the test has high power, and π0,i is well below 1. In c), the test has equal power, but π0,i is higher, leading to a reduced tdr. In d), π0,i is like in b), but the test has little power, again leading to a reduced tdr. e) If an informative covariate is associated with each test, the distribution of the p-values from multiple tests is different for different values of the covariate. The contours represent the joint density of p-values and covariate. The BH procedure accounts only for the p-values and not the covariates (dashed red line). In contrast, the decision boundary of IHW is a step function; each step corresponds to one group, i. e., to one weight. f) By Equation (1), the density of the tdr also depends on the covariate. The decision boundary of the BH procedure (dashed red line) leads to a suboptimal set of discoveries, in this example with higher than optimal tdr for intermediate covariate values and too low otherwise. In contrast, IHW approximates a line of constant tdr, implying efficient use of the FDR budget. An important feature of IHW is that it works directly on p-values and covariates rather than explicitly estimating the tdr.

It can now be shown that to maximize power at a given FDR, one should reject the hypotheses with the highest tdr [20, 21]. In other words, if we knew the functions in Equation (1) and could use tdri (pi) as our test statistics, then without any further effort we would have a method for FDR control with optimal power. Similarly to the central idea of IHW, one can now assume that the many different, unknown univariate functions tdri (p), one for each hypothesis i, can be approximated by a single bivariate function tdr(p, x), where x is the covariate. The joint density of p and x (Fig. 3e) gives rise to the joint density of tdr and x (Fig. 3f). We can see how in such a scenario the decision boundary of the BH method tends to be suboptimal. As it is defined solely in terms of p-values (Fig. 3e), it differs from the optimal region, whose boundary is a vertical line of constant tdr (Fig. 3f). However, in practice, we neither know the quantities in Equation (1) nor the bivariate function tdr(p, x) and have to estimate them [22]. Unfortunately, this estimation problem is difficult, and even with the use of additional approximations, such as splines [23] or piecewise constant functions [24], there does not seem to be a practical implementation. With IHW we circumvent explicit estimation of the bivariate tdr function and instead derive a powerful testing procedure by assigning data-driven hypothesis weights. In addition, the IHW method readily extends to other weighted multiple testing procedures [6]. In Supplementary Note 6 (and Supplementary Fig. 5) we describe IHW-Bonferroni, a new powerful method for control of the familywise error rate (FWER). In contrast, local tdr methods are specific to the FDR. We have introduced a weighted multiple testing method that learns the weights from the data. Its appeal lies in its generic applicability. It does not require assumptions about the relationship between the covariate and the power of the individual tests, such as monotonicity, which is necessary for Independent Filtering. It can apply size investing strategies, since it does not assume that the alternative distributions are the same across the different hypotheses. It is computationally robust and scales to millions of hypotheses. The idea of using informative covariates for hypothesis weighting or for shaping optimal decision boundaries is not new (Table 2; [24-26]). In this work, we provide a general and practical approach. Most importantly, we show how to overcome two major limitations of previous approaches: type I error control and stability. We gave examples of suitable covariates for a variety of applications in Table 1. Further work could establish additional domain-specific choices of covariates, formalize and automate the assessment of diagnostic plots such as Fig. 1 and extend IHW to higher dimensional covariates. Various approaches to increasing power compared to the BH method have focused on estimating the fraction of true nulls among all hypotheses instead of conservatively approximating it by 1, as the BH method does [2]. In practice, this tends to have limited impact, since in the most interesting situations the number of true discoveries is small compared to all tests and no substantial power increase is gained. On the other hand, such an extension could be beneficial for IHW, since often the groups that get assigned a high weight also have a reduced proportion of true nulls. The issue of dependence between hypotheses deserves attention. For example, the proof of the BH method was initially provided under the assumption of independent hypothesis tests and later extended to positive regression dependence [27]. Beyond that, BH has turned out to be remarkably robust to correlations encountered in analyses of real data. In our experience, IHW inherits this property of BH, whenever the covariate is not involved in the joint dependence of the null p-values. In our method we have explicitly avoided estimating the densities in Equation (1). Nevertheless, the local true discovery rate is an interesting quantity in its own right, since it provides a posterior probability for each individual hypothesis. Our weighted p-values do not provide this information. Thus, development of stable estimation procedures for the local local true discovery rate that incorporate informative covariates is needed and would be complementary to our work [19, 22–24].

Code availability

The IHW package is available from Bioconductor at http://www.bioconductor.org/packages/IHW. It comes with detailed documentation and a vignette that showcases the application of IHW to a real dataset. The vignette also provides guidance on the choice of informative covariates and suggests diagnostic plots, so that users can determine if their covariate satisfies the required conditions. Executable documents (Rmarkdown) reproducing all analyses shown here can be downloaded at http://bioconductor.org/packages/IHWpaper. Both packages are also available as Supplementary Software to this manuscript.

Online Methods

Description of the IHW algorithm

The hypothesis tests are divided into G different groups based on the covariate, typically of about equal size. Each group g is associated with weight wg. The following optimization problem is solved: find the weight vector = (w1 , …, w) that maximizes the number of rejections of the weighted BH method at level α. This method, naive IHW, is modified by the following three extensions. E1. Instead of the above optimization task, we solve a convex relaxation of it. In statistical terms this corresponds to replacing the empirical cumulative distribution functions (ECDF) of the p-values with the Grenander estimators (least concave majorant of the ECDF). The resulting problem is convex and can be efficiently solved even for large numbers of hypotheses. E2. We randomly split the hypotheses into k folds. For each fold, we apply convex IHW to the other k− 1 folds and assign the learned weights to the remaining fold. Thus the weight assigned to a given hypothesis does not directly depend on its p-value, but only on its covariate. E3. The performance of the algorithm can be further improved by ensuring that the weights learned with k– 1 folds generalize to the held-out fold. Therefore, we introduce a regularization parameter λ ≥ 0, and the optimization is done over a constrained subset of the weights. For an ordered covariate, we require that, i. e., weights of successive groups should not be too different. For an unordered covariate, we use instead the constraint i.e., deviations from 1 are penalized. In the limit case λ = 0, all weights are the same, so IHW with λ = 0 is just the BH method. IHW with λ → ∞ is the unconstrained version. Choice of λ is a model selection problem, so within each split in E2 we apply a second nested layer of cross-validation. E3 is optional; whether or not to apply it will depend on the data. It will be most beneficial if the number of hypotheses per group is relatively small. A complete description of the algorithm, including an efficient computational implementation of the optimization task, is provided in Supplementary Note 2. Supplementary Note 7 describes its theoretical justification.

15 in total

1. False Discovery Rate Control With Groups.

Authors: James X Hu; Hongyu Zhao; Harrison H Zhou
Journal: J Am Stat Assoc Date: 2010-09-01 Impact factor: 5.033

2. Multidimensional local false discovery rate for microarray studies.

Authors: Alexander Ploner; Stefano Calza; Arief Gusnanto; Yudi Pawitan
Journal: Bioinformatics Date: 2005-12-20 Impact factor: 6.937

3. Were genome-wide linkage studies a waste of time? Exploiting candidate regions within genome-wide association studies.

Authors: Yun J Yoo; Shelley B Bull; Andrew D Paterson; Daryl Waggott; Lei Sun
Journal: Genet Epidemiol Date: 2010-02 Impact factor: 2.135

4. Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions.

Authors: Fabian Grubert; Judith B Zaugg; Maya Kasowski; Oana Ursu; Damek V Spacek; Alicia R Martin; Peyton Greenside; Rohith Srivas; Doug H Phanstiel; Aleksandra Pekowska; Nastaran Heidari; Ghia Euskirchen; Wolfgang Huber; Jonathan K Pritchard; Carlos D Bustamante; Lars M Steinmetz; Anshul Kundaje; Michael Snyder
Journal: Cell Date: 2015-08-20 Impact factor: 41.582

5. POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES.

Authors: Edsel A Peña; Joshua D Habiger; Wensong Wu
Journal: Ann Stat Date: 2011-02 Impact factor: 4.028

6. Optimal multiple testing under a Gaussian prior on the effect sizes.

Authors: Edgar Dobriban; Kristen Fortney; Stuart K Kim; Art B Owen
Journal: Biometrika Date: 2015-11-04 Impact factor: 2.445

7. Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays.

Authors: Daniel Bottomly; Nicole A R Walter; Jessica Ezzell Hunter; Priscila Darakjian; Sunita Kawane; Kari J Buck; Robert P Searles; Michael Mooney; Shannon K McWeeney; Robert Hitzemann
Journal: PLoS One Date: 2011-03-24 Impact factor: 3.240

8. ReCount: a multi-experiment resource of analysis-ready RNA-seq gene count datasets.

Authors: Alyssa C Frazee; Ben Langmead; Jeffrey T Leek
Journal: BMC Bioinformatics Date: 2011-11-16 Impact factor: 3.169

9. A unified approach to false discovery rate estimation.

Authors: Korbinian Strimmer
Journal: BMC Bioinformatics Date: 2008-07-09 Impact factor: 3.169

10. Beyond the E-Value: Stratified Statistics for Protein Domain Prediction.

Authors: Alejandro Ochoa; John D Storey; Manuel Llinás; Mona Singh
Journal: PLoS Comput Biol Date: 2015-11-17 Impact factor: 4.475

141 in total

1. Flagellar Mutants Have Reduced Pilus Synthesis in Caulobacter crescentus.

Authors: Courtney K Ellison; Douglas B Rusch; Yves V Brun
Journal: J Bacteriol Date: 2019-08-22 Impact factor: 3.490

2. Statistical Modeling of High Dimensional Counts.

Authors: Michael I Love
Journal: Methods Mol Biol Date: 2021

3. Leveraging Polygenic Functional Enrichment to Improve GWAS Power.

Authors: Gleb Kichaev; Gaurav Bhatia; Po-Ru Loh; Steven Gazal; Kathryn Burch; Malika K Freund; Armin Schoech; Bogdan Pasaniuc; Alkes L Price
Journal: Am J Hum Genet Date: 2018-12-27 Impact factor: 11.025

4. TranscriptAchilles: a genome-wide platform to predict isoform biomarkers of gene essentiality in cancer.

Authors: Fernando Carazo; Lucía Campuzano; Xabier Cendoya; Francisco J Planes; Angel Rubio
Journal: Gigascience Date: 2019-04-01 Impact factor: 6.524

5. A wave of monocytes is recruited to replenish the long-term Langerhans cell network after immune injury.

Authors: Ivana R Ferrer; Heather C West; Stephen Henderson; Dmitry S Ushakov; Pedro Santos E Sousa; Jessica Strid; Ronjon Chakraverty; Andrew J Yates; Clare L Bennett
Journal: Sci Immunol Date: 2019-08-23

6. Cardelino: computational integration of somatic clonal substructure and single-cell transcriptomes.

Authors: Davis J McCarthy; Raghd Rostom; Yuanhua Huang; Daniel J Kunz; Petr Danecek; Marc Jan Bonder; Tzachi Hagai; Ruqian Lyu; Wenyi Wang; Daniel J Gaffney; Benjamin D Simons; Oliver Stegle; Sarah A Teichmann
Journal: Nat Methods Date: 2020-03-16 Impact factor: 28.547

7. SWI/SNF inactivation in the endometrial epithelium leads to loss of epithelial integrity.

Authors: Jake J Reske; Mike R Wilson; Jeanne Holladay; Marc Wegener; Marie Adams; Ronald L Chandler
Journal: Hum Mol Genet Date: 2020-12-18 Impact factor: 6.150

8. Bias, robustness and scalability in single-cell differential expression analysis.

Authors: Charlotte Soneson; Mark D Robinson
Journal: Nat Methods Date: 2018-02-26 Impact factor: 28.547

9. Differential Activation of P-TEFb Complexes in the Development of Cardiomyocyte Hypertrophy following Activation of Distinct G Protein-Coupled Receptors.

Authors: Ryan D Martin; Yalin Sun; Sarah MacKinnon; Luca Cuccia; Viviane Pagé; Terence E Hébert; Jason C Tanny
Journal: Mol Cell Biol Date: 2020-06-29 Impact factor: 4.272

10. Filamentation Regulatory Pathways Control Adhesion-Dependent Surface Responses in Yeast.

Authors: Jacky Chow; Izzy Starr; Sheida Jamalzadeh; Omar Muniz; Anuj Kumar; Omer Gokcumen; Denise M Ferkey; Paul J Cullen
Journal: Genetics Date: 2019-05-03 Impact factor: 4.562