Toshihiro Kishikawa1, Yoshihiko Tomofuji1, Hidenori Inohara2, Yukinori Okada1. 1. Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita 565-0871, Japan. 2. Department of Otorhinolaryngology-Head and Neck Surgery, Osaka University Graduate School of Medicine, Suita 565-0871, Japan.
Abstract
Microbiome is an essential omics layer to elucidate disease pathophysiology. However, we face a challenge of low reproducibility in microbiome studies, partly due to a lack of standard analytical pipelines. Here, we developed OMARU (Omnibus metagenome-wide association study with robustness), a new end-to-end analysis workflow that covers a wide range of microbiome analysis from phylogenetic and functional profiling to case-control metagenome-wide association studies (MWAS). OMARU rigorously controls the statistical significance of the analysis results, including correction of hidden confounding factors and application of multiple testing comparisons. Furthermore, OMARU can evaluate pathway-level links between the metagenome and the germline genome-wide association study (i.e. MWAS-GWAS pathway interaction), as well as links between taxa and genes in the metagenome. OMARU is publicly available (https://github.com/toshi-kishikawa/OMARU), with a flexible workflow that can be customized by users. We applied OMARU to publicly available type 2 diabetes (T2D) and schizophrenia (SCZ) metagenomic data (n = 171 and 344, respectively), identifying disease biomarkers through comprehensive, multilateral, and unbiased case-control comparisons of metagenome (e.g. increased Streptococcus vestibularis in SCZ and disrupted diversity in T2D). OMARU improves accessibility and reproducibility in the microbiome research community. Robust and multifaceted results of OMARU reflect the dynamics of the microbiome authentically relevant to disease pathophysiology.
Microbiome is an essential omics layer to elucidate disease pathophysiology. However, we face a challenge of low reproducibility in microbiome studies, partly due to a lack of standard analytical pipelines. Here, we developed OMARU (Omnibus metagenome-wide association study with robustness), a new end-to-end analysis workflow that covers a wide range of microbiome analysis from phylogenetic and functional profiling to case-control metagenome-wide association studies (MWAS). OMARU rigorously controls the statistical significance of the analysis results, including correction of hidden confounding factors and application of multiple testing comparisons. Furthermore, OMARU can evaluate pathway-level links between the metagenome and the germline genome-wide association study (i.e. MWAS-GWAS pathway interaction), as well as links between taxa and genes in the metagenome. OMARU is publicly available (https://github.com/toshi-kishikawa/OMARU), with a flexible workflow that can be customized by users. We applied OMARU to publicly available type 2 diabetes (T2D) and schizophrenia (SCZ) metagenomic data (n = 171 and 344, respectively), identifying disease biomarkers through comprehensive, multilateral, and unbiased case-control comparisons of metagenome (e.g. increased Streptococcus vestibularis in SCZ and disrupted diversity in T2D). OMARU improves accessibility and reproducibility in the microbiome research community. Robust and multifaceted results of OMARU reflect the dynamics of the microbiome authentically relevant to disease pathophysiology.
Microbiome is one of the major research areas in human diseases towards implementation of personalized medicine based on multi-layer omics data. Recent interests are on multidimensional integration of metagenome data with other omics layers such as host genome and metabolome, as well as deep analysis within the single metagenomic layer (1,2). Analytical approaches of microbiome are shifting from amplicon sequencing of 16S ribosomal RNA genes to whole-genome shotgun sequencing. However, we face a challenge of low reproducibility in findings of microbiome studies. Differences in physiological variables and lifestyles of the samples also have been reported as a factor yielding this problem (1,3). In addition, we still lack a gold standard analytical pipeline which can overcome the problem of low reproducibility (3,4).Here, we introduce OMARU (mnibus etagenome-wide ssociation study [MWAS] with obstness), a new end-to-end metagenome analysis workflow (Figure 1). Through implementation of rigorous quality control (QC) of shotgun sequence reads, samples, clades, and genes, OMARU constructs phylogenetic and functional profiling of the metagenome, the two main analytical pipelines. Three major components of the case–control association tests of MWAS (i.e. phylogenetic, gene, and biological pathway analyses) are subsequently conducted with rigorous handling of false positives in statistical analysis (5–7). In addition to solving the low reproducibility of metagenomic study, OMARU provides integrative analyses. As an example, OMARU can evaluate pathway-level links between the metagenome and the germline genome-wide association studies (GWAS) of the host genome. Furthermore, OMARU identifies the links between taxa and genes in the metagenome utilizing the results of phylogenetic and gene analyses. OMARU also visualizes attractive figures which enable comprehensive summary of the association test results. The referenced databases, which substantially affect the analytic results, is currently being rapidly expanded (8,9). OMARU is a flexible and extensible workflow that can be customized, such as adding an up-to-date database.
Figure 1.
OMARU workflow and details as bioinformatics pipelines for the metagenome-wide association study. OMARU workflow. Using shotgun sequencing data, metagenome-wide association studies (phylogenetic, gene and pathway analyses) and additional analyses are performed, including comparing pathway analyses between genome-wide association studies (GWAS) and metagenome-wide association study (MWAS).
OMARU workflow and details as bioinformatics pipelines for the metagenome-wide association study. OMARU workflow. Using shotgun sequencing data, metagenome-wide association studies (phylogenetic, gene and pathway analyses) and additional analyses are performed, including comparing pathway analyses between genome-wide association studies (GWAS) and metagenome-wide association study (MWAS).
MATERIALS AND METHODS
Quality control
OMARU handles the shotgun sequencing data in the FASTQ format as input (currently, 16S rRNA data is not supported). QC of the sequencing reads is applied to maximize the quality of datasets as follows: (i) trimming of low-quality bases using Trimmomatic (10), (ii) identification and masking of human reads using bowtie2 (11) and BMTagger (12) and (iii) removal of duplicated reads using PRINSEQ-lite (13). As for QC of samples, there exist three factors for selecting samples to be excluded as follows; (i) overall quality of sequencing reads, (ii) status of phylogenetic abundance and mapping rates, (iii) status of contigs and open reading frames (ORFs) in assembly-based approach and mapping rates in mapping-based approach, and (iv) principal component analyses (PCA) in the phylogenetic data and gene abundance data. OMARU sequentially outputs graphical figures and tables representing statistical matrixes of each procedure, helping users select samples to be excluded at each step (Figures 1 and 2A). Clades and genes detected in less than the pre-defined threshold of the samples (e.g. 20%), or in no sample in either cases or controls, are removed. Besides, clades with an average relative abundance less than the pre-defined threshold of total abundance are removed (default: 0.001%).
Figure 2.
MWAS results of QC and phylogenetic analysis. (A) Principal component analysis (PCA) in phylogenetic and gene abundance of the schizophrenia (SCZ) data. The green dots represent the excluded sample as a result of quality control. (B) Quantile-quantile plots of the phylogenetic MWAS P-values of the clades in the SCZ data. The x-axis indicates log-transformed empirically estimated median P. The y-axis indicates observed –log10(P). The diagonal dashed line represents y = x, which corresponds to the null hypothesis. The left and right figures show the results of including 0 principal component (PC) and 30 PCs as covariates, respectively, which indicates that PCs suppress the inflation of P-values. (C) A histogram of minimum P-values in the phenotype permutation procedure in the SCZ data. Vertical lines of red and purple indicate an empirical Bonferroni significance threshold at a significance level of 0.05 and a standard Bonferroni significance threshold in multiple comparison procedure (0.05/692 = 7.23 × 10–5), respectively. (D) A phylogenetic tree. Levels L2–L7 are from the inside layer to the outside layer in the SCZ data. The size and the color of the dots represent relative abundance and effect sizes, respectively. The three clades with significant case–control associations (false discovery rate < 0.05) are outlined in red.
MWAS results of QC and phylogenetic analysis. (A) Principal component analysis (PCA) in phylogenetic and gene abundance of the schizophrenia (SCZ) data. The green dots represent the excluded sample as a result of quality control. (B) Quantile-quantile plots of the phylogenetic MWAS P-values of the clades in the SCZ data. The x-axis indicates log-transformed empirically estimated median P. The y-axis indicates observed –log10(P). The diagonal dashed line represents y = x, which corresponds to the null hypothesis. The left and right figures show the results of including 0 principal component (PC) and 30 PCs as covariates, respectively, which indicates that PCs suppress the inflation of P-values. (C) A histogram of minimum P-values in the phenotype permutation procedure in the SCZ data. Vertical lines of red and purple indicate an empirical Bonferroni significance threshold at a significance level of 0.05 and a standard Bonferroni significance threshold in multiple comparison procedure (0.05/692 = 7.23 × 10–5), respectively. (D) A phylogenetic tree. Levels L2–L7 are from the inside layer to the outside layer in the SCZ data. The size and the color of the dots represent relative abundance and effect sizes, respectively. The three clades with significant case–control associations (false discovery rate < 0.05) are outlined in red.
Case-control association test for phylogenetic data
OMARU adopts a mapping-based approach to utilize the advantages of paired-end reads and reduce mapping errors. Users can flexibly customize the reference data in a FASTA format to the appropriate one: Default is modified DNA sequences of the Unified Human Gastrointestinal Genome (8). After read-mapping using bowtie2 (11), relative abundance of each clade is quantified for each sample up to the six taxonomic levels (L2: phyla, L3: classes, L4: orders, L5: families, L6: genera and L7: species). Subsequently, the relative abundance profiles are normalized using log transformation. Case-control association tests are performed using the lm function implemented in the R statistical software. Users can incorporate covariates for adjustment, such as sex and age. OMARU generally requires a sufficient number of principal components as covariates to robustly adjust the effect of hidden confounding factors and suppress P-value inflation (Figure 2B).Empirical null distributions of the minimum P-values (= Pmin) are calculated based on a phenotype permutation procedure (× 10,000 iterations) to control the type I error rates (14). The empirical Bonferroni significance threshold is defined at a significance level of 0.05, as the 95th percentile of Pmin (= Psig). The 95% confidence interval for Pmin is calculated by a bootstrapping method of the Harrell-Davis distribution-free quantile estimator (Figure 2C). In addition to the standard figures to visualize distribution of statistics such as quantile-quantile and volcano plots (Supplementary Figure S1), OMARU illustrates a phylogenetic tree indicating the case–control association results of multilayered taxonomic levels (Figure 2D).
Case–control association test for functional data (gene and pathways)
Gene abundance data of metagenome are constructed by the assembly-based approach as follows; (i) de novo assembly of the sequencing reads into contigs using MEGAHIT (15), (ii) prediction of open reading frames (ORFs) on the contigs with Prokka (16), (iii) alignment of ORF against an appropriate database (default: UniRef90 (17)) with DIAMOND (18) and (iv) quantification of gene abundance by mapping the sequencing reads to the assembled contigs using bowtie2 (11). Normalization of gene abundance is conducted by the two steps. First, the ORF abundance is defined as the depth of each ORF’s region of the ORF catalog according to the mapping result to avoid the bias of the gene lengths. Second, the gene abundance is adjusted by the sum of the ORF abundance for each sample to correct potential bias of heterogeneity in the total amount of sequence reads among the samples. Next, a rank-based inverse normal transformation is applied to correct the heterogeneity of each gene's abundance and distribution. Association tests are in the same way as phylogenetic analysis, including covariates and empirical threshold (Figure 3A).
Figure 3.
MWAS results of functional analysis. (A) Results of functional association tests in schizophrenia (SCZ) and type 2 diabetes (T2D). Left figures are quantile-quantile plots of the P-values in the gene association tests. The x-axes indicate empirically estimated median -log10(P). The y-axes indicate observed -log10(P). The diagonal grey lines represent y = x, which correspond to the null hypothesis. The horizontal red lines indicate the empirical Bonferroni-corrected threshold (α = 0.05), and the brown line indicates the empirically estimated FDR threshold (FDR = 0.05). Center figures are volcano plots. The x-axes indicate effect sizes in linear regression. The y-axes, horizontal lines, and dot colors are the same as in the left quantile-quantile plots. Right figures are quantile-quantile plots of the P-values in the pathway association tests. Genes and pathways with P-values less than the Bonferroni threshold are plotted as red dots. Genes and pathways with FDR less than 0.05 are plotted as brown dots, and others are plotted as black dots. FDR; false discovery rate. (B) Links in the metagenome data between taxa and Vpar_1847, one of the schizophrenia-associated genes. Stacked bar graphs indicate the species of origin for each gene and their percentage, divided into cases and controls. The parentheses in each title represent the organism registered as the origin of the genes in the database. (C) Comparison of P-values of GO analyses between the SCZ MWAS and GWAS data. The x-axis indicates the P-value in the SCZ GWAS data. The y-axis indicates the P-value in the SCZ MWAS data. The horizontal and vertical black lines indicate P of 0.05. The overlap of the GO enrichment was evaluated by classifying the GO terms based on the significance threshold of P < 0.05 or P ≥ 0.05 and using Fisher's exact test.
MWAS results of functional analysis. (A) Results of functional association tests in schizophrenia (SCZ) and type 2 diabetes (T2D). Left figures are quantile-quantile plots of the P-values in the gene association tests. The x-axes indicate empirically estimated median -log10(P). The y-axes indicate observed -log10(P). The diagonal grey lines represent y = x, which correspond to the null hypothesis. The horizontal red lines indicate the empirical Bonferroni-corrected threshold (α = 0.05), and the brown line indicates the empirically estimated FDR threshold (FDR = 0.05). Center figures are volcano plots. The x-axes indicate effect sizes in linear regression. The y-axes, horizontal lines, and dot colors are the same as in the left quantile-quantile plots. Right figures are quantile-quantile plots of the P-values in the pathway association tests. Genes and pathways with P-values less than the Bonferroni threshold are plotted as red dots. Genes and pathways with FDR less than 0.05 are plotted as brown dots, and others are plotted as black dots. FDR; false discovery rate. (B) Links in the metagenome data between taxa and Vpar_1847, one of the schizophrenia-associated genes. Stacked bar graphs indicate the species of origin for each gene and their percentage, divided into cases and controls. The parentheses in each title represent the organism registered as the origin of the genes in the database. (C) Comparison of P-values of GO analyses between the SCZ MWAS and GWAS data. The x-axis indicates the P-value in the SCZ GWAS data. The y-axis indicates the P-value in the SCZ MWAS data. The horizontal and vertical black lines indicate P of 0.05. The overlap of the GO enrichment was evaluated by classifying the GO terms based on the significance threshold of P < 0.05 or P ≥ 0.05 and using Fisher's exact test.As for the pathway analysis, OMARU adopts a gene set enrichment analysis using the ranking of the genes by z-values in case–control gene association tests. The pathway database could be flexibly customized (Default is Gene Ontology (19)).
Links between the microbe MWAS and the germline GWAS of host
OMARU identifies disease-specific biological pathway links between the microbe MWAS and the germline GWAS of host (5–7). The result of pathway analysis using summary statistics of GWAS for the target disease is required as input. OMARU evaluates the overlap between the MWAS and GWAS in the pathway enrichment by Fisher's exact test, based on the classification of pathways with P-value threshold of 0.05 (Figure 3C).
Links between taxa and genes in the metagenome
Organisms of origin for each gene are an important factor to understand microbiome biology. While gene databases such as Kyoto Encyclopedia of Genes and Genomes (KEGG) (20) and UniProt(21) collect organisms of origin, such information are based on the specific link between the registered gene and organisms, and may not reflect the real link in the target metagenomic sample. By tracing back to the level of sequencing reads, OMARU can directly estimate organisms of the origin for each gene in the target data (Figure 3B, Supplementary Figure S2).
Case-control difference between α-diversity and β-diversity of the metagenome
For calculating diversities, all samples should be down-sampled at the appropriate same number of reads. OMARU calculates α-diversity (within-sample diversity) as a Shannon index based on the gene abundance and the six levels of phylogenetic relative abundance (L2–L7) for each sample. Case–control comparison are performed with pre-defined covariates and the effect size of disease state is evaluated. To evaluate β-diversity, multidimensional scaling (MDS) on the Bray-Curtis dissimilarity is used. For evaluating case–control differences in the dissimilarity, OMARU performs permutational multivariate analysis of variance (PERMANOVA) (22) using the adonis() function in R package vegan.
RESULTS
We adopted the two public fecal metagenomic data of schizophrenia (SCZ; 90 SCZ patients and 81 healthy controls) and type 2 diabetes (T2D; 170 T2D patients and 174 healthy controls) for a practical example of operation of OMARU (23,24).In sample QC of the SCZ data, we excluded one SCZ sample that had singleton genes beyond four standard deviations and was an outlier of both phylogenetic and gene abundance data (Figure 2A). We used a phylogenetic reference, which was constructed by integrating those registered by Nishijima et al. (25) and those newly identified from the human gut bacteria projects (9,26,27), as previously described (5,6). We had 692 clades for the SCZ case–control association test, including 10 phyla (L2), 23 classes (L3), 34 orders (L4), 69 families (L5), 156 genera (L6) and 400 species (L7). We adopted sex, age, body mass index (BMI) and the top 30 principal components as covariates. In multiple test correction, empirically estimated Bonferroni threshold was lower than the standard Bonferroni threshold (Figure 2C). It could reflect that microbiome composition within an individual was not independent between clades. We identified the three clades significantly increased in SCZ (FDR < 0.05; Figure 2D, Supplementary Figure S3, Supplementary Table S1). We had 789 clades for the T2D case–control association test and identified no clades with significant association. In both diseases, the numbers of disease-associated clades were considerably lower than those in the reference papers and other metagenome studies(23,24). Correction of hidden confounding factors mainly led to this result. The quantile-quantile plots of P-values in the SCZ data showed that the analysis without adopting no PCs as covariates demonstrated severe inflation of P-values and a large number of false positives (Figure 2B). Streptococcus vestibularis, one of the three SCZ-associated clades identified by OMARU, was reported to induce deficits in social behavior and alter neurotransmitter levels in peripheral tissues in recipient mice(23). Thus, OMARU is featured by its ability to specify robustly disease-associated clades by optimally adjusting confounding factors.We selected KEGG database (20) as references of gene and biological pathway. After gene-level QC, we retained 185 663 and 104 487 genes for SCZ and T2D case–control comparison, respectively. In functional association tests, we obtained results with suppression of the inflation of P-values by adjusting covariates in the same way as the phylogenetic analyses. We identified four SCZ-associated genes, four SCZ-associated pathways, and four T2D-associated pathways (FDR < 0.05; Figure 3A, Supplementary Table S2 and S3). In the analysis of link between phylogenetic and gene data, Vpar_1847, one of the four SCZ-associated genes, was estimated to be derived from multiple Veillonella spp. (Figure 3B, Supplementary Figure S2). These clades were not significantly associated in our phylogenetic analyses, while their increase in SCZ was highlighted in the referenced paper (23). The cross-sectional assessment of OMARU could suggest that this gene may be an essential factor in the effect on the SCZ pathogenesis of Veillonella spp.As for the MWAS–GWAS interaction, we used PASCAL with summary statistics from the SCZ GWAS (22,778 cases and 35 362 controls) (28) and the T2D GWAS (77 418 cases and 433 5440 controls) (29) in order to determine GO term enrichment of the human genome. We compared the P-values of the each GO term shared between the metagenome data and GWAS data. We found significant overlaps between the pathways enriched in the MWAS and GWAS (PFisher = 0.011 and 0.008 in SCZ and T2D, respectively; Figure 3C). Our results suggested that there was disease-specific links between human genome and metagenome, namely MWAS- GWAS interaction, in the pathology of SCZ and T2D.We performed case–control comparison of α-diversity and β-diversity in the phylogenetic data (L2–L7) and the gene abundance data based on KEGG database. In SCZ, no significant differences of α-diversity in the phylogenetic data (P > 0.05/6 = 0.0083) and the gene abundance data (P = 0.134) were observed, and neither was β-diversity (Supplementary Table S4). In T2D, α-diversity in the taxonomic level of L3 and L4 (P < 8.3 × 10–3) and the gene abundance data (P = 5.1 × 10–3) significantly increased, while significant differences of β-diversity in the taxonomic level of L5–L7 (P < 8.3 × 10–3) and the gene abundance data (P < 1.0 × 10–4) were observed (Supplementary Figure S4, Supplementary Table S4).
DISCUSSION
While several bioinformatic tools for microbiome has been developed recently (30–35), OMARU has a unique characteristic as highlighted in case–control MWAS analysis using shotgun sequencing data. In contrast to several existing tools which are limited to a single part of the analysis, such as phylogenetic or functional analysis, OMARU provides end-to-end analysis from the processing of sequencing data, such as QC of reads and samples, to the three major analyses and the assessment of diversities. It should be meaningful to perform those analyses in a single pipeline with integrative assessments of the results of each part of the analysis, providing deep interpretation of case–control differences in the microbiome. Further, evaluation of links between the metagenome and host genome is one of the novel features of OMARU.We demonstrated that OMARU yields robust and multifaceted results by using public metagenome data. OMARU identified a sample in the SCZ data to be excluded. It's quite difficult to perform sample QC manually in metagenome analyses and comprehensive decision based on multiple assessments is required. OMARU can provide users with multifaceted data to help them make the decision. By statistical processing in OMARU including reduction of false positives, SCZ-associated clades were narrowed down to the clade with functional support, which demonstrates the robustness of OMARU in identifying crucial biomarkers. While hidden confounding factors would better to be adjusted by integration of the covariates into a case–control model, it is not currently implemented in OMARU and thus considered to be one of the limitations.In addition, integrative analyses with multifaceted evaluation, such as the MWAS-GWAS interaction and the links between disease-associated genes and clades, provided a comprehensive understanding of the microbiome-associated pathology. The metagenome of SCZ had little difference of diversities while T2D had significant ones compared to healthy controls. Diversity analysis provides evidence of microbiome's role in disease pathology from a different aspect than other analyses. We note that the metagenome analysis is still highly dependent on reference databases and database development is a challenge for the future.In conclusion, OMARU, as a well-organized and user-friendly workflow, can improve the accessibility and reproducibility of MWAS in the microbiome research community. Robust and multifaceted results of OMARU, including the association with the host genome, reflect the dynamics of the microbiome authentically relevant to disease pathophysiology, leading to the identification of potential biomarkers.
DATA AVAILABILITY
OMARU is publicly available at https://github.com/toshi-kishikawa/OMARU and can be downloaded in the format of a Conda package.Click here for additional data file.
Authors: Alexander Kurilshikov; Carolina Medina-Gomez; Rodrigo Bacigalupe; Djawad Radjabzadeh; Jun Wang; Ayse Demirkan; Caroline I Le Roy; Juan Antonio Raygoza Garay; Casey T Finnicum; Xingrong Liu; Daria V Zhernakova; Marc Jan Bonder; Tue H Hansen; Fabian Frost; Malte C Rühlemann; Williams Turpin; Jee-Young Moon; Han-Na Kim; Kreete Lüll; Elad Barkan; Shiraz A Shah; Myriam Fornage; Joanna Szopinska-Tokov; Zachary D Wallen; Dmitrii Borisevich; Lars Agreus; Anna Andreasson; Corinna Bang; Larbi Bedrani; Jordana T Bell; Hans Bisgaard; Michael Boehnke; Dorret I Boomsma; Robert D Burk; Annique Claringbould; Kenneth Croitoru; Gareth E Davies; Cornelia M van Duijn; Liesbeth Duijts; Gwen Falony; Jingyuan Fu; Adriaan van der Graaf; Torben Hansen; Georg Homuth; David A Hughes; Richard G Ijzerman; Matthew A Jackson; Vincent W V Jaddoe; Marie Joossens; Torben Jørgensen; Daniel Keszthelyi; Rob Knight; Markku Laakso; Matthias Laudes; Lenore J Launer; Wolfgang Lieb; Aldons J Lusis; Ad A M Masclee; Henriette A Moll; Zlatan Mujagic; Qi Qibin; Daphna Rothschild; Hocheol Shin; Søren J Sørensen; Claire J Steves; Jonathan Thorsen; Nicholas J Timpson; Raul Y Tito; Sara Vieira-Silva; Uwe Völker; Henry Völzke; Urmo Võsa; Kaitlin H Wade; Susanna Walter; Kyoko Watanabe; Stefan Weiss; Frank U Weiss; Omer Weissbrod; Harm-Jan Westra; Gonneke Willemsen; Haydeh Payami; Daisy M A E Jonkers; Alejandro Arias Vasquez; Eco J C de Geus; Katie A Meyer; Jakob Stokholm; Eran Segal; Elin Org; Cisca Wijmenga; Hyung-Lae Kim; Robert C Kaplan; Tim D Spector; Andre G Uitterlinden; Fernando Rivadeneira; Andre Franke; Markus M Lerch; Lude Franke; Serena Sanna; Mauro D'Amato; Oluf Pedersen; Andrew D Paterson; Robert Kraaij; Jeroen Raes; Alexandra Zhernakova Journal: Nat Genet Date: 2021-01-18 Impact factor: 41.307
Authors: Samuel C Forster; Nitin Kumar; Blessing O Anonye; Alexandre Almeida; Elisa Viciani; Mark D Stares; Matthew Dunn; Tapoka T Mkandawire; Ana Zhu; Yan Shao; Lindsay J Pike; Thomas Louie; Hilary P Browne; Alex L Mitchell; B Anne Neville; Robert D Finn; Trevor D Lawley Journal: Nat Biotechnol Date: 2019-02-04 Impact factor: 54.908