Dan Luo1, Sara Ziebell2, Lingling An1,2,3. 1. Department of Epidemiology and Biostatistics, College of Public Health. 2. Interdisciplinary Program in Statistics. 3. Department of Agricultural & Biosystems Engineering, University of Arizona, Tucson, AZ 85721, USA.
Abstract
Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support. Results: Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type. Availability and Implementation: R code and an example dataset are available at https://cals.arizona.edu/∼anling/sbg/software.htm. Contact: anling@email.arizona.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: The advent of high-throughput next generation sequencing technology has greatly promoted the field of metagenomics where previously unattainable information about microbial communities can be discovered. Detecting differentially abundant features (e.g. species or genes) plays a critical role in revealing the contributors (i.e. pathogens) to the biological or medical status of microbial samples. However, currently available statistical methods lack power in detecting differentially abundant features contrasting different biological or medical conditions, in particular, for time series metagenomic sequencing data. We have proposed a novel procedure, metaDprof, which is built upon a spline-based method assuming heterogeneous error, to meet the challenges of detecting differentially abundant features from metagenomic samples by comparing different biological/medical conditions across time. It contains two stages: (i) global detection on features and (ii) time interval detection for significant features. The detection procedures in both stages are based on sound statistical support. Results: Compared with existing methods the new method metaDprof shows the best performance in comprehensive simulation studies. Not only can it accurately detect features relating to the biological condition or disease status of samples but it also can accurately detect the starting and ending time points when the differences arise. The proposed method is also applied to a real metagenomic dataset and the results provide an interesting angle to understand the relationship between the microbiota in mouse gut and diet type. Availability and Implementation: R code and an example dataset are available at https://cals.arizona.edu/∼anling/sbg/software.htm. Contact: anling@email.arizona.edu. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Ahmed A Metwally; Jie Yang; Christian Ascoli; Yang Dai; Patricia W Finn; David L Perkins Journal: Microbiome Date: 2018-02-13 Impact factor: 14.650