| Literature DB >> 28398468 |
Paul Saary1, Kristoffer Forslund1, Peer Bork1,2,3,4, Falk Hildebrand1.
Abstract
MOTIVATION: The rapidly expanding microbiomics field is generating increasingly larger datasets, characterizing the microbiota in diverse environments. Although classical numerical ecology methods provide a robust statistical framework for their analysis, software currently available is inadequate for large datasets and some computationally intensive tasks, like rarefaction and associated analysis.Entities:
Mesh:
Year: 2017 PMID: 28398468 PMCID: PMC5870771 DOI: 10.1093/bioinformatics/btx206
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1(A) Speed and memory requirements of different rarefaction programs. Four datasets were 20 times rarefied at 95% lowest sample count. Time and memory consumption of our implementation is consistently below that observed using mothur, vegan or QIIME for the same purpose. vegan failed processing the Tara table (see Supplementary material). (B) Plotting of collector curves as well as of rarefaction curves is implemented in the R-package (Color version of this figure is available at Bioinformatics online.)
Time and memory consumption when rarefying the Tara gene abundance matrix five times to 2.3 M counts per sample, from 139 M counts on average per sample
| Software (mode) | Runtime | Max. memory | Success |
|---|---|---|---|
| RTK (memory) | 3:50 h | 140 Gb | successful |
| RTK (swap) | 3:30 h | 8.5 Gb | successful |
| R RTK (memory) | 3:30 h | 140 Gb | successful |
| R RTK (swap) | 3:05 h | 8.7 Gb | successful |
| QIIME | 21:50 h | 339 Gb | successful |
| vegan | – | 387 Gb | failed |
| mothur | 17:30 h | 262 Gb | successful |
Note: While RTK could return the rarefied data, mothur only reports diversity.