Paweł P Łabaj1,2, David P Kreil3. 1. APART Fellow, Austrian Academy of Science, Vienna, Austria. pawel.labaj@boku.ac.at. 2. Chair of Bioinformatics Research Group, Boku University, Vienna, Austria. pawel.labaj@boku.ac.at. 3. Chair of Bioinformatics Research Group, Boku University, Vienna, Austria.
Abstract
BACKGROUND: The MAQC/SEQC consortium has recently compiled a key benchmark that can serve for testing the latest developments in analysis tools for microarray and RNA-seq expression profiling. Such objective benchmarks are required for basic and applied research, and can be critical for clinical and regulatory outcomes. Going beyond the first comparisons presented in the original SEQC study, we here present extended benchmarks including effect strengths typical of common experiments. RESULTS: With artefacts removed by factor analysis and additional filters, for genome scale surveys, the reproducibility of differential expression calls typically exceed 80% for all tool combinations examined. This directly reflects the robustness of results and reproducibility across different studies. Similar improvements are observed for the top ranked candidates with the strongest relative expression change, although here some tools clearly perform better than others, with typical reproducibility ranging from 60 to 93%. CONCLUSIONS: In our benchmark of alternative tools for RNA-seq data analysis we demonstrated the benefits that can be gained by analysing results in the context of other experiments employing a reference standard sample. This allowed the computational identification and removal of hidden confounders, for instance, by factor analysis. In itself, this already substantially improved the empirical False Discovery Rate (eFDR) without changing the overall landscape of sensitivity. Further filtering of false positives, however, is required to obtain acceptable eFDR levels. Appropriate filters noticeably improved agreement of differentially expressed genes both across sites and between alternative differential expression analysis pipelines. REVIEWERS: An extended abstract of this research paper was selected for the CAMDA Satellite Meeting to ISMB 2015 by the CAMDA Programme Committee. The full research paper then underwent one round of Open Peer Review under a responsible CAMDA Programme Committee member, Lan Hu, PhD (Bio-Rad Laboratories, Digital Biology Center-Cambridge). Open Peer Review was provided by Charlotte Soneson, PhD (University of Zürich) and Michał Okoniewski, PhD (ETH Zürich). The Reviewer Comments section shows the full reviews and author responses.
BACKGROUND: The MAQC/SEQC consortium has recently compiled a key benchmark that can serve for testing the latest developments in analysis tools for microarray and RNA-seq expression profiling. Such objective benchmarks are required for basic and applied research, and can be critical for clinical and regulatory outcomes. Going beyond the first comparisons presented in the original SEQC study, we here present extended benchmarks including effect strengths typical of common experiments. RESULTS: With artefacts removed by factor analysis and additional filters, for genome scale surveys, the reproducibility of differential expression calls typically exceed 80% for all tool combinations examined. This directly reflects the robustness of results and reproducibility across different studies. Similar improvements are observed for the top ranked candidates with the strongest relative expression change, although here some tools clearly perform better than others, with typical reproducibility ranging from 60 to 93%. CONCLUSIONS: In our benchmark of alternative tools for RNA-seq data analysis we demonstrated the benefits that can be gained by analysing results in the context of other experiments employing a reference standard sample. This allowed the computational identification and removal of hidden confounders, for instance, by factor analysis. In itself, this already substantially improved the empirical False Discovery Rate (eFDR) without changing the overall landscape of sensitivity. Further filtering of false positives, however, is required to obtain acceptable eFDR levels. Appropriate filters noticeably improved agreement of differentially expressed genes both across sites and between alternative differential expression analysis pipelines. REVIEWERS: An extended abstract of this research paper was selected for the CAMDA Satellite Meeting to ISMB 2015 by the CAMDA Programme Committee. The full research paper then underwent one round of Open Peer Review under a responsible CAMDA Programme Committee member, Lan Hu, PhD (Bio-Rad Laboratories, Digital Biology Center-Cambridge). Open Peer Review was provided by Charlotte Soneson, PhD (University of Zürich) and Michał Okoniewski, PhD (ETH Zürich). The Reviewer Comments section shows the full reviews and author responses.
Authors: Alexander Dobin; Carrie A Davis; Felix Schlesinger; Jorg Drenkow; Chris Zaleski; Sonali Jha; Philippe Batut; Mark Chaisson; Thomas R Gingeras Journal: Bioinformatics Date: 2012-10-25 Impact factor: 6.937
Authors: Cole Trapnell; David G Hendrickson; Martin Sauvageau; Loyal Goff; John L Rinn; Lior Pachter Journal: Nat Biotechnol Date: 2012-12-09 Impact factor: 54.908
Authors: Ioannis Kafantaris; Christina Tsadila; Marios Nikolaidis; Eleni Tsavea; Tilemachos G Dimitriou; Ioannis Iliopoulos; Grigoris D Amoutzias; Dimitris Mossialos Journal: Foods Date: 2021-04-24
Authors: Yi-Pei Chen; Laura B Ferguson; Nihal A Salem; George Zheng; R Dayne Mayfield; Mohammed Eslami Journal: Bioinformatics Date: 2021-09-27 Impact factor: 6.931
Authors: Olga Zolotareva; Reza Nasirigerdeh; Julian Matschinske; Reihaneh Torkzadehmahani; Mohammad Bakhtiari; Tobias Frisch; Julian Späth; David B Blumenthal; Amir Abbasinejad; Paolo Tieri; Georgios Kaissis; Daniel Rückert; Nina K Wenke; Markus List; Jan Baumbach Journal: Genome Biol Date: 2021-12-14 Impact factor: 13.583
Authors: Benjamin J Garcia; Joshua Urrutia; George Zheng; Diveena Becker; Carolyn Corbet; Paul Maschhoff; Alexander Cristofaro; Niall Gaffney; Matthew Vaughn; Uma Saxena; Yi-Pei Chen; D Benjamin Gordon; Mohammed Eslami Journal: Synth Biol (Oxf) Date: 2022-08-23