Guilhem Sempéré1,2,3, Adrien Pétel4, Magsen Abbé1,3, Pierre Lefeuvre4, Philippe Roumagnac5,6, Frédéric Mahé5,6, Gaël Baurens1,3, Denis Filloux5,6. 1. CIRAD, UMR INTERTRYP, F-34398 Montpellier, France. 2. South Green Bioinformatics Platform, Bioversity, CIRAD, INRAE, IRD, Montpellier, France. 3. INTERTRYP, Université de Montpellier, CIRAD, IRD, 34398 Montpellier, France. 4. CIRAD, UMR PVBMT, F-97410 St Pierre, La Réunion, France. 5. CIRAD, BGPI, 34398 Montpellier, France. 6. BGPI, INRAE, CIRAD, Institut Agro, Université de Montpellier, 34398 Montpellier, France.
Abstract
BACKGROUND: Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. RESULTS: metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. CONCLUSION: metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.
BACKGROUND: Efficiently managing large, heterogeneous data in a structured yet flexible way is a challenge to research laboratories working with genomic data. Specifically regarding both shotgun- and metabarcoding-based metagenomics, while online reference databases and user-friendly tools exist for running various types of analyses (e.g., Qiime, Mothur, Megan, IMG/VR, Anvi'o, Qiita, MetaVir), scientists lack comprehensive software for easily building scalable, searchable, online data repositories on which they can rely during their ongoing research. RESULTS: metaXplor is a scalable, distributable, fully web-interfaced application for managing, sharing, and exploring metagenomic data. Being based on a flexible NoSQL data model, it has few constraints regarding dataset contents and thus proves useful for handling outputs from both shotgun and metabarcoding techniques. By supporting incremental data feeding and providing means to combine filters on all imported fields, it allows for exhaustive content browsing, as well as rapid narrowing to find specific records. The application also features various interactive data visualization tools, ways to query contents by BLASTing external sequences, and an integrated pipeline to enrich assignments with phylogenetic placements. The project home page provides the URL of a live instance allowing users to test the system on public data. CONCLUSION: metaXplor allows efficient management and exploration of metagenomic data. Its availability as a set of Docker containers, making it easy to deploy on academic servers, on the cloud, or even on personal computers, will facilitate its adoption.
Authors: Belinda Giardine; Cathy Riemer; Ross C Hardison; Richard Burhans; Laura Elnitski; Prachi Shah; Yi Zhang; Daniel Blankenberg; Istvan Albert; James Taylor; Webb Miller; W James Kent; Anton Nekrutenko Journal: Genome Res Date: 2005-09-16 Impact factor: 9.043
Authors: Pierre Lefeuvre; Darren P Martin; Santiago F Elena; Dionne N Shepherd; Philippe Roumagnac; Arvind Varsani Journal: Nat Rev Microbiol Date: 2019-07-16 Impact factor: 60.633
Authors: David Paez-Espino; I-Min A Chen; Krishna Palaniappan; Anna Ratner; Ken Chu; Ernest Szeto; Manoj Pillay; Jinghua Huang; Victor M Markowitz; Torben Nielsen; Marcel Huntemann; T B K Reddy; Georgios A Pavlopoulos; Matthew B Sullivan; Barbara J Campbell; Feng Chen; Katherine McMahon; Steve J Hallam; Vincent Denef; Ricardo Cavicchioli; Sean M Caffrey; Wolfgang R Streit; John Webster; Kim M Handley; Ghasem H Salekdeh; Nicolas Tsesmetzis; Joao C Setubal; Phillip B Pope; Wen-Tso Liu; Adam R Rivers; Natalia N Ivanova; Nikos C Kyrpides Journal: Nucleic Acids Res Date: 2016-10-30 Impact factor: 16.971
Authors: A Murat Eren; Özcan C Esen; Christopher Quince; Joseph H Vineis; Hilary G Morrison; Mitchell L Sogin; Tom O Delmont Journal: PeerJ Date: 2015-10-08 Impact factor: 2.984
Authors: Antonio Gonzalez; Jose A Navas-Molina; Tomasz Kosciolek; Daniel McDonald; Yoshiki Vázquez-Baeza; Gail Ackermann; Jeff DeReus; Stefan Janssen; Austin D Swafford; Stephanie B Orchanian; Jon G Sanders; Joshua Shorenstein; Hannes Holste; Semar Petrus; Adam Robbins-Pianka; Colin J Brislawn; Mingxun Wang; Jai Ram Rideout; Evan Bolyen; Matthew Dillon; J Gregory Caporaso; Pieter C Dorrestein; Rob Knight Journal: Nat Methods Date: 2018-10-01 Impact factor: 28.547