BACKGROUND: Metagenomic next-generation sequencing (mNGS) has enabled the rapid, unbiased detection and identification of microbes without pathogen-specific reagents, culturing, or a priori knowledge of the microbial landscape. mNGS data analysis requires a series of computationally intensive processing steps to accurately determine the microbial composition of a sample. Existing mNGS data analysis tools typically require bioinformatics expertise and access to local server-class hardware resources. For many research laboratories, this presents an obstacle, especially in resource-limited environments. FINDINGS: We present IDseq, an open source cloud-based metagenomics pipeline and service for global pathogen detection and monitoring (https://idseq.net). The IDseq Portal accepts raw mNGS data, performs host and quality filtration steps, then executes an assembly-based alignment pipeline, which results in the assignment of reads and contigs to taxonomic categories. The taxonomic relative abundances are reported and visualized in an easy-to-use web application to facilitate data interpretation and hypothesis generation. Furthermore, IDseq supports environmental background model generation and automatic internal spike-in control recognition, providing statistics that are critical for data interpretation. IDseq was designed with the specific intent of detecting novel pathogens. Here, we benchmark novel virus detection capability using both synthetically evolved viral sequences and real-world samples, including IDseq analysis of a nasopharyngeal swab sample acquired and processed locally in Cambodia from a tourist from Wuhan, China, infected with the recently emergent SARS-CoV-2. CONCLUSION: The IDseq Portal reduces the barrier to entry for mNGS data analysis and enables bench scientists, clinicians, and bioinformaticians to gain insight from mNGS datasets for both known and novel pathogens.
BACKGROUND: Metagenomic next-generation sequencing (mNGS) has enabled the rapid, unbiased detection and identification of microbes without pathogen-specific reagents, culturing, or a priori knowledge of the microbial landscape. mNGS data analysis requires a series of computationally intensive processing steps to accurately determine the microbial composition of a sample. Existing mNGS data analysis tools typically require bioinformatics expertise and access to local server-class hardware resources. For many research laboratories, this presents an obstacle, especially in resource-limited environments. FINDINGS: We present IDseq, an open source cloud-based metagenomics pipeline and service for global pathogen detection and monitoring (https://idseq.net). The IDseq Portal accepts raw mNGS data, performs host and quality filtration steps, then executes an assembly-based alignment pipeline, which results in the assignment of reads and contigs to taxonomic categories. The taxonomic relative abundances are reported and visualized in an easy-to-use web application to facilitate data interpretation and hypothesis generation. Furthermore, IDseq supports environmental background model generation and automatic internal spike-in control recognition, providing statistics that are critical for data interpretation. IDseq was designed with the specific intent of detecting novel pathogens. Here, we benchmark novel virus detection capability using both synthetically evolved viral sequences and real-world samples, including IDseq analysis of a nasopharyngeal swab sample acquired and processed locally in Cambodia from a tourist from Wuhan, China, infected with the recently emergent SARS-CoV-2. CONCLUSION: The IDseq Portal reduces the barrier to entry for mNGS data analysis and enables bench scientists, clinicians, and bioinformaticians to gain insight from mNGS datasets for both known and novel pathogens.
Authors: Michael R Wilson; Brian D O'Donovan; Jeffrey M Gelfand; Hannah A Sample; Felicia C Chow; John P Betjemann; Maulik P Shah; Megan B Richie; Mark P Gorman; Rula A Hajj-Ali; Leonard H Calabrese; Kelsey C Zorn; Eric D Chow; John E Greenlee; Jonathan H Blum; Gary Green; Lillian M Khan; Debarko Banerji; Charles Langelier; Chloe Bryson-Cahn; Whitney Harrington; Jairam R Lingappa; Niraj M Shanbhag; Ari J Green; Bruce J Brew; Ariane Soldatos; Luke Strnad; Sarah B Doernberg; Cheryl A Jay; Vanja Douglas; S Andrew Josephson; Joseph L DeRisi Journal: JAMA Neurol Date: 2018-08-01 Impact factor: 18.302
Authors: Nathan L Yozwiak; Peter Skewes-Cox; Mark D Stenglein; Angel Balmaseda; Eva Harris; Joseph L DeRisi Journal: PLoS Negl Trop Dis Date: 2012-02-07
Authors: Senjuti Saha; Akshaya Ramesh; Katrina Kalantar; Roly Malaker; Md Hasanuzzaman; Lillian M Khan; Madeline Y Mayday; M S I Sajib; Lucy M Li; Charles Langelier; Hafizur Rahman; Emily D Crawford; Cristina M Tato; Maksuda Islam; Yun-Fang Juan; Charles de Bourcy; Boris Dimitrov; James Wang; Jennifer Tang; Jonathan Sheu; Rebecca Egger; Tiago Rodrigues De Carvalho; Michael R Wilson; Samir K Saha; Joseph L DeRisi Journal: mBio Date: 2019-12-17 Impact factor: 7.867
Authors: Aleksandr Morgulis; George Coulouris; Yan Raytselis; Thomas L Madden; Richa Agarwala; Alejandro A Schäffer Journal: Bioinformatics Date: 2008-06-21 Impact factor: 6.937
Authors: Christina Yek; Andrea R Pacheco; Manu Vanaerschot; Jennifer A Bohl; Elizabeth Fahsbender; Andrés Aranda-Díaz; Sreyngim Lay; Sophana Chea; Meng Heng Oum; Chanthap Lon; Cristina M Tato; Jessica E Manning Journal: Front Epidemiol Date: 2022-08-15
Authors: Sharline Madera; Amy Kistler; Hafaliana C Ranaivoson; Vida Ahyong; Angelo Andrianiaina; Santino Andry; Vololoniaina Raharinosy; Tsiry H Randriambolamanantsoa; Ny Anjara Fifi Ravelomanantsoa; Cristina M Tato; Joseph L DeRisi; Hector C Aguilar; Vincent Lacoste; Philippe Dussart; Jean-Michel Heraud; Cara E Brook Journal: J Virol Date: 2022-08-30 Impact factor: 6.549
Authors: Katrina L Kalantar; Lucile Neyton; Carolyn S Calfee; Charles R Langelier; Mazin Abdelghany; Eran Mick; Alejandra Jauregui; Saharai Caldera; Paula Hayakawa Serpa; Rajani Ghale; Jack Albright; Aartik Sarma; Alexandra Tsitsiklis; Aleksandra Leligdowicz; Stephanie A Christenson; Kathleen Liu; Kirsten N Kangelaris; Carolyn Hendrickson; Pratik Sinha; Antonio Gomez; Norma Neff; Angela Pisco; Sarah B Doernberg; Joseph L Derisi; Michael A Matthay Journal: Nat Microbiol Date: 2022-10-20 Impact factor: 30.964
Authors: Matt S Zinter; A Birgitta Versluys; Caroline A Lindemans; Madeline Y Mayday; Gustavo Reyes; Sara Sunshine; Marilynn Chan; Elizabeth K Fiorino; Maria Cancio; Sabine Prevaes; Marina Sirota; Michael A Matthay; Sandhya Kharbanda; Christopher C Dvorak; Jaap J Boelens; Joseph L DeRisi Journal: Sci Transl Med Date: 2022-03-09 Impact factor: 19.319
Authors: Matt S Zinter; Caroline A Lindemans; Birgitta A Versluys; Madeline Y Mayday; Sara Sunshine; Gustavo Reyes; Marina Sirota; Anil Sapru; Michael A Matthay; Sandhya Kharbanda; Christopher C Dvorak; Jaap J Boelens; Joseph L DeRisi Journal: Blood Date: 2021-03-25 Impact factor: 22.113
Authors: Alexandra Tsitsiklis; Beth Zha; Ashley Byrne; Catherine DeVoe; Sophia Levan; Elze Rackaityte; Sara Sunshine; Eran Mick; Rajani Ghale; Alejandra Jauregui; Norma Neff; Aartik Sarma; Paula Serpa; Thomas Deiss; Amy Kistler; Sidney Carrillo; K Mark Ansel; Aleksandra Leligdowicz; Stephanie Christenson; Norman Jones; Bing Wu; Spyros Darmanis; Michael Matthay; Susan Lynch; Joseph DeRisi; Comet Consortium; Carolyn Hendrickson; Kirsten Kangelaris; Matthew Krummel; Prescott Woodruff; David Erle; Oren Rosenberg; Carolyn Calfee; Charles Langelier Journal: Res Sq Date: 2021-04-23
Authors: Ruth E Timme; William J Wolfgang; Maria Balkey; Sai Laxmi Gubbala Venkata; Robyn Randolph; Marc Allard; Errol Strain Journal: One Health Outlook Date: 2020-10-19
Authors: Eran Mick; Jack Kamm; Angela Oliveira Pisco; Kalani Ratnasiri; Jennifer M Babik; Gloria Castañeda; Joseph L DeRisi; Angela M Detweiler; Samantha L Hao; Kirsten N Kangelaris; G Renuka Kumar; Lucy M Li; Sabrina A Mann; Norma Neff; Priya A Prasad; Paula Hayakawa Serpa; Sachin J Shah; Natasha Spottiswoode; Michelle Tan; Carolyn S Calfee; Stephanie A Christenson; Amy Kistler; Charles Langelier Journal: Nat Commun Date: 2020-11-17 Impact factor: 14.919