Abigail L Lind1, Katherine S Pollard2,3,4,5,6. 1. Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA. 2. Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA. katherine.pollard@gladstone.ucsf.edu. 3. Institute for Human Genetics, University of California, San Francisco, CA, USA. katherine.pollard@gladstone.ucsf.edu. 4. Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, USA. katherine.pollard@gladstone.ucsf.edu. 5. Institute for Computational Health Sciences, University of California, San Francisco, CA, USA. katherine.pollard@gladstone.ucsf.edu. 6. Chan Zuckerberg Biohub, San Francisco, CA, USA. katherine.pollard@gladstone.ucsf.edu.
Abstract
BACKGROUND: Microbial eukaryotes are found alongside bacteria and archaea in natural microbial systems, including host-associated microbiomes. While microbial eukaryotes are critical to these communities, they are challenging to study with shotgun sequencing techniques and are therefore often excluded. RESULTS: Here, we present EukDetect, a bioinformatics method to identify eukaryotes in shotgun metagenomic sequencing data. Our approach uses a database of 521,824 universal marker genes from 241 conserved gene families, which we curated from 3713 fungal, protist, non-vertebrate metazoan, and non-streptophyte archaeplastida genomes and transcriptomes. EukDetect has a broad taxonomic coverage of microbial eukaryotes, performs well on low-abundance and closely related species, and is resilient against bacterial contamination in eukaryotic genomes. Using EukDetect, we describe the spatial distribution of eukaryotes along the human gastrointestinal tract, showing that fungi and protists are present in the lumen and mucosa throughout the large intestine. We discover that there is a succession of eukaryotes that colonize the human gut during the first years of life, mirroring patterns of developmental succession observed in gut bacteria. By comparing DNA and RNA sequencing of paired samples from human stool, we find that many eukaryotes continue active transcription after passage through the gut, though some do not, suggesting they are dormant or nonviable. We analyze metagenomic data from the Baltic Sea and find that eukaryotes differ across locations and salinity gradients. Finally, we observe eukaryotes in Arabidopsis leaf samples, many of which are not identifiable from public protein databases. CONCLUSIONS: EukDetect provides an automated and reliable way to characterize eukaryotes in shotgun sequencing datasets from diverse microbiomes. We demonstrate that it enables discoveries that would be missed or clouded by false positives with standard shotgun sequence analysis. EukDetect will greatly advance our understanding of how microbial eukaryotes contribute to microbiomes. Video abstract.
BACKGROUND: Microbial eukaryotes are found alongside bacteria and archaea in natural microbial systems, including host-associated microbiomes. While microbial eukaryotes are critical to these communities, they are challenging to study with shotgun sequencing techniques and are therefore often excluded. RESULTS: Here, we present EukDetect, a bioinformatics method to identify eukaryotes in shotgun metagenomic sequencing data. Our approach uses a database of 521,824 universal marker genes from 241 conserved gene families, which we curated from 3713 fungal, protist, non-vertebrate metazoan, and non-streptophyte archaeplastida genomes and transcriptomes. EukDetect has a broad taxonomic coverage of microbial eukaryotes, performs well on low-abundance and closely related species, and is resilient against bacterial contamination in eukaryotic genomes. Using EukDetect, we describe the spatial distribution of eukaryotes along the human gastrointestinal tract, showing that fungi and protists are present in the lumen and mucosa throughout the large intestine. We discover that there is a succession of eukaryotes that colonize the human gut during the first years of life, mirroring patterns of developmental succession observed in gut bacteria. By comparing DNA and RNA sequencing of paired samples from human stool, we find that many eukaryotes continue active transcription after passage through the gut, though some do not, suggesting they are dormant or nonviable. We analyze metagenomic data from the Baltic Sea and find that eukaryotes differ across locations and salinity gradients. Finally, we observe eukaryotes in Arabidopsis leaf samples, many of which are not identifiable from public protein databases. CONCLUSIONS: EukDetect provides an automated and reliable way to characterize eukaryotes in shotgun sequencing datasets from diverse microbiomes. We demonstrate that it enables discoveries that would be missed or clouded by false positives with standard shotgun sequence analysis. EukDetect will greatly advance our understanding of how microbial eukaryotes contribute to microbiomes. Video abstract.
Authors: Duy Tin Truong; Eric A Franzosa; Timothy L Tickle; Matthias Scholz; George Weingart; Edoardo Pasolli; Adrian Tett; Curtis Huttenhower; Nicola Segata Journal: Nat Methods Date: 2015-10 Impact factor: 28.547
Authors: Jose J Limon; Jie Tang; Dalin Li; Andrea J Wolf; Kathrin S Michelsen; Vince Funari; Matthew Gargus; Christopher Nguyen; Purnima Sharma; Viviana I Maymi; Iliyan D Iliev; Joseph H Skalski; Jordan Brown; Carol Landers; James Borneman; Jonathan Braun; Stephan R Targan; Dermot P B McGovern; David M Underhill Journal: Cell Host Microbe Date: 2019-03-05 Impact factor: 21.023
Authors: Evgenia V Kriventseva; Dmitry Kuznetsov; Fredrik Tegenfeldt; Mosè Manni; Renata Dias; Felipe A Simão; Evgeny M Zdobnov Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971
Authors: Alexandre Almeida; Stephen Nayfach; Miguel Boland; Francesco Strozzi; Martin Beracochea; Zhou Jason Shi; Katherine S Pollard; Ekaterina Sakharova; Donovan H Parks; Philip Hugenholtz; Nicola Segata; Nikos C Kyrpides; Robert D Finn Journal: Nat Biotechnol Date: 2020-07-20 Impact factor: 54.908
Authors: Samah S Abuzahrah; Mohammed N Baeshen; Ali Alkaladi; Noor M Bataweel; Ahmed M Alhejen; Hayam Abdelkader Journal: Saudi J Biol Sci Date: 2022-06-26 Impact factor: 4.052
Authors: Bryan D Merrill; Matthew M Carter; Matthew R Olm; Dylan Dahan; Surya Tripathi; Sean P Spencer; Brian Yu; Sunit Jain; Norma Neff; Aashish R Jha; Erica D Sonnenburg; Justin L Sonnenburg Journal: bioRxiv Date: 2022-10-06