Literature DB >> 25184604

The Earth Microbiome project: successes and aspirations.

Jack A Gilbert, Janet K Jansson, Rob Knight.   

Abstract

Entities:  

Mesh:

Year:  2014        PMID: 25184604      PMCID: PMC4141107          DOI: 10.1186/s12915-014-0069-1

Source DB:  PubMed          Journal:  BMC Biol        ISSN: 1741-7007            Impact factor:   7.431


× No keyword cloud information.
The Earth Microbiome Project (EMP) was launched in August 2010, with the ambitious aim of constructing a global catalogue of the uncultured microbial diversity of this planet. The primary vision of the Earth Microbiome Project, to process the microbial diversity and functional potential from approximately 200,000 environmental samples, marks it as an undertaking so massive that it was at first considered to be pure folly (as late as 2012, Jonathan Eisen was quoted in Nature as saying ‘Knight and Gilbert literally talk about sampling the entire planet. It is ludicrous and not feasible - yet they are doing it’ [1]). The initial concept arose out of a Department of the Environment (DOE) sponsored workshop on the promise of terabase-scale sequencing in Snowbird, Utah, designed to inspire research ideas using new technology to revolutionize microbial ecology and our understanding of the microbial world [2]. Many other exciting projects also evolved from that meeting, including efforts to extend the sequencing of type strains of cultured bacterial taxa, which in itself has become the Microbial Earth Project [3]. In October 2010, EMP pioneers held a small workshop at Argonne National Laboratories to determine the most effective way to jumpstart such an initiative. At this meeting, we agreed that the only feasible route to acquire and process 200,000 samples was through crowdsourcing, soliciting donations of samples from researchers around the world. This was identified as a key flaw in the design, on the grounds that it would not be possible to convince researchers to part with samples that had been painstakingly collected for inclusion in a single effort [4]. Fortunately, the participants’ generosity has greatly exceeded what we could have hoped for, and the crowdsourcing approach has been a success. We floated this strategy initially as a potentially viable approach based on the precedent of existing programs that followed broadly similar designs, especially the International Census of Marine Microbes [5] and the Human Microbiome Project [6]. The basic design was founded on the principle of coordinated sample collection, and standardization of contextual metadata acquisition, DNA extraction, PCR and amplicon and shotgun sequencing approaches, and an open-source analytical platform with free, unrestricted access to both the amplicon and metadata immediately following completion of the analysis. Initially the effort was funded primarily by unrestricted funds available to the principle investigators through Argonne National Laboratory, Lawrence Berkeley National Laboratory, the Howard Hughes Medical Institute, and donations from corporate sponsors. Under this effort, the Earth Microbiome Project committee developed the standard protocols [7], contacted and collaborated with researchers from numerous different microbial ecology disciplines, from human, animal, plant, terrestrial, marine, freshwater, sediment, air, built environment and every intersection of these ecosystems. By August 2012, less than 2 years since its initiation, the Earth Microbiome Project had processed approximately 7,000 environmental samples, generating 16S rRNA amplicon data and releasing these data using an open portal through the Quantitative Insights into Microbial Ecology (QIIME) database. In June 2013, the EMP received awards from the WM Keck Foundation and the John Templeton Foundation to support activities to bring the catalogue up to 50,000 samples processed, and as of July 2014 we have reached over 30,000 (compared with the phase 1 Human Microbiome Project amplicon analysis of 5,771 samples [8]). In its planning phase, the EMP proposed the co-analysis of samples using metagenomics and metabolic modeling of ecosystems, and these aims are still viable, but such efforts have to date been more targeted to specific environments and studies. As it stands, the EMP represents the largest effort to characterize the diversity, distribution, and structure of microbial ecosystems across the earth, achievable only through coordinated collaboration of all of the independent research projects (166) that comprise the EMP. Although each hypothesis-driven study provided by our collaborators can tell its own story, the real power of the EMP is through meta-analysis of these data, empowering researchers to develop and use samples acquired from myriad ecosystems to test hypotheses in microbial ecology. Importantly, this pooled data resource also provides an unparalleled opportunity to contextualize individual studies by defining the patterns they see in a global context. These large-scale meta-analyses can enable researchers to ask unique questions regarding the biogeography, dynamic dispersal, and ecology of the microbial planet.

Current studies, ecosystem coverage, and immediate observations

In the currently available EMP database (as of July 2014) [9] there are samples acquired from >200 collaborators, comprising more than 40 different biomes, defined for broad categories including marine pelagic water, freshwater lake sediment, human-associated, and so on. At a ‘30,000 feet’ perspective the EMP is identifying the environmental characteristics that correlate with microbial community structure within and between these different biomes. However, as the EMP is a collection of individual projects, each with a core hypothesis, it is also possible to discuss the immediate observations associated with individual studies. For example, exploration of human saliva from obese versus normal-weight individuals showed that while saliva was able to alter the aromatic properties of wine, only a few microbial taxa were likely to be responsible for this [10]. This preliminary study shows that oral microbes may influence the aromatic properties of food and drink, altering our satiation response. In soil systems, microbial communities from prairie soils across the Midwest of the United States of America were sequenced by the EMP. This ecosystem has been mostly replaced through agricultural land-use, and this study showed that the major shifts in their composition are driven almost exclusively by the changing relative abundance of Verrucomicrobia and its influence on carbon dynamics [11]. These analyses could be useful in helping improve prairie restoration efforts. In deep soil samples from the Russian permafrost, the EMP characterized microbial communities associated with buried organic matter, helping to identify the bacteria that were degrading the soil organic matter in these systems [12]. In deep-sea sediments from the Gulf of Mexico, the EMP data have provided understanding of how the microbial communities responded to the oil pollution from the Deepwater Horizon Oil Spill [13,14]. Another example of investigating human impact is the analysis of freshwater river sediments along a gradient of human influence, whereby the EMP data on the microbial communities demonstrate impact-specific signals [15]. The diversity of study sites and research questions embedded in these first 30,000 samples is extraordinary, yet this is just the tip of the iceberg. Initial analysis of 10,000 of the samples identified approximately 6 million bacterial taxonomic units (genus or species level taxa), only a small fraction of which could be mapped to known phylogenies using 16S rRNA databases such as GreenGenes [16]. The frequency and distribution of these species can enable us to address interesting questions, for example, regarding the distribution of taxa across different soil ecosystems; the EMP datasets suggest that there is considerable overlap in taxa between sites, with organisms that are abundant at one location being extremely rare in another location, as previously demonstrated from marine sites [17]. A small number of concerns regarding the existing data have been raised by communities focusing on specific systems or taxa. For example, as with all studies using PCR, there are biases associated with the EMP PCR primers: they are not efficient at amplifying marine Pelagibacter ubique targets. As a result, new primers have been designed that should be more efficient in amplifying Pelagibacter, an important taxon in marine systems; however, we need to determine how efficient these new primers will be at amplifying all the other bacteria from other environments. As such, a study is underway to investigate whether rescuing Pelagibacter has deleterious consequences for other taxa or systems. However, because DNA extraction protocols themselves can have different biases depending on the environmental matrix from which the DNA is extracted [18], and PCR reagents can have contaminants that may influence amplification [19], the number of potential biases that could influence analysis is large and the key for cross-system analyses is consistent protocols. We are taking all sensible precautions to catalogue and determine potential biases: by recording all procedural and analytical variables it will be possible to determine which specific protocol elements may influence interpretation and whether the effects of these technical sources of variation limit our ability to identify important factors structuring microbial diversity.

Creating an EMP operation taxonomic unit table

One major challenge has been creating a master table delineating the abundance of each type of organism in each environment. With 7,000 samples for the Shenzhen meeting in 2011 [20], existing tools could barely handle the data load. In particular, the operation taxonomic unit (OTU) table, which converts the raw sequence data into a sample-by-OTU table giving the taxon abundances, strained the limits of what could be done in the traditional ‘dense’ format in which there is a slot for the abundance of each possible taxon in each environment, even if that slot has a zero count. Simply loading the table into memory and accessing specific taxa or samples became impossible as the dataset grew. Accordingly, we developed the Biological Observation Matrix (BIOM) file format [21], which reduced an early version of the EMP OTU table (6,164 samples by 7,082 OTUs) from 175 MB to 12 MB. Further improvement has been achieved by the recent move in BIOM 2.1 to HDF5, a file format used widely by physicists, climate scientists, and others needing random access to subsets of vast files. With these improvements, which are being developed fully open-source on the github repository [22], we expect that interested parties will be able to manipulate the full EMP OTU table on their laptops rather than requiring large-scale compute resources. There are many different methods for analyzing the sequence data to obtain clusters of related sequences, each with advantages and drawbacks. For example, clustering sequences de novo produces a gold standard sequence cluster (a robust classification of a taxonomically similar group of sequences), but is very slow, while a reference-based protocol, where sequences are matched in a phylogenetic tree, is very fast but throws out sequences that fail to hit a reference. Another important challenge is visualization. QIIME [23] is the analysis architecture primarily used by the EMP, and it has long relied on KiNG [24], a molecular graphics package, for producing three-dimensional principal coordinates plots, essentially treating the community locations as atoms in a very curious molecule. However, as the size of the EMP dataset continued to grow, and the environmental contextual data became richer, the strategy of creating different views of the dataset colored by each field of contextual data (for example pH, dissolved organic carbon, and each of the hundreds of other variables captured by samples in the EMP) became unwieldy. To overcome these challenges, and to provide a three-dimensional graphics component that is directly embeddable in current web technologies, we developed EMPeror [25], software that uses current web standards such as HTML5 and OpenGL, to display even vast datasets and to explore and to recolor them dynamically.

The future

The EMP will continue to grow and adapt as new collaborators and new technologies are added. Generating the taxon matrix in BIOM format for the existing 30,000 samples will help us to provide advice on the biomes and questions that should be targeted for the next 20,000 samples. We are also exploring metagenomic analyses for studies where the data can be used to test hypotheses regarding the ecology of microbial metabolic function (for example, [11,13,15]). At present, metagenomic data associated with individual studies have been made available through traditional routes (EBI, NCBI submissions), but we are working towards explicit submission and analysis pipelines for these data, including downstream analyses such as genome assemblies and metabolic pathway reconstruction. The success of the EMP has been in generating a coordinated exploration of the microbial world, and in providing the facility for data generation to collaborators who previously did not have such capacity. Primarily this has been achieved through the generation of open access data and analysis platforms that facilitate interpretation. As we move forward, we will continue to explore new avenues for collaboration, including potentially going beyond the Earth to explore extra-terrestrial locations.
  18 in total

1.  Nitrogenase genes in PCR and RT-PCR reagents: implications for studies of diversity of functional genes.

Authors:  Jonathan P Zehr; Lori L Crumbliss; Matthew J Church; Enoma O Omoregie; Bethany D Jenkins
Journal:  Biotechniques       Date:  2003-11       Impact factor: 1.993

2.  Microbes en masse: The sequencing machine.

Authors:  Virginia Gewin
Journal:  Nature       Date:  2012-07-11       Impact factor: 49.962

3.  Evidence for a persistent microbial seed bank throughout the global ocean.

Authors:  Sean M Gibbons; J Gregory Caporaso; Meg Pirrung; Dawn Field; Rob Knight; Jack A Gilbert
Journal:  Proc Natl Acad Sci U S A       Date:  2013-03-04       Impact factor: 11.205

4.  QIIME allows analysis of high-throughput community sequencing data.

Authors:  J Gregory Caporaso; Justin Kuczynski; Jesse Stombaugh; Kyle Bittinger; Frederic D Bushman; Elizabeth K Costello; Noah Fierer; Antonio Gonzalez Peña; Julia K Goodrich; Jeffrey I Gordon; Gavin A Huttley; Scott T Kelley; Dan Knights; Jeremy E Koenig; Ruth E Ley; Catherine A Lozupone; Daniel McDonald; Brian D Muegge; Meg Pirrung; Jens Reeder; Joel R Sevinsky; Peter J Turnbaugh; William A Walters; Jeremy Widmann; Tanya Yatsunenko; Jesse Zaneveld; Rob Knight
Journal:  Nat Methods       Date:  2010-04-11       Impact factor: 28.547

5.  Distinct microbial communities associated with buried soils in the Siberian tundra.

Authors:  Antje Gittel; Jiří Bárta; Iva Kohoutová; Robert Mikutta; Sarah Owens; Jack Gilbert; Jörg Schnecker; Birgit Wild; Bjarte Hannisdal; Joeran Maerz; Nikolay Lashchinskiy; Petr Capek; Hana Santrůčková; Norman Gentsch; Olga Shibistova; Georg Guggenberger; Andreas Richter; Vigdis L Torsvik; Christa Schleper; Tim Urich
Journal:  ISME J       Date:  2013-12-12       Impact factor: 10.302

6.  The Earth Microbiome Project: Meeting report of the "1 EMP meeting on sample selection and acquisition" at Argonne National Laboratory October 6 2010.

Authors:  Jack A Gilbert; Folker Meyer; Janet Jansson; Jeff Gordon; Norman Pace; James Tiedje; Ruth Ley; Noah Fierer; Dawn Field; Nikos Kyrpides; Frank-Oliver Glöckner; Hans-Peter Klenk; K Eric Wommack; Elizabeth Glass; Kathryn Docherty; Rachel Gallery; Rick Stevens; Rob Knight
Journal:  Stand Genomic Sci       Date:  2010-12-25

7.  Metagenomics reveals sediment microbial community response to Deepwater Horizon oil spill.

Authors:  Olivia U Mason; Nicole M Scott; Antonio Gonzalez; Adam Robbins-Pianka; Jacob Bælum; Jeffrey Kimbrel; Nicholas J Bouskill; Emmanuel Prestat; Sharon Borglin; Dominique C Joyner; Julian L Fortney; Diogo Jurelevicius; William T Stringfellow; Lisa Alvarez-Cohen; Terry C Hazen; Rob Knight; Jack A Gilbert; Janet K Jansson
Journal:  ISME J       Date:  2014-01-23       Impact factor: 10.302

8.  Human and environmental impacts on river sediment microbial communities.

Authors:  Sean M Gibbons; Edwin Jones; Angelita Bearquiver; Frederick Blackwolf; Wayne Roundstone; Nicole Scott; Jeff Hooker; Robert Madsen; Maureen L Coleman; Jack A Gilbert
Journal:  PLoS One       Date:  2014-05-19       Impact factor: 3.240

9.  Saliva from obese individuals suppresses the release of aroma compounds from wine.

Authors:  Paola Piombino; Alessandro Genovese; Silvia Esposito; Luigi Moio; Pier Paolo Cutolo; Angela Chambery; Valeria Severino; Elisabetta Moneta; Daniel P Smith; Sarah M Owens; Jack A Gilbert; Danilo Ercolini
Journal:  PLoS One       Date:  2014-01-22       Impact factor: 3.240

10.  EMPeror: a tool for visualizing high-throughput microbial community data.

Authors:  Yoshiki Vázquez-Baeza; Meg Pirrung; Antonio Gonzalez; Rob Knight
Journal:  Gigascience       Date:  2013-11-26       Impact factor: 6.524

View more
  246 in total

Review 1.  A perspective on inter-kingdom signaling in plant-beneficial microbe interactions.

Authors:  Amanda Rosier; Usha Bishnoi; Venkatachalam Lakshmanan; D Janine Sherrier; Harsh P Bais
Journal:  Plant Mol Biol       Date:  2016-01-20       Impact factor: 4.076

2.  Spatial scale drives patterns in soil bacterial diversity.

Authors:  Sarah L O'Brien; Sean M Gibbons; Sarah M Owens; Jarrad Hampton-Marcell; Eric R Johnston; Julie D Jastrow; Jack A Gilbert; Folker Meyer; Dionysios A Antonopoulos
Journal:  Environ Microbiol       Date:  2016-03-21       Impact factor: 5.491

3.  A New N-Acyl Homoserine Lactone Synthase in an Uncultured Symbiont of the Red Sea Sponge Theonella swinhoei.

Authors:  Maya Britstein; Giulia Devescovi; Kim M Handley; Assaf Malik; Markus Haber; Kumar Saurav; Roberta Teta; Valeria Costantino; Ilia Burgsdorf; Jack A Gilbert; Noa Sher; Vittorio Venturi; Laura Steindler
Journal:  Appl Environ Microbiol       Date:  2015-12-11       Impact factor: 4.792

4.  Phylogeny-aware identification and correction of taxonomically mislabeled sequences.

Authors:  Alexey M Kozlov; Jiajie Zhang; Pelin Yilmaz; Frank Oliver Glöckner; Alexandros Stamatakis
Journal:  Nucleic Acids Res       Date:  2016-05-10       Impact factor: 16.971

5.  Metagenomic Analysis of Subtidal Sediments from Polar and Subpolar Coastal Environments Highlights the Relevance of Anaerobic Hydrocarbon Degradation Processes.

Authors:  Fernando Espínola; Hebe M Dionisi; Sharon Borglin; Colin J Brislawn; Janet K Jansson; Walter P Mac Cormack; JoLynn Carroll; Sara Sjöling; Mariana Lozada
Journal:  Microb Ecol       Date:  2017-07-12       Impact factor: 4.552

6.  Accurate Annotation of Microbial Metagenomic Genes and Identification of Core Sets.

Authors:  Chiara Vanni
Journal:  Methods Mol Biol       Date:  2021

Review 7.  Dissection of plant microbiota and plant-microbiome interactions.

Authors:  Kihyuck Choi; Raees Khan; Seon-Woo Lee
Journal:  J Microbiol       Date:  2021-02-23       Impact factor: 3.422

Review 8.  Network analysis of gut microbiota literature: an overview of the research landscape in non-human animal studies.

Authors:  Emily L Pascoe; Heidi C Hauffe; Julian R Marchesi; Sarah E Perkins
Journal:  ISME J       Date:  2017-08-11       Impact factor: 10.302

9.  CD8 T cells drive anorexia, dysbiosis, and blooms of a commensal with immunosuppressive potential after viral infection.

Authors:  Lara Labarta-Bajo; Anna Gramalla-Schmitz; Romana R Gerner; Katelynn R Kazane; Gregory Humphrey; Tara Schwartz; Karenina Sanders; Austin Swafford; Rob Knight; Manuela Raffatellu; Elina I Zúñiga
Journal:  Proc Natl Acad Sci U S A       Date:  2020-09-21       Impact factor: 11.205

10.  Minimum Information about a Biosynthetic Gene cluster.

Authors:  Marnix H Medema; Renzo Kottmann; Pelin Yilmaz; Matthew Cummings; John B Biggins; Kai Blin; Irene de Bruijn; Yit Heng Chooi; Jan Claesen; R Cameron Coates; Pablo Cruz-Morales; Srikanth Duddela; Stephanie Düsterhus; Daniel J Edwards; David P Fewer; Neha Garg; Christoph Geiger; Juan Pablo Gomez-Escribano; Anja Greule; Michalis Hadjithomas; Anthony S Haines; Eric J N Helfrich; Matthew L Hillwig; Keishi Ishida; Adam C Jones; Carla S Jones; Katrin Jungmann; Carsten Kegler; Hyun Uk Kim; Peter Kötter; Daniel Krug; Joleen Masschelein; Alexey V Melnik; Simone M Mantovani; Emily A Monroe; Marcus Moore; Nathan Moss; Hans-Wilhelm Nützmann; Guohui Pan; Amrita Pati; Daniel Petras; F Jerry Reen; Federico Rosconi; Zhe Rui; Zhenhua Tian; Nicholas J Tobias; Yuta Tsunematsu; Philipp Wiemann; Elizabeth Wyckoff; Xiaohui Yan; Grace Yim; Fengan Yu; Yunchang Xie; Bertrand Aigle; Alexander K Apel; Carl J Balibar; Emily P Balskus; Francisco Barona-Gómez; Andreas Bechthold; Helge B Bode; Rainer Borriss; Sean F Brady; Axel A Brakhage; Patrick Caffrey; Yi-Qiang Cheng; Jon Clardy; Russell J Cox; René De Mot; Stefano Donadio; Mohamed S Donia; Wilfred A van der Donk; Pieter C Dorrestein; Sean Doyle; Arnold J M Driessen; Monika Ehling-Schulz; Karl-Dieter Entian; Michael A Fischbach; Lena Gerwick; William H Gerwick; Harald Gross; Bertolt Gust; Christian Hertweck; Monica Höfte; Susan E Jensen; Jianhua Ju; Leonard Katz; Leonard Kaysser; Jonathan L Klassen; Nancy P Keller; Jan Kormanec; Oscar P Kuipers; Tomohisa Kuzuyama; Nikos C Kyrpides; Hyung-Jin Kwon; Sylvie Lautru; Rob Lavigne; Chia Y Lee; Bai Linquan; Xinyu Liu; Wen Liu; Andriy Luzhetskyy; Taifo Mahmud; Yvonne Mast; Carmen Méndez; Mikko Metsä-Ketelä; Jason Micklefield; Douglas A Mitchell; Bradley S Moore; Leonilde M Moreira; Rolf Müller; Brett A Neilan; Markus Nett; Jens Nielsen; Fergal O'Gara; Hideaki Oikawa; Anne Osbourn; Marcia S Osburne; Bohdan Ostash; Shelley M Payne; Jean-Luc Pernodet; Miroslav Petricek; Jörn Piel; Olivier Ploux; Jos M Raaijmakers; José A Salas; Esther K Schmitt; Barry Scott; Ryan F Seipke; Ben Shen; David H Sherman; Kaarina Sivonen; Michael J Smanski; Margherita Sosio; Evi Stegmann; Roderich D Süssmuth; Kapil Tahlan; Christopher M Thomas; Yi Tang; Andrew W Truman; Muriel Viaud; Jonathan D Walton; Christopher T Walsh; Tilmann Weber; Gilles P van Wezel; Barrie Wilkinson; Joanne M Willey; Wolfgang Wohlleben; Gerard D Wright; Nadine Ziemert; Changsheng Zhang; Sergey B Zotchev; Rainer Breitling; Eriko Takano; Frank Oliver Glöckner
Journal:  Nat Chem Biol       Date:  2015-09       Impact factor: 15.040

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.