Literature DB >> 21472015

The genomic standards consortium: bringing standards to life for microbial ecology.

Pelin Yilmaz1, Jack A Gilbert, Rob Knight, Linda Amaral-Zettler, Ilene Karsch-Mizrachi, Guy Cochrane, Yasukazu Nakamura, Susanna-Assunta Sansone, Frank Oliver Glöckner, Dawn Field.   

Abstract

Entities:  

Mesh:

Year:  2011        PMID: 21472015      PMCID: PMC3176512          DOI: 10.1038/ismej.2011.39

Source DB:  PubMed          Journal:  ISME J        ISSN: 1751-7362            Impact factor:   10.302


× No keyword cloud information.
Adoption of easy-to-follow standards will vastly improve our ability to interpret data from genomes, metagenomes and marker studies Interest in sampling of diverse environments, combined with advances in high-throughput sequencing, vastly accelerates the pace at which new genomes and metagenomes are generated. For example, as of January 2011, 12 500 user-generated metagenomes have been submitted to the public MG-RAST Annotation server (http://metagenomics.nmpdr.org; Meyer ), >90% of which were produced using high-throughput sequencing methodologies. We have entered into an era of ‘mega-sequencing projects' that include the Genomic Encyclopaedia of Bacteria and Archaea project (http://www.jgi.doe.gov/programs/GEBA), the Microbial Earth Project (http://genome.jgi-psf.org/pro grams/bacteria-archaea/MEP/index.jsf), the Human Microbiome Project (http://nihroadmap.nih.gov/hmp), the Metagenomics of the Human Intestinal Tract consortium (http://www.metahit.eu), the Terragenome Initiative (http://www.terragenome.org), the Tara Oceans Expedition (http://oceans.taraexpeditions.org), the National Ecological Observatory Network (NEON-http://www.neoninc.org), the International Census of Marine Microbes (ICoMM-http://icomm.mbl.edu), Microbial Inventory Research Across Diverse Aquatic Long-Term Ecological Research Sites (http://amarallab.mbl.edu/mirada/mirada.html), the Earth Microbiome Project (http://www.earthmicrobiome.org) and other funded and unfunded projects, with many more visionary projects on the horizon. Additionally, studies of emerging metatranscriptomes (community transcript profiles), metaproteomes (community protein profiles) and metametabolomes (community metabolite profiles) now complement genomes and metagenomes. Comparative studies of multi-omic data sets from the same community hold the promise of unparalleled insights into fundamental questions across a range of fields including evolution, ecology, environmental science, physiology and medicine. Advances stem from improvements in the annotation and quantification of genes, pathways, organisms and consortia within these communities. We are just starting to exploit these technologies to understand the microbial world, and have only scratched the surface in terms of sampling microbial diversity across temporal and spatial scales (Delmotte ; Gilbert ). To fully exploit the promise of these data, we need both scientific innovation and community agreement on how to provide appropriate stewardship of these resources for the benefit of all. Although we have collected billions of nucleic-acid sequences from thousands of ecosystems, illuminating uncharacterized microbial lifestyles remains far from trivial. For example, in each analysed genome or metagenome, about 40% of the putative protein-coding genes cannot be assigned to any known function or taxon. Only 42% of the 61 known bacterial phyla have even a single cultured representative (Hugenholtz and Kyrpides, 2009), with the remainder being known only from 16S rRNA gene environmental surveys. Surprisingly, only 14% of cultured bacterial taxa have a single complete genome sequenced. Holistic approaches that will centralize (meta) omics data are needed, which will allow investigators to analyze these data within the context of space, time, habitat and characteristics of the environment. Networks of information arising from these studies will allow us to describe and predict ecological patterns of organisms, genes, transcripts and proteins. One key insight into the function of a gene or organism is the environment where it occurs. Collection of contextual (meta) data, which delineates the source of a sequence in terms of the space, time, habitat and characteristics of the environment, is thus essential in interpreting these unknown genes and species, as well as gaining new insights into the known fraction. Although early comparative studies of metagenomes (Tringe ) relied on a few, deeply sequenced samples, the experience from 16S rRNA gene surveys suggests that additional insight is gained from observing spatial and temporal variation across hundreds of samples, whether examining the distribution of bacteria in soils across a continent (Lauber ) or various skin sites from many subjects (Grice ). At present, the valuable contextual data halo is often missing for sequences deposited in the International Nucleotide Sequence Database Collaboration (INSDC; GenBank, European Nucleotide Archive (ENA, including EMBL-Bank) and the DNA Databank of Japan (DDBJ)). This leaves researchers in the position of searching in electronic resources, literature or contacting the authors for even the most basic contextual data, such as geographic location, date and time of sampling or the habitat where the sample was obtained. Molecular ecologists should immediately recognize the inherent value of these data to the community, because without them their own sequence data sets will have extremely limited comparability with the wealth of other data available. Sequences without contextual data are like unlabeled cans in a supermarket—you do not know what you are purchasing until you open it and examine the contents. The present inability to automatically retrieve rich contextual data hampers comparative research, and constitutes a considerable misuse of the vast global resources currently being applied to microbial ecology. Just as food-safety laws emphasize clear and accurate labeling based on the product, process and producer, so should sequence data be properly annotated. Standardization of the required information will greatly facilitate the annotation of sequence data. To achieve this, we must first have community collaboration and participation. Second, as a result of this collaboration, a contextual data set must be standardized in terms of content, syntax and terminology to which the community can adhere. In 2005, members of the community came together to form the Genomic Standards Consortium (GSC), an open-membership working body with the stated mission of working towards better descriptions of our genomes, metagenomes and related data (http://www.gensc.org). Supported by the expertize of the members involved in many of the aforementioned mega-sequencing projects, the GSC has formalized contextual data requirements for genomes and metagenomes as the Minimum Information about a Genome/Metagenome Sequence checklist (MIGS/MIMS) (Field ). Furthermore, to cover the description of phylogenetic and functional marker genes an extended standard, the Minimum Information about a MARKer gene Sequence (MIMARKS) checklist (http://gensc.org/gc_wiki/index.php/MIMARKS) has been developed (Yilmaz ). This family of minimum information checklists provides researchers with a condensed set of contextual data requirements, which range from description of the environment to sampling and sequencing procedures. The GSC is also driving the evolution of omics data sharing in a broader context through participation in the BioSharing (http://biosharing.org) portal. This forum aims to enable a broader dialog among funders, journals, standards and technology developers, and researchers on the critical issue of data sharing within the metagenomics community and beyond (Field ). It provides an example of what an infrastructure to support standards-compliant reporting of contextual data might look like; as well as encouraging and enabling curation at community level (Rocca-Serra ; http://isatab.sourceforge.net). The primary sequence databases' adoption of these standards is integral to their success. The INSDC partners have recognized this support for submission of compliant data sets with the adoption of an official keyword for the family of minimum standards reserved for compliant INSDC sequence records. Additionally, the development of a number of tools and formats to aid in data exchange (Kottmann ) and compliance during sequence submissions with these standards is ongoing within specialized genomics and metagenomics resources. The application of high-throughput sequencing technologies has transformed the way microbial ecologists approach questions in their field (Gilbert ). The shift of sequencing capacity to individual labs is creating a data bonanza. With appropriate contextual information, these data sets could herald a new era of discovery for microbial ecology. This will only be possible, if each study, from each environment, and from each lab maintains, at the very least, a minimum contextual data standard to facilitate cross-comparison and meta-analysis of global microbial communities. Inadequate implementation of these standards threatens progress in our field of research, as we will lose the best opportunity to produce a complete mechanistic understanding of microbial life. Every investigator will benefit immensely by being able to obtain a rapid, comprehensive answer to the question ‘Have my microbes been seen before, and, if so, where, with whom, and what were they doing? Only by accepting the relatively small responsibility of entering their own contextual data into a global system will they realize this dream. Just as standardized deposition of sequence data contributed an immensely valuable resource, standardization of contextual data will allow us to reap vast dividends for decades to come and enable us to finally escape the burden of ‘my sequence matches 1500 uncultured environmental isolates—now what'? To provide a better understanding of the requirements, we included three examples for MIGS, MIMS and MIMARKS compliant data sets in the Supplementary Table 1. Supplementary File 2 provides links to detailed submission and compliance guidelines. With this open letter to the ISME community, we not only hope to advertise the existence of the GSC and invite more microbial ecologists investigating marker genes and doing ‘omics' work to join us, but also make a call for compliance with current and future GSC standards. To learn how to describe your data according to MIGS/MIMS/MIMARKS (MIxS) standards, please visit the GSC website for details and options for submitting compliant data sets into public domain databases (http://gensc.org/gc_wiki/index .php/MIGS/MIMS/MIMARKS).
  13 in total

1.  Comparative metagenomics of microbial communities.

Authors:  Susannah Green Tringe; Christian von Mering; Arthur Kobayashi; Asaf A Salamov; Kevin Chen; Hwai W Chang; Mircea Podar; Jay M Short; Eric J Mathur; John C Detter; Peer Bork; Philip Hugenholtz; Edward M Rubin
Journal:  Science       Date:  2005-04-22       Impact factor: 47.728

2.  Pyrosequencing-based assessment of soil pH as a predictor of soil bacterial community structure at the continental scale.

Authors:  Christian L Lauber; Micah Hamady; Rob Knight; Noah Fierer
Journal:  Appl Environ Microbiol       Date:  2009-06-05       Impact factor: 4.792

3.  A changing of the guard.

Authors:  Philip Hugenholtz; Nikos C Kyrpides
Journal:  Environ Microbiol       Date:  2009-03       Impact factor: 5.491

4.  Community proteogenomics reveals insights into the physiology of phyllosphere bacteria.

Authors:  Nathanaël Delmotte; Claudia Knief; Samuel Chaffron; Gerd Innerebner; Bernd Roschitzki; Ralph Schlapbach; Christian von Mering; Julia A Vorholt
Journal:  Proc Natl Acad Sci U S A       Date:  2009-09-04       Impact factor: 11.205

5.  Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications.

Authors:  Pelin Yilmaz; Renzo Kottmann; Dawn Field; Rob Knight; James R Cole; Linda Amaral-Zettler; Jack A Gilbert; Ilene Karsch-Mizrachi; Anjanette Johnston; Guy Cochrane; Robert Vaughan; Christopher Hunter; Joonhong Park; Norman Morrison; Philippe Rocca-Serra; Peter Sterk; Manimozhiyan Arumugam; Mark Bailey; Laura Baumgartner; Bruce W Birren; Martin J Blaser; Vivien Bonazzi; Tim Booth; Peer Bork; Frederic D Bushman; Pier Luigi Buttigieg; Patrick S G Chain; Emily Charlson; Elizabeth K Costello; Heather Huot-Creasy; Peter Dawyndt; Todd DeSantis; Noah Fierer; Jed A Fuhrman; Rachel E Gallery; Dirk Gevers; Richard A Gibbs; Inigo San Gil; Antonio Gonzalez; Jeffrey I Gordon; Robert Guralnick; Wolfgang Hankeln; Sarah Highlander; Philip Hugenholtz; Janet Jansson; Andrew L Kau; Scott T Kelley; Jerry Kennedy; Dan Knights; Omry Koren; Justin Kuczynski; Nikos Kyrpides; Robert Larsen; Christian L Lauber; Teresa Legg; Ruth E Ley; Catherine A Lozupone; Wolfgang Ludwig; Donna Lyons; Eamonn Maguire; Barbara A Methé; Folker Meyer; Brian Muegge; Sara Nakielny; Karen E Nelson; Diana Nemergut; Josh D Neufeld; Lindsay K Newbold; Anna E Oliver; Norman R Pace; Giriprakash Palanisamy; Jörg Peplies; Joseph Petrosino; Lita Proctor; Elmar Pruesse; Christian Quast; Jeroen Raes; Sujeevan Ratnasingham; Jacques Ravel; David A Relman; Susanna Assunta-Sansone; Patrick D Schloss; Lynn Schriml; Rohini Sinha; Michelle I Smith; Erica Sodergren; Aymé Spo; Jesse Stombaugh; James M Tiedje; Doyle V Ward; George M Weinstock; Doug Wendel; Owen White; Andrew Whiteley; Andreas Wilke; Jennifer R Wortman; Tanya Yatsunenko; Frank Oliver Glöckner
Journal:  Nat Biotechnol       Date:  2011-05       Impact factor: 54.908

6.  ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level.

Authors:  Philippe Rocca-Serra; Marco Brandizi; Eamonn Maguire; Nataliya Sklyar; Chris Taylor; Kimberly Begley; Dawn Field; Stephen Harris; Winston Hide; Oliver Hofmann; Steffen Neumann; Peter Sterk; Weida Tong; Susanna-Assunta Sansone
Journal:  Bioinformatics       Date:  2010-08-02       Impact factor: 6.937

7.  The taxonomic and functional diversity of microbes at a temperate coastal site: a 'multi-omic' study of seasonal and diel temporal variation.

Authors:  Jack A Gilbert; Dawn Field; Paul Swift; Simon Thomas; Denise Cummings; Ben Temperton; Karen Weynberg; Susan Huse; Margaret Hughes; Ian Joint; Paul J Somerfield; Martin Mühling
Journal:  PLoS One       Date:  2010-11-29       Impact factor: 3.240

8.  Topographical and temporal diversity of the human skin microbiome.

Authors:  Elizabeth A Grice; Heidi H Kong; Sean Conlan; Clayton B Deming; Joie Davis; Alice C Young; Gerard G Bouffard; Robert W Blakesley; Patrick R Murray; Eric D Green; Maria L Turner; Julia A Segre
Journal:  Science       Date:  2009-05-29       Impact factor: 47.728

9.  The minimum information about a genome sequence (MIGS) specification.

Authors:  Dawn Field; George Garrity; Tanya Gray; Norman Morrison; Jeremy Selengut; Peter Sterk; Tatiana Tatusova; Nicholas Thomson; Michael J Allen; Samuel V Angiuoli; Michael Ashburner; Nelson Axelrod; Sandra Baldauf; Stuart Ballard; Jeffrey Boore; Guy Cochrane; James Cole; Peter Dawyndt; Paul De Vos; Claude DePamphilis; Robert Edwards; Nadeem Faruque; Robert Feldman; Jack Gilbert; Paul Gilna; Frank Oliver Glöckner; Philip Goldstein; Robert Guralnick; Dan Haft; David Hancock; Henning Hermjakob; Christiane Hertz-Fowler; Phil Hugenholtz; Ian Joint; Leonid Kagan; Matthew Kane; Jessie Kennedy; George Kowalchuk; Renzo Kottmann; Eugene Kolker; Saul Kravitz; Nikos Kyrpides; Jim Leebens-Mack; Suzanna E Lewis; Kelvin Li; Allyson L Lister; Phillip Lord; Natalia Maltsev; Victor Markowitz; Jennifer Martiny; Barbara Methe; Ilene Mizrachi; Richard Moxon; Karen Nelson; Julian Parkhill; Lita Proctor; Owen White; Susanna-Assunta Sansone; Andrew Spiers; Robert Stevens; Paul Swift; Chris Taylor; Yoshio Tateno; Adrian Tett; Sarah Turner; David Ussery; Bob Vaughan; Naomi Ward; Trish Whetzel; Ingio San Gil; Gareth Wilson; Anil Wipat
Journal:  Nat Biotechnol       Date:  2008-05       Impact factor: 54.908

10.  Megascience. 'Omics data sharing.

Authors:  Dawn Field; Susanna-Assunta Sansone; Amanda Collis; Tim Booth; Peter Dukes; Susan K Gregurick; Karen Kennedy; Patrik Kolar; Eugene Kolker; Mary Maxon; Siân Millard; Alexis-Michel Mugabushaka; Nicola Perrin; Jacques E Remacle; Karin Remington; Philippe Rocca-Serra; Chris F Taylor; Mark Thorley; Bela Tiwari; John Wilbanks
Journal:  Science       Date:  2009-10-09       Impact factor: 47.728

View more
  26 in total

1.  Context is key for sequence data.

Authors: 
Journal:  Nat Rev Microbiol       Date:  2011-06       Impact factor: 60.633

2.  Microdiversity of extracellular enzyme genes among sequenced prokaryotic genomes.

Authors:  Amy E Zimmerman; Adam C Martiny; Steven D Allison
Journal:  ISME J       Date:  2013-01-10       Impact factor: 10.302

3.  Single-cell enabled comparative genomics of a deep ocean SAR11 bathytype.

Authors:  J Cameron Thrash; Ben Temperton; Brandon K Swan; Zachary C Landry; Tanja Woyke; Edward F DeLong; Ramunas Stepanauskas; Stephan J Giovannoni
Journal:  ISME J       Date:  2014-01-23       Impact factor: 10.302

4.  Integrating metagenomic and amplicon databases to resolve the phylogenetic and ecological diversity of the Chlamydiae.

Authors:  Ilias Lagkouvardos; Thomas Weinmaier; Federico M Lauro; Ricardo Cavicchioli; Thomas Rattei; Matthias Horn
Journal:  ISME J       Date:  2013-08-15       Impact factor: 10.302

5.  CDinFusion--submission-ready, on-line integration of sequence and contextual data.

Authors:  Wolfgang Hankeln; Norma Johanna Wendel; Jan Gerken; Jost Waldmann; Pier Luigi Buttigieg; Ivaylo Kostadinov; Renzo Kottmann; Pelin Yilmaz; Frank Oliver Glöckner
Journal:  PLoS One       Date:  2011-09-13       Impact factor: 3.240

6.  Major submissions tool developments at the European Nucleotide Archive.

Authors:  Clara Amid; Ewan Birney; Lawrence Bower; Ana Cerdeño-Tárraga; Ying Cheng; Iain Cleland; Nadeem Faruque; Richard Gibson; Neil Goodgame; Christopher Hunter; Mikyung Jang; Rasko Leinonen; Xin Liu; Arnaud Oisel; Nima Pakseresht; Sheila Plaister; Rajesh Radhakrishnan; Kethi Reddy; Stephane Rivière; Marc Rossello; Alexander Senf; Dimitriy Smirnov; Petra Ten Hoopen; Daniel Vaughan; Robert Vaughan; Vadim Zalunin; Guy Cochrane
Journal:  Nucleic Acids Res       Date:  2011-11-12       Impact factor: 16.971

Review 7.  Bacterial genomes: habitat specificity and uncharted organisms.

Authors:  Francisco Dini-Andreote; Fernando Dini Andreote; Welington Luiz Araújo; Jack T Trevors; Jan Dirk van Elsas
Journal:  Microb Ecol       Date:  2012-03-07       Impact factor: 4.552

Review 8.  Defining the human microbiome.

Authors:  Luke K Ursell; Jessica L Metcalf; Laura Wegener Parfrey; Rob Knight
Journal:  Nutr Rev       Date:  2012-08       Impact factor: 7.110

9.  FANTOM: Functional and taxonomic analysis of metagenomes.

Authors:  Kemal Sanli; Fredrik H Karlsson; Intawat Nookaew; Jens Nielsen
Journal:  BMC Bioinformatics       Date:  2013-02-01       Impact factor: 3.169

Review 10.  Current opportunities and challenges in microbial metagenome analysis--a bioinformatic perspective.

Authors:  Hanno Teeling; Frank Oliver Glöckner
Journal:  Brief Bioinform       Date:  2012-09-09       Impact factor: 11.622

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.