Literature DB >> 25999085

Computational eco-systems biology in Tara Oceans: translating data into knowledge.

Shinichi Sunagawa1, Eric Karsenti2, Chris Bowler3, Peer Bork4.   

Abstract

Entities:  

Mesh:

Year:  2015        PMID: 25999085      PMCID: PMC4461402          DOI: 10.15252/msb.20156272

Source DB:  PubMed          Journal:  Mol Syst Biol        ISSN: 1744-4292            Impact factor:   11.429


× No keyword cloud information.
In molecular systems biology, data are flooding us at an ever-increasing pace, from genomic to transcriptomic and proteomic information, complemented by spatially and time-resolved data obtained at multiple scales. Computational biology usually integrates these layers of information at the cellular and biochemical levels. But how does the interplay between experimental and computational biology work if this information is coming not only from a cellular system, but from an entire ecosystem and if this ecosystem spans the entire Earth? Here, we illustrate some of the computational challenges and promises of large-scale eco-systems biology studies (Raes & Bork, 2008) in the context of the Tara Oceans project (Fig1), which has arguably been one of the wettest wet laboratory experiments ever.
Figure 1

Translating Tara Oceans' data deluge into knowledge.

Translating Tara Oceans' data deluge into knowledge. From 2009 to 2013, the schooner Tara sampled ocean plankton spanning several orders of magnitude in size at 210 stations over a range of depths (down to 2,000 m) around the world's oceans, together with various oceanographic measures such as temperature, salinity, nutrient concentrations, as well as visual monitoring of plankton far beyond the resolution of the naked eye (Bork et al, 2015). This adventurous wet part of the project obviously needed a dry counterpart on land, not only mastering study design, standards (e.g. standard operational protocols), archiving and logistics (Pesant et al, 2015), but also the arduous yet exciting part of translating heterogeneous data into knowledge at a truly planetary scale. To tackle this grand challenge, an interdisciplinary team was formed early on to take an integrative approach and to maximize the interactions between fields and people, with the objective of promoting research that was beyond what individual laboratories could accomplish alone (Karsenti, 2015). In total, > 35,000 samples, each one with an individual barcode and with contextual metadata, were collected for morphological, environmental and genomic analysis. Of the latter, a subset of ca. 600 samples had been prioritized early on to balance biogeographical coverage and analysis costs (Table1). At regular meetings and telephone conferences, consortium scientists were able to build a network of the different methodological approaches, biomolecular data types, diverse organism groups and oceanographic parameters, and then to superimpose a range of global and discipline-specific questions that could be addressed with these data.
Table 1

Taxonomic and genetic diversity analysed to date by the Tara Oceans project.

EukaryotesProkaryotesViruses
Taxonomic diversity
 Method18S rRNA (PCR tags)16S rRNA (mitags)Contigs (assembly)
 Detected diversity110 k OTUs35 k OTUs5.5 k populations
 Novel taxa23 k OTUsNDND
 Samples33413943
 Stations476726
Genetic diversity
 MethodMetatranscriptomicsMetagenomicsMetagenomics
 Detected diversity7.6 M genesa40 M genesa1 M proteinsb
 Novel genes> 30%c> 80%a< 20%b
 Samples2924343
 Stations36826

Cells highlighted in yellow: only those data are, in principle, comparable but even here station numbers and filters differ.

Based on clustering at 95% nucleotide sequence similarity.

Based on clustering at 60% protein sequence identity.

Based on taxonomic assignments.

Taxonomic and genetic diversity analysed to date by the Tara Oceans project. Cells highlighted in yellow: only those data are, in principle, comparable but even here station numbers and filters differ. Based on clustering at 95% nucleotide sequence similarity. Based on clustering at 60% protein sequence identity. Based on taxonomic assignments. To enable synchronization of the different laboratories, the first steps of the analysis involved data standardization, normalization, quality control and public deposition of the data. For example, signal profiles that had been recorded in situ by numerous instruments had to be calibrated and validated, data from satellites and autonomous floats were integrated, and on land, analyses of samples added further data such as nutrients, pigments and carbonate chemistry to yield comprehensive environmental data. Digital images were analysed to extract features describing the shape and diversity of the captured organisms. Trillions of sequenced DNA base pairs were translated into organismal abundance and diversity using 16S and 18S rRNA gene data on the one hand and assembled into genes and genomes based on metagenomics and metatranscriptomics on the other hand (Jaillon et al in preparation; Sunagawa et al, 2015). For the analysis of metagenomics data alone, millions of CPU hours distributed over high-performance clusters with terabyte memory nodes were required to solve this gigantic puzzle. The public deposition of the raw and derived data was a challenge on its own, not only due to their sheer volume (to date, 11.5 terabytes), but also due to the need to contextualize and cross-link data from heterogeneous sources. But with the much-appreciated support from the European Bioinformatics Institute (EBI) and the environmental data publisher PANGAEA (www.pangaea.de), it was finally accomplished. As a first integration milestone, a number of general resources were created: an 18S rRNA gene-based census of eukaryotic biodiversity and an ocean microbial reference gene catalogue; the latter derived from the analysis of organisms filtered by size to enrich viral and prokaryotic content. Both biomolecule-based data types give insights into the biodiversity of the world's oceans at unprecedented scale (Sunagawa et al, 2015; de Vargas et al, 2015). The resources, together with other data types (e.g. microscopy images, environmental parameters and oceanographic measurements), were then utilized to establish an overview of DNA virus distribution in the oceans (Brum et al, 2015), to derive global species interaction networks across all domains of life and viruses (Lima-Mendez et al, 2015) and to integrate oceanographic and biological data to study plankton dispersal at a major chokepoint of global ocean circulation (Villar et al, 2015). These studies exemplify how an ecosystems biology approach can be used to interpret molecular data in the context of planetary-scale processes such as ocean currents, temperature gradients and nutrient cycles. Analysis of the Tara Oceans' data is likely to continue for years, perhaps decades. Together with other data sources and types, the Tara Oceans' data sets should contribute to a comprehensive parts list of organisms, genes and genomes in our oceans, although challenges in data comparability still need to be addressed (Box 1). The current data should also be amended, for example, with a dissection of temporal and seasonal variation at global scale, which could be achieved by simultaneously and repeatedly collecting samples of the global ocean. To this end, initiatives of crowd-sourced research are already on the horizon (www.oceansamplingday.org). The increasing quantity, quality and resolution of such data will make it possible to address global-scale phenomena, and by integrating molecular data, to test constraints on biodiversity, dispersal and evolution at various spatial and temporal scales, for example. We anticipate that with advances in ‘omics’ technologies, deciphering the features that are consistent within and across Earth's ecosystems as well as the mechanisms that drive them over seasonal and evolutionary time scales has now become a little more science than just fiction. Tara Oceans released a massive amount of primary and derived data along with the publication of their initial results. For example, a data volume of ca. 13 terabytes has already been archived at the EBI (PRJEB402); however, many data types can still not be easily compared as methodological details and context differ. Due to differing biological features in the different organism classes and due to funding constraints, different methods were applied to capture biodiversity. For example, metagenomics could not be afforded for eukaryotes, since only a very small fraction of the large genomes are protein-coding. Also, because of missing methodological standards, direct comparison of these data is challenging. For example, due to difficulties in delineating species based on molecular data alone, the term operational taxonomic unit (OTU) is commonly used to define a taxonomic group based on sequence similarity of select taxonomic marker genes. However, the 18S and 16S rRNA genes are used for eukaryotes and prokaryotes, respectively, which differ in diversification rates and operational taxonomic definitions. Moreover, as viruses lack any universal genes that could be used for consistent taxonomic classification, long contiguous sequences of assembled viral genomes were used as an alternative approach to quantify viral populations. On the other hand, for studying genetic diversity, similar gene definitions were used for metagenomically characterized prokaryotic genes and metatranscriptomically derived eukaryotic genes. However, sequencing depths, sample numbers, gene lengths, genome sizes and many other parameters are different and need normalization, before sensible comparisons can be made (Table1). Thus, despite a 1,000-fold increase of data over earlier ocean surveys (Rusch et al, 2007), the established Tara Oceans' resources are only the tip of an iceberg when attempting to collect planetary biodiversity. While representing a promising start to collect the molecular and taxonomic parts lists of the contemporary ocean, Tara Oceans has a lot of work ahead to connect these into species interactions and their functional meaning in the context of the environment.
  9 in total

Review 1.  Molecular eco-systems biology: towards an understanding of community function.

Authors:  Jeroen Raes; Peer Bork
Journal:  Nat Rev Microbiol       Date:  2008-09       Impact factor: 60.633

2.  Ocean plankton. Determinants of community structure in the global plankton interactome.

Authors:  Gipsi Lima-Mendez; Karoline Faust; Nicolas Henry; Johan Decelle; Sébastien Colin; Fabrizio Carcillo; Samuel Chaffron; J Cesar Ignacio-Espinosa; Simon Roux; Flora Vincent; Lucie Bittner; Youssef Darzi; Jun Wang; Stéphane Audic; Léo Berline; Gianluca Bontempi; Ana M Cabello; Laurent Coppola; Francisco M Cornejo-Castillo; Francesco d'Ovidio; Luc De Meester; Isabel Ferrera; Marie-José Garet-Delmas; Lionel Guidi; Elena Lara; Stéphane Pesant; Marta Royo-Llonch; Guillem Salazar; Pablo Sánchez; Marta Sebastian; Caroline Souffreau; Céline Dimier; Marc Picheral; Sarah Searson; Stefanie Kandels-Lewis; Gabriel Gorsky; Fabrice Not; Hiroyuki Ogata; Sabrina Speich; Lars Stemmann; Jean Weissenbach; Patrick Wincker; Silvia G Acinas; Shinichi Sunagawa; Peer Bork; Matthew B Sullivan; Eric Karsenti; Chris Bowler; Colomban de Vargas; Jeroen Raes
Journal:  Science       Date:  2015-05-22       Impact factor: 47.728

3.  Ocean plankton. Environmental characteristics of Agulhas rings affect interocean plankton transport.

Authors:  Emilie Villar; Gregory K Farrant; Michael Follows; Laurence Garczarek; Sabrina Speich; Stéphane Audic; Lucie Bittner; Bruno Blanke; Jennifer R Brum; Christophe Brunet; Raffaella Casotti; Alison Chase; John R Dolan; Fabrizio d'Ortenzio; Jean-Pierre Gattuso; Nicolas Grima; Lionel Guidi; Christopher N Hill; Oliver Jahn; Jean-Louis Jamet; Hervé Le Goff; Cyrille Lepoivre; Shruti Malviya; Eric Pelletier; Jean-Baptiste Romagnan; Simon Roux; Sébastien Santini; Eleonora Scalco; Sarah M Schwenck; Atsuko Tanaka; Pierre Testor; Thomas Vannier; Flora Vincent; Adriana Zingone; Céline Dimier; Marc Picheral; Sarah Searson; Stefanie Kandels-Lewis; Silvia G Acinas; Peer Bork; Emmanuel Boss; Colomban de Vargas; Gabriel Gorsky; Hiroyuki Ogata; Stéphane Pesant; Matthew B Sullivan; Shinichi Sunagawa; Patrick Wincker; Eric Karsenti; Chris Bowler; Fabrice Not; Pascal Hingamp; Daniele Iudicone
Journal:  Science       Date:  2015-05-22       Impact factor: 47.728

4.  Ocean plankton. Eukaryotic plankton diversity in the sunlit ocean.

Authors:  Colomban de Vargas; Stéphane Audic; Nicolas Henry; Johan Decelle; Frédéric Mahé; Ramiro Logares; Enrique Lara; Cédric Berney; Noan Le Bescot; Ian Probert; Margaux Carmichael; Julie Poulain; Sarah Romac; Sébastien Colin; Jean-Marc Aury; Lucie Bittner; Samuel Chaffron; Micah Dunthorn; Stefan Engelen; Olga Flegontova; Lionel Guidi; Aleš Horák; Olivier Jaillon; Gipsi Lima-Mendez; Julius Lukeš; Shruti Malviya; Raphael Morard; Matthieu Mulot; Eleonora Scalco; Raffaele Siano; Flora Vincent; Adriana Zingone; Céline Dimier; Marc Picheral; Sarah Searson; Stefanie Kandels-Lewis; Silvia G Acinas; Peer Bork; Chris Bowler; Gabriel Gorsky; Nigel Grimsley; Pascal Hingamp; Daniele Iudicone; Fabrice Not; Hiroyuki Ogata; Stephane Pesant; Jeroen Raes; Michael E Sieracki; Sabrina Speich; Lars Stemmann; Shinichi Sunagawa; Jean Weissenbach; Patrick Wincker; Eric Karsenti
Journal:  Science       Date:  2015-05-22       Impact factor: 47.728

5.  Ocean plankton. Structure and function of the global ocean microbiome.

Authors:  Shinichi Sunagawa; Luis Pedro Coelho; Samuel Chaffron; Jens Roat Kultima; Karine Labadie; Guillem Salazar; Bardya Djahanschiri; Georg Zeller; Daniel R Mende; Adriana Alberti; Francisco M Cornejo-Castillo; Paul I Costea; Corinne Cruaud; Francesco d'Ovidio; Stefan Engelen; Isabel Ferrera; Josep M Gasol; Lionel Guidi; Falk Hildebrand; Florian Kokoszka; Cyrille Lepoivre; Gipsi Lima-Mendez; Julie Poulain; Bonnie T Poulos; Marta Royo-Llonch; Hugo Sarmento; Sara Vieira-Silva; Céline Dimier; Marc Picheral; Sarah Searson; Stefanie Kandels-Lewis; Chris Bowler; Colomban de Vargas; Gabriel Gorsky; Nigel Grimsley; Pascal Hingamp; Daniele Iudicone; Olivier Jaillon; Fabrice Not; Hiroyuki Ogata; Stephane Pesant; Sabrina Speich; Lars Stemmann; Matthew B Sullivan; Jean Weissenbach; Patrick Wincker; Eric Karsenti; Jeroen Raes; Silvia G Acinas; Peer Bork
Journal:  Science       Date:  2015-05-22       Impact factor: 47.728

6.  Ocean plankton. Patterns and ecological drivers of ocean viral communities.

Authors:  Jennifer R Brum; J Cesar Ignacio-Espinoza; Simon Roux; Guilhem Doulcier; Silvia G Acinas; Adriana Alberti; Samuel Chaffron; Corinne Cruaud; Colomban de Vargas; Josep M Gasol; Gabriel Gorsky; Ann C Gregory; Lionel Guidi; Pascal Hingamp; Daniele Iudicone; Fabrice Not; Hiroyuki Ogata; Stéphane Pesant; Bonnie T Poulos; Sarah M Schwenck; Sabrina Speich; Celine Dimier; Stefanie Kandels-Lewis; Marc Picheral; Sarah Searson; Peer Bork; Chris Bowler; Shinichi Sunagawa; Patrick Wincker; Eric Karsenti; Matthew B Sullivan
Journal:  Science       Date:  2015-05-22       Impact factor: 47.728

7.  Tara Oceans. Tara Oceans studies plankton at planetary scale. Introduction.

Authors:  P Bork; C Bowler; C de Vargas; G Gorsky; E Karsenti; P Wincker
Journal:  Science       Date:  2015-05-21       Impact factor: 47.728

8.  The Sorcerer II Global Ocean Sampling expedition: northwest Atlantic through eastern tropical Pacific.

Authors:  Douglas B Rusch; Aaron L Halpern; Granger Sutton; Karla B Heidelberg; Shannon Williamson; Shibu Yooseph; Dongying Wu; Jonathan A Eisen; Jeff M Hoffman; Karin Remington; Karen Beeson; Bao Tran; Hamilton Smith; Holly Baden-Tillson; Clare Stewart; Joyce Thorpe; Jason Freeman; Cynthia Andrews-Pfannkoch; Joseph E Venter; Kelvin Li; Saul Kravitz; John F Heidelberg; Terry Utterback; Yu-Hui Rogers; Luisa I Falcón; Valeria Souza; Germán Bonilla-Rosso; Luis E Eguiarte; David M Karl; Shubha Sathyendranath; Trevor Platt; Eldredge Bermingham; Victor Gallardo; Giselle Tamayo-Castillo; Michael R Ferrari; Robert L Strausberg; Kenneth Nealson; Robert Friedman; Marvin Frazier; J Craig Venter
Journal:  PLoS Biol       Date:  2007-03       Impact factor: 8.029

9.  The making of Tara Oceans: funding blue skies research for our Blue Planet.

Authors:  Eric Karsenti
Journal:  Mol Syst Biol       Date:  2015-05-21       Impact factor: 11.429

  9 in total
  7 in total

Review 1.  Tara Oceans: towards global ocean ecosystems biology.

Authors:  Shinichi Sunagawa; Silvia G Acinas; Peer Bork; Chris Bowler; Damien Eveillard; Gabriel Gorsky; Lionel Guidi; Daniele Iudicone; Eric Karsenti; Fabien Lombard; Hiroyuki Ogata; Stephane Pesant; Matthew B Sullivan; Patrick Wincker; Colomban de Vargas
Journal:  Nat Rev Microbiol       Date:  2020-05-12       Impact factor: 60.633

2.  The making of Tara Oceans: funding blue skies research for our Blue Planet.

Authors:  Eric Karsenti
Journal:  Mol Syst Biol       Date:  2015-05-21       Impact factor: 11.429

3.  Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta.

Authors:  Sergey Ovchinnikov; David E Kim; Ray Yu-Ruei Wang; Yuan Liu; Frank DiMaio; David Baker
Journal:  Proteins       Date:  2016-02-24

4.  The MAR databases: development and implementation of databases specific for marine metagenomics.

Authors:  Terje Klemetsen; Inge A Raknes; Juan Fu; Alexander Agafonov; Sudhagar V Balasundaram; Giacomo Tartari; Espen Robertsen; Nils P Willassen
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

5.  Genome mining- and synthetic biology-enabled production of hypermodified peptides.

Authors:  Agneya Bhushan; Peter J Egli; Eike E Peters; Michael F Freeman; Jörn Piel
Journal:  Nat Chem       Date:  2019-09-09       Impact factor: 24.427

6.  The Tara Pacific expedition-A pan-ecosystemic approach of the "-omics" complexity of coral reef holobionts across the Pacific Ocean.

Authors:  Serge Planes; Denis Allemand; Sylvain Agostini; Bernard Banaigs; Emilie Boissin; Emmanuel Boss; Guillaume Bourdin; Chris Bowler; Eric Douville; J Michel Flores; Didier Forcioli; Paola Furla; Pierre E Galand; Jean-François Ghiglione; Eric Gilson; Fabien Lombard; Clémentine Moulin; Stephane Pesant; Julie Poulain; Stéphanie Reynaud; Sarah Romac; Matthew B Sullivan; Shinichi Sunagawa; Olivier P Thomas; Romain Troublé; Colomban de Vargas; Rebecca Vega Thurber; Christian R Voolstra; Patrick Wincker; Didier Zoccola
Journal:  PLoS Biol       Date:  2019-09-23       Impact factor: 8.029

7.  Bioinformatics tools for marine biotechnology: a practical tutorial with a metagenomic approach.

Authors:  Ludovica Liguori; Maria Monticelli; Mariateresa Allocca; Maria Vittoria Cubellis; Bruno Hay Mele
Journal:  BMC Bioinformatics       Date:  2020-08-21       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.