| Literature DB >> 29788376 |
Emilie Villar1,2, Thomas Vannier2, Caroline Vernette2, Magali Lescot2, Miguelangel Cuenca3, Aurélien Alexandre2, Paul Bachelerie2, Thomas Rosnet2, Eric Pelletier4, Shinichi Sunagawa3, Pascal Hingamp2.
Abstract
The Ocean Gene Atlas is a web service to explore the biogeography of genes from marine planktonic organisms. It allows users to query protein or nucleotide sequences against global ocean reference gene catalogs. With just one click, the abundance and location of target sequences are visualized on world maps as well as their taxonomic distribution. Interactive results panels allow for adjusting cutoffs for alignment quality and displaying the abundances of genes in the context of environmental features (temperature, nutrients, etc.) measured at the time of sampling. The ease of use enables non-bioinformaticians to explore quantitative and contextualized information on genes of interest in the global ocean ecosystem. Currently the Ocean Gene Atlas is deployed with (i) the Ocean Microbial Reference Gene Catalog (OM-RGC) comprising 40 million non-redundant mostly prokaryotic gene sequences associated with both Tara Oceans and Global Ocean Sampling (GOS) gene abundances and (ii) the Marine Atlas of Tara Ocean Unigenes (MATOU) composed of >116 million eukaryote unigenes. Additional datasets will be added upon availability of further marine environmental datasets that provide the required complement of sequence assemblies, raw reads and contextual environmental parameters. Ocean Gene Atlas is a freely-available web service at: http://tara-oceans.mio.osupytheas.fr/ocean-gene-atlas/.Entities:
Mesh:
Year: 2018 PMID: 29788376 PMCID: PMC6030836 DOI: 10.1093/nar/gky376
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The Ocean Gene Atlas query submission interface. (A) The query can be either (i) a fasta format sequence, (ii) an uploaded HMM profile or (iii) an uploaded results file from a previous search. (B) Two gene catalogs are currently available: OM-RGC, a catalog of mostly prokaryotic genes from plankton metagenomes (with associated abundances from Tara oceans and GOS biosamples), and MATOU, a catalog of mostly eukaryotic transcripts from plankton metatranscriptomes. (C) The sequence similarity search algorithm is one of BLAST, DIAMOND or HMMER. (D) E-value threshold to filter the results. (E) Selection of the number of interactive panels in the results page. (F) Optionally notification of results availability can be sent by email.
Figure 2.The Ocean Gene Atlas interactive results panels. (A) Hits abundances are represented by the diameter of filled circle for each sample at user selected sampling depths (e.g. subsurface or mesopelagic). Circle colors represent the filter size fractions (e.g. [0.2–3 μm]). (B) Co-variation of hits abundances with specific environmental variables are shown on bubble plots for each sampling depths: subsurface (SRF), deep chlorophyll maximum (DCM) and mesopelagic (MES). (C) Taxonomic distribution of the hits genes's predicted origins are represented on interactive Krona plots. (D) Result files can be downloaded as tab delimited flat files.
Figure 3.Tara oceans data sources for the the Ocean Gene Atlas workflow. Field campaigns (blue) have collected plankton biosamples and measured in situ environmental parameters. The OGA web server (yellow) combines heterogeneous data published by distinct archives (pink): EBI ENA for sequencing reads, published articles companion websites for gene catalogs and taxonomic annotations, PANGAEA for contextual environmental data. For GOS, metadata was manually extracted from table 1 of Rusch et al. (6).
Figure 4.Phospholipase C (PlcP) biogeography produced by the Ocean Gene Atlas web service. (A) Abundance of PlcP hits in the OM-RGC subsurface samples. (B) Bubble plot of the PlcP abundance in relation to PO4 concentrations; DCM: Deep Chlorophyll Maximum layer, SRF: subsurface and MES: mesopelagic zone. (C) Krona plot of the taxonomic distribution of the PlcP hits.