| Literature DB >> 27738135 |
I-Min A Chen1, Victor M Markowitz2, Ken Chu2, Krishna Palaniappan2, Ernest Szeto2, Manoj Pillay2, Anna Ratner2, Jinghua Huang2, Evan Andersen2, Marcel Huntemann3, Neha Varghese3, Michalis Hadjithomas3, Kristin Tennessen3, Torben Nielsen3, Natalia N Ivanova3, Nikos C Kyrpides4.
Abstract
The Integrated Microbial Genomes with Microbiome Samples (IMG/M: https://img.jgi.doe.gov/m/) system contains annotated DNA and RNA sequence data of (i) archaeal, bacterial, eukaryotic and viral genomes from cultured organisms, (ii) single cell genomes (SCG) and genomes from metagenomes (GFM) from uncultured archaea, bacteria and viruses and (iii) metagenomes from environmental, host associated and engineered microbiome samples. Sequence data are generated by DOE's Joint Genome Institute (JGI), submitted by individual scientists, or collected from public sequence data archives. Structural and functional annotation is carried out by JGI's genome and metagenome annotation pipelines. A variety of analytical and visualization tools provide support for examining and comparing IMG/M's datasets. IMG/M allows open access interactive analysis of publicly available datasets, while manual curation, submission and access to private datasets and computationally intensive workspace-based analysis require login/password access to its expert review (ER) companion system (IMG/M ER: https://img.jgi.doe.gov/mer/). Since the last report published in the 2014 NAR Database Issue, IMG/M's dataset content has tripled in terms of number of datasets and overall protein coding genes, while its analysis tools have been extended to cope with the rapid growth in the number and size of datasets handled by the system. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.Entities:
Mesh:
Year: 2016 PMID: 27738135 PMCID: PMC5210632 DOI: 10.1093/nar/gkw929
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.New Metadata from GOLD v.5. (i) IMG now provides additional metadata field selection obtained from GOLD v.5. (ii) Genome publication list in the Genome Detail page shows researchers the publication reference.
Figure 2.Find Functions with KO List. (i) Newly added KEGG functions are shown: KO List, KO List w/ Stats, KEGG Module List and KEGG Module List w/ Stats. (ii) KO List shows all KO terms in the IMG database. (iii) KO Term detail page for K00003 homoserine dehydrogenase [EC:1.1.1.3] shows associated KO modules (M00017, M00018) and pathways (169 172), as well as all genomes and metagenomes with genes annotated with this KO term.
Figure 3.KEGG Module Viewer. (i) ‘KEGG Module List’ in the Find Functions menu shows a list of all KEGG modules in IMG. (ii) KEGG Module detail for M00608 2-Oxocarboxylic acid chain extension, 2-oxoglutarate = > 2-oxoadipate = > 2-oxopimelate = > 2-oxosuberate shows the module definition and all KO terms in the module. (iii) The ‘View KO Module Map’ feature shows that selected genome Methanobrevibacter smithii DSM 2375 has genes annotated with all KO terms in this module.
Figure 4.Pairwise ANI tool. (i) ANI's landing page in IMG is found under menu item Compare Genomes -> Avg Nucleotide Ident. (ii) Pairwise ANI genome selection page, only isolate genomes can be selected. (iii) The results of the selected genomes pairwise comparisons. (iv) The ANI details on the genome's detail page. (v) A clique details. (vi) All the genomes that belong to the given clique. (vii) A list of similar cliques to the given clique. (viii) A graphical representation of the clique group.
Figure 5.ANI Same Species Plot. (i) Tool found under menu Compare Genomes -> Avg Nucleotide Ident. -> Same Species Plot. (ii) Example of comparing some Enterobacter and Pantoea.
Figure 6.ANI Cliques. (i) List of all clique types in IMG. (ii) All cliques grouped by species. (iii) All cliques grouped by taxonomy. (iv) Cliques groups.
Figure 7.Metagenome binning. (i) From a metagenome detail page, a user can scroll down to find the Phylogenetic Distrubution of Genes function. (ii) The phylogenetic distribution result shows that the metagenome has 19 genes with more than 90% hits to C2likevirus (genus). (iii) The 19 genes all came from two scaffolds C1828819 and C1830504, which can be selected and saved to individual workspace scaffold datasets for further analysis.