| Literature DB >> 31164461 |
Alon Shaiber1, A Murat Eren2,3.
Abstract
Entities:
Mesh:
Year: 2019 PMID: 31164461 PMCID: PMC6550520 DOI: 10.1128/mBio.00725-19
Source DB: PubMed Journal: mBio Impact factor: 7.867
FIG 1Refinement of three composite genome bins. (A to C) The top left corners of these panels display the original name of a given Espinoza et al. MAG (see Table 1 in the original study) and its estimated completion and redundancy (C/R) based on a bacterial single-copy core gene collection (10). Each concentric circle represents one of the 88 metagenomes in the original study, dendrograms show hierarchical clustering of contigs based on sequence composition and differential mean coverage across metagenomes (using Euclidean distance and Ward’s method), and each data point represents the read recruitment statistic of a given contig in a given metagenome. Arcs at the outermost layers mark contigs that belong to a refined bin along with their new completion and redundancy estimates (C/R). (D) The phylogenomic tree organizes genomes based on 37 concatenated ribosomal proteins. Coloring of genome names matches their taxonomy in NCBI, and branch colors match the consensus taxonomy of genomes they represent. Espinoza et al. reported MAG IV.A as Gracilibacteria (hence the red color); however, this phylogenomic analysis places refined MAGs under Absconditabacteria. (E) Pangenomic analysis of Espinoza et al. Saccharibacteria MAG III.A before (left) and after (right) refinement together with the Saccharibacteria genomes from panel D. Pangenomes describe 575 and 497 gene clusters, respectively, where each concentric circle represents a genome and bars correspond to the number of genes that a given genome is contributing to a given gene cluster (the maximum value is set to 2 for readability). Outermost layers mark single-copy core gene clusters to which every genome contributes precisely a single gene. We used Bowtie2 (11) to recruit reads from metagenomes, and anvi’o (12) to visualize and refine Espinoza et al. MAGs. FAMSA (13) aligned anvi’o-reported ribosomal protein amino acid sequences, trimAl (14) curated them, and IQ-TREE (15) computed the tree for the phylogenomic analysis. Anvi’o used DIAMOND (16) and MCL (17) algorithms to determine pangenomes. A reproducible bioinformatics workflow and FASTA files for refined MAGs are available at http://merenlab.org/data/refining-espinoza-mags.