| Literature DB >> 27445992 |
Cedric C Laczny1, Emilie E L Muller1, Anna Heintz-Buschart1, Malte Herold1, Laura A Lebrun1, Angela Hogan2, Patrick May1, Carine de Beaufort3, Paul Wilmes1.
Abstract
Linking taxonomic identity and functional potential at the population-level is important for the study of mixed microbial communities and is greatly facilitated by the availability of microbial reference genomes. While the culture-independent recovery of population-level genomes from environmental samples using the binning of metagenomic data has expanded available reference genome catalogs, several microbial lineages remain underrepresented. Here, we present two reference-independent approaches for the identification, recovery, and refinement of hitherto undescribed population-level genomes. The first approach is aimed at genome recovery of varied taxa and involves multi-sample automated binning using CANOPY CLUSTERING complemented by visualization and human-augmented binning using VIZBIN post hoc. The second approach is particularly well-suited for the study of specific taxa and employs VIZBIN de novo. Using these approaches, we reconstructed a total of six population-level genomes of distinct and divergent representatives of the Alphaproteobacteria class, the Mollicutes class, the Clostridiales order, and the Melainabacteria class from human gastrointestinal tract-derived metagenomic data. Our results demonstrate that, while automated binning approaches provide great potential for large-scale studies of mixed microbial communities, these approaches should be complemented with informative visualizations because expert-driven inspection and refinements are critical for the recovery of high-quality population-level genomes.Entities:
Keywords: binning; genome recovery; metagenome; reference genomes; refinement
Year: 2016 PMID: 27445992 PMCID: PMC4914512 DOI: 10.3389/fmicb.2016.00884
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Genomic and functional features of refined and re-assembled population-level genomes.
| Population-level genome | MGS00153 | MGS00248 | MGS00113-CG02 | CLSG01 | CLSG02 | CLSG03 |
|---|---|---|---|---|---|---|
| Originating sample | M2-3V2 | M2-1V2 | M2-1V1 | M1-4V3 | M2-1V2 | M2-2V2 |
| Size (bp) | 1,954,779 | 1,555,611 | 2,970,300 | 1,871,540 | 2,180,307 | 1,916,257 |
| # Contigs | 157 | 113 | 408 | 47 | 83 | 551 |
| %GC | 50.81 | 30.48 | 44.49 | 32.25 | 35.21 | 35.98 |
| # CDS | 2,049 | 1,410 | 2,605 | 1,848 | 2,139 | 1,876 |
| # Protein-coding CDS | 2,006 | 1,371 | 2,560 | 1,809 | 2,095 | 1,852 |
| # rRNAs (complete or partial) | 2x 16S | 0 | 4x 16S/23S | 0 | 5S/23S | 0 |
| # tRNAs | 41 | 39 | 40 | 39 | 42 | 24 |
| tRNAs missing for † | I/F | F/N | I/F/Y | none | none | C/D/F/H/N/T/Y |
| # Essential genes (out of 107) | 105 | 76 | 102 | 106 | 106 | 81 |
| # Multi-copy essent. genes | 1 | 1 | 17 | 3 | 3 | 3 |
| EMP pathway complete ‡ | Yes | No | Yes | Yes | Yes | Yes |
| PP pathway complete ∗ | No | No | No | No | No | No |
| TCA cycle complete | No | No | No | No | No | No |
| Entner–Doudoroff pathway complete | No | No | No | No | No | No |
| Predicted fermentation products § | ET/AC | ET/FO/AC | ET/LA/FO/AC | ET | ET/LA/FO | ET |
| Classical electron transport chain | No | No | No | No | Partial | No |
| Rnf electron transport complex | Yes | Partial | Partial | No | No | No |
| ATP synthase | Yes | Yes | Yes | Yes | Yes | Yes |
| # Flagellar genes | 4 | 0 | 42 | 15 | 54 | 12 |
| Vitamins: B1/B2/B3/B6/B9/B12/H ∇ | -/-/-/-/+/-/- | -/-/-/-/-/-/- | -/+/+/-(?)/+/-(?)/- | -/+/-/-/(?)/-/+ | -/+/-/-/-(?)/-/+ | -/+/-/-/-/-/+ |