| Literature DB >> 35315560 |
Aline Adler1, Simon Poirier1,2, Marco Pagni3, Julien Maillard1, Christof Holliger1.
Abstract
Complete genomes can be recovered from metagenomes by assembling and binning DNA sequences into metagenome assembled genomes (MAGs). Yet, the presence of microdiversity can hamper the assembly and binning processes, possibly yielding chimeric, highly fragmented and incomplete genomes. Here, the metagenomes of four samples of aerobic granular sludge bioreactors containing Candidatus (Ca.) Accumulibacter, a phosphate-accumulating organism of interest for wastewater treatment, were sequenced with both PacBio and Illumina. Different strategies of genome assembly and binning were investigated, including published protocols and a binning procedure adapted to the binning of long contigs (MuLoBiSC). Multiple criteria were considered to select the best strategy for Ca. Accumulibacter, whose multiple strains in every sample represent a challenging microdiversity. In this case, the best strategy relies on long-read only assembly and a custom binning procedure including MuLoBiSC in metaWRAP. Several high-quality Ca. Accumulibacter MAGs, including a novel species, were obtained independently from different samples. Comparative genomic analysis showed that MAGs retrieved in different samples harbour genomic rearrangements in addition to accumulation of point mutations. The microdiversity of Ca. Accumulibacter, likely driven by mobile genetic elements, causes major difficulties in recovering MAGs, but it is also a hallmark of the panmictic lifestyle of these bacteria.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35315560 PMCID: PMC9311429 DOI: 10.1111/1462-2920.15947
Source DB: PubMed Journal: Environ Microbiol ISSN: 1462-2912 Impact factor: 5.476
Fig. 1Phylogenetic tree of 16S rRNA gene sequences related to Ca. Accumulibacter, Propionivibrio and Dechloromonas in chosen reference genomes and PacBio long‐read contigs of aerobic granular sludge samples collected on day 71 (d71), 322 (d322), 427 (d427) and 740 (d740) in a lab‐scale sequencing batch reactor fed with volatile fatty acids (VFA) for d71, VFA, glucose and amino acids for d322 and d427, VFA glucose, amino acids, starch and peptone for d740. The number of sequences per sample is indicated in the coloured dots (sequences with <3 bp of differences were considered as the same). The 16S rRNA gene sequence ACC005 and POV001 are identical to the 16S rRNA gene sequences in Ca. Accumulibacter sp. BA‐93 and Ca. Propionivibrio aalborgenis, respectively.
Characteristics of Ca. Accumulibacter related MAGs obtained with metaWRAP and the default combination of binning tools, MetaBAT2, MaxBin2 and CONCOCT (BXC) and the combination MetaBAT2, MaxBin2, MuLoBiSC (BXM).
| Sample | Binning tool | Contigs containing at least one | Number of contigs | Bin length (Mbp) | Completeness (%) | Contamination | WSC | ||
|---|---|---|---|---|---|---|---|---|---|
| d71 | BXC | ACC003a | ACC003b | ACC010 | 57 | 4.8 | 87.7 | 1.8 | 2.5 |
| BXM | ACC003a | ACC003b | 51 | 4.6 | 86.4 | 1.3 | 2.6 | ||
| ACC005a | ACC005b | 14 | 4.3 | 84.1 | 1.4 | 3.1 | |||
| d322 | BXC | ACC003a | ACC003b | 13 | 5.4 | 98.6 | 0.4 | 4.0 | |
| BXM | ACC003b | 13 | 5.4 | 98.6 | 0.4 | 2.8 | |||
| d427 | BXC | ACC003a | ACC003b | ACC003c | 45 | 6.7 | 98.2 | 2.7 | 2.0 |
| ACC004a | 4 | 5.2 | 98.0 | 1.4 | 3.8 | ||||
| ACC005a | ACC005b | ACC005c | 35 | 5.2 | 98.5 | 3.9 | 2.6 | ||
| BXM | ACC003a | ACC003b | ACC003c | 17 | 5.4 | 98.1 | 2.7 | 4.4 | |
| ACC004a | 5 | 5.2 | 98.6 | 1.5 | 3.6 | ||||
| ACC005a | ACC005b | ACC005c | 19 | 4.7 | 97.9 | 2.2 | 4.2 | ||
| d740 | BXC | ACC007a | 15 | 4.9 | 95.7 | 5.4 | 2.7 | ||
| ACC007b | ACC012 | 27 | 5.6 | 98.3 | 5.6 | 2.2 | |||
| BXM | ACC007a | 13 | 4.8 | 94.8 | 4.4 | 3.0 | |||
| ACC007b | ACC012 | 27 | 5.6 |
98.3 | 5.6 | 2.2 | |||
Only the MAGs with a completeness >80% and a contamination <10% are shown here. More details, including the lower quality bins are in the Supporting Information Table S8.
Metagenomic samples taken at four different days of reactor operation (d71, d322, d427, d740).
The Ca. Accumulibacter 16S rRNA genes with different numbers have a sequence difference of at least three nucleotides. The letters indicate different contigs containing the same Ca. Accumulibacter 16S rRNA gene.
Completeness and contamination percentages were determined with CheckM.
WSC = weighted silhouette coefficient (expressed in millions).
The completeness and contamination after the correction of the chimeral contig are 98.1% and 4.4%, respectively.
The completeness and contamination after the correction of the chimeral contig are 95.9% and 4.6%, respectively.
Fig. 2Phylogenetic tree built from a selection of polyphosphate kinase 1 (pkk1) from different Ca. Accumulibacter clades (He et al., 2007; Peterson et al., 2008; Mao et al., 2015) listed in the Supporting Information Table S3 and ppk1 gene sequences related to Ca. Accumulibacter, Propionivibrio and Dechloromonas in chosen references and PacBio long‐read contigs of aerobic granular sludge samples collected on day 71 (d71), 322 (d322), 427 (d427) and 740 (d740) in a lab‐scale sequencing batch reactor fed with volatile fatty acids (VFA) for d71, VFA, glucose and amino acids for d322 and d427, VFA glucose, amino acids, starch and peptone for d740. The presence of sequences in the samples is indicated in the coloured dots. The ppk1 sequence indicated as ACC001 is located in a contig belonging to ACC001 bin in MetaBAT2 and in bin.4 in metaWRAP_BXM.
Fig. 3Validation of a selected genomic rearrangement between contigs UNC4029 and UNC4079. A PCR amplification strategy was designed from the alignment of both contigs across a large genomic rearrangement (Supporting Information Fig. S15).
A. Matching fragments of the two contigs indicating their coordinates and the designed primers.
B. PCR strategy and primer positions on the conserved fragments from both contigs across the selected putative rearrangement. Please note that primer arrows are not depicted on the same scale as the contig fragments.
C. Gel electrophoresis after PCR amplification of targeted fragments across the predicted genomic rearrangement between contigs UNC4029 and UNC4079. Unique PCR products amplified with the above‐mentioned primer combinations were obtained with the expected sizes, thus confirming the predicted genomic rearrangement.