| Literature DB >> 25331649 |
Seyed Yahya Anvar1, Jeroen Frank, Arjan Pol, Arnoud Schmitz, Ken Kraaijeveld, Johan T den Dunnen, Huub Jm Op den Camp.
Abstract
BACKGROUND: Aerobic methanotrophs can grow in hostile volcanic environments and use methane as their sole source of energy. The discovery of three verrucomicrobial Methylacidiphilum strains has revealed diverse metabolic pathways used by these methanotrophs, including mechanisms through which methane is oxidized. The basis of a complete understanding of these processes and of how these bacteria evolved and are able to thrive in such extreme environments partially resides in the complete characterization of their genome and its architecture.Entities:
Mesh:
Year: 2014 PMID: 25331649 PMCID: PMC4210602 DOI: 10.1186/1471-2164-15-914
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Read statistics of 8 SMRT sequencing runs pre and post correction
| PacBio RS (raw) | PacBio RS (corrected)* | |
|---|---|---|
| Number of reads | 234,459 | 48,452 |
| Total nucleotides | 352,940,647 | 90,484,833 |
| Median read length | 1,263 bp | 1,742 bp |
| 5th percentile | 396 bp | 699 bp |
| 95th percentile | 3,374 bp | 3,311 bp |
| Maximum length | 22,910 bp | 15,852 bp |
| GC content | 43.54% | 41.70% |
| Coverage depth | 141.74× | 36.34× |
*Error-corrected PacBio reads generated by HGAP with seed length of 1,500 bp.
SMRT genome assembly statistics
| Draft genome 1 | SMRT de novo 2 | |
|---|---|---|
| Number of reads | 16,099,262 | 48,452 |
| Sequencing depth | 401.23× | 36.34× |
| Number of contigs | 109 | 1 |
| Bases in scaffolds | 2,362,416 bp | 2,476,673 bp* |
| N50 | 50,138 bp | 2,476,673 bp |
| Maximum length | 166,468 bp | 2,476,673 bp |
| GC content | 40.91% | 41.48% |
| Genome coverage | 95.54%** | 100%*** |
| Accuracy | 99.9958% | 99.9998% |
1Draft genome assembled using Illumina GAII and Roche 454 reads using CLCBio (CLCBio, Aarhus, Denmark) and curated manually [6].
2SMRT de novo assembly was carried out on corrected PacBio reads using Celera Assembler 7.0.
*The total bases in the scaffolds were determined after circularization of the final assembly.
**The overall genome coverage is determined by calculating the total number of gaps in the draft genome as compared to the final assembly.
***The genome coverage of SMRT de novo is determined by aligning PacBio reads, generated by two independent SMRT sequencing runs, to the final assembly.
Figure 1The SolV genome sequence. A) Circos plot depicts the level of concordance between draft genome and the final assembly. Coloured links highlight misassemblies in the draft genome. B) Circos plot illustrates the overall genetic makeup of the Methylacidiphilum fumariolicum SolV. The outer ring marks the positions of tandem repeats across the genome. The next rings (outside to inside) show: gene annotation, highlighting key biological pathways in colours; placement of the draft genome in respect to the final assembly; and the overall GC content and the coverage profile of SMRT, Roche 454, and Illumina GAII sequencing reads. Repetitive sequences and structural variations are linked across the genome. Repeats that are longer than 2 Kb are shown in red whilst shorter repeats are linked in grey.
Figure 2Metabolic regulation of SolV cell cultures grown under different conditions. A) Circos plot depicts the genome-wide expression profile for cell cultures under maximum growth conditions (blue) and the relative gene expressions (fold change) of cell cultures grown under nitrogen fixation or oxygen limitation conditions. Count-per-million (CPM) was used to determine the level of gene expression. Key biological pathways are highlighted in different colours. B) MA plot for cell cultures under nitrogen fixation condition as compared to cell cultures in maximum growth environment. Deregulated genes are depicted in red. MA plot for cell cultures under oxygen limitation condition as compared to cell cultures in maximum growth environment. Deregulated genes are depicted in red. C) Venn diagram shows the number of genes that are differentially expression in both N2fix and O2lim conditions compared to μmax. Pie charts illustrate the fraction of genes that have a higher (black) or lower (light grey) expression in N2fix and O2lim cell cultures relative to μmax. D) Bar charts present the fraction of up- or down-regulated genes (black and light grey, respectively) in each of the nine key pathways. Red line depicts the 50% mark. The proportion of non-significant genes is depicted in white.
Figure 3The SolV global methylation state. The first inner circle shows the annotated genes and highlights those that are involved in key metabolic pathways. The second ring depicts methylated adenines that are associated with specific motifs. The placement of 5′- ACN4GT-3′, 5′-CC AN5CTC-3′, and 5′-G AGN5TGG-3′ motifs are highlighted in red, purple, and blue ticks, respectively. Methylated bases that are not associated with any motifs are presented in the three innermost circles. The position of additional methylated adenines is shown in black. The position of m4C and m5C bases is marked in green and orange, respectively.
Adenine motif statistics
| Motif 1 | # motifs in genome | # motifs detected | % motifs detected | % intergenic | Mean coverage |
|---|---|---|---|---|---|
| G | 1,182 | 1,159 | 98.1 | 10.1 | 142.4 |
| CC | 1,182 | 1,153 | 97.6 | 9.7 | 143.1 |
|
| 6,202 | 6,151 | 99.2 | 15.2 | 140.4 |
Motifs with a modification quality value >50 are considered.
1Methylated adenines are typed in bold.