| Literature DB >> 26779178 |
Jeroen Frank1, Celia Dingemanse2, Arnoud M Schmitz1, Rolf H A M Vossen1, Gert-Jan B van Ommen2, Johan T den Dunnen3, Els C Robanus-Maandag2, Seyed Yahya Anvar4.
Abstract
BACKGROUND: Immuno-compromised mice infected with Helicobacter typhlonius are used to model microbially inducted inflammatory bowel disease (IBD). The specific mechanism through which H. typhlonius induces and promotes IBD is not fully understood. Access to the genome sequence is essential to examine emergent properties of this organism, such as its pathogenicity. To this end, we present the complete genome sequence of H. typhlonius MIT 97-6810, obtained through single-molecule real-time sequencing.Entities:
Keywords: Helicobacter typhlonius; Pacific Biosciences; genome assembly; methylation; pathogenicity; single-molecule real-time sequencing
Year: 2016 PMID: 26779178 PMCID: PMC4705304 DOI: 10.3389/fmicb.2015.01549
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Read statistics of 3 SMRT sequencing runs pre- and post-correction.
| PacBio RSII (Raw) | PacBio RSII (Corrected)1 | |
|---|---|---|
| Number of reads | 164,030 | 4,157 |
| Total nucleotides | 649,035,578 | 37,634,528 |
| Median read length | 2,795 bp | 9,053 bp |
| 5th percentile | 805 bp | 686 bp |
| 95th percentile | 10,881 bp | 16,281 bp |
| Maximum length | 29,940 bp | 20,234 bp |
| GC content | 40.38% | 38.86% |
| Coverage depth | 337.89× | 19.59× |
Single-molecule real-time (SMRT) de novo genome assembly statistics.
| SMRT | |
|---|---|
| Number of reads | 4,157 |
| Sequencing depth | 19.59× |
| Number of contigs | 1 |
| Bases in scaffolds | 1,920,832 bp∗ |
| GC content | 38.8% |
| Accuracy | 99.9890% |
Annotation statistics.
| Number of PEGs | 2,117 |
| Average PEG length | 836 bp |
| Coding density | 92.2% |
| PEGs assigned to subsystem | 890 (42.0%) |
| Hypothetical proteins | 747 (35.3%) |
| Number of rRNAs | 4 |
| Number of tRNAs | 39 |
Base modifications and motifs: adenine and cytosine motif statistics.
| Motif1 | # Motifs in Genome | # Motifs Detected | % Motifs Detected | % Intergenic | Mean Coverage | Presence in |
|---|---|---|---|---|---|---|
| G | 20,546 | 20,492 | 99.7 % | 9.3% | 237.1 | J99-R3, 26695 |
| GDCCN | 2,110 | 2,073 | 98.2% | 3.6% | 236.1 | |
| GRA | 2,682 | 1,977 | 73.7% | 5.6% | 233.4 | |
| 1,980 | 1,965 | 99.2% | 3.6% | 234.7 | ||
| G | 1,166 | 1,152 | 98.8% | 5.7% | 237.1 | J99-R3, 26695 |
| GT | 1,068 | 1,025 | 96.0% | 6.9% | 241.1 | J99-R3, 26695 |
| GTNN | 512 | 476 | 93.0% | 3.8% | 233.1 | |
| GTS | 222 | 216 | 93.7% | 7.9% | 236.4 | J99-R3 |