| Literature DB >> 22947175 |
Luke J Mappley1, Michael L Black, Manal AbuOun, Alistair C Darby, Martin J Woodward, Julian Parkhill, A Keith Turner, Matthew I Bellgard, Tom La, Nyree D Phillips, Roberto M La Ragione, David J Hampson.
Abstract
BACKGROUND: The anaerobic spirochaete Brachyspira pilosicoli causes enteric disease in avian, porcine and human hosts, amongst others. To date, the only available genome sequence of B. pilosicoli is that of strain 95/1000, a porcine isolate. In the first intra-species genome comparison within the Brachyspira genus, we report the whole genome sequence of B. pilosicoli B2904, an avian isolate, the incomplete genome sequence of B. pilosicoli WesB, a human isolate, and the comparisons with B. pilosicoli 95/1000. We also draw on incomplete genome sequences from three other Brachyspira species. Finally we report the first application of the high-throughput Biolog phenotype screening tool on the B. pilosicoli strains for detailed comparisons between genotype and phenotype.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22947175 PMCID: PMC3532143 DOI: 10.1186/1471-2164-13-454
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Dendrogram showing relationships among nine strains, representing six of the seven known species. Analysis was based on concatenated DNA sequences of seven MLST loci [37]. The genome sequences of the strains used in the analysis have either been completed or are currently within a genome sequencing project (*). The tree was constructed using the maximum likelihood method. Bootstrap values (%) are shown for stable nodes. The length of the scale bar is equivalent.
General genome feature comparison for strains of different host origin
| Genome size (bp) | 2586443 | 2765477 | 2889522 |
| G+C content | 27.90% | 27.79% | 27.45% |
| Total predicted ORFs | 2339 | 2696 | 2690 |
| Non-significant PID and coverage ORFs | 3 | 23 | 101 |
| Significant PID and/or coverage ORFs | 2336 | 2673 | 2589 |
| rRNA genes | 3 | 3 | 3 |
| tRNA genes | 34 | 34 | 34 |
| tmRNA genes | 1 | 1 | 1 |
| hypothetical/conserved hypothetical proteins | 657 | 590 | 545 |
| genes with function prediction | 1641 | 2045 | 2006 |
| Genes assigned to COG | 1201 | 1196 | 1276 |
| Genes assigned a KO number | 1048 | 1082 | 1128 |
| Genes assigned E.C. numbers | 523 | 567 | 563 |
| Genes with signal peptide | 244 | 322 | 316 |
| Genes with transmembrane helices | 48 | 61 | 68 |
| Mobile genetic elements (MGE) | 4 | 61 | 31 |
| insertion sequence elements (ISE) | 0 | 15 | 17 |
| integrases | 0 | 43 | 10 |
| transposases | 2 | 1 | 2 |
| recombinases | 2 | 2 | 2 |
| Suspected truncated proteins | 55 | 130 | 64 |
| Suspected protein frameshift/deletions | 4 | 223 | 50 |
The comparison includes strains 95/1000 (porcine), B2904 (avian) and WesB (human).
The incomplete WesB strain genome was within one scaffold.
Those genes with significant PID and/or query/target coverage hits; significance equals blastx/blastp PID of at least 25% and/or 75% query or target coverage.
Assigned to KO via KEGG Automatic Annotation Server (KAAS)
Figure 2Circos circular representation of the complete B2904 genome with annotated genes. The genome is orientated from the oriC and also displays the location of dnaA. Circles range from 1 (outer circle) to 7 (inner circle). Circle 1, COG-coded forward strand genes; circle 2, COG-coded reverse strand genes; circle 3, forward strand tRNA; circle 4, reverse strand tRNA; circle 5, forward strand rRNA; circle 6, reverse strand rRNA; circle 7, GC skew ((G-C)/(G + C); red indicates values >0; green indicates values <0). All genes are colour-coded according to Cluster of Orthologous Group (COG) functions shown in the key table; A, RNA processing and modification; B, chromatin structure and dynamics; C, energy production and conversion; D, cell cycle control, cell division and chromosome partitioning; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; M, cell wall, membrane and envelope biogenesis; N, cell motility and secretion; O, posttranslational modification, protein turnover and chaperones; P, inorganic ion transport and metabolism; Q, secondary metabolite biosynthesis, transport and catabolism; T, signal transduction mechanisms; U, intracellular trafficking, secretion and vesicular transport; V, defence mechanisms; W, extracellular structures; Y, nuclear structure wheat for cell division and chromosome partitioning; Z, cytoskeleton; R, general function prediction only; S, function unknown.
Figure 3Pairwise genome alignments and dot matrix plots comparing the genomes of strains 95/1000, B2904 and WesB. The Artemis Comparison Tool (ACT) was used to compare the three genome sequences against each other (A). Genome sequences were aligned from the predicted oriC and visualised in ACT with a cut-off set to blast scores >500. Red and blue bars indicate regions of similarity in the same orientation (red) and inverted (blue). Dot matrix plots of the genome sequences linearised at the oriC were generated using Freckle (B). The incomplete WesB strain genome was within one scaffold. The output displays a two-dimensional plot, whereby the dots represent matched regions between the genomes. The minimum size of matched sequences was set to 20 bp.
Distribution of Cluster of Orthologous Genes (COG) categories in strains 95/1000, B2904 and WesBa
| Translation, ribosomal structure and biogenesis (J) | 122 | 5.22 | 119 | 4.45 | 125 | 4.83 |
| Transcription (K) | 51 | 2.18 | 49 | 1.83 | 61 | 2.36 |
| Replication, recombination and repair (L) | 51 | 2.18 | 56 | 2.10 | 61 | 2.36 |
| Cell cycle control, cell division and chromosome partitioning (D) | 10 | 0.43 | 8 | 0.30 | 9 | 0.35 |
| Defence mechanisms (V) | 35 | 1.50 | 33 | 1.23 | 35 | 1.35 |
| Signal transduction mechanisms (T) | 16 | 0.68 | 15 | 0.56 | 15 | 0.58 |
| Cell wall, membrane and envelope biogenesis (M) | 74 | 3.17 | 72 | 2.69 | 79 | 3.05 |
| Cell motility (N) | 40 | 1.71 | 39 | 1.46 | 40 | 1.54 |
| Intracellular trafficking, secretion and vesicular transport (U) | 11 | 0.47 | 7 | 0.26 | 9 | 0.35 |
| Posttranslational modification, protein turnover and chaperones (O) | 40 | 1.71 | 36 | 1.35 | 39 | 1.51 |
| Energy production and conservation (C) | 89 | 3.81 | 84 | 3.14 | 84 | 3.24 |
| Carbohydrate transport and metabolism (G) | 101 | 4.32 | 123 | 4.60 | 139 | 5.37 |
| Amino acid transport and metabolism (E) | 141 | 6.03 | 138 | 5.16 | 149 | 5.76 |
| Nucleotide transport and metabolism (F) | 49 | 2.10 | 54 | 2.02 | 57 | 2.20 |
| Coenzyme transport and metabolism (H) | 47 | 2.01 | 44 | 1.65 | 48 | 1.85 |
| Lipid transport and metabolism (I) | 41 | 1.75 | 33 | 1.23 | 34 | 1.31 |
| Inorganic ion transport and metabolism (P) | 53 | 2.27 | 53 | 1.98 | 49 | 1.89 |
| Secondary metabolites biosynthesis, transport and catabolism (Q) | 9 | 0.38 | 10 | 0.37 | 9 | 0.35 |
| General function prediction only (R) | 149 | 6.37 | 147 | 5.50 | 157 | 6.06 |
| Function unknown (S) | 72 | 3.08 | 76 | 2.84 | 77 | 2.97 |
| Not in COG (X) | 1137 | 48.63 | 1477 | 55.26 | 1313 | 50.71 |
| 2338 | 100 | 2673 | 100 | 2589 | 100 | |
The number and percentage of the total genes within each of the genomes, assigned to each functional group are shown.
Those genes with significant PID and/or query/target coverage hits; significance equals blastx/blastp PID of at least 25% and/or 75% query or target coverage.
The incomplete WesB strain genome was within one scaffold.
Figure 4Venn diagram of genes unique to and shared between strains 95/1000, B2904 and WesB. The Venn diagram was resolved via BLASTlineMCL protein clustering. Each circle represents the total number of protein-coding genes in the genome, whereby overlapping regions indicate the number of genes shared between the respective genomes.
Protein blastmatrix analysis of nine genomes
| 21.36% | 21.65% | 22.73% | 24.95% | 19.84% | 25.95% | 68.43% | 65.32% | 1.73% | |
| 17.94% | 18.56% | 20.78% | 20.90% | 16.77% | 22.11% | 54.93% | 0.74% | | |
| 20.47% | 21.25% | 21.74% | 23.60% | 19.25% | 25.35% | 2.71% | | | |
| 22.37% | 31.51% | 33.68% | 36.40% | 29.33% | 5.30% | | | | |
| 20.65% | 31.23% | 46.77% | 57.65% | 1.77% | | | | | |
| 21.05% | 32.03% | 50.33% | 1.56% | | | | | | |
| 17.72% | 27.48% | 1.11% | | | | | | | |
| 23.48% | 2.54% | | | | | | | | |
| 1.11% |
The percentage of the total CDS that were identified in other genomes (green) and the proportion of protein repeats within the genome (red), is shown. A cut-off e-value of 1e-05 was used.
Incomplete genome currently within a genome sequencing project.
The incomplete WesB strain genome was within one scaffold.
The number of genes with potential roles in pathogenesis and virulence in the three genomes
| Core genes involved in lipopolysaccharide biosynthesis | 27 | 30 | 32 |
| Chemotaxis | |||
| putative methyl-accepting chemotaxis protein | 7 | 7 | 10 |
| methyl-accepting chemotaxis protein A ( | 2 | 0 | 2 |
| methyl-accepting chemotaxis protein B ( | 8 | 11 | 11 |
| chemotaxis protein | 15 | 15 | 15 |
| Flagella | 42 | 42 | 42 |
| Adhesion and membrane protein | |||
| lipoprotein | 21 | 31 | 29 |
| variable surface protein | 3 | 4 | 4 |
| integral membrane protein | 1 | 1 | 1 |
| outer membrane protein | 25 | 25 | 23 |
| periplasmic protein | 25 | 25 | 28 |
| inner membrane protein | 75 | 83 | 83 |
| Host tissue degradation | |||
| haemolysis | 12 | 12 | 12 |
| phospholipase | 2 | 3 | 2 |
| peptidase | 44 | 48 | 48 |
| protease | 19 | 19 | 17 |
| Oxidative stress | 7 | 7 | 7 |
| Ankyrin-like protein | 31 | 34 | 35 |
| Phage and other mobile genetic elements | 46 | 109 | 100 |
| Total | 412 | 506 | 501 |
The analysis categorised the genes from the genomes of B. pilosicoli 95/1000, B2904 and WesB.
The incomplete WesB strain genome was within one scaffold.
Core lipooligosaccharide (LOS) biosynthesis genes.
Figure 5Comparison of the organisation of the bacteriophages in the three genomes. A comparison on bacteriophages pP1 in 95/1000, pP2 in B2904 and pP3 in WesB and also the three bacteriophages found in B. murdochii 56-150T; pM1, pM2 and pM3. Genes encoding hypothetical proteins (grey) and genes with predicted protein function (yellow) are indicated.
Correlation between differences in carbon source utilisation and genotype of 95/1000, B2904 and WesB
| D-Mannose | - | - | + | WesB is the only strain with the mannose/sorbose-specific PTS system IIABCD components (wesB_1269, wesB_1270, wesB_1271 and wesB_1272) for uptake and phosphorylation of D-mannose. |
| D-Glucuronic acid | - | + | + | 95/1000 lacks the pfkB carbohydrate kinase, 2-dehydro-3-deoxygluconate kinase, which links D-glucuronic acid metabolism to glycolysis. This enzyme is found in both B2904 (B2904_orf899 and B2904_orf900) and WesB (wesB_1781). |
| D-Mannitol | - | + | - | B2904 is the only strain with the D-mannitol PTS system IIABC components (B2904_orf2447) and also a mannitol-1-phosphate 5-dehydrogenase (B2904_orf2446) for D-mannitol, uptake, phosphorylation and catabolism. |
| Glucuronamide | - | + | + | 95/1000 lacks the pfkB carbohydrate kinase, 2-dehydro-3-deoxygluconate kinase, which links glucuronate and related compound metabolism to glycolysis. This enzyme is found in both B2904 (B2904_orf899 and B2904_orf899 and B2904_orf900) and WesB (wesB_1781). |
| β-D-Allose | - | - | + | WesB is the only strains with D-allose ABC transporter components (wesB_1171, wesB_1172 and wesB_1175) and D-allose kinase (wesB_0259 and wesB_1174) for uptake and metabolism of D-allose. |
| β-Methyl-D-glucuronic acid | - | + | + | 95/1000 lacks the pfkB carbohydrate kinase, 2-dehydro-3-deoxygluconate kinase, which links glucuronate and related compound metabolism to glycolysis. This enzyme is found in both B2904 (B2904_orf899 and B2904_orf900) and WesB (wesB_1781). |
| L-Sorbose | - | - | + | WesB is the only strain with the mannose/sorbose-specific PTS system IIABCD (wesB_1269, wesB_1270, wesB_1271, wesB_1272) components for uptake and phosphorylation of L-sorbose. |
Possible explanations for the differences in phenotype relate to differences in genomic features.