| Literature DB >> 20507619 |
Max Farnbacher1, Thomas Jahns, Dirk Willrodt, Rolf Daniel, Rainer Haas, Alexander Goesmann, Stefan Kurtz, Gabriele Rieder.
Abstract
BACKGROUND: The Mongolian gerbils are a good model to mimic the Helicobacter pylori-associated pathogenesis of the human stomach. In the current study the gerbil-adapted strain B8 was completely sequenced, annotated and compared to previous genomes, including the 73 supercontigs of the parental strain B128.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20507619 PMCID: PMC3091624 DOI: 10.1186/1471-2164-11-335
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
General features of different H. pylori genomes.
| B8 | 26695 | J99 | HPAG1 | P12 | |
|---|---|---|---|---|---|
| accession | FN598874 | NC_000915 | NC_000921 | NC_008086 | NC_011498 |
| chromosome | 1,673,997 bp | 1,667,867 bp | 1,643,831 bp | 1,596,366 bp | 1,673,813 bp |
| plasmid | pHPB8 (6,032 bp, | pHPAG1 (9,370 bp, | pHPP12 (10,225 bp, | ||
| GC content | 38.8% | 38.9% | 39.2% | 39.1% | 38.8% |
| CDS density | 91.8% | 90.2% | 90.2% | 91.9% | 89.7% |
| number of CDS | 1,711 | 1,576 | 1,489 | 1,536 | 1,568 |
| average length | 897 | 955 | 990 | 954 | 957 |
| annotated | 929 (54.3%) | 918 (58.2%) | 712 (47.8%) | 1,013 (66.0%) | 1,111 (70.9%) |
| strain-specific | 293 | 204 | 112 | 152 | 202 |
| thereof | 3 | 9 | 1 | 8 | 2 |
| 670,637 - 720,370 | + | + | + | 552,705 - 589,225 | |
| I: 1,575,414 - 1,578,323 | + | I: 13,587 - 16,496 | |||
| II: 1,551,606 - 1,554,461 | II: 37,867 - 40,719 | II: 38,692 - 41,645 | II: 36,339 | ||
| T4SS-3 ( | 510,833 - 526,789 | fragmented | - | - | 1,394,833 - 1,411,026 |
| T4SS-4 ( | fragmented, surrounding T4SS-3 | ? | ? | 452,423 - 492,710 | |
| PZ1:452,011 | left: 449,150 - 479,531 | I: 1,012,090 - 1,057,038 | PZ1:452,423 | ||
| right: 1,044,552 - 1,071,068 | PZ2: 1,043,356 - 1,053,784 | ||||
| PZ3: 1,394,833 - 1,423,818 | |||||
| rRNA 23S | 16S | 5S | 2 | 2 | 2 | 2 | 2 | 2+1‡ | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 |
| tRNA | 36 | 36 | 36 | 36 | 36 |
# notation for nucleotide range: first position last position; notation for gene range: first_gene-last_gene
* genes occur on reverse strand
§ number of gene copies
‡ strain 26695 contains two sets of 23S and 5S rRNA and one additional 5S rRNA without associated 23S rRNA
Figure 1Circular plot of the chromosome of strain B8. From outside to inside the circles show (a) coding sequences on the forward strand, colored according to the COG category whose code and meaning is given in the table below the plot, (b) coding sequences on the reverse strand, colored according to the COG category, (c) average GC content, calculated for a window of 3,000 bp shifted by 1,000 bp over the genome sequence in each step, (d) GC-skew (G + C = G C), calculated for a window of 3,000 bp, shifted by 1,000 bp over the genome sequence in each step. For better readability (c) and (d) are scaled up. For each COG category the number of genes belonging to it is shown in brackets in the left legend. For each COG category, the bottom legend shows the letter code, the meaning, and the color of the categories used in the circle plot. The cag-PAI and the plasticity zone are marked by an ellipse on the second inner circle. The plot was created by the software GenDB.
List of the 43 most significant repeats in the genome of strain B8, ordered by increasing E-value.
| length in bp | position | strand | length in bp | position | number of differences | E-value | sequence identity in % |
|---|---|---|---|---|---|---|---|
| 3241 | 352423 | + | 3242 | 646346 | 2 | 0.00e + 00 | 99.94 |
| 818 | 1614435 | + | 819 | 1615640 | 94 | 0.00e + 00 | 88.52 |
| 801 | 1490715 | + | 802 | 1543145 | 6 | 0.00e + 00 | 99.25 |
| 648 | 984653 | + | 647 | 1039045 | 1 | 0.00e + 00 | 99.85 |
| 4381 | 447896 | - | 4384 | 1528769 | 30 | 0.00e + 00 | 99.32 |
| 2059 | 287098 | + | 2060 | 912424 | 4 | 0.00e + 00 | 99.81 |
| 1164 | 396381 | - | 1165 | 832923 | 11 | 0.00e + 00 | 99.06 |
| • 2207 | 96959 | + | 2201 | 323043 | 39 | 0.00e + 00 | 98.23 |
| 2439 | 210771 | + | 2438 | 628615 | 4 | 0.00e + 00 | 99.84 |
| 563 | 690091 | + | 563 | 690481 | 11 | 0.00e + 00 | 98.05 |
| 588 | 273255 | + | 588 | 1414805 | 11 | 0.00e + 00 | 98.13 |
| • 1135 | 590750 | + | 1134 | 1083761 | 79 | 0.00e + 00 | 93.04 |
| 2213 | 580672 | - | 2213 | 901965 | 7 | 0.00e + 00 | 99.68 |
| 508 | 715518 | + | 514 | 720788 | 14 | 1.88e - 259 | 97.28 |
| 799 | 692230 | + | 796 | 692782 | 109 | 4.26e - 247 | 86.36 |
| 663 | 130801 | - | 671 | 1311865 | 104 | 1.76e - 186 | 84.50 |
| 602 | 167233 | + | 602 | 869141 | 87 | 8.22e - 176 | 85.55 |
| 425 | 272770 | + | 425 | 1414201 | 32 | 1.54e - 170 | 92.47 |
| 396 | 351462 | - | 395 | 1070874 | 28 | 1.56e - 161 | 92.93 |
| 298 | 445088 | - | 298 | 1538257 | 9 | 1.06e - 143 | 96.98 |
| 398 | 131808 | - | 401 | 1311134 | 40 | 1.08e - 142 | 90.02 |
| 282 | 447722 | - | 280 | 1533034 | 11 | 7.50e - 130 | 96.10 |
| 421 | 496148 | - | 424 | 526803 | 61 | 6.03e - 121 | 85.61 |
| 276 | 627392 | + | 277 | 627526 | 16 | 1.58e - 116 | 94.22 |
| 406 | 130325 | - | 406 | 1312519 | 58 | 5.69e - 116 | 85.71 |
| 205 | 213298 | + | 207 | 631421 | 2 | 4.58e - 107 | 99.03 |
| 468 | 692461 | + | 474 | 693450 | 91 | 5.98e - 103 | 80.80 |
| 308 | 444293 | - | 316 | 1539031 | 40 | 6.68e - 96 | 87.34 |
| 233 | 692119 | + | 233 | 692236 | 15 | 3.32e - 93 | 93.56 |
| 213 | 85440 | - | 216 | 468947 | 11 | 2.05e - 91 | 94.91 |
| 232 | 692120 | + | 232 | 692789 | 16 | 1.05e - 90 | 93.10 |
| 279 | 105830 | + | 279 | 1670378 | 32 | 9.03e - 89 | 88.53 |
| 225 | 1437117 | - | 226 | 1456289 | 18 | 1.55e - 83 | 92.04 |
| 243 | 693340 | + | 244 | 693664 | 24 | 5.80e - 83 | 90.16 |
| 394 | 691841 | + | 396 | 692394 | 77 | 8.27e - 83 | 80.56 |
| 191 | 397488 | - | 193 | 832659 | 9 | 3.26e - 82 | 95.34 |
| 264 | 667779 | - | 263 | 1052938 | 31 | 3.39e - 82 | 88.26 |
| 319 | 130360 | - | 318 | 630425 | 51 | 1.95e - 80 | 84.01 |
| 319 | 130360 | - | 318 | 212583 | 51 | 1.95e - 80 | 84.01 |
| 264 | 798608 | + | 267 | 800003 | 34 | 6.00e - 79 | 87.27 |
| 323 | 691912 | + | 323 | 693019 | 54 | 4.38e - 78 | 83.28 |
| 367 | 913847 | + | 367 | 1312534 | 70 | 7.92e - 78 | 80.93 |
In particular, all repeats of minimum length 100 with at least 80% sequence identity are shown. The length and the position of the first and second instance of the repeat are shown in column 1 and 2 and in column 4 and 5. The strand of the repeat is given in column 3. Column 6 shows the number of differences (i.e. insertions, deletions, replacements) in an optimal alignment of the two repeat instances. Column 7 shows the E-value of the repeat and column 8 the sequence identity. The two bullets mark the repeats also occurring in all four H. pylori reference strains 26695, J99, HPAG1, and P12.
Distribution of nucleotide differences in the best matches of the coding sequences of strain B8 against the supercontigs of B128.
| number of nucleotide differences | number of CDS | cumulative number |
|---|---|---|
| 0 | 1281 | 1281 |
| 1 | 269 | 1550 |
| 2 | 64 | 1614 |
| 3 | 16 | 1630 |
| 4 | 10 | 1640 |
| 5 | 6 | 1646 |
| 7 | 2 | 1648 |
| 9 | 1 | 1649 |
| 12 | 1 | 1650 |
| 15 | 1 | 1651 |
| 40 | 1 | 1652 |
The second column shows the number of sequences matching the number of differences. The third column accumulates the numbers of the second column.
List of 12 singletons of strain B8, i.e. genes which completely occur outside of the uncovered regions and have neither an 80/80 blastp hit in the B128 proteome nor a complete match with at most 2% differences on the DNA level.
| gene | product | J99-ortho | 26695-ortho | |
|---|---|---|---|---|
| 1 | HPB8_138 | periplasmic protein TonB | ||
| 2 | HPB8_277 | hypothetical protein predicted by Glimmer/Critica | ||
| 3 | HPB8_399 | conserved hypothetical protein | HP1105 | |
| 4 | HPB8_639 | hypothetical protein predicted by Glimmer/Critica | ||
| 5 | HPB8_655 | Hydrogenase expression/formation protein hypD2 | ||
| 6 | HPB8_692 | Plasminogen-binding protein pgbA | ||
| 7 | HPB8_888 | ferrous iron transport protein B | jhp0627 | HP0687 |
| 8 | HPB8_922 | conserved hypothetical protein | jhp0654 | HP0716 |
| 9 | HPB8_976 | hypothetical protein predicted by Glimmer/Critica | ||
| 10 | HPB8_1447 | conserved hypothetical protein | HP1187 | |
| 11 | HPB8_1483 | methyl-accepting chemotaxis protein | jhp0075 | HP0082 |
| 12 | HPB8_1618 | hypothetical protein predicted by Glimmer/Critica | ||
The last two columns show the locus tags of all genes in strain J99 and strain 26695 which are ortholog to the given gene of strain B8.
Figure 2Venn-diagram of the genomes of the . The diagram shows the numbers of orthologous coding sequences of these strains. The genes of the core genome (1,189) are present in all strains. Singletons: 202 coding sequences of strain P12 (red), 204 coding sequences of strain 26695 (green), 112 coding sequences of strain J99 (yellow), 152 coding sequence of strain HPAG1 (white), and 293 coding sequences of strain B8 (blue) have no orthologs in the other four genomes. The diagram was drawn by the software EDGAR.
Figure 3Synteny plot of . The plot is based on pairwise blastn hits (blast version 2.2.21) with the following parameters: E-value: 10, minimum sequence identity: 80%, minimum bit score: 80. Red indicates homologous regions. Blue indicates inverted homologous regions. White indicates regions without homology in the other genome. The yellow arrow marks the PZ of strain B8. The graph was created by the Artemis Comparison tool.
Figure 4Synteny plot of . The plot is based on pairwise blastn hits (blast version 2.2.21) with the following parameters: E-value: 10, minimum sequence identity: 80%, minimum bit score: 80. Red indicates homologous regions. Blue indicates inverted homologous regions. White indicates regions without homology in the other genome. The yellow arrow marks the PZ of strain B8. The graph was created by the Artemis Comparison tool. The synteny plots of strain B8 versus strains J99 and P12 can be found in Additional file 1, Figures S3 and Figure S4.
Figure 5Comparison of the PZ of strain B8 and the PZ3 of strain P12. For strain B8, the genomic region from position 490,731 to 527,504 (3'-PZ) is shown. For strain P12, the genomic region from position 1,394,833 to 1,423,818 (PZ3) is shown. Genes with significant similarity (according to blastp hits) are drawn in the same color. Genes with no similarity in the other genome are drawn in solid yellow (B8) and solid red (P12). The image was created by the software AnnotationSketch [67].
Figure 6Similarity comparison of the . The image shows the annotation of genes inside the cag-pathogenicity island of the four H. pylori genomes. Orthologous groups of genes are drawn in the same color. Each gene is labeled with its 'gene' feature tag from the Genbank annotation file. If for a given gene, a 'gene' feature tag is missing in the Genbank file in the first place, the terminal four characters of the 'locus-tag' feature tag are used as labels instead. Genes which do not show significant sequence similarities to any genes in the other genomes (singletons) are drawn with no fill color but a red stroke color. The image was created by the software AnnotationSketch [67].
Figure 7Similarity comparison of the . Clusters of genes with homology are shown in the same color. Clusters of genes with no similarity in the other genome are drawn in grey (B8) and white (P12). In strain B8, the region delimited by the cag1-gene and the cagB-gene is inverted and located 13,730 bp upstream of the cagA-gene. A cluster of six genes (four singletons, two variable genes) is located between the groEL-gene and the cagA-gene.
Figure 8Functional alignment of the three plasmids of the . Homologuous genes are shown in the same color. The cluster of mob-genes (green) and the repA-gene (yellow) are present in all three plasmids. Genes homologuous to pHPB8_1 (plasmid stabilization system protein) and to pHPB8_2 (conserved hypothetical protein) are present in pHPP12 and pHel4 (red). The microcins mccC and mccB are not present in pHPB8. Genes that are not present in the other plasmids are colored white (pHPB8), orange (pHPP12) and light blue (pHel4).