Literature DB >> 15676066

Inter-species horizontal transfer resulting in core-genome and niche-adaptive variation within Helicobacter pylori.

Nigel J Saunders1, Prawit Boonmee, John F Peden, Stephen A Jarvis.   

Abstract

BACKGROUND: Horizontal gene transfer is central to evolution in most bacterial species. The detection of exchanged regions is often based upon analysis of compositional characteristics and their comparison to the organism as a whole. In this study we describe a new methodology combining aspects of established signature analysis with textual analysis approaches. This approach has been used to analyze the two available genome sequences of H. pylori.
RESULTS: This gene-by-gene analysis reveals a wide range of genes related to both virulence behaviour and the strain differences that have been relatively recently acquired from other sequence backgrounds. These frequently involve single genes or small numbers of genes that are not associated with transposases or bacteriophage genes, nor with inverted repeats typically used as markers for horizontal transfer. In addition, clear examples of horizontal exchange in genes associated with 'core' metabolic functions were identified, supported by differences between the sequenced strains, including: ftsK, xerD and polA. In some cases it was possible to determine which strain represented the 'parent' and 'altered' states for insertion-deletion events. Different signature component lengths showed different sensitivities for the detection of some horizontally transferred genes, which may reflect different amelioration rates of sequence components.
CONCLUSION: New implementations of signature analysis that can be applied on a gene-by-gene basis for the identification of horizontally acquired sequences are described. These findings highlight the central role of the availability of homologous substrates in evolution mediated by horizontal exchange, and suggest that some components of the supposedly stable 'core genome' may actually be favoured targets for integration of foreign sequences because of their degree of conservation.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15676066      PMCID: PMC549213          DOI: 10.1186/1471-2164-6-9

Source DB:  PubMed          Journal:  BMC Genomics        ISSN: 1471-2164            Impact factor:   3.969


Background

Helicobacter pylori is a bacterial pathogen associated with gastritis, peptic ulcers, gastric adenocarcinoma, and rare lymphomas [1]. It has a highly panmictic population structure in which homologous recombination makes the predominant contribution to sequence differences within a highly diverse population structure [2]. The acquisition of genes from other strains and species is by far the most rapid evolutionary process. This occurs frequently without loss of existing functions, is central to the evolution of niche-adaptive and pathogenic characteristics of bacteria, and greatly influences inter-strain differences in gene complement [3-5]. In this context, it is notable that none of the traits typically used to differentiate E. coli from Salmonella can be attributed to point mutation genes but are broadly attributable to horizontal exchange [6]. H. pylori is relatively unusual in that it is a naturally transformable Gram-negative species that does not appear to have a species-specific DNA uptake sequence and appears to rely upon its niche separation as a transformation barrier [7]. Disease associated H. pylori strains have been divided into two types, type I being those that carry the cag pathogenicity island [8] (cag PAI), which has a foreign species origin, and are associated with more severe disease. Dinucleotide composition is highly stable within a genome and can distinguish between sequences from different species. Based upon its constancy the species composition is referred to as a 'genome signature' [9,10]. This characteristic has been applied to assessments of DNA metabolic processes such as methylation and base conversion, DNA structure, and evolutionary relationships. It has also become established as a method for the identification of sequences that have been acquired by inter-species horizontal transfer. For example, lateral transfer has recently been shown using these methods for a tryptophan pathway operon [11], the gain of additional metabolic functions in Pseudomonas putida [12], a determination that many gain of function genes have been acquired by E. coli rather than lost from S. typhi [13], and more recently developed Bayesian methods based upon similar premises have been used to assess global signatures and determine the origins of some lateral transfer events [14,15]. However there are problems associated with this and other methods that use progressive 'walking windows', and the larger the window the greater the problems. These result from the inclusion of intergenic sequence, the inability to distinguish divergences due to a single highly divergent gene from that from a cluster of less divergent ones, and an inability to identify the limits of the abnormal regions. In practice additional features are necessary to determine the ends of such regions, such as the location of repeats typical of pathogenicity islands in H. pylori [16], or comparisons with other sequences as in N. meningitidis strain MC58 [17]. In addition, divergence scores are influenced by the size of the sampling window used such that sampling effects limit analysis of sequences shorter than about 800 bp (data not presented), and the need to use fixed window sizes prevents gene by gene studies. We describe the use of a linear implementation of signature analysis that can efficiently address a range of walking window sizes using dinucleotide signatures (DNS) and longer signatures. In addition, use of a new approach based upon classical text analysis that allows analysis of genomes gene-by-gene is described. Analysis of H. pylori sequences, combined with comparisons of the identified genes between genomes, reveals complex changes that influence both niche-adaptive and core functions illustrating a previously unpredicted range of functions which are continuously undergoing variation and selection.

Results and discussion

Genes were ranked on the basis of their divergence from the mean genome composition. The degree of divergence that is indicative of acquisition from other species is not an absolute. The frequency with which genes are acquired, the untypicality of the donated material, and the rate at which they are ameliorated to the host sequence composition influence it. Strains J99 and 26695 had 53 (Table 1) and 60 (Table 4) genes respectively with DNS that were >2 SD from the mean. Those with annotated functions included genes from the cag pathogenicity island (6 and 5), vac and related toxins (3 and 4), and restriction-modification genes (2 and 4). On the basis of the similarities determined in the H. pylori strain J99 sequence annotation, 7 of the most divergent genes as determined by DNS are not present in strain 26695. Likewise, 2 of the 50 most divergent genes in strain 26695 are not present in strain J99. This is consistent with the identification of genes acquired from other species that have not extended to both sequenced strains. It also suggests that a significant proportion of the 6 to 7% of genes unique to one or other strain [18] are inherent to the Helicobacter gene pool, but are variably present in different strains rather than reflecting recent foreign origins. Comparisons of a selection of identified orthologous genes in the two strains are shown in Figure 1.
Table 1

The 53 most divergent (>2 SD) genes in H. pylori strain J99 by DNS showing their ranking in strain 26695 and in TNS and HNS analysis

DNS orderJHP #annotation26695 #26695 DNS orderTNS orderHNS order
1JHP0952hypothetical proteinHP04271431355
2JHP0476cag pathogenicity island protein (cag7)HP0527122
3JHP0556vacuolating cytotoxin (vacA) paralogHP0609/104/1354
4JHP0274vacuolating cytotoxin (vacA) paralogHP0289265
5JHP0305hypothetical proteinHP03223810
6JHP0942hypothetical proteinHP099651327
7JHP0856vacuolating cytotoxin (vacA) paralogHP0922696
8JHP0050hypothetical proteinHP005888784
9JHP1300hypothetical proteinHP14081511
10JHP1044hypothetical proteinHP11168148
11JHP0928hypothetical proteinNAH-129
12JHP0074hypothetical proteinHP0080932125
13JHP0440hypothetical proteinHP048871617
14JHP1042hypothetical proteinHP11152025694
15JHP1321histidine and glutamine-rich metal-binding proteinHP143246449
16JHP0934hypothetical proteinNAH-1595
17JHP0495cag island protein (cagA)HP0547312012
18JHP0931topoisomerase I (topA 3)NAH-1820
19JHP0693hypothetical proteinHP075624591490
20JHP0632N-methylhydantoinaseHP0696194436
21JHP0471cag pathogenicity island protein (cag3)HP0522113562
22JHP0438outer membrane proteinHP04862667145
23JHP0026hypothetical proteinHP0030453664
24JHP1084outer membrane protein (omp26)HP1157341724
25JHP0481cag island protein (cagT)HP05322370558
26JHP0052hypothetical proteinHP00594324120
27JHP0336hypothetical proteinHP1089125154
28JHP1426iron(III) dicitrate transport protein (fecA)HP14003278111
29JHP0174hypothetical proteinHP0187 / 8 / 647&1127&5968890
30JHP1297type III restriction enzyme (res)NAH-6328
31JHP0953hypothetical proteinNAH-261463
32JHP0067urease beta subunit (urea amidohydrolase) (ureB)HP0072213770
33JHP0941integrase/recombinase (xerD)HP099525100541
34JHP0548flagellin A (flaA)HP06013340154
35JHP0299hypothetical proteinHP061/2230&76511275
36JHP1033hypothetical proteinHP110659262342
37JHP1409type II restriction enzyme (methyltransferase)NAH-5515
38JHP0626iron(III) dicitrate transport protein (fecA)HP0686628947
39JHP0940hypothetical proteinNAH-53393
40JHP1253hypothetical proteinHP13334075384
41JHP0132cytochrome oxidase (cbb3 type) (fixN)HP014427206209
42JHP0842hypothetical proteinHP0906422921
43JHP0925hypothetical proteinNAH-130990
44JHP0613hypothetical proteinHP0669694233
45JHP0565DNA mismatch repair protein (mutS)HP06212222782
46JHP1363DNA polymerase I (polA)HP1470308146
47JHP0489cag island protein (cagH)HP054171137398
48JHP1260siderophore-mediated iron transport protein (tonB)HP1341851260402
49JHP0492DNA transfer protein (cagE)HP05441049550
50JHP1121DNA-directed RNA polymerase, beta subunit (rpoB)HP1198842316
51JHP1434DNA repair protein (recN)HP139335177160
52JHP0491cag island protein (cagF)HP054382170828
53JHP0191hypothetical proteinHP020557337

Genes with > 2 SD divergence indicated in bold

NAH indicates No Annotated Homologue in the other sequence

Table 4

Top 60 most divergent (>2 SD) genes by DNS in H. pylori strain 26695 plus those additional genes in the top 50 genes from TNS and HNS

DNS orderannotationHP#J99 #J99 DNS orderTNS orderHNS order
1cag pathogenicity island protein (cag7)HP0527JHP0476211
2vacuolating cytotoxin (vacA) paralogHP0289JHP0274424
3poly E-rich hypothetical proteinHP0322JHP0305585
4hypothetical proteinHP0609JHP0556*369
5hypothetical proteinHP0996JHP094261446
6vacuolating cytotoxin (vacA) paralogHP0922JHP0856753
7hypothetical proteinHP0488JHP0440131012
8hypothetical proteinHP1116JHP1044101113
9hypothetical proteinHP0080JHP00741218122
10hypothetical proteinHP0489JHP044111536582
11cag pathogenicity island protein (cag3)HP0522JHP04712148100
12hypothetical proteinHP1089JHP0336276759
13vacuolating cytotoxin (vacA) paralogHP0610JHP0556*31217
14hypothetical proteinHP0427JHP095213737
15hypothetical proteinHP1408JHP130094738
16type III restriction enzyme R protein (res)HP0592NAH-3035
17hypothetical proteinHP0119NAH-72
18vacuolating cytotoxin (vacA)HP0887JHP0819592534
19N-methylhydantoinaseHP0696JHP0632203543
20hypothetical proteinHP1115JHP10421433866
21urease beta subunit (urea amidohydrolase) (ureB)HP0072JHP0067323887
22DNA mismatch repair protein (MutS)HP0621JHP05654513764
23cag island protein (cagT)HP0532JHP04812587693
24hypothetical proteinHP0756JHP069319711548
25integrase/recombinase (xerD)HP0995JHP09413339448
26outer membrane proteinHP0486JHP043822147142
27cytochrome oxidase (cbb3 type) (fixN)HP0144JHP013241102168
28type IIS restriction enzyme R and M protein (ECO57IR)HP1517NAH-4214
29DNA transfer protein (cagE)HP0441JHP0492495122
30DNA polymerase I (polA)HP1470JHP1363467754
31cag island protein (cagA)HP0547JHP049517157
32iron(III) dicitrate transport protein (fecA)HP1400JHP14262899129
33flagellin A (flaA)HP0601JHP05483440180
34outer membrane protein (omp26)HP1157JHP1084241725
35DNA repair protein (recN)HP1393JHP143451154207
36type I restriction enzyme R protein (hsdR)HP0464NAH-9026
37cell division protein (ftsK)HP1090JHP03356718190
38hypothetical proteinHP1003NAH-61170
39histidine-rich, metal binding polypeptide (hpn)HP1427NAH-261449
40hypothetical proteinHP1333JHP12534053296
41hypothetical proteinHP0788JHP07256872256
42hypothetical proteinHP0906JHP0842422216
43hypothetical proteinHP0059JHP00522621320
44GMP reductase (guaC)HP0854JHP0790107169451
45hypothetical proteinHP0030JHP0026232439
46histidine and glutamine-rich metal-binding proteinHP1432JHP13211591432
47hypothetical proteinHP0186JHP017429130276
48fucosyltransferaseHP0651JHP05961054375
49translation elongation factor EF-Tu (tufB)HP1205JHP11288164166
50virulence associated protein homolog (vacB)HP1248JHP116979164160
51hypothetical proteinHP0449NAH-81449
52type III restriction enzyme R proteinHP1371JHP12855511923
53virB4 homolog (virB4)HP0459NAH-4928
542',3'-cyclic-nucleotide 2'-phosphodiesterase (cpdB)HP0104JHP0096567368
55hypothetical proteinHP1479JHP1372135153127
56RNA polymerase sigma-70 factor (rpoD)HP0088JHP0081625531
57hypothetical proteinHP0205JHP019153788
58hypothetical proteinHP1143JHP1071782941
59hypothetical proteinHP1106JHP103336272277
60cag pathogenicity island protein (cag13)HP0534JHP0482712251021
63DNA topoisomerase I (topA)HP0440NAH-14924
68outer membrane protein (omp3)HP0079JHP00737964599
69hypothetical proteinHP0669JHP0613446042
74cag pathogenicity island protein (cag8)HP0528JHP0477725027
75hypothetical proteinHP0453NAH-5810
84DNA-directed RNA polymerase, beta subunit (rpoB)JHP1121502319
91hypothetical proteinHP1142JHP107060196
97multidrug resistance protein (spaB)HP0600JHP0547754130
103type I restriction enzyme R protein (hsdR)HP1402JHP14241958621
109adenine/cytosine DNA methyltransferaseHP0054NAH-12020
119preprotein translocase subunit (secA)HP0786JHP072315917649
121hypothetical proteinHP0058JHP00513941653
122hypothetical proteinHP0513JHP04621042815
125type I restriction enzyme M protein (hsdM)HP1403JHP142329934044
132hypothetical proteinHP0731JHP06681108032
139hypothetical proteinHP0508JHP0458843277
142hypothetical proteinHP1187JHP11132743138
167hypothetical proteinHP1520NAH-2033
179hypothetical proteinHP0118JHP0110642736
195type III restriction enzyme R protein (res)HP1521JHP141016121018
209outer membrane protein (omp17)HP0725JHP066225747101
224hypothetical proteinHP0733JHP067076922248
230hypothetical proteinHP0611JHP029935371129
249hypothetical proteinHP0345NAH-461338
283hypothetical proteinHP0120NAH-4450
291translation initiation factor IF-2 (infB)HP1048JHP037733033245
297DNA polymerase III alpha-subunit (dnaE)HP1460JHP135350921947
342type I restriction enzyme R protein (hsdR)HP0846JHP078424410137
363adenine specific DNA methyltransferase (mod)HP1522JHP141185720711
410secreted protein involved in flagellar motilityHP1192JHP1117614131256
593hypothetical proteinHP1516NAH-341090
631hypothetical proteinHP0586JHP053457716329
1080type II restriction enzyme (methyltransferase)HP0478JHP043095322040

* probably frame shifted components of the same vacA related gene

Genes with > 2 SD divergence in each analysis are indicated in bold

NAH indicates No Annotated Homologue in the other sequence

Figure 1

Comparisons using LAlign between a representative selection of orthologous genes with divergent DNA present in both H. pylori strains J99 and 26695 (presented in descending order of divergence as determined in strain J99).

The 53 most divergent (>2 SD) genes in H. pylori strain J99 by DNS showing their ranking in strain 26695 and in TNS and HNS analysis Genes with > 2 SD divergence indicated in bold NAH indicates No Annotated Homologue in the other sequence Top 60 most divergent (>2 SD) genes by DNS in H. pylori strain 26695 plus those additional genes in the top 50 genes from TNS and HNS * probably frame shifted components of the same vacA related gene Genes with > 2 SD divergence in each analysis are indicated in bold NAH indicates No Annotated Homologue in the other sequence Comparisons using LAlign between a representative selection of orthologous genes with divergent DNA present in both H. pylori strains J99 and 26695 (presented in descending order of divergence as determined in strain J99). It cannot be assumed that all genes identified in this manner have been recently acquired. It is necessary to assess the nature of the sequence to determine if its divergence might be accounted for on the basis of features of the encoded protein. For example, JHP0476/HP0527, JHP1300/HP1408 and JHP0074/HP0080 include repetitive sequences likely to account for their DNS divergence. This type of analysis cannot be used to determine the possible foreign origin of such genes. Notably, the most divergent cag PAI gene (the 1st and 2nd most divergent gene in the whole genomes of strain 26695 and J99 respectively, JHP0476/HP0527) has a highly complex repetitive structure and the size of the large divergent peak associated with this island using previous methods is largely due to the presence of this gene. While a significant proportion of the genes identified in this analysis are associated with regions including several such genes and which share characteristics of islands of horizontal transfer or pathogenicity islands, this is far from universally true. There are many instances of single genes or small numbers of genes that are present that are not associated with any features that might otherwise have been used as indicators of horizontal acquisition such as transposases and flanking repeats. Our initial goal was to identify recently acquired and exchanged genes as candidates likely to be important in niche-adaptation, host interactions, and alterations in bacterial fitness. It has been argued that essential genes are unlikely to be transferred successfully since recipient taxa would already bear functional orthologues, which would have experienced long-term co-evolution with the rest of the cellular machinery. In contrast, it is proposed that those under weak or transient selection – like those associated with nonessential catabolic processes, new operons, and those providing new niche-adaptive changes are likely to be successfully transferred and retained [19]. This leads to a model in which a stable 'core genome' comprised of essential metabolic, regulatory, and cell division genes provides a stable context for the more labile non-essential and niche adaptive genes. On this basis such genes are used for phylogenetic studies and are thought to provide a relatively constant background in which species evolution occurs. Many of the genes identified for which functions are known affect virulence or niche adaptive genes, including: the vacuolating cytotoxin and related toxins (2 and 3), urease and flagellar components, and genes involved in iron acquisition. However, we also find clear evidence, confirmed by differences between the two genome sequences, that recent, and therefore relatively frequent, horizontal transfer is not limited to genes associated with niche adaptation and virulence. Amongst the core function genes identified were mutS, ftsK, xerD, and polA. The comparisons of the latter three between the sequence strains are shown in Figure 1f,g &1j. These comparisons support the results suggesting that these genes have been the substrates for horizontal exchange between species. Tetranucleotide composition has been used for the consideration of the presence of palindromic sequences that might be substrates for restriction systems and Chi sites and the presence of unstable repeats mediating phase variation [10], but the use of longer component signatures has not been used to identify horizontally acquired regions in bacterial genomes. Following analysis of eukaryotic sequences it was concluded that DNS captures most of the departure from randomness in DNA sequences and that longer component lengths correlate highly with the DNS results [20]. Also, analysis of dinucleotides separated by no, one, or two other nucleotides showed that separated pairs are more nearly random than adjacent pairs and were concluded to be relatively uninformative [9]. However, in preliminary analyses, while results using the typically long walking windows gave concordant results as previously reported, we found that the use of smaller walking windows generated progressively more different patterns of divergence with other length components. Using tetranucleotide (TNS) and hexanucleotide (HNS) signature analysis we find that, while in some instances there is significant overlap between the genes identified using the different component lengths, there are substantial differences that indicate additional horizontally transferred genes not identified by DNS alone (Tables 2 to 6).
Table 2

Top 50 most divergent genes by TNS in H. pylori strain J99 plus those additional genes > 2 SD greater than the mean by DNS and the 50 most divergent by HNS

TNS orderAnnotationJHP #26695 #DNS orderHNS order
1hypothetical proteinJHP1300HP140891
2cag pathogenicity island protein (cag7)JHP0476HP052722
3hypothetical proteinJHP0952HP042711355
4histidine and glutamine-rich metal-binding proteinJHP1321HP14321549
5vacuolating cytotoxin (vacA) paralogJHP0556HP0609/1034
6vacuolating cytotoxin (vacA) paralogJHP0274HP028945
7hypothetical proteinJHP0050HP0058884
8hypothetical proteinJHP0305HP0322510
9vacuolating cytotoxin (vacA) paralogJHP0856HP092276
10type I restriction enzyme (hsdS)JHP1422NAH3193
11hypothetical proteinJHP0299HP061/235275
12hypothetical proteinJHP0928NAH119
13hypothetical proteinJHP0942HP0996627
14hypothetical proteinJHP1044HP1116108
15hypothetical proteinJHP0934NAH1695
16hypothetical proteinJHP0440HP04881317
17outer membrane protein (omp26)JHP1084HP11572424
18topoisomerase I (topA 3)JHP0931NAH1820
19hypothetical proteinJHP0318NAH286293
20cag island protein (cagA)JHP0495HP05471712
21hypothetical proteinJHP0110HP01186419
22hypothetical proteinJHP1208HP128891830
23DNA-directed RNA polymerase, beta subunit (rpoB)JHP1121HP11985016
24hypothetical proteinJHP0052HP005926120
25hypothetical proteinJHP1042HP111514694
26hypothetical proteinJHP0953NAH311463
27hypothetical proteinJHP1070HP11426014
28hypothetical proteinJHP1113HP118727439
29hypothetical proteinJHP0842HP09064221
30type II restriction enzymeJHP0630NAH173588
31histidine-rich, metal binding polypeptide (hpn)JHP1320HP1427701404
32hypothetical proteinJHP0074HP008012125
33hypothetical proteinJHP0191HP0205537
34hypothetical proteinJHP0376HP10492351128
35cag pathogenicity island protein (cag3)JHP0471HP05222162
36hypothetical proteinJHP0026HP00302364
37urease beta subunit (urea amidohydrolase) (ureB)JHP0067HP00723270
38hypothetical proteinJHP0939HP0991116156
39multidrug resistance protein (spaB)JHP0547HP06007518
40flagellin A (flaA)JHP0548HP060134154
41hypothetical proteinJHP1071HP11437861
42hypothetical proteinJHP0613HP06694433
43hypothetical proteinJHP0623HP06822311186
44N-methylhydantoinaseJHP0632HP06962036
45hypothetical proteinJHP1049NAH278470
46vacuolating cytotoxin (vacA)JHP0819HP08875938
47putative restriction enzymeJHP0164NAH8843
48type I restriction enzyme R protein (hsdR)JHP0784HP084624435
49hook assembly protein, flagella (flgD)JHP0843HP0907103175
50hypothetical proteinJHP0458HP05088444
51hypothetical proteinJHP0336HP10892754
53hypothetical proteinJHP0940NAH39393
54hypothetical proteinJHP0462HP051310411
55type II restriction enzyme (methyltransferase)JHP1409NAH3715
58hypothetical proteinJHP1285HP13715525
59hypothetical proteinJHP0693HP0756191490
62cag pathogenicity island protein (cag8)JHP0477HP05287231
63type III restriction enzyme (res)JHP1297NAH3028
64hypothetical proteinJHP0668HP073111032
67outer membrane proteinJHP0438HP048622145
70cag island protein (cagT)JHP0481HP053225558
71RNA polymerase sigma-70 factor (rpoD)JHP0081HP00886237
75hypothetical proteinJHP1253HP133340384
78iron(III) dicitrate transport protein (fecA)JHP1426HP140028111
81DNA polymerase I (polA)JHP1363HP14704646
85type I restriction enzyme (hsdS)JHP0414NAH27530
88hypothetical proteinJHP0174HP0187/8/62990
89iron(III) dicitrate transport protein (fecA)JHP0626HP06863847
95DNA transfer protein (cagE)JHP0492HP05444950
100integrase/recombinase (xerD)JHP0941HP099533541
104type III restriciton enzyme (mod)JHP1411HP152285713
105type I restriction enzyme R protein (hsdR)JHP0416HP04646329
122adenine specific DNA methyltransferase (mod)JHP0244HP026023648
130hypothetical proteinJHP0925NAH43990
137cag island protein (cagH)JHP0489HP054147398
138type I restriction enzyme (hsdR)JHP1424HP140219522
158hypothetical proteinJHP0540NAH67426
170cag island protein (cagF)JHP0491HP054352828
177DNA repair protein (recN)JHP1434HP139351160
190type III restriction enzyme (mod)JHP1296NAH12134
196role in outermembrane permeability (imp)JHP1138HP1215/620845
206cytochrome oxidase (cbb3 type) (fixN)JHP0132HP014441209
227DNA mismatch repair protein (mutS)JHP0565HP06214582
230hypothetical proteinJHP0534HP058657740
258type III restriction enzyme (res)JHP1410HP152116123
262hypothetical proteinJHP1033HP110636342
281translation initiation factor IF-2 (infB)JHP0377HP104833042
290type II restriction enzyme (methyltrasferase)JHP1284NAH75041
1260siderophore-mediated iron transport protein (tonB)JHP1260HP134148402

Genes with > 2 SD divergence in each analysis are indicated in bold

NAH indicates No Annotated Homologue in the other sequence

Table 6

Top 50 most divergent genes by HNS in H. pylori strain 26695 plus those additional genes > 2 SD greater than the mean by DNS and the 50 most divergent by HNS

HNS orderannotationHP#J99 #DNS orderTNS order
1cag pathogenicity island protein (cag7)HP0527JHP047611
2hypothetical proteinHP0119NAH177
3vacuolating cytotoxin (vacA) paralogHP0922JHP085665
4vacuolating cytotoxin (vacA) paralogHP0289JHP027422
5poly E-rich hypothetical proteinHP0322JHP030538
6hypothetical proteinHP1142JHP10709119
7cag island protein (cagA)HP0547JHP04953115
8hypothetical proteinHP0205JHP01915778
9hypothetical proteinHP0609JHP0556*46
10hypothetical proteinHP0453NAH7558
11adenine specific DNA methyltransferase (mod)HP1522JHP1411363207
12hypothetical proteinHP0488JHP0440710
13hypothetical proteinHP1116JHP1044811
14type IIS restriction enzyme R and M protein (ECO57IR)HP1517NAH2842
15hypothetical proteinHP0513JHP046212228
16hypothetical proteinHP0906JHP08424222
17vacuolating cytotoxin (vacA) paralogHP0610JHP0556*1312
18type III restriction enzyme R protein (res)HP1521JHP1410195210
19DNA-directed RNA polymerase, beta subunit (rpoB)HP1198JHP11218423
20adenine/cytosine DNA methyltransferaseHP0054NAH109120
21type I restriction enzyme R protein (hsdR)HP1402JHP142410386
22DNA transfer protein (cagE)HP0441JHP04922951
23type III restriction enzyme R proteinHP1371JHP128552119
24DNA topoisomerase I (topA)HP0440NAH63149
25outer membrane protein (omp26)HP1157JHP10843427
26type I restriction enzyme R protein (hsdR)HP0464NAH3690
27cag pathogenicity island protein (cag8)HP0528JHP04777450
28virB4 homolog (virB4)HP0459NAH5349
29hypothetical proteinHP0586JHP0534631163
30multidrug resistance protein (spaB)HP0600JHP05479741
31RNA polymerase sigma-70 factor (rpoD)HP0088JHP00815655
32hypothetical proteinHP0731JHP066813280
33hypothetical proteinHP1520NAH16720
34vacuolating cytotoxinHP0887JHP08191825
35type III restriction enzyme R protein (res)HP0592NAH1630
36hypothetical proteinHP0118JHP011017927
37type I restriction enzyme R protein (hsdR)HP0846JHP0784342101
38hypothetical proteinHP1187JHP111314231
39hypothetical proteinHP0030JHP00264524
40HP0478JHP04301080220
41hypothetical proteinHP1143JHP10715829
42hypothetical proteinHP0669JHP06136960
43N-methylhydantoinaseHP0696JHP06321935
44type I restriction enzyme M protein (hsdM)HP1403JHP1423125340
45translation initiation factor IF-2 (infB)HP1048JHP0377291332
46hypothetical proteinHP0996JHP0942514
47DNA polymerase III alpha-subunit (dnaE)HP1460JHP1353297219
48hypothetical proteinHP0733JHP0670224222
49preprotein translocase subunit (secA)HP0786JHP0723119176
50hypothetical proteinHP0120NAH28344
53hypothetical proteinHP0058JHP005112116
54DNA polymerase I (polA)HP1470JHP13633077
59hypothetical proteinHP1089JHP03361267
64DNA mismatch repair protein (MutS)HP0621JHP056522137
682',3'-cyclic-nucleotide 2'-phosphodiesterase (cpdB)HP0104JHP00965473
75fucosyltransferaseHP0651JHP05964843
77hypothetical proteinHP0508JHP045813932
87urease beta subunit (urea amidohydrolase) (ureB)HP0072JHP00672138
90cell division protein (ftsK)HP1090JHP033537181
99outer membrane protein (omp3)HP0079JHP00736845
100cag pathogenicity island protein (cag3)HP0522JHP04711148
101outer membrane protein (omp17)HP0725JHP066220947
122hypothetical proteinHP0080JHP0074918
127hypothetical proteinHP1479JHP137255153
129iron(III) dicitrate transport protein (fecA)HP1400JHP14263299
142outer membrane proteinHP0486JHP043826147
160virulence associated protein homolog (vacB)HP1248JHP116950164
166translation elongation factor EF-Tu (tufB)HP1205JHP11284964
168cytochrome oxidase (cbb3 type) (fixN)HP0144JHP013227102
170hypothetical proteinHP1003NAH3861
180flagellin A (flaA)HP0601JHP05483340
207DNA repair protein (recN)HP1393JHP143435154
256hypothetical proteinHP0788JHP07254172
276hypothetical proteinHP0186JHP017447130
277hypothetical proteinHP1106JHP103359272
296hypothetical proteinHP1333JHP12534053
320hypothetical proteinHP0059JHP00524321
448integrase/recombinase (xerD)HP0995JHP09412539
449hypothetical proteinHP0449NAH5181
451GMP reductase (guaC)HP0854JHP079044169
582hypothetical proteinHP0489JHP04411036
693cag island protein (cagT)HP0532JHP04812387
737hypothetical proteinHP0427JHP0952143
738hypothetical proteinHP1408JHP1300154
866hypothetical proteinHP1115JHP10422033
1021cag pathogenicity island protein (cag13)HP0534JHP048260225
1090hypothetical proteinHP1516NAH59334
1129hypothetical proteinHP0611JHP029923037
1256secreted protein involved in flagellar motilityHP1192JHP111741013
1338hypothetical proteinHP0345NAH24946
1432histidine and glutamine-rich metal-binding proteinHP1432JHP1321469
1449histidine-rich, metal binding polypeptide (hpn)HP1427NAH3926
1548hypothetical proteinHP0756JHP06932471

* probably frame shifted components of the same vacA related gene

Genes with > 2 SD divergence in each analysis are indicated in bold

NAH indicates No Annotated Homologue in the other sequence

Top 50 most divergent genes by TNS in H. pylori strain J99 plus those additional genes > 2 SD greater than the mean by DNS and the 50 most divergent by HNS Genes with > 2 SD divergence in each analysis are indicated in bold NAH indicates No Annotated Homologue in the other sequence Top 50 most divergent genes by HNS in H. pylori strain J99 plus those additional genes >2 SD greater than the mean by DNS and top 50 by TNS Genes with > 2 SD divergence in each analysis are indicated in bold NAH indicates No Annotated Homologue in the other sequence Top 50 most divergent genes by TNS in H. pylori strain 26695 plus those additional genes > 2 SD greater than the mean by DNS and the 50 most divergent by HNS * probably frame shifted components of the same vacA related gene Genes with > 2 SD divergence in each analysis are indicated in bold NAH indicates No Annotated Homologue in the other sequence Top 50 most divergent genes by HNS in H. pylori strain 26695 plus those additional genes > 2 SD greater than the mean by DNS and the 50 most divergent by HNS * probably frame shifted components of the same vacA related gene Genes with > 2 SD divergence in each analysis are indicated in bold NAH indicates No Annotated Homologue in the other sequence The 50 most divergent J99 ORFs by HNS included 26 (52%) that were not in the 53 (>2 SD) most divergent by DNS, these included 11 restriction-modification system genes and 6 others that were not annotated within the strain 26695 genome sequence. The identification of genes of a type known to be horizontally exchanged, and different between the gene complements of the strains, is strong corroboration for the foreign origin of the additional genes identified by HNS. In several instances (Tables 2 to 6) the DNS did not detect these genes at all e.g. restriction enzymes that were the 3rd, 13th and 41st most divergent genes by HNS, were 319th, 857th and 750th most divergent by DNS, respectively. In some instances the TNS gave intermediate results and in others identified other genes as more divergent than the other methods. The TNS was most sensitive for the detection of rpoB (HP1198 / JHP1121) which is associated with a significantly different gene length in the two strains (Figure 1h). One explanation for this observation is that while the DNS may initially be the most sensitive indicator of horizontal exchange it may become ameliorated to the new sequence characteristics more rapidly that the longer component features, which are probably detecting qualitatively different sequence characteristics. The differences in the analyses using different length components, and a comparison of the results from the two sequenced strains, suggest a complex evolutionary history for the cag pathogenicity island. These suggest that it probably has mosaic structure including sequences from more than one species background, in addition to sequence that is entirely typical of H. pylori. It is normally impossible to determine the chronology of events to distinguish insertions and deletions when comparing strains. In strain 26695 there are two open reading frames that are both good candidate coding sequences. There is only one gene in this location in strain J99 composed of the 5' gene from strain 26695 and the 3' end of the subsequent gene. This could have arisen from either a deletion or an insertion event. However, the normal DNS of the J99 gene (JHP0073, 799th in divergence) and the 5' 26695 gene (HP0079, 751st in divergence), and the high divergence of the 3' 26695 gene (HP0078, 68th in divergence), indicate that the most likely event is an insertion into strain 26695 (Figure 1l). Likewise HP0119 is likely to contain an insertion and JHP1113 probably reflects the original sequences (Figure 1k). The inclusion of two DNA metabolism genes associated with recombination and repair is notable. Both mutS and recN were identified in both strains (22nd and 35th, and 45th and 51st most divergent genes by DNS in strains 26695 and J99 respectively). When the homologous genes were compared between the strains, extensive divergences were evident between more than one region of each protein. That these genes have divergent signatures in both strains suggests that neither has a wholly native composition. This observation is consistent with the models of rapid evolution which suggest that transient competitive advantages are enjoyed by organisms that are hypermutators under conditions of environmental stress and transitions, and that these states which can be produced by mutations in DNA repair genes [21-26]. However, such states have to be reversed so that an unsustainable mutational burden is not attained, and it has been proposed that this reversal is mediated by repair following horizontal transfer and homologous recombination, and that such strains are hyper-recombinogenic [27-29]. The untypicality of mutS and recN suggest that H. pylori is another species that can make use of this strategy for diversification under stressful conditions. The identification of RNA polymerase genes, with associated differences between the strains, is striking. The divergence of phylogenetic trees based upon different sequences has been highlighted, and particularly the differences between the trees associated with RNA polymerase genes and rRNA [30,31]. It has been argued that RNA polymerase is as essential to cell function as is rRNA and that there is no compelling reason to chose rRNA as the more reliable marker [32]. While the DNS analysis does not address the stability of rRNA (and specifically excludes the rRNA sequences because their differing coding requirements and evolutionary pressures generate a divergent signature for other reasons), it does indicate that RNA polymerase can be a substrate for horizontal transfer, and that trees based upon this gene, or other essential genes, need not necessarily be considered a challenge to rRNA based phylogenies.

Conclusions

The spectrum of recently horizontally acquired sequences identified emphasizes the two driving forces of horizontal exchange: the transfer of a phenotype which alters or enhances bacterial fitness resulting in increased competitive fitness or altered niche adaptation, and the presence of a substrate for homologous recombination. Because of the focus upon, and relative ease of identifying, large islands associated with readily identifiable features and phenotypes, the importance of the latter component has perhaps been underestimated. The genes that have been considered to code for 'core metabolic' 'house-keeping' functions are amongst those most likely to be changed by horizontal transfer events because of the presence of homologous substrates, and changes are likely to persist even when the change is phenotypically neutral. Equally, changes in the genes involved in core functions such as gene expression and DNA metabolism may have pleotropic effects and there may be significant differences in strain behaviour, that are not simply the consequence of differences in their respective gene complements. The selection of genes for phylogenetic analysis on the basis of their coding for conserved core functions is also problematic because these are also frequently the genes most likely to share the high homology that facilitates recombination and horizontal exchange.

Methods

A traditional nucleotide signature is generated by segmenting a sequence of DNA into k equal-sized subsequences (or 'windows'). The mathematical basis for the signature is an odds ratio – p– calculated by dividing the frequency of a length-L oligonucleotide by its expected frequency. The odds ratios for each of the 4oligonucleotides in each window (w) are compared with the odds ratios for the overall sequence (s) [9,10,33]. The normalized difference δ is plotted and thus a nucleotide signature consists of a k-length sequence of δ values: δ(w,s) = (1/4)Σ(4,i:x)|p(w) - p(s)|, where x is the set of all permutations of length L and i is one such permutation. There are interesting parallels between signature-style genome analysis and stylometric techniques previously used to determine the authorship of controversial literary texts. This is analogous with the biological problem and it is from this that our method is derived. Rather than using a fixed-window signature, signature scores are calculated for each coding open reading frame (ORF) and weighted with variance estimates so that the scores for shorter ORFs confer with their longer counterparts. Bissell's weighted cusum (cumulative sum) [34], , is modified so that n denotes the number of ORFs in the genome, Xthe number of oligonucleotides in ORF i, and wthe number of nucleotides in ORF i. The results are scaled according to ORF size using the standard error σ = √(*#ORF). In this way false positives are abrogated by normalizing for over-representation of lower order peptides. The method is implemented in Java and efficiency is maintained through an O(N) (N = sequence length) refinement: probabilities for the complete sequence are calculated in O(N) steps for any length-L oligonucleotide, and maintain O(N) when 4>N through a hashing function; the second part of the program calculates σ for each ORF using a loop flattening technique, thereby avoiding the program having to recalculate overlapping sub-expressions. The program is available from and . Sequence alignments, as shown in Figure 1, were performed and displayed using the programs: Lalign and viewed using Lalignview [35].

Abbreviations

ORF, Open Reading Frame; DNS, Dinucleotide Signature; TNS, tetranucleotide signature; HNS, hexanucleotide signature.

Authors' contributions

NJS initiated the project, performed the genome sequence analyses, compared the two strains, interpreted the results, and prepared the biological aspects of the manuscript. PB was a DPhil student who worked on the coding aspects of the new methodology. JFP contributed to the bioinformatics discussions and planning stage of this project. SAJ directed and primarily developed the analysis strategy and the implementation of the new computational basis of the methodology, and prepared the computational aspects of the manuscript.
Table 3

Top 50 most divergent genes by HNS in H. pylori strain J99 plus those additional genes >2 SD greater than the mean by DNS and top 50 by TNS

HNS orderJ99 annotationJHP #26695 #DNS orderTNS order
1hypothetical proteinJHP1300HP140891
2cag pathogenicity island protein (cag7)JHP0476HP052722
3type I restriction enzyme (hsdS)JHP1422NAH31910
4vacuolating cytotoxin (vacA) paralogJHP0556HP0609/1035
5vacuolating cytotoxin (vacA) paralogJHP0274HP028946
6vacuolating cytotoxin (vacA) paralogJHP0856HP092279
7hypothetical proteinJHP0191HP02055333
8hypothetical proteinJHP1044HP11161014
9hypothetical proteinJHP0928NAH1112
10hypothetical proteinJHP0305HP032258
11hypothetical proteinJHP0462HP051310454
12cag island protein (cagA)JHP0495HP05471720
13type III restriciton enzyme (mod)JHP1411HP1522857104
14hypothetical proteinJHP1070HP11426027
15type II restriction enzyme (methyltransferase)JHP1409NAH3755
16DNA-directed RNA polymerase, beta subunit (rpoB)JHP1121HP11985023
17hypothetical proteinJHP0440HP04881316
18multidrug resistance protein (spaB)JHP0547HP06007539
19hypothetical proteinJHP0110HP01186421
20topoisomerase I (topA 3)JHP0931NAH – check1818
21hypothetical proteinJHP0842HP09064229
22type I restriction enzyme (hsdR)JHP1424HP1402195138
23type III restriction enzyme (res)JHP1410HP1521161258
24outer membrane protein (omp26)JHP1084HP11572417
25hypothetical proteinJHP1285HP13715558
26hypothetical proteinJHP0540NAH674158
27hypothetical proteinJHP0942HP0996613
28type III restriction enzyme (res)JHP1297NAH3063
29type I restriction enzyme R protein (hsdR)JHP0416HP046463105
30type I restriction enzyme (hsdS)JHP0414NAH27585
31cag pathogenicity island protein (cag8)JHP0477HP05287262
32hypothetical proteinJHP0668HP073111064
33hypothetical proteinJHP0613HP06694442
34type III restriction enzyme (mod)JHP1296NAH121190
35type I restriction enzyme R protein (hsdR)JHP0784HP084624448
36N-methylhydantoinaseJHP0632HP06962044
37RNA polymerase sigma-70 factor (rpoD)JHP0081HP00886271
38vacuolating cytotoxin (vacA)JHP0819HP08875946
39hypothetical proteinJHP1113HP118727428
40hypothetical proteinJHP0534HP0586577230
41type II restriction enzyme (methyltrasferase)JHP1284NAH750290
42translation initiation factor IF-2 (infB)JHP0377HP1048330281
43restriction enzymeJHP0164NAH8847
44hypothetical proteinJHP0458HP05088450
45role in outermembrane permeability (imp)JHP1138HP1215/6208196
46DNA polymerase I (polA)JHP1363HP14704681
47iron(III) dicitrate transport protein (fecA)JHP0626HP06863889
48adenine specific DNA methyltransferase (mod)JHP0244HP0260236122
49histidine and glutamine-rich metal-binding proteinJHP1321HP1432154
50DNA transfer protein (cagE)JHP0492HP05444995
54hypothetical proteinJHP0336HP10892751
62cag pathogenicity island protein (cag3)JHP0471HP05222135
64hypothetical proteinJHP0026HP00302336
70urease beta subunit (urea amidohydrolase) (ureB)JHP0067HP00723237
82DNA mismatch repair protein (mutS)JHP0565HP062145227
84hypothetical proteinJHP0050HP005887
90hypothetical proteinJHP0174HP0187/8/62988
95hypothetical proteinJHP0934NAH1615
111iron(III) dicitrate transport protein (fecA)JHP1426HP14002878
120hypothetical proteinJHP0052HP00592624
125hypothetical proteinJHP0074HP00801232
145Outer membrane proteinJHP0438HP04862267
154flagellin A (flaA)JHP0548HP06013440
160DNA repair protein (recN)JHP1434HP139351177
209cytochrome oxidase (cbb3 type) (fixN)JHP0132HP014441206
275hypothetical proteinJHP0299HP061/23511
342hypothetical proteinJHP1033HP110636262
384hypothetical proteinJHP1253HP13334075
393hypothetical proteinJHP0940NAH3953
398cag island protein (cagH)JHP0489HP054147137
402siderophore-mediated iron transport protein (tonB)JHP1260HP1341481260
541integrase/recombinase (xerD)JHP0941HP099533100
558cag island protein (cagT)JHP0481HP05322570
694hypothetical proteinJHP1042HP11151425
828cag island protein (cagF)JHP0491HP054352170
990hypothetical proteinJHP0925NAH43130
1355hypothetical proteinJHP0952HP042713
1463hypothetical proteinJHP0953NAH3126
1490hypothetical proteinJHP0693HP07561959

Genes with > 2 SD divergence in each analysis are indicated in bold

NAH indicates No Annotated Homologue in the other sequence

Table 5

Top 50 most divergent genes by TNS in H. pylori strain 26695 plus those additional genes > 2 SD greater than the mean by DNS and the 50 most divergent by HNS

TNS orderannotationHP#J99 #DNS orderHNS order
1cag pathogenicity island protein (cag7)HP0527JHP047611
2vacuolating cytotoxin (vacA) paralogHP0289JHP027424
3hypothetical proteinHP0427JHP095214737
4hypothetical proteinHP1408JHP130015738
5vacuolating cytotoxin (vacA) paralogHP0922JHP085663
6hypothetical proteinHP0609JHP0556*49
7hypothetical proteinHP0119NAH172
8poly E-rich hypothetical proteinHP0322JHP030535
9histidine and glutamine-rich metal-binding proteinHP1432JHP1321461432
10hypothetical proteinHP0488JHP0440712
11hypothetical proteinHP1116JHP1044813
12vacuolating cytotoxin (vacA) paralogHP0610JHP0556*1317
13secreted protein involved in flagellar motilityHP1192JHP11174101256
14hypothetical proteinHP0996JHP0942546
15cag island protein (cagA)HP0547JHP0495317
16hypothetical proteinHP0058JHP005112153
17outer membrane protein (omp26)HP1157JHP10843425
18hypothetical proteinHP0080JHP00749122
19hypothetical proteinHP1142JHP1070916
20hypothetical proteinHP1520NAH16733
21hypothetical proteinHP0059JHP005243320
22hypothetical proteinHP0906JHP08424216
23DNA-directed RNA polymerase, beta subunit (rpoB)HP1198JHP11218419
24hypothetical proteinHP0030JHP00264539
25vacuolating cytotoxin (vacA)HP0887JHP08191834
26histidine-rich, metal binding polypeptide (hpn)HP1427NAH391449
27hypothetical proteinHP0118JHP011017936
28hypothetical proteinHP0513JHP046212215
29hypothetical proteinHP1143JHP10715841
30type III restriction enzyme R protein (res)HP0592NAH1635
31hypothetical proteinHP1187JHP111314238
32hypothetical proteinHP0508JHP045813977
33hypothetical proteinHP1115JHP104220866
34hypothetical proteinHP1516NAH5931090
35N-methylhydantoinaseHP0696JHP06321943
36hypothetical proteinHP0489JHP044110582
37hypothetical proteinHP0611JHP02992301129
38urease beta subunit (urea amidohydrolase) (ureB)HP0072JHP00672187
39integrase/recombinase (xerD)HP0995JHP094125448
40flagellin A (flaA)HP0601JHP054833180
41multidrug resistance protein (spaB)HP0600JHP05479730
42type IIS restriction enzyme R and M protein (ECO57IR)HP1517NAH2814
43fucosyltransferaseHP0651JHP05964875
44hypothetical proteinHP0120NAH28350
45outer membrane protein (omp3)HP0079JHP00736899
46hypothetical proteinHP0345NAH2491338
47outer membrane protein (omp17)HP0725JHP0662209101
48cag pathogenicity island protein (cag3)HP0522JHP047111100
49virB4 homolog (virB4)HP0459NAH5328
50cag pathogenicity island protein (cag8)HP0528JHP04777427
51DNA transfer protein (cagE)HP0441JHP04922922
53hypothetical proteinHP1333JHP125340296
55RNA polymerase sigma-70 factor (rpoD)HP0088JHP00815631
58hypothetical proteinHP0453NAH7510
60hypothetical proteinHP0669JHP06136942
61hypothetical proteinHP1003NAH38170
64translation elongation factor EF-Tu (tufB)HP1205JHP112849166
67hypothetical proteinHP1089JHP03361259
71hypothetical proteinHP0756JHP0693241548
72hypothetical proteinHP0788JHP072541256
732',3'-cyclic-nucleotide 2'-phosphodiesterase (cpdB)HP0104JHP00965468
77DNA polymerase I (polA)HP1470JHP13633054
78hypothetical proteinHP0205JHP0191578
80hypothetical proteinHP0731JHP066813232
81hypothetical proteinHP0449NAH51449
86type I restriction enzyme R protein (hsdR)HP1402JHP142410321
87cag pathogenicity island protein (cag12)HP0532JHP048123693
90type I restriction enzyme R protein (hsdR)HP0464NAH3626
99iron(III) dicitrate transport protein (fecA)HP1400JHP142632129
101type I restriction enzyme R protein (hsdR)HP0846JHP078434237
102cytochrome oxidase (cbb3 type) (fixN)HP0144JHP013227168
119type III restriction enzyme R proteinHP1371JHP12855223
120adenine/cytosine DNA methyltransferaseHP0054NAH10920
130hypothetical proteinHP0186JHP017447276
137DNA mismatch repair protein (MutS)HP0621JHP05652264
147outer membrane proteinHP0486JHP043826142
149DNA topoisomerase I (topA)HP0440NAH6324
153hypothetical proteinHP1479JHP137255127
154DNA repair protein (recN)HP1393JHP143435207
163hypothetical proteinHP0586JHP053463129
164virulence associated protein homolog (vacB)HP1248JHP116950160
169GMP reductase (guaC)HP0854JHP079044451
176preprotein translocase subunit (secA)HP0786JHP072311949
181cell division protein (ftsK)HP1090JHP03353790
207adenine specific DNA methyltransferase (mod)HP1522JHP141136311
210type III restriction enzyme R protein (res)HP1521JHP141019518
219DNA polymerase III alpha-subunit (dnaE)HP1460JHP135329747
220type II restriction enzyme (methyltransferase)HP0478JHP0430108040
222hypothetical proteinHP0733JHP067022448
225cag pathogenicity island protein (cag13)HP0534JHP0482601021
272hypothetical proteinHP1106JHP103359277
332translation initiation factor IF-2 (infB)HP1048JHP037729145
340type I restriction enzyme M protein (hsdM)HP1403JHP142312544

* probably frame shifted components of the same vacA related gene

Genes with > 2 SD divergence in each analysis are indicated in bold

NAH indicates No Annotated Homologue in the other sequence

  33 in total

Review 1.  Phylogenetic classification and the universal tree.

Authors:  W F Doolittle
Journal:  Science       Date:  1999-06-25       Impact factor: 47.728

2.  Costs and benefits of high mutation rates: adaptive evolution of bacteria in the mouse gut.

Authors:  A Giraud; I Matic; O Tenaillon; A Clara; M Radman; M Fons; F Taddei
Journal:  Science       Date:  2001-03-30       Impact factor: 47.728

Review 3.  Gene transfer, speciation, and the evolution of bacterial genomes.

Authors:  J G Lawrence
Journal:  Curr Opin Microbiol       Date:  1999-10       Impact factor: 7.934

Review 4.  Lateral gene transfer and the nature of bacterial innovation.

Authors:  H Ochman; J G Lawrence; E A Groisman
Journal:  Nature       Date:  2000-05-18       Impact factor: 49.962

5.  Evolutionary implications of the frequent horizontal transfer of mismatch repair genes.

Authors:  E Denamur; G Lecointre; P Darlu; O Tenaillon; C Acquaviva; C Sayada; I Sunjevaric; R Rothstein; J Elion; F Taddei; M Radman; I Matic
Journal:  Cell       Date:  2000-11-22       Impact factor: 41.582

Review 6.  Helicobacter pylori-related diseases.

Authors:  F Cremonini; A Gasbarrini; A Armuzzi; G Gasbarrini
Journal:  Eur J Clin Invest       Date:  2001-05       Impact factor: 4.686

7.  Absence in Helicobacter pylori of an uptake sequence for enhancing uptake of homospecific DNA during transformation.

Authors:  N J Saunders; J F Peden; E R Moxon
Journal:  Microbiology       Date:  1999-12       Impact factor: 2.777

8.  Phylogenetic evidence for horizontal transfer of mutS alleles among naturally occurring Escherichia coli strains.

Authors:  E W Brown; J E LeClerc; B Li; W L Payne; T A Cebula
Journal:  J Bacteriol       Date:  2001-03       Impact factor: 3.490

9.  Complete genome sequence of Neisseria meningitidis serogroup B strain MC58.

Authors:  H Tettelin; N J Saunders; J Heidelberg; A C Jeffries; K E Nelson; J A Eisen; K A Ketchum; D W Hood; J F Peden; R J Dodson; W C Nelson; M L Gwinn; R DeBoy; J D Peterson; E K Hickey; D H Haft; S L Salzberg; O White; R D Fleischmann; B A Dougherty; T Mason; A Ciecko; D S Parksey; E Blair; H Cittone; E B Clark; M D Cotton; T R Utterback; H Khouri; H Qin; J Vamathevan; J Gill; V Scarlato; V Masignani; M Pizza; G Grandi; L Sun; H O Smith; C M Fraser; E R Moxon; R Rappuoli; J C Venter
Journal:  Science       Date:  2000-03-10       Impact factor: 47.728

10.  High frequency of hypermutable Pseudomonas aeruginosa in cystic fibrosis lung infection.

Authors:  A Oliver; R Cantón; P Campo; F Baquero; J Blázquez
Journal:  Science       Date:  2000-05-19       Impact factor: 47.728

View more
  16 in total

1.  Helicobacter pylori possesses four coiled-coil-rich proteins that form extended filamentous structures and control cell shape and motility.

Authors:  Mara Specht; Sarah Schätzle; Peter L Graumann; Barbara Waidner
Journal:  J Bacteriol       Date:  2011-06-03       Impact factor: 3.490

2.  Speciation and ecological success in dimly lit waters: horizontal gene transfer in a green sulfur bacteria bloom unveiled by metagenomic assembly.

Authors:  Tomàs Llorens-Marès; Zhenfeng Liu; Lisa Zeigler Allen; Douglas B Rusch; Matthew T Craig; Chris L Dupont; Donald A Bryant; Emilio O Casamayor
Journal:  ISME J       Date:  2016-07-08       Impact factor: 10.302

3.  Complexomics study of two Helicobacter pylori strains of two pathological origins: potential targets for vaccine development and new insight in bacteria metabolism.

Authors:  Cédric Bernarde; Philippe Lehours; Jean-Paul Lasserre; Michel Castroviejo; Marc Bonneu; Francis Mégraud; Armelle Ménard
Journal:  Mol Cell Proteomics       Date:  2010-07-07       Impact factor: 5.911

Review 4.  Molecular epidemiology, population genetics, and pathogenic role of Helicobacter pylori.

Authors:  Rumiko Suzuki; Seiji Shiota; Yoshio Yamaoka
Journal:  Infect Genet Evol       Date:  2011-12-17       Impact factor: 3.342

Review 5.  Horizontal gene transfers with or without cell fusions in all categories of the living matter.

Authors:  Joseph G Sinkovics
Journal:  Adv Exp Med Biol       Date:  2011       Impact factor: 2.622

6.  HGT turbulence: Confounding phylogenetic influence of duplicative horizontal transfer and differential gene conversion.

Authors:  Weilong Hao; Jeffrey D Palmer
Journal:  Mob Genet Elements       Date:  2011-11-01

7.  Analysis of the genome-wide variations among multiple strains of the plant pathogenic bacterium Xylella fastidiosa.

Authors:  Harshavardhan Doddapaneni; Jiqiang Yao; Hong Lin; M Andrew Walker; Edwin L Civerolo
Journal:  BMC Genomics       Date:  2006-09-01       Impact factor: 3.969

8.  A novel system of cytoskeletal elements in the human pathogen Helicobacter pylori.

Authors:  Barbara Waidner; Mara Specht; Felix Dempwolff; Katharina Haeberer; Sarah Schaetzle; Volker Speth; Manfred Kist; Peter L Graumann
Journal:  PLoS Pathog       Date:  2009-11-20       Impact factor: 6.823

9.  Comparative genomic analyses of Streptococcus mutans provide insights into chromosomal shuffling and species-specific content.

Authors:  Fumito Maruyama; Mitsuhiko Kobata; Ken Kurokawa; Keishin Nishida; Atsuo Sakurai; Kazuhiko Nakano; Ryota Nomura; Shigetada Kawabata; Takashi Ooshima; Kenta Nakai; Masahira Hattori; Shigeyuki Hamada; Ichiro Nakagawa
Journal:  BMC Genomics       Date:  2009-08-05       Impact factor: 3.969

10.  A variable gene in a conserved region of the Helicobacter pylori genome: isotopic gene replacement or rapid evolution?

Authors:  Armelle Ménard; Antoine Danchin; Sandrine Dupouy; Francis Mégraud; Philippe Lehours
Journal:  DNA Res       Date:  2008-04-27       Impact factor: 4.458

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.