| Literature DB >> 20537153 |
Jean-Michel Thiberge1, Caroline Boursaux-Eude, Philippe Lehours, Marie-Agnès Dillies, Sophie Creno, Jean-Yves Coppée, Zoé Rouy, Aurélie Lajus, Laurence Ma, Christophe Burucoa, Anne Ruskoné-Foumestraux, Anne Courillon-Mallet, Hilde De Reuse, Ivo Gomperts Boneca, Dominique Lamarque, Francis Mégraud, Jean-Charles Delchier, Claudine Médigue, Christiane Bouchier, Agnès Labigne, Josette Raymond.
Abstract
BACKGROUND: Helicobacter pylori infection is associated with several gastro-duodenal inflammatory diseases of various levels of severity. To determine whether certain combinations of genetic markers can be used to predict the clinical source of the infection, we analyzed well documented and geographically homogenous clinical isolates using a comparative genomics approach.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20537153 PMCID: PMC3091627 DOI: 10.1186/1471-2164-11-368
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Hybridization reactions on a DNA macroarray membrane containing 254 PCR products that are representative of . Bacterial DNAs from 120 isolates involved in various diseases, including chronic gastritis (yellow), intestinal metaplasia (pink), duodenal ulcer (blue) and gastric MZBL (green), were tested by hybridization. Isolates are listed on the horizontal axis, and the genes tested, on the vertical axis. Clustering (genesis software) was carried out using the continuous values from 120 heterologous hybridization experiments, where each value corresponds to the (log26695-logheterol.strain) value for each tested gene (see materials & methods). Colors of the line range from blue, if the gene is present, to red, if absent. The range of intermediate colors reflects the degree of hybridization and thus homology, but also the redundancy of the tested genes. This figure represents the clustering based on the complete set of 254 genes.
Figure 2Hybridization reactions on a DNA macroarray membrane: clustering based on the 48 most discriminatory genes identified as key combinations of variables (genes/axes) from Principal Component Analysis. These 48 genes are labeled in Addional file 1.
Figure 3Genome map of . From outside to inside: - GC skew (window 2500, step 500) in blue. - Total CDSs (green) with pseudogenes/partial genes (purple). - CDSs coding for hypothetical restriction/modification systems (purple), phage proteins (orange), or insertion sequences (ISHp609) (green). - Total CDSs according to the matrix defined for gene identification (matrix n°1 in red, matrix N°2 in black, matrix n°3 in green). - RNA (rRNA in green, tRNA in purple and misc_RNA in red). - Rule. - GC% (window 5000, step 2000) in yellow. Red arrow indicates the position of the origin of replication.
Summary of comparative features of Helicobacter genomes
| Features of the strains | B38 | 26695 | J99 | Shi470 | H. h Strain ATCC 51449 | ||||
|---|---|---|---|---|---|---|---|---|---|
| cagPAI | POS | POS | POS | POS | POS | POS | HacGI | HHGI1 | |
| Size (bp) | 1,667,867 | 1,643,831 | 1,596,366 | 1,608,547 | 1,652,982 | 1,673,813 | 1,553,927 | 1,799,146 | |
| (G+C) content (%) | 38.9 | 39.2 | 39.1 | 38.9 | 38.9 | 38.8 | 38.2 | 35.9 | |
| Total CDSs (nb)b | 1,637 | 1,543 | 1,539 | 1,592 | 1,611 | 1,639 | 1,696 | 1,851 | |
| Complete CDSs (nb)b | 1,501 | 1,446 | 1,441 | 1,473 | 1,469 | 1,505 | 1,397 | 1,824 | |
| Average length (bp)b | 964 | 988 | 971 | 955 | 954 | 957 | 933 | 914 | |
| Coding density (%)b | 86.3 | 86.6 | 87.3 | 87.2 | 84.6 | 85.8 | 83.6 | 92.3 | |
| Partial CDSs (nb)b | 136 | 97 | 98 | 119 | 142 | 134 | 299 | 27 | |
| Truncated genes (nb)b | 9 | 10 | 7 | 4 | 7 | 7 | 11 | 11 | |
| Pseudogenes (nb)b | 127(7.8%) | 87(5.6%) | 91(5.9%) | 115(7.2%) | 135(8.4%) | 127(7.8%) | 288(17%) | 16(0.9%) | |
| Fragmented | 61(3.7%) | 38(2.8%) | 43(2.8%) | 52(3.2%) | 64(3.9%) | 56(3.4%) | 81(4.8%) | 8(0.4%) | |
| tRNA (nb) | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 37 | |
| Ribosomal RNA genes | |||||||||
| 23S (nb) | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | |
| 16S (nb) | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | |
| 5S (nb) | 3 | 2 | 2 | 2 | 3 | 2 | 2 | 1 | |
| IS-types | 17 | 6 | 7 | 5 | 9 | 1 | 13 | 2IS |
aThese genomes have got a 9,369 bp (HPAG1), a 10,031 bp (G27), a 10,225 bp (P12), a 3,661 bp (Sheeba) plasmid and a 10,031 bp (G27) and a 10,225 bp (P12). Plasmids were not counted
bRevised number with the MaGe system and manual curation
cPercentage of fragments of genes/total CDSs
dPercentage of fragmented genes/total CDSs
e Number of copies
Automatic distribution of protein functions, based on the COG classification, between Helicobacter strains
| Cell cycle control, cell division, chromosome partitioning | D | 39 | 2.55% | 39 | 2.38% | 38 | 2.46% | 34 | 2.21% | 38 | 2.39% | 37 | 2.30% | 39 | 2.38% | 37 | 2.10% | 33 | 1.78% |
| Cell wall/membrane/envelope biogenesis | M | 109 | 7.13% | 106 | 6.48% | 105 | 6.81% | 105 | 6.82% | 104 | 6,53% | 108 | 6.70% | 106 | 6.47% | 109 | 6.20% | 134 | 7.24% |
| Cell motility | N | 65 | 4.26% | 58 | 3.54% | 60 | 3.89% | 57 | 3.70% | 55 | 3.46% | 59 | 3.66% | 61 | 3.72% | 57 | 3.24% | 68 | 3.68% |
| Posttranslational modification, protein turnover, chaperones | O | 71 | 4.65% | 74 | 4.52% | 75 | 4.86% | 74 | 4.81% | 77 | 4.84% | 76 | 4.72% | 76 | 4.64% | 73 | 4.15% | 85 | 4.60% |
| Signal transduction mechanisms | T | 54 | 3.53% | 52 | 3.18% | 56 | 3.63% | 47 | 3.05% | 46 | 2.89% | 51 | 3.17% | 57 | 3.48% | 46 | 2.62% | 68 | 3.68% |
| Intracellular trafficking, secretion, vesicular transport | U | 59 | 3.86% | 76 | 4.64% | 73 | 4.73% | 66 | 4.29% | 71 | 4.46% | 74 | 4.59% | 83 | 5.06% | 56 | 3.18% | 60 | 3.24% |
| Defense mechanisms | |||||||||||||||||||
| Extracellular structures | W | 0 | 0.00% | 0 | 0.00% | 0 | 0.00% | 1 | 0.07% | 1 | 0.06% | 1 | 0.06% | 0 | 0 | 0 | 0% | 1 | 0.05% |
| Cytoskeleton | Z | 0 | 0.00% | 0 | 0.00% | 0 | 0.00% | 0 | 0.00% | 0 | 0.00% | 0 | 0 | 0 | 0 | 0 | 0% | 1 | 0.05% |
| Chromatin structure and dynamics | B | 1 | 0.07% | 1 | 0.06% | 0 | 0 | 1 | 0.07% | 1 | 0.06% | 1 | 0.06% | 1 | 0.06% | 0 | 0% | 1 | 0.05% |
| Translation, ribosomal structure, biogenesis | J | 134 | 8.77% | 138 | 8.43% | 141 | 9.14% | 137 | 8.90% | 138 | 8.67% | 136 | 8.44% | 139 | 8.48% | 134 | 7.62% | 148 | 8.00% |
| Transcription | K | 53 | 3.47% | 44 | 2.69% | 46 | 2.98% | 48 | 3.12% | 43 | 2.70% | 49 | 3.04% | 49 | 2.99% | 43 | 2.45% | 59 | 3.19% |
| Replication, recombination and repair | 10.62% | ||||||||||||||||||
| Energy prodcution and conversion | C | 90 | 5.89% | 89 | 5.44% | 87 | 5.64% | 93 | 6.04% | 93 | 5.84% | 92 | 93 | 5.67% | 87 | 4.95% | 118 | 6.38% | |
| Amino acid transport | E | 151 | 9.88% | 145 | 8.86% | 152 | 9.85% | 146 | 9.49% | 153 | 9.61% | 150 | 9.31% | 150 | 9.15% | 160 | 9.10% | 200 | 10.81% |
| Nucleotide transport | F | 46 | 3.01% | 45 | 2.75% | 47 | 3.05% | 45 | 2.92% | 44 | 2.76% | 47 | 2.92% | 47 | 2.87% | 47 | 2.67% | 58 | 3.14% |
| Carbohydrate transport | G | 60 | 3.93% | 56 | 3.42% | 57 | 3.69% | 55 | 3.57% | 63 | 3.96% | 55 | 3.41% | 59 | 3.60% | 59 | 3.35% | 76 | 4.11% |
| Coenzyme transport | H | 75 | 4.91% | 73 | 4.46% | 75 | 4.86% | 75 | 4.87% | 75 | 4.71% | 75 | 4.66% | 76 | 4.64% | 67 | 3.81% | 87 | 4.70% |
| Lipid transport | I | 50 | 3.27% | 49 | 2.99% | 49 | 3.18% | 50 | 3.25% | 47 | 2.95% | 51 | 3.17% | 51 | 3.11% | 47 | 2.67% | 54 | 2.92% |
| Inorganic ion transport | P | 97 | 6.35% | 94 | 5.74% | 100 | 6.48% | 94 | 6.11% | 103 | 6.47% | 97 | 6.02% | 98 | 5.98% | 95 | 5.40% | 125 | 6.76% |
| Secondary matabolites biosynthesis, transport | Q | 26 | 1.70% | 25 | 1.53% | 25 | 1.62% | 23 | 1.50% | 23 | 1.45% | 22 | 1.37% | 26 | 1.59% | 24 | 1.36% | 37 | 2.00% |
| General function prediction only | R | 174 | 11.39% | 173 | 10.57% | 178 | 11.54% | 160 | 10.40% | 174 | 10.93% | 168 | 10.43% | 175 | 10.68% | 166 | 9.44% | 234 | 12.65% |
| Function unknown | S | 84 | 5.50% | 80 | 4.89% | 71 | 4.60% | 78 | 5.07% | 70 | 4.40% | 84 | 5.21% | 81 | 4.94% | 70 | 3.98% | 113 | 6.11% |
*The CDSs were manually curated in the MaGe system for the elimination of artifacts.
Figure 4Synteny lineplot pair-wise analyses between B38 and the .