| Literature DB >> 11299044 |
P Le Flèche1, Y Hauck, L Onteniente, A Prieur, F Denoeud, V Ramisse, P Sylvestre, G Benson, F Ramisse, G Vergnaud.
Abstract
BACKGROUND: Some pathogenic bacteria are genetically very homogeneous, making strain discrimination difficult. In the last few years, tandem repeats have been increasingly recognized as markers of choice for genotyping a number of pathogens. The rapid evolution of these structures appears to contribute to the phenotypic flexibility of pathogens. The availability of whole-genome sequences has opened the way to the systematic evaluation of tandem repeats diversity and application to epidemiological studies.Entities:
Mesh:
Substances:
Year: 2001 PMID: 11299044 PMCID: PMC31411 DOI: 10.1186/1471-2180-1-2
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Figure 1Querying the tandem repeats database 1A: bacterial tandem repeats main page Bacteria species are listed in alphabetical order. The name of the strain used for sequencing is indicated after the species name and before the genome size (expressed in megabase). The rightmost figure indicates the density (per Mb) of tandem repeat arrays longer than 100 bp. The search for tandem repeats can be restricted according to a combination of criteria, including total array length (L), repeat unit length (U), number of repeats (N), internal conservation of the repeats (V), position (expressed in kilobase) on the genome (Pos), GC content of the array (%GC), strand bias (B). Three different biases can be evaluated, GC bias, AT bias and Purine-Pyrimidine bias. The bias reflects strand asymmetry of the repeat sequence. The search output can either present a list of characteristics of the tandem repeats fulfilling criteria, ordered according to their position on the genome, or classify the tandem repeats according to a selected structural parameter. 1B: examples of queries in three genomes All tandem repeat arrays spanning more than 100 base-pairs are classified according to repeat unit length. The query was run on Buchnera sp. (left panel), Yersinia pestis (middle panel) and Pseudomonas aeruginosa (right panel).
Figure 2Relative frequency of tandem repeats within bacterial genomes The ten non-pathogen species are listed on top. Within each category, species are ordered according to genome size (smallest genome on top). The density of tandem repeat arrays longer than 100 bp is plotted for each species (dark bars). The clear bars reflect the excess (χ2 values) of tandem repeats with a repeat unit length multiple of three.
Figure 3Selection procedure of minisatellites for 3A: Sixty-four tandem repeats have at least 7 units longer than 9 base-pairs. Panel A presents the distribution of these 64 loci according to repeat unit length. Each rectangle is an hyperlink to an alignment file. The rectangle indicated by the arrow is linked to the file illustrated in panel B. 3B: This is an annotated alignment file. The file corresponds to Yp3057ms09 (Table 1 and Figure 4; Yp : Yersinia pestis; 3057 : position on the genome, expressed in kilobases; MS09 : MiniSatellite index). The consensus pattern of 18 base-pairs is aligned to each motif. Annotations of the file are inserted within brackets. Although this minisatellite is very polymorphic, eleven different motifs (labeled a-k) are observed in the sequenced allele. The first four and last two copies are most diverged and rare. Four types of motifs (f, g, h, i) constitute most of the array. For convenience, 18 motifs have been removed from the alignment file and replaced by their letter code. The last two copies are 21 base-pair long instead of 18. The end of the alignment file (panel B, bottom) provides sequence data flanking the tandem repeat array. The positions of the primers chosen for PCR amplification of this locus (Table 1) are shown underlined.
Description of Yersinia polymorphic markers
| 90 s | ||||||||||
| 90 s | ||||||||||
| 53°C | ||||||||||
| 1 min | ||||||||||
| 90 s | ||||||||||
Some structural characteristics of the tandem repeats are presented : U (unit length), N (number of repeats), %GC, V (% of conservation). PCR and electrophoresis conditions are as described in the material and methods section : annealing temperature is 60°C, elongation time is 60 seconds and gels are 2% agarose except when indicated otherwise. Total number of alleles means number of alleles in 3 Y. pestis and 2 Y. pseudotuberculosis strains.
Figure 4Images of PCR amplification of the twenty-five minisatellites polymorphic in the DNA from three reference Y. pestis strains representing each of the main biovars, antiqua (lane 1), medievalis (lane 2) and orientalis (lane 3) and two Y. pseudotuberculosis strains (lanes 4 and 5) have been PCR amplified and an aliquot of the products has been run on 2% horizontal agarose gels as described. The length of the minisatellite motifs (U) and the size range is indicated on each panel. Yp2916ms07 has one of the shortest (10 bp) unit. Four alleles are clearly distinguished between the 150 and 200 bp marker fragments.
Description of Bacillus anthracis polymorphic markers
| 1% | |||||||||||
| 53°C | |||||||||||
| 60s | |||||||||||
| 65°C | |||||||||||
| 1% | |||||||||||
| 120s | |||||||||||
| 1% | |||||||||||
| 120s | |||||||||||
| 1% | |||||||||||
| 1% |
Some structural characteristics of the tandem repeats are presented : U (unit length), N (number of repeats), %GC, V (% of conservation). PCR and electrophoresis conditions are as described in the material and methods section : annealing temperature is 60°C, elongation time is 60 seconds and gels are 2% agarose except when indicated otherwise. The expected product length is deduced from the sequencing data corresponding to the Ames strain. When the Ames strains typing does not fit with the expected value, the observed value is indicated between (). Only one side of the Ceb-Bams30 minisatellite can be identified in the available Ames sequence. The other side was identified in the course of the independent, partial sequencing of B. anthracis strains (Vergnaud and col., unpublished data). Total number of alleles includes alleles observed in the B. cereus strains. Polymorphism Information Index (PIC) or Nei's diversity index is calculated as 1 - Σ (allele frequency)2.
Figure 5PCR amplification of DNA from B. anthracis and B. cereus (six rightmost lanes) was amplified using primers for CEB-Bams30 (Table 2). The PCR products were run on a 40 cm long 2% ordinary agarose gel.
Figure 6The genotype of each strain for the polymorphic minisatellites is given (size estimates for each allele are given in Table 3). "0" indicates a failure of the PCR amplification. This is most often associated with B. cereus strains, and probably reflects in these cases sequence divergence in the flanking sequence. The phylogenetic tree was produced using the Neighbor-Joining method as available on-line at
Correspondence between B. anthracis allele sizes and allele numbering
| ~ 410 | ~ 430 | ~ 450 | ~ 480 | ~ 520 | ||||||
| 484 | 514 | 544 | 559 | 574 | 589 | 704 | 734 | 857 | ||
| 307 | 346 | 385 | ||||||||
| 603 | 1017 | 1305 | 1503 | 1557 | 1647 | 1809 | 1899 | 1953 | ||
| 328 | 382 | 454 | 481 | 490 | 652 | 742 | 787 | 814 | 850 | |
| 409 | 535 | 571 | 589 | 607 | ||||||
| 541 | 631 | 676 | ||||||||
| 591 | 627 | 699 | 735 | ~ 900 | ~ 950 | |||||
| 569 | 611 | 653 | 821 | |||||||
| 336 | 420 | 462 | 504 | 630 | 672 | |||||
| 376 | 391 | |||||||||
| ~ 300 | ~ 375 | ~ 400 | ||||||||
| 266 | 375 | 500 | 660 | 695 | 730 | 760 | 850 to | |||
| 900 | ||||||||||
| 304 | 700 | 772 | 853 | |||||||
| 289 | 301 | 313 | 325 | 337 | ||||||
| 184 | 193 | 220 | 229 | 256 | ~ 280 | ~ 290 | ||||
| ~ 135 | 153 | 162 | 171 | ~ 180 | ||||||
| 400 | 502 | 520 | 538 | 583 | 613 | 685 | ||||
| 532 | 568 | 607 | 660 | |||||||
| 153 | 158 |
Alleles have been numbered in increasing size order. When the allele size (in base-pairs) observed in the Ames strain was in agreement with the size expected according to Ames sequence data, the values indicated in the table assume that alleles differ in size by a multiple of the motif length. These likely values will have to be confirmed by more accurate size estimation tools and allele sequencing. When the allele size in Ames is not as expected (Ceb-Bams1 and Ceb-Bams28), the estimated values are preceded by a ~. The Vrr and CG3 allele sizes were described in [2]; new alleles are indicated by a ~.
Figure 7Significant correlation between number of alleles and minisatellites structural characteristics The number of alleles is plotted as a function of Total length and %GC for Bacillus anthracis, and %matches for Yersinia pestis (the correlations are highly significant at the 0.01 level). Number of alleles for each locus is the total number detected (i.e. Bacillus anthracis and B. cereus; Yersinia pestis and Y. pseudotuberculosis).