| Literature DB >> 22509353 |
Jing Cheng1, Huping Xue, Xin Zhao.
Abstract
Tandem repeats (either as microsatellites or minisatellites) in eukaryotic and prokaryotic organisms are mutation-prone DNA. While minisatellites in prokaryotic genomes are underrepresented, the cell surface adhesins of bacteria often contain the minisatellite SD repeats, encoding the amino acid pair of serine-asparatate, especially in Staphylococcal strains. However, their relationship to biological functions is still elusive. In this study, effort was made to uncover the copy number variations of SD repeats by bioinformatic analysis and to detect changes in SD repeats during a plasmid-based assay, as a first step to understand its biological functions. The SD repeats were found to be mainly present in the cell surface proteins. The SD repeats were genetically unstable and polymorphic in terms of copy numbers and sequence compositions. Unlike SNPs, the change of its copy number was reversible, without frame shifting. More significantly, a rearrangement hot spot, the ATTC/AGRT site, was found to be mainly responsible for the instability and reversibility of SD repeats. These characteristics of SD repeats may facilitate bacteria to respond to environmental changes, with low cost, low risk and high efficiency.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22509353 PMCID: PMC3324548 DOI: 10.1371/journal.pone.0034756
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1DNA sequence characterization of clfA SD repeats of S. aureus Smith Cp and the locations of rearranged sites from 9 variants of pNZ3004-ClfA.rRWM and 1 variant of pNZ3004-ClfA.SAR.rRWM.C9.
The number at two sides represents the nucleotide number. The perfect consensus repeats are marked in red and the imperfect consensus repeats are marked in black. The repeat right after 15 central perfect consensus repeats has only 12 nucleotides, as this arrangement makes the whole repeats maximally conform the consensus. Each rearranged site in 9 variants of pNZ3004-ClfA.rRWM was marked by the name of variants in parentheses in green, with the region between them looped out during the rearrangement. For 1 variant of pNZ3004-ClfA.SAR.rRWM.C9, 5 SD repeats were added between the two arrangement sites and were not shown in the figure, marked by the name of the variant in parentheses in blue.
Plasmids and their relevant features in this study.
| Plasmid | Relevant features | Reference |
| pNZ3004 |
|
|
| pNZ3004-ClfA.rRWM | RWM domains cloned into pNZ3004, SD repeats had rearrangements as indicated by ‘r’ | this study |
| pNZ3004-ClfA.rRWM.C1 | R domain contained 8 repeats | this study |
| pNZ3004-ClfA.rRWM.C2 | R domain contained 18 repeats | this study |
| pNZ3004-ClfA.rRWM.C3 | R domain contained 8 repeats | this study |
| pNZ3004-ClfA.rRWM.C4 | R domain contained 29 repeats | this study |
| pNZ3004-ClfA.rRWM.C5 | R domain contained 23 repeats | this study |
| pNZ3004-ClfA.rRWM.C6 | R domain contained 11 repeats | this study |
| pNZ3004-ClfA.rRWM.C7 | R domain contained 11 repeats | this study |
| pNZ3004-ClfA.rRWM.C8 | R domain contained 13 repeats | this study |
| pNZ3004-ClfA.rRWM.C9 | R domain contained 24 repeats | this study |
| pNZ3004-ClfA.SA.rRWM.C4 | SA segment cloned into pNZ3004-ClfA.rRWM.C4 | this study |
| pNZ3004-ClfA.SA.rRWM.C6 | SA segment cloned into pNZ3004-ClfA.rRWM.C6 | this study |
| pNZ3004-ClfA.SA.rRWM.C8 | SA segment cloned into pNZ3004-ClfA.rRWM.C8 | this study |
| pNZ3004-ClfA.SA.rRWM.C9 | SA segment cloned into pNZ3004-ClfA.rRWM.C9 | this study |
| pNZ3004-ClfA.SAR.rRWM.C9 | SAR segment cloned into pNZ3004-ClfA.rRWM.C9 | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C1 | 87 repeats, instability of SD repeats discovered after 3 rounds of propagation | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C2 | 92 repeats, expansion of 5 SD repeats at rRWM segment | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7 | 87 repeats, instability of SD repeats discovered after one time of transformation | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C1 | deletion in SD repeats between SAR and rRWM, lost | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C2 | deletion in SD repeats between SAR and rRWM, lost | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C3 | deletion in SD repeats between SAR and rRWM, lost | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C4 | deletion in SD repeats between SAR and rRWM, lost | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C5 | deletion in SD repeats between SAR and rRWM, lost | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C6 | deletion in SD repeats between SAR and rRWM, lost | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C7 | SAR segment increase, may have expansion in SD repeats of SAR | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C8 | deletion in SD repeats of SAR segment | this study |
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C10 | deletion in SD repeats between SAR and rRWM, lost | this study |
: The copy numbers of the SD repeats were same for the following plasmids: pNZ3004-ClfA.SAR.rRWM.C9.C1 and pNZ3004-ClfA.SAR.rRWM.C9.C3 to pNZ3004-ClfA.SAR.rRWM.C9.C10.
The 5′ recombination sites in SD repeat regions of different variants.
| Variants | Rearrangement site |
| pNZ3004-ClfA.rRWM.C1 |
|
| pNZ3004-ClfA.rRWM.C2 |
|
| pNZ3004-ClfA.rRWM.C4 |
|
| pNZ3004-ClfA.rRWM.C5 |
|
| pNZ3004-ClfA.rRWM.C6 |
|
| pNZ3004-ClfA.rRWM.C7 |
|
| pNZ3004-ClfA.rRWM.C8 |
|
| pNZ3004-ClfA.rRWM.C9 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C2 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C1 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C2 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C3 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C4 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C5 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C6 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C8 |
|
| pNZ3004-ClfA.SAR.rRWM.C9.C7.C10 | ATAG/CGAT |
| pNZ3004-ClfA.rRWM.C3 |
|
: The DNA before slash is homologous sequence located at 5′ of two recombination sites of the mutant.
: The nucleotides at 5′ of two recombination sites are not totally homologous in this mutant.
Figure 2Instability of SD repeats shown by restriction enzyme analyses.
(A) pNZ3004-ClfA.SAR.rRWM.C9 plasmid map showed locations of the restriction enzymes and sizes of different segments. (B) After 3 rounds of propagation, instability of constructs pNZ3004-ClfA.SAR.rRWM.C9.C1 was detected by restriction enzyme analyses. Lane 1: Uncut plasmid; Lane 2: Digested by BamHI; Lane 3: Digested by BamHI/SalI; Lane 4: Digested by PstI; Lane 5: Digested by SalI; Lane 6: Digested by EcoRI; Lane 7: BamHI restriction enzyme activity control. Plasmid PF01 digested by BamHI; Lane 8: ultraRanger 1 kb DNA ladder (Kb). Arrows A and B indicated the incomplete digestion by BamHI and the failed double digestion by BamHI/SalI, respectively. Arrow C showed the mixture of the shorter plasmid DNA. Arrow D revealed the rearrangement in the region between two PstI sites. (C) The rearranged variants from the transformants of pNZ3004-ClfA.SAR.rRWM.C9.C7 digested by PstI. Lane 1: pNZ3004-ClfA.SAR.rRWM.C9.C7.C1 (deletion in SD repeats between SAR and rRWM). Lane 2: pNZ3004-ClfA.SAR.rRWM.C9.C7.C2 (deletion in SD repeats between SAR and rRWM). Lane 3: pNZ3004-ClfA.SAR.rRWM.C9.C7.C3 (deletion in SD repeats between SAR and rRWM). Lane 4: pNZ3004-ClfA.SAR.rRWM.C9.C7.C4 (deletion in SD repeats between SAR and rRWM). Lane 5: pNZ3004-ClfA.SAR.rRWM.C9.C7.C5 (deletion in SD repeats between SAR and rRWM). Lane 6: pNZ3004-ClfA.SAR.rRWM.C9.C7.C6 (deletion in SD repeats between SAR and rRWM). Lane 7: pNZ3004-ClfA.SAR.rRWM.C9.C7.C7 (expansion in SD repeats between SAR and rRWM). Lane 8: pNZ3004-ClfA.SAR.rRWM.C9.C7.C8 (deletion in SD repeats between SAR and rRWM). Lane 9: pNZ3004-ClfA.SAR.rRWM.C9.C7.C9 (no rearrangement). Lane10: pNZ3004-ClfA.SAR.rRWM.C9.C7.C10 (deletion in SD repeats between SAR and rRWM). (D) The rearranged variants from the transformants of pNZ3004-ClfA.SAR.rRWM.C9.C7 double-digested by BamHI and SalI. Lane 1: pNZ3004-ClfA.SAR.rRWM.C9.C7.C1 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 2: pNZ3004-ClfA.SAR.rRWM.C9.C7.C2 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 3: pNZ3004-ClfA.SAR.rRWM.C9.C7.C3 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 4: pNZ3004-ClfA.SAR.rRWM.C9.C7.C4 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 5: pNZ3004-ClfA.SAR.rRWM.C9.C7.C5 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 6: pNZ3004-ClfA.SAR.rRWM.C9.C7.C6 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 7: pNZ3004-ClfA.SAR.rRWM.C9.C7.C7 (SAR segment increase, may have expansion in SD repeats of SAR). Lane 8: pNZ3004-ClfA.SAR.rRWM.C9.C7.C8 (deletion in SD repeats of SAR). Lane 9: pNZ3004-ClfA.SAR.rRWM.C9.C7.C9 (no rearrangement). Lane10: pNZ3004-ClfA.SAR.rRWM.C9.C7.C10 (rearrangement occurred between SAR and rRWM, lost BamHI site). Lane 11: ultraRanger 1 kb DNA ladder (Kb).
Figure 3A schematic diagram of S. aureus ClfA organization and the insert fragments in the constructs.
S: the signal sequence; A: the binding domain; R: the SD repeat region; W: the wall-spanning region; M: the membrane-spanning region. ClfA.RWM: the segment RWM of ClfA from S. aureus Smith Cp which was cloned into pNZ3004. ClfA.SA.rRWM: the inserted portion of pNZ3004-ClfA.SA.rRWM after cloning the segment SA of ClfA from S. aureus Smith Cp cloned into pNZ3004-ClfA.rRWM. ClfA.SAR.rRWM: the inserted portion of pNZ3004-ClfA.SAR.rRWM after cloning the segment SAR of ClfA from S. aureus Smith Cp into pNZ3004-ClfA.rRWM.
SD repeat variations in the surface proteins of different bacteria strains.
| Strain | Protein name (No.) | Repeat range | Variation site | Variation distribution | ||||||
| Smallest | Largest | Average | D | S | N | M | C | W | ||
|
| ClfA (36) | 45 | 65 | 52.18 | A,G,E,T,N | N,L | 10–20 | 10–15 | ||
|
| ClfB (30) | 9.33 | 59 | 37.75 | G,E,N | N | 5–10 | 1–5 | ||
|
| SdrC (28) | 5 | 58.67 | 27.80 | E | N,T | 1–5 | 1–4 | ||
|
| SdrD (27) | 22 | 38.33 | 30.53 | E | C | 0–5 | |||
|
| SdrE (28) | 13.33 | 27.67 | 23.73 | E | 0–3 | ||||
|
| Pls (6) | 26.33 | 51.33 | 38.05 | E | A | 2–30 | |||
|
| SdrF (4) | 32.67 | 93 | 69.75 | E | A,Y | 12–25 | |||
|
| SdrG (9) | 9.33 | 33.33 | 22.63 | A,G | N | 1–5 | 1–5 | ||
|
| SdrH (9) | 15 | 21 | 18.81 | G | A,N | 1–5 | 1–5 | ||
|
| Fbl (4) | 15.67 | 43.67 | 36.25 | G,N | A | 17–47 | |||
|
| Sdr (1) | 29 | 29 | 29 | A | 2 | 29 | |||
|
| SdrI (1) | 142.67 | 142.67 | 142.67 | A | 111 | ||||
|
| SdrX (1) | 34.67 | 34.67 | 34.67 | G,E,C | N,L | 5 | 8 | ||
|
| SdrZ (1) | 21.33 | 21.33 | 21.33 | G,N | A,Q | 21 | |||
|
| Sdr (4) | 81 | 267.67 | 181.92 | G,N | A,G | 2–3 | 2–3 | ||
|
| Adhesin 1 (1) | 235.67 | 235.67 | 235.67 | None | |||||
|
| Adhesin 2 (1) | 167.67 | 167.67 | 167.67 | A | 2 | ||||
|
| Sdr (1) | 424.33 | 424.33 | 424.33 | E | A | 2 | 14 | 1 | |
:The detected proteins available in the Reference, Swissprot and Non-redundant protein sequences database (accessed on August 3rd, 2011). The number in the parentheses is the number of strains which contain same protein.
:The variation site in the SD repeats, the letters represent amino acids single-letter codes (SLC).
:The distribution and number of variation sites in the proteins. N, C, M and W represent the mutations in the N terminus, C terminus, middle, or whole sequences of the SD repeats, respectively.
Primers used in this study.
| Primer | Primer sequence (5′-3′) | |
| F |
| forward |
| F-1 |
| forward |
| F-505 |
| forward |
| F-545 |
| forward |
| F-pNZ3004 |
| forward |
| R |
| reverse |
| R-226 |
| reverse |
| R-545 |
| reverse |
| R-869 |
| reverse |
| R-933 |
| reverse |
Figure 4The definition of perfect and imperfect SD repeats.
The TRs sequence “SDSDSD” was defined as a perfect consensus and an imperfect consensus was defined by the repeat contains 1–3 residues which did not follow the consensus “SDSDSD” sequence. Both perfect consensus and imperfect consensus repeats were counted as SD repeats.