| Literature DB >> 14611665 |
Philip Jordan1, Lori A S Snyder, Nigel J Saunders.
Abstract
BACKGROUND: Tandem repeats contained within coding regions can mediate phase variation when the repeated units change the reading frame of the coding sequence in a copy number dependent manner. Coding tandem repeats are those which do not alter the reading frame with copy number, and the changes in copy number of these repeats may then potentially alter the function or antigenicity of the protein encoded. Three complete neisserial genomes were analyzed and compared to identify coding tandem repeats where the number of copies of the repeat will have some structural consequence for the protein. This is the first study to address coding tandem repeats that may affect protein structures using comparative genomics, combined with a population survey to investigate which show interstrain variability.Entities:
Mesh:
Substances:
Year: 2003 PMID: 14611665 PMCID: PMC305346 DOI: 10.1186/1471-2180-3-23
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Genes containing coding tandem repeats with differing copy numbers.
| Tandem repeat | NMA#* | NMB# † | XNG# ‡ | Gene annotation § | Repeats noted previously? |
| TR1 | NMA0227 | NMB2141 | XNG1829 | hypothetical protein | no |
| TR2 | NMA0257 | NMB0010 | XNG1803 | phosphoglycerate kinase ( | no |
| TR3 | NMA0338 | NMB2092 | XNG1869 | hypothetical protein | no |
| TR4 | NMA0440 | NMB2001 | XNG1095 | conserved hypothetical protein | no |
| TR5 | NMA0650 | NMB1812 | XNG0088 | PilQ protein ( | yes |
| TR6 | NMA0702 | NMB0525 | XNG0120 | aluminium resistance protein, putative | no |
| TR7 | NMA0789 | NMB0586 | XNG0151 | adhesin, putative | no |
| TR8 | NMA1150 | NMB0956 | XNG0849 | SucB protein ( | no |
| TR9 | NMA1461 | NMB1027 | XNG0526 | DnaJ protein, truncation ( | no |
| TR10 | NMA1491 | NMB1281 | XNG0596 | transcription-repair coupling factor ( | no |
| TR11 | NMA1547 | NMB1333 | XNG0546 | conserved hypothetical protein | no |
| TR12 | NMA1612 | NMB1395 | XNG0677 | alcohol dehydrogenase, zinc-containing | no |
| TR13 | NMA1680 ¶ | NMB1468 | XNG0955 | hypothetical protein | no |
| TR14 | NMA1692 | NMB1483 | XNG0967 | lipoprotein NlpD, putative | no |
| TR15 | NMA1723 | NMB1523 | XNG0903 | Lip (H.8 antigen) protein ** | yes |
| TR16 | NMA1887 | NMB1623 | XNG1179 | major anaerobically induced OMP ( | yes |
| TR17 | NMA1897 | NMB1643 | XNG1189 | translation initiation factor IF-2 ( | no |
| TR18 | NMA1977 | NMB1723 | XNG1268 | cytochrome c oxidase, subunit III ( | no |
| TR19 | NMA1985 | NMB1730 | XNG1276 | TonB protein ( | no |
| TR20 | NMA2065 | NMB0419 | XNG1419 | conserved hypothetical protein ( | yes |
| TR21 | NMA2105 | NMB0382 | XNG1455 | outer membrane protein class 4 ( | no |
| TR22 | NMA2206 | NMB0281 | XNG1595 | peptidyl-prolyl cis-trans isomerase | no |
| TR23 | ¶¶ | ¶¶ | XNG0938 | hypothetical protein | no |
| TR24 | ¶¶ | NMB1848 | ¶¶ | hypothetical protein | no |
| TR25 | ¶¶ | ¶¶ | XNG0481 | hypothetical protein | no |
| TR26 | NMA0386 | NMB2050 | XNG1916 | conserved hypothetical protein | no |
| TR27 | NMA1515 | NMB1301 | XNG0578 | 30S ribosomal protein S1 ( | no |
| TR28 | NMA2213 | NMB0274 | XNG1602 | DNA helicase RecQ ( | no |
* [19] † [20] ‡ Locus numbers from our own annotation of N. gonorrhoeae strain FA1090 as used in [43]. § From [20] unless otherwise noted. || Gene annotation from [74] ¶ NMA1680 is annotated on the reverse complement strand compared to NMB1468 and XNG0955. ** Gene annotations from [75] and [30] †† OMP: outer membrane protein. Gene annotation from [33] ‡‡ Gene annotation from [76] §§ Gene annotation from [15] || || Gene annotation from [77] ¶¶ Corresponding gene is not present in this strain. *** Gene annotation from [78]
Copy number of coding tandem repeats.
| Repeats | Strains | MC58 | 44/76 | NGE30 | BZ133 | 92001 | 94/155 | A22 | Z2491 | FAM18 | L18 | L12 | L22 | FA1090 | FA19 | 26034 | Repeat length | Nature * |
| TR1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 2 | 2 | 30 bp / 10 aa | no new | |
| TR2 | 3 | 4 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | np | 2 | 2 | 2 | 33 bp / 11 aa | more | |
| TR3 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | np | np | 2 | 2 | 2 | 18 bp / 6 aa | no new | |
| TR4 | 3 | 2 | 3 | 3 | 3 | 3 | 3 | 4 | 3 | np | 1 | 1 | 2 | 2 | 1 | 87 bp / 29 aa | more | |
| TR5 † | 2 / 4 | 2 / 5 | 2 / 3 | 2 / 4 | 2 / 1 | 2 / 3 | 2 / 5 | 2 / 3 | 0 / 0‡ | np | np | np | 1 / 1 | 1 / 2 | 1 / 2 | † | more | |
| TR6 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 2 | 2 | 15 bp / 5 aa | no new | |
| TR7 | 2 | 2 | 2 | 2 | 3 | 2 | 2 | 3 | 2 | 2 | 3 | 2 | 3 | 3 | 2 | 12 bp / 4 aa | no new | |
| TR8 | 2 | 2 | 2 | 2 | 2 | 3 | 2 | 3 | 4 | 2 | 2 | 2 | 2 | 2 | 2 | 30 bp / 10 aa | more | |
| TR9 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 30 bp / 10 aa | species | |
| TR10 | 3 | 3 | 2 | 3 | 3 | 3 | 3 | 2 | 3 | 2 | 0 | 2 | 1 | 1 | 1 | 207 bp / 69 aa | more | |
| TR11 | 2 | 2 | 3 | 3 | 4 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 4 | 4 | 15 bp / 5 aa | more | |
| TR12 | 1 | 1 | 1 | 3 | 3 | 1 | 1 | 3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 21 bp / 7 aa | no new | |
| TR13 | 2 | 2 | 7 | 2 | 2 | 3 | 2 | 2 | 4 | np | np | np | 3 | 3 | 3 | 21 bp / 7 aa | more | |
| TR14 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | np | np | np | 2 | 2 | 2 | 240 bp / 80 aa | species | |
| TR15 | 14 | 14 | 15 | 15 | 10 | 14 | 15 | 14 | 13 | np | np | np | 12 | 18 | 16 | 15 bp / 5 aa | more | |
| TR16 | 3 | 3 | 3 | 2 | 3 | 2 | np | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 12 bp / 4 aa | no new | |
| TR17 | 2 | np | 2 | 2 | 2 | np | 2 | 2 | 2 | np | np | np | 1 | np | 1 | 57 bp / 19 aa | species | |
| TR18 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 12 bp / 4 aa | species§ | |
| TR19a | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | np | np | np | 2 | 2 | 2 | 18 bp / 6 aa | species | |
| TR19b | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 4 | 4 | 4 | 4 | 4 | 4 | 9 bp / 3 aa | species | |
| TR20 || | 4 | 4 | 8 | 1 | 2 | 2 | 1 | 1 | 1 | 5 | 8 | 11 | 2 | 2 | nd | 108 bp / 36 aa | more | |
| TR21 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | 6 | np | np | np | 2 | 2 | 2 | 6 bp / 2 aa | species | |
| TR22 | 7 | 11 | 19 | 5 | 6 | 9 | 16 | 9 | 26 | np | np | np | 2 | 2 | 2 | 9 bp / 3 aa | more | |
| TR24 | 15 | 14 | 3 | np | np | 5 | 14 | --- | --- | np | np | np | --- | np | np | 18 bp / 6 aa | more | |
| TR26 | 2 | 2 | 2 | 2 | 2 | 1¶ | 2 | 2 | 2 | np | np | np | 2 | 2 | 2 | 273 bp / 91 aa | none | |
| TR27 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | np | np | np | 2 | 2 | 2 | 261 bp / 87 aa | none | |
| TR28 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 258 bp / 86 aa | none | |
| Repeats | Strains | FA1090 | MS11 | FA19 | F62 | 25534 | 28539 | 26034 | 26241 | 27921 | 29528 | 28516 | 27806 | |||||
| TR23 | 3 | 3 | 3 | 3 | 5 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 18 bp / 6 aa | more | ||||
| TR25 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 30 bp / 10 aa | none | ||||
np – no product nd – not done * Nature of the length variations of the coding tandem repeat: species – observed lengths are associated with species; more – lengths in addition to what is seen in the three sequence strains used in genome comparison analysis; no new – no new lengths observed in addition to those seen in the sequenced strains used in genome comparison analysis; none – no length differences observed. † Compound tandem repeat, the first is 66 bp / 11 aa and the second is 24 bp / 8 aa ‡ PilQ is frame-shifted in the genome sequence due to a large deletion starting 50 bp into the 66 bp component of the compound tandem repeat element and ending 361 bp after the region that contains the repeats in other strains. § Length differences were only seen in the non-pathogen, N. lactamica. || From [15]. ¶ The CDS in this strain contains a 303 bp deletion, which begins 63 bp before the first copy of the tandem repeat in the other strains and ends 32 bp before the second copy of the tandem repeat in the other strains. Thus this strain has one full copy of the tandem repeat and 32 bp remaining of a second copy.
Neisseria spp. strains used in this study.
| Strain set for tandem repeat PCRs | Strain identifier used in Table |
| | MC58 |
| | 44/76 |
| | NGE30 |
| | BZ133 |
| | 92001 |
| | 94/155 |
| | A22 |
| | L18 |
| | L12 |
| | L22 |
| | FA19 |
| | 26034 |
| Strain set for | |
| | FA1090 |
| | MS11 |
| | FA19 |
| | F62 |
| | 25534 |
| | 28539 |
| | 26593 |
| | 26241 |
| | 27921 |
| | 29528 |
| | 28516 |
| | 27806 |
Figure 1Two consecutive tandem repeat elements exist in pilQ (TR5). The first repeated unit is 66 bp. The first 24 bp of this 66 bp repeat is homologous to the second repeated unit of 24 bp. Both repeats in this compound coding tandem repeat are present in different lengths in the strains.
Figure 2Hydrophobicity profiles of DcaC. The number of coding tandem repeats present in DcaC influences the hydrophobicity profile. In N. meningitidis strain Z2491 there is one copy of the 36 amino acid repeat, while in N. meningitidis strain MC58 there are four copies. Generated using TopPredII, where the cutoff for certain transmembrane segments is 1, therefore no transmembrane domains are predicted.
Primer pairs used in this study.
| Tandem repeat | Forward primer (5'-3') | Reverse primer (5'-3') |
| TR1 | GGTCGCTGGATACGCTGC | CGGTAGCCCAAGCCTGCG |
| TR2 | TGCCGGCAGCAAAGTGTCC | GCCCGTTCCAAACGACCG |
| TR3 | AGCGGCAGCGGACTGCC | GTGTGCCTGCCGTGCCG |
| TR4 | CAAACCGGCAGTTTGGGCG | ATGGGAAATGCGGCTGCCG |
| TR4v2 | TGCTCAGCAGCCGCGAGC | GGATGAAGCGGTTGTTGCCG |
| TR5 | CAGCCGTGCGCGTCTGG | TTGGCTGATGTCGGGCTGC |
| TR5v2 | CGGGGCGCGATGTTGACG | |
| TR6 | AACGCCATGCCGTCGAGC | GCCCGCGATGATGTGCCG |
| TR7 | CGAAGCGACCAAAGGCATCC | GGGTCATATTCGCCGTGGTC |
| TR8 | CGGTGAAACCGTTGTTGCCG | TCACGGCCGGAACCTTGC |
| TR9 | CCGCCTACATCCTGTTCGG | AAAGCCGGCGATGATGCGG |
| TR10 | GCGCCGAAAGTTTGGGACG | GCTTCGCCTTGTTCCACGC |
| TR11 | CGCTGCCACCAATGATGTCG | TCGTAGCGGCGACTTCCG |
| TR12 | GTAGAAGAAGTCGGCGAGGC | ATCGCCGTATTGCACGCCG |
| TR13 | CAGCCGATTGATGGAACACG | GCCTGAAAATCTTCAGGCGG |
| TR14 | TTGCCGCTCTGTTGGGCG | TGCCCGTATGCGGTCAGG |
| TR14v2 | CTGTTGGGCGGTTGCGCC | CCGCGTTTGACCCTGCTG |
| TR15 | GCGCATTCTAACACAACCGC | AGGAAGGGAATCTGATGCCG |
| TR16 | GCATACCGCCAGAACGGC | ATACGGCCCGGAACGTCG |
| TR17 | AACACGCAAACGCGGACGC | TGCGGCACGGCAGGTGC |
| TR18 | CAGCAGCCAAATGCCCGC | TGCTGCGGCAGGTTTGGC |
| TR19a | GCGCCCGAACCGCAACC | GTTTTTCCGCCGGTTTCGGG |
| TR19b | AACGGGGCGCGGAGAAGG | GCCCGGAGAAACCAAAACGC |
| TR20 | Investigated previously [ | |
| TR21 | GTCGCGTAGAATGCGGCG | GGAGCCTGCTCCACAACG |
| TR22 | GCCGCCGCATTGCTGGC | GCCCTGCTGTTGTGCGGG |
| TR23 | ATCCTGCCGCCGCCTGC | GATGACCGCGGCATCAGC |
| TR24 | GTGCTTTTCGGGCAAGTGCC | CACCAATCCTACACCGTTCCC |
| TR25 | CGCCCGAAGGGTTTACCG | GACACGCCGTCAATGACGC |
| TR26 | GTTTCAGGGCGAGTTTGCCG | CTCGCTGTGCAGCTGCGC |
| TR27 | GGGTGAAGAACGCAAAGCCC | GAGTTCAGTGCTTCGCGGC |
| TR28 | GATTGGATACGCGGCAACCG | ATACGGCGGCAAGCTCCG |