| Literature DB >> 19769792 |
Christopher R E McEvoy1, Paul D van Helden, Robin M Warren, Nicolaas C Gey van Pittius.
Abstract
BACKGROUND: PPE38 (Rv2352c) is a member of the large PPE gene family of Mycobacterium tuberculosis and related mycobacteria. The function of PPE proteins is unknown but evidence suggests that many are cell-surface associated and recognised by the host immune system. Previous studies targeting other PPE gene members suggest that some display high levels of polymorphism and it is thought that this might represent a means of providing antigenic variation. We have analysed the genetic variability of the PPE38 genomic region on a cohort of M. tuberculosis clinical isolates representing all of the major phylogenetic lineages, along with the ancestral M. tuberculosis complex (MTBC) member M. canettii, and supplemented this with analysis of publicly available whole genome sequences representing additional M. tuberculosis clinical isolates, other MTBC members and non tuberculous mycobacteria (NTM). Where possible we have extended this analysis to include the adjacent plcABC and PPE39/40 genomic regions.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19769792 PMCID: PMC2758852 DOI: 10.1186/1471-2148-9-237
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Phylogenetic reconstruction of the evolutionary relationships between members of the H37Rv . The phylogenetic tree was constructed from a phylogenetic analysis done on the 180 aa N-terminal domains of the PPE proteins. Results show the division of PPE proteins into 5 sublineages with PPE38 (Rv2352c, highlighted in green) located at the border of sublineages IV (SVP subfamily) and V (MPTR subfamily). Reproduced from ref [2] with permission from the authors.
Figure 2Schematic representations of the . The PPE38 region from the published H37Rv (2a) and H37Ra (2b) sequences are shown. Colour coding as follows: PPE38 pale blue, PPE71 dark blue, MRA_2374 pale green, MRA_2375 dark green. Locations of the PPE38F/R and PPE38 IntF/R primers are shown. 2a. H37Rv ATCC reference strain (published whole genome sequence) The published H37Rv sequence [1] represents the RvD7 genotype. Recombination between PPE38 and PPE71 results in a single PPE38/71 gene (Rv2352c) and loss of the 2 esx-like genes MRA_2374 and MRA_2375. The PPE38F/R primers (black arrows) are predicted to produce an amplicon of 1335 bp from the RvD7 genotype. It is impossible to determine which PPE38/71 gene has been deleted hence the mixture of colours used. The published H37Rv sequence is not representative of the H37Rv ATCC reference strain, most clinical isolates, or the H37Ra whole genome sequence [19]. This genotype is also seen in strains SAWC 2240 (CAS, F20), SAWC 1748 (Pre-Haarlem, F24), SAWC 1595 (Quebec/S), SAWC 1841 (Haarlem, F4), CPHL_A (WA-1, M. africanum), T17 (PGG1, EAI), EAS054 (PGG1, EAI), strain C (LCC, "3 bander") and Haarlem (PGG2, F4) [see additional file 1]. 2b. H37Rv ATCC reference strain (actual) and H37Ra (published whole genome sequence) This represents the ancestral MTBC genotype that is also seen in M. canettii. It contains the 2 identical PPE38 (MRA_2373) and PPE71 (MRA_2376) genes separated by the 2 esx-like genes MRA_2374 and MRA_2375. Gene annotations are as reported for the H37Ra published sequence [19]. Locations of primers used for PCR and sequence analysis are indicated (black arrows). This is also the true genotype of the ATTC reference strain H37Rv.
Sequences of primers used for PCR amplification and sequencing.
| PPE38F | TTTTCGGTGTGGATTGTCT | 3398 bp amplicon for H37Ra-like genotype, 1331 bp amplicon for RvD7 genotype. |
| PPE38R | GCCAGGGATTTCCAACGAC | |
| PPE38IntF | ATGTCGGCGGAGTTGGGTAAG | 1351 bp amplicon for H37Ra-like genotype, no product for RvD7 genotype. |
| PPE38IntR | TAGCCTGACCAGCCGACAACT | |
| 21delF | GGGGATGATGCCGATGC | 111 bp amplicon for wild-type genotype, 90 bp amplicon for 21del genotype. |
| 21delR | ACACTGGGCCGAGCCTG | |
| IS5' | GGTACCTCCTCGATGAACCAC | IS |
| Xho1 | TTCAACCATCGCCGCCTCTAC | IS |
| plcA5' | CAAATGTCCGGGACAAGG | Primes from the 5' region of |
Figure 3RD5 and RD5-like deletions seen in MTBC isolates. This region is susceptible to frequent large deletions. Here we show the genes surrounding PPE38 along with the deleted regions characterised in 5 non-M. tuberculosis MTBC members [22-25,32], along with the deletions detected in the M. tuberculosis whole genome sequences T92 and 94_M4241A. A red arrow indicates the presence and direction of IS6110 at a deletion point. Deletions caused by homologous recombination between PPE39 and PPE40 in M. bovis BCG and 94_M4241A are also shown. Numbering refers to gene nucleotide positions.
Figure 421del PCR results for all 21 members of PGG2. Wild type gene amplicon = 111 bp. 21del amplicon = 90 bp. Sample SAWC 3100 (F14) is negative for all PPE38-related PCRs suggesting complete deletion of this region [see additional file 2, S8]. In isolates that possess both a normal and a 21del gene copy an additional amplicon of approximately 100 bp is seen. This presumably represents a heteroduplex comprising both amplicons.
Figure 521del genotype results in relation to PGG2 phylogeny. A simplified phylogenetic tree of PGG2 lineages shows that the 21del mutation is seen only within the Haarlem and LCC lineages indicating that they share a recent common ancestor. Results also suggest frequent gene conversion and recombination events, particularly between the Haarlem groups. Each mutational type is shown in colour and indicates the number of PPE38/71 genes present and the genotype (WT = wildtype, ie lacking the 21del mutation). Green (a): 21 bp deletion (21del) in PPE71, both genes retained; Blue (b): Recombination between PPE38/71 with PPE71 deletion; Yellow (c): Gene conversion leading to deletion of PPE71 and duplication of PPE38; Grey (d); Recombination between PPE38/71 with PPE38 deletion. The "Haarlem-like" lineages (F6 and F7) could not be included in the analysis because the 3' end of PPE38, including the region homologous to PPE71 21del, has been deleted due to an IS6110-associated deletion event [see additional file 2, S11 and S12].
21del analysis of lineages representing the LCC and Haarlem groups
| LCC, 2 banders | 4 | N.A. | 4 | 2, WT/21del | 0 |
| LCC, 3 banders | 6 | N.A. | 6 | 2, WT/21del | 1. (1, 21del) |
| LCC, 4 banders | 6 | N.A. | 6 | 2, WT/21del | 0 |
| LCC, 5 banders | 6 | N.A. | 6 | 2, WT/21del | 1. (1, WT) |
| LCC, 6 banders | 4 | N.A. | 4 | 2, WT/21del | 0 |
| F6. Haarlem-like† | 3 | 3 | 6 | 1, 21del | 0 |
| F7. Haarlem-like† | 3 | 5 | 8 | 1, 21del | 0 |
| F1. Haarlem | - | 3 | 3 | 2, WT/21del | 0 |
| F2. Haarlem | 5 | 8 | 13 | 2, WT/21del | 1. (1, 21del) |
| F4. Haarlem | 4 | 6 | 10 | 1, WT | 0 |
| F10. Haarlem | 4 | 2 | 6 | 1, 21del | 0 |
| F24, Pre-Haarlem | 4 | 5 | 9 | 1, WT | 2. (2, WT/21del) |
| F19, Pre-Haarlem | 4 | 5 | 9 | 2, WT | 0 |
| 53 | 36 | 90 | 5 | ||
N.A.: Not applicable. Because of the invariance of the IS6110 RFLP patterns LCC lineages cannot be subdivided into clusters.
†The "1, 21del" genotype observed in these lineages is due to deletion of the 3' end of PPE38 by an IS6110-mediated mechanism rather than homologous recombination with deletion of PPE38.
Mutational analysis of the plcABC genes from 15 publicly available whole genome M. tuberculosis isolates and 8 non-M. tuberculosis MTBC members.
| Deleted | Deleted | Deleted | ||
| Deleted | Deleted | Deleted | ||
| CPHL_A | sSNP A → C at position 435. | + | + | All genes predicted to be fully functional. |
| K85 | sSNP A → C at position 435. | + | nsSNP C → T (Thr → Ile) at aa position 302. | plcC function possibly impaired. |
| GM041182 | sSNP A → C at position 435. | + | + | All genes predicted to be fully functional. |
| Deleted. | Deleted | 5' 867 bp deleted. | ||
| Oryx bacillus | Deleted | 5' 260 bp deleted. | +‡ | |
| Dassie bacillus | Deleted | Deleted | Deleted | |
| T17 | sSNP A → C at position 435. | IS | + | plcB function predicted to be abolished. |
| EAS054 | sSNP A → C at position 435. | sSNP G → A at position 1404. | + | All genes predicted to be fully functional. |
| T92 | Deleted | Deleted | 5' 194 bp deleted. | Major deletion results in removal of |
| 94_M4241A | Deleted | 5' 793 bp deleted. | sSNP T→ C at position 753. | IS |
| 02_1987 | Deletion of 3' end of | sSNP T→ C at position 753. | Hybrid | |
| T85 | sSNP G → A at position 705. | + | sSNP T → C at position 753. | plcA and plcC functions possibly impaired. |
| KZN 4207 | + | + | + | Total homology to H37Rv. |
| KZN 1435 | + | + | + | Total homology to H37Rv. |
| KZN 605 | + | + | + | Total homology to H37Rv. |
| F11 | + | + | + | Total homology to H37Rv. |
| Strain C | T insertion at position 104. Altered reading frame and premature protein termination. | + | + | |
| CDC1551 | + | + | + | Total homology to H37Rv. |
| Haarlem | A insertion at position 968. Altered reading frame and premature protein termination. | + | + | |
| H37Rv | + | + | + | Defined as wild type sequence |
| H37Ra | + | + | + | Total homology to H37Rv. |
+ indicates complete homology to the H37Rv reference sequence.
‡Deletion mapping studies indicates that the plcC gene of the Oryx bacilli is present [23]. The exact sequence of the gene in this species is unknown however.
Mutational analysis of the PPE39 and PPE40 genes from 15 publicly available whole genome M. tuberculosis isolates and 8 non-M. tuberculosis MTBC members.
| Deleted downstream from position 1358. | 3 bp in-frame deletion removes aa 164 (A). | ||
| Fused | |||
| CPHL_A | + | IS | PPE40 function predicted to be abolished. |
| K85 | sSNP G→ T position 1548 | + | |
| GM041182 | sSNP C→ T position 1563. | 33bp in-frame deletion of nucleotides 190 -- 222. Removes aa sequence AAAAAMVVAAA. | PPE40 function predicted to be altered. |
| Deleted downstream of position 325. | + | ||
| Oryx bacillus | Deleted | Deleted | |
| Dassie bacillus | Deleted | Gene present. | Deletion analysis suggests that |
| T17 | + | + | |
| EAS054 | 3 bp (GCG) in-frame deletion removes alanine at aa position 27. | + | |
| T92 | Deleted | Deleted from position 1592. | See Figure 3. |
| 94_M4241A | Fused | ||
| 02_1987 | Deleted | IS | Most of genes deleted as part of major genomic structural alterations. (Figure S23). |
| T85 | G insertion position 830, A deletion position 942. Stop codon aa position 278. | + | PPE39 function predicted to be abolished or highly modified. |
| KZN 4207 | + | + | |
| KZN 1435 | + | 6 nsSNPs between positions 1094 and 1105. 2 aa changes: 367 T→ N and 368 G→ N. | |
| KZN 605 | + | + | |
| F11 | IS | + | PPE39 function predicted to be abolished. |
| Strain C | N.D. | sSNP T→ C position 969. | Unable to characterise |
| CDC1551 | 3 bp (GCG) in-frame deletion removes alanine at aa position 27. | + | |
| Haarlem | IS | + | PPE39 function predicted to be abolished. |
| H37Rv | IS | + | PPE39 function predicted to be abolished. |
| H37Ra | As for H37Rv. | + | PPE39 function predicted to be abolished. |
+ indicates homology to consensus sequence.
Figure 6Phylogeny of Mycobacterial species. Phylogenetic tree of 80 members of the genus Mycobacterium based on the 16S rRNA DNA sequence with the sequence of the species Gordonia aichiensis as the outgroup. Reproduced from ref [2] with permission from the authors. MTBC members analysed in this study are highlighted in yellow, while other mycobacteria analysed are highlighted in green.
Figure 7Possible evolutionary scenario for the MTBC . Analysis of fast growing mycobacterial species and the M. avium complex indicates that the homologues of Rv2345 and glyS have been in close proximity for a long evolutionary period and that the insertion of homologues to PPE38 and Rv2348c between these genes was also a relatively early event (a, b). The most recent common ancestor of the M. marinum/M. ulcerans and MTBC lineages is hypothesised to have comprised a single esx/esx/PPE38 gene cluster (black box) located between the plcABC (yellow) and PPE39/40 (grey) gene regions (c). Duplication of esx/esx/PPE38 resulted in a genotype that is retained by M. marinum (d). The genotype of the ancestral MTBC species (e) shows an additional deletion of the esx/esx gene pair between plcA and PPE38 (red box).