| Literature DB >> 29114024 |
Emanuel Schmid-Siegert1, Sophie Richard2, Amanda Luraschi2, Konrad Mühlethaler3, Marco Pagni1, Philippe M Hauser4.
Abstract
Microbial pathogens commonly escape the human immune system by varying surface proteins. We investigated the mechanisms used for that purpose by Pneumocystis jirovecii This uncultivable fungus is an obligate pulmonary pathogen that in immunocompromised individuals causes pneumonia, a major life-threatening infection. Long-read PacBio sequencing was used to assemble a core of subtelomeres of a single P. jirovecii strain from a bronchoalveolar lavage fluid specimen from a single patient. A total of 113 genes encoding surface proteins were identified, including 28 pseudogenes. These genes formed a subtelomeric gene superfamily, which included five families encoding adhesive glycosylphosphatidylinositol (GPI)-anchored glycoproteins and one family encoding excreted glycoproteins. Numerical analyses suggested that diversification of the glycoproteins relies on mosaic genes created by ectopic recombination and occurs only within each family. DNA motifs suggested that all genes are expressed independently, except those of the family encoding the most abundant surface glycoproteins, which are subject to mutually exclusive expression. PCR analyses showed that exchange of the expressed gene of the latter family occurs frequently, possibly favored by the location of the genes proximal to the telomere because this allows concomitant telomere exchange. Our observations suggest that (i) the P. jirovecii cell surface is made of a complex mixture of different surface proteins, with a majority of a single isoform of the most abundant glycoprotein, (ii) genetic mosaicism within each family ensures variation of the glycoproteins, and (iii) the strategy of the fungus consists of the continuous production of new subpopulations composed of cells that are antigenically different.IMPORTANCEPneumocystis jirovecii is a fungus causing severe pneumonia in immunocompromised individuals. It is the second most frequent life-threatening invasive fungal infection. We have studied the mechanisms of antigenic variation used by this pathogen to escape the human immune system, a strategy commonly used by pathogenic microorganisms. Using a new DNA sequencing technology generating long reads, we could characterize the highly repetitive gene families encoding the proteins that are present on the cellular surface of this pest. These gene families are localized in the regions close to the ends of all chromosomes, the subtelomeres. Such chromosomal localization was found to favor genetic recombinations between members of each gene family and to allow diversification of these proteins continuously over time. This pathogen seems to use a strategy of antigenic variation consisting of the continuous production of new subpopulations composed of cells that are antigenically different. Such a strategy is unique among human pathogens.Entities:
Keywords: PCP; PacBio sequencing; Pneumocystis carinii; Pneumocystis jirovecii; adhesin; gene exchange; major surface glycoprotein; mosaicism; subtelomere; telomere exchange
Mesh:
Substances:
Year: 2017 PMID: 29114024 PMCID: PMC5676039 DOI: 10.1128/mBio.01470-17
Source DB: PubMed Journal: MBio Impact factor: 7.867
Characteristics of the msg families identified in P. jirovecii
The promoter including the signal peptide for family I is within the UCS present at a single copy per genome.
The msg3 gene was not used to calculate this value because it is ca. 900 bp shorter than the other genes of the family, although it presents all features of the family (see alignment in Fig. S2).
NA, not applicable.
FIG 1 Classification trees of P. jirovecii msg genes and Msg proteins. The different families are represented by color, and their characteristics are summarized in Table 1. A few unclassified outliers are in gray. The scale represents the number of mean substitutions per site. (a) RAxML DNA and PEP are maximum likelihood trees of nucleotide and amino acid sequences of the 61 genes with an exon larger than 1.6 kb. Members of family V were defined as the out-group (1,000 bootstraps). JACOP PEP is a hierarchical classification based on local sequence similarity, a method that does not rely on a particular multiple sequence alignment. (b) Maximum likelihood tree of the 61 genes with an exon larger than 1.6 kb plus 18 genes with an exon smaller than 1.6 kb. The sequences were trimmed from position 1540 of the first alignment up to their end and realigned to construct the tree (1,000 bootstraps). Seven of the 18 genes with an exon smaller than 1.6 kb constitute the msg family VI shown in brown, whereas the remaining 11 shown in black belong to the other msg families.
FIG 2 Diagrams of the structure of P. jirovecii msg genes and Msg proteins belonging to families I to VI. (a) Features of the msg genes of each family derived from the analysis of the full-length genes. The UCS and recombination between CRJE sequences are shown for family I. The approximate positions of PCR primers used for identification of the msg-I expressed genes linked to the UCS are shown by arrows (see note 4 in Text S1). (b) Features of Msg proteins of each family derived from the analyses of the full-length proteins. The 13 domains identified by MEME analysis are shown. The logos of these domains are shown in Fig. S4.
FIG 3 Diagrams of 10 representative P. jirovecii assembled subtelomeres. The other 27 assembled subtelomeres are shown in Fig. S6. The msg genes are described in detail in Table S1. The attribution of the contigs to the chromosomes previously described using flanking non-msg genes is described in Table S3 at http://www.chuv.ch/microbiologie/en/imu_home/imu-recherche/imu-research-groups/imu-research-phauser/imu-supplementary_data.htm. The number of subclones obtained from the generic PCR amplifying msg-I genes linked to the UCS is indicated close to the asterisk of one msg-I gene of contig 55 (see note 4 in Text S1). The functions/products of 4 of the 20 non-msg genes are known (see Table S3): gene 6, thiamine pyrophosphokinase; gene 9, amidophosphoribosyltransferase; gene 15, 60S ribosomal protein L28; and gene 16, potassium-sodium efflux P-type ATPase.
FIG 4 Examples of detection of potential mosaic genes. (a) Mosaic gene msg32. (a1) The set of 11 full-length msg-I genes was analyzed using the Recombination Analysis Tool. This method measures genetic distances in windows sliding along the MSA. The genetic distance scores of the putative parent genes at the middle of each window are plotted against the position in the mosaic gene. The predicted recombination site is at position ca. 600, at the crossover of the curves. The second screening method, Bellerophon, which is based on a similar analysis, identified a recombination event at position 392. (a2) Analysis of the mosaic gene msg32 with its putative parent genes together with the randomly chosen gene msg84 of the same family using the more sensitive method TOPALi, based on the hidden Markov model. This method analyzes only four sequences at a time and calculates the probabilities of the three possible tree topologies at each residue of the MSA. A recombination event is also detected at positions ca. 400 to 600, but many other recombination events are predicted. (b) Mosaic gene msg79. This gene shares an almost identical fragment of 947 bp with its putative parent, msg7 (see alignment in Fig. S8c). (b1) The set of 11 full-length msg-II genes was analyzed using the Recombination Analysis Tool. The predicted recombination sites are at positions ca. 400, 1300, 2100, and 3100. The Bellerophon method did not identify this mosaic gene. (b2) Analysis of the mosaic gene msg79 with its putative parent genes together with the randomly chosen gene msg85 of the same family using TOPALi based on the hidden Markov model. Recombination events are also detected at positions ca. 400, 1500, and 3100, but not at 2100, and other recombination events are predicted.
Potential mosaic genes detected within each msg family
| No. of | No. of potential | % mosaic | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Full length | Partial | Pseudogenes | Total | Nonmosaic | Full length | Partial | Pseudogenes | Total | ||
| I | 11 | 16 | 16 | 43 | 25 | 8 | 1 | 9 | 18 | 42 |
| II | 11 | 3 | 4 | 18 | 13 | 4 | 1 | 0 | 5 | 28 |
| III | 7 | 2 | 1 | 10 | 6 | 3 | 0 | 1 | 4 | 40 |
| IV | 6 | 1 | 2 | 9 | 7 | 1 | 0 | 1 | 2 | 22 |
| V | 8 | 6 | 1 | 15 | 14 | 1 | 0 | 0 | 1 | 7 |
| VI | 6 | 1 | 0 | 7 | 7 | 0 | 0 | 0 | 0 | 0 |
Detected using the Recombination Analysis Tool and/or Bellerophon numerical screening methods among three different sets of genes of each msg family: full-length, full-length plus partial genes, or full-length plus pseudogenes.
The number of potential mosaic genes among the msg families was almost significantly different (P = 0.06, chi-square test).
Six full-length mosaic genes were detected twice but with different pairs of putative full-length parent genes according to the set of genes analyzed (four of family I, one of family II, and one of family III). One mosaic gene of family I was detected twice: once with one full-length gene and one pseudogene as parents and once with two partial genes as parents. All 10 remaining genes were detected only once with a pair of full-length parents.
Six mosaic pseudogenes of family I had two pseudogenes as parents. Two of family I had one full-length gene and one a pseudogene as parents. The three remaining had a pair of full-length parents.
Several potential recombination events were detected for these two families using the more sensitive method TOPALi based on the hidden Markov model (see Fig. S9 at http://www.chuv.ch/microbiologie/en/imu_home/imu-recherche/imu-research-groups/imu-research-phauser/imu-supplementary_data.htm).
FIG 5 Telomere exchange model for swapping the msg-I expressed gene through a single recombination between CRJE sequences. One exchanged telomere is shown in red. Subpopulations of cells expressing a potentially new mosaic msg-I gene are generated over time and may then multiply. Polycistronic expression of two msg-I genes is shown in the second subpopulation generated (see the text).