Literature DB >> 31220867

What functional genomics has taught us about transcriptional regulation in malaria parasites.

Christa G Toenhake1, Richárd Bártfai1.   

Abstract

Malaria parasites are characterized by a complex life cycle that is accompanied by dynamic gene expression patterns. The factors and mechanisms that regulate gene expression in these parasites have been searched for even before the advent of next generation sequencing technologies. Functional genomics approaches have substantially boosted this area of research and have yielded significant insights into the interplay between epigenetic, transcriptional and post-transcriptional mechanisms. Recently, considerable progress has been made in identifying sequence-specific transcription factors and DNA-encoded regulatory elements. Here, we review the insights obtained from these efforts including the characterization of core promoters, the involvement of sequence-specific transcription factors in life cycle progression and the mapping of gene regulatory elements. Furthermore, we discuss recent developments in the field of functional genomics and how they might contribute to further characterization of this complex gene regulatory network.
© The Author(s) 2019. Published by Oxford University Press.

Entities:  

Keywords:  zzm321990 Plasmodiumzzm321990 ; Malaria; gene expression; regulatory sequences; transcription factors

Mesh:

Substances:

Year:  2019        PMID: 31220867      PMCID: PMC6859821          DOI: 10.1093/bfgp/elz004

Source DB:  PubMed          Journal:  Brief Funct Genomics        ISSN: 2041-2649            Impact factor:   4.241


Introduction

Eukaryotic, unicellular parasites of the Plasmodium genus are the causative agents of malaria. During their life cycle, these parasites alternate between a vertebrate and an insect host with multiple, morphologically and functionally distinct stages of development within each host (Figure 2). In the vertebrate host, development and replication occur predominantly within host cells, either a hepatocyte or an erythrocyte. In the mosquito, parasites traverse host cells and reside in extracellular spaces (e.g. midgut lumen or at the basal side of the midgut wall). The larger part of this life cycle is deterministic, including few true cell fate decision events. The decision between continued asexual replication and the formation of gametocytes is typical of all Plasmodium species, while the decision to enter and exit the dormant hypnozoite stage in hepatocytes is made only in a few Plasmodium species (e.g. Plasmodium vivax).
Figure 2

Overview of the Plasmodium life cycle and the corresponding AP2 factors that are essential for the development of individual stages (boxed), as identified in P. falciparum (pink dot), P. berghei (light blue dot) and/or P. yoelli (dark blue dot). Note that gametocyte stages reflect the morphology of P. falciparum gametocytes only. Data were derived from references [19, 22, 23, 25–27, 29–34].

In eukaryotes, gene expression can be regulated at various steps before or after RNA synthesis. Regulatory mechanisms that act on the transcriptional process itself can be formally divided into (i) epigenetic mechanisms that, in simple terms, influence access or recruitment of the transcriptional machinery to the DNA [1-3] and (ii) the activity of sequence-specific transcriptional activators or repressors that interact with cis-regulatory DNA elements [4, 5]. The critical contribution of epigenetic mechanisms to Plasmodium gene expression regulation has been established (e.g. reviewed in this issue and [6-8]), for example in regulating antigenic variation genes and the expression of alternative solute transporters [8, 9]. Furthermore, expansion of heterochromatic domains during gametocytogenesis and mosquito-stage development restricts the expression of stage-specific sets of genes during developmental progression [10]. Finally, higher acetylation levels of histone tails in regulatory regions have been associated with increased gene transcription [11, 12]. On the other hand, the contribution of DNA regulatory elements to Plasmodium gene expression regulation, and thus the involvement of sequence-specific transcription factors (TFs) has been evident from initial promoter mapping studies (primarily performed in Plasmodium falciparum), which identified DNA regions with transcription enhancing or repressing potential (reviewed in [13, 14]). These regions were, however, limited to only a small set of genes and the identity of the protein or protein complex binding to the regulatory DNA sequence could in most cases not be clarified. P. falciparum transcriptional unit summarizing reported associative DNA- and chromatin-encoded elements. P. falciparum intergenic regions are occupied by apicomplexan-specific H2A.Z/H2B.Z double-variant nucleosomes (H2A.Z in yellow; H2B.Z in pink) [37, 38]. Acetylation of histone 3 lysine 9 (dark blue circles with K9) in these regions correlates moderately with the transcriptional output of the downstream gene. Trimethylation of histone 3 lysine 4 (pink circles with K4) associates with developmental progression in the intra-erythrocytic development cycle [11]. Transcription initiation is mapped to multiple TSS windows within promoter regions; only the most dominant TSS peaks in two windows are depicted here (black arrows). Approximately 75% of the intergenic TSSs are detected within 600 bp of the ATG [16, 17]. A typical TA dinucleotide at position −1,0 can be detected at the TSS [16, 40] as well polymeric AAAAA- or TTTTT-stretches within 50 bp upstream of the TSS [42]. The orange line depicts the typical GC-rich sequence elements detected at ~ 150 bp and ~ 210 bp downstream of the TSS and the local increase of CG content around weaker TSSs (dashed part of the orange line). A well-positioned nucleosome (indicated by more prominent coloring) is located directly downstream of the TSS (‘+1 nucleosomes’) [16, 17]. Another well-positioned nucleosome marks the start of the coding region [17]. An NDR is located upstream of the +1 nucleosome [17]. Accessible regions of variable size are detected around the TSS, located up (and down-)stream of it, and contain TFBSs. Three examples of DNA motifs are listed, that could occur at the TFBS, with their corresponding TF and transcriptional response [26–28, 31, 65]. Recent functional genomic approaches have led to the genome-wide characterization of regulatory DNA elements, including transcription start sites (TSSs) [15-17], and have lent support to the essential role of sequence-specific TFs, in particular apicomplexan apetala 2 (ApiAP2) factors [18-34]. In this review, we summarize recent findings on how the delicate interplay between core promoters, regulatory DNA elements and sequence-specific TFs, regulates specific sets of genes during the developmental progression of malaria parasites. Notably, the majority of studies on this topic have been performed in a limited number of Plasmodium species, i.e. human-infecting P. falciparum and rodent-infecting Plasmodium berghei and Plasmodium yoelli and findings cannot be generalized.

Core promoter recognition in the compact Plasmodium genome

At the basis of the transcriptional process lies recognition of the core promoter by general transcription factors that, together with RNA polymerase II, make up the pre-initiation complex (PIC). In most eukaryotes, core promoters have characteristic DNA sequences, nucleosome positioning and/or chromatin marking [1]. Plasmodium parasites contain a haploid set of 14 chromosomes during most of their life cycle. The 20--30 Mb genomes encode 5500–6500 protein-coding genes, which results in approximately a 50:50 ratio of coding versus non-coding DNA and rather short intergenic sequences (on average 1.4–2 Kb intergenic [35]). Notably, the genome of several Plasmodium species is among the most AT-rich nuclear genomes [36]. In particular that of P. falciparum, which has coding and subtelomeric sequences somewhat richer in G and C, but an average 87% AT bases in intergenic regions. Why and how these extremely AT-biased genomes evolved specifically in some Plasmodium species remains unclear. Nevertheless, it is likely that the transcriptional apparatus and other DNA-associated proteins have adapted to this nucleotide bias. For example, the Apicomplexan-specific H2A.Z/H2B.Z double-variant nucleosomes which, at least in P. falciparum, associate with AT-rich intergenic regions [37, 38], that could not be occupied by the canonical histone H2A and H2B containing nucleosomes [39]. The search is still ongoing for Plasmodium DNA- and chromatin-related elements that guide the PIC to the core promoter and signal the site of transcription. P. falciparum TSSs have been mapped to multiple small windows in most promoter regions [15-17], and divergent transcription initiation is highly prevalent [16, 17, 40]. Interestingly, while most genes have multiple TSS windows, some of these initiation sites are more prevalent than others [17]. Furthermore, while in most cases the windows initiate transcription simultaneously, in a small number of promoter regions these windows exhibit differential regulation during intra-erythrocytic development (3.4%) [16]. Several DNA- and chromatin-based features have been associated with TSSs in P. falciparum and may guide core promoter recognition, TSS selection and/or promoter strength. Sequence-based DNA features include a typical TA dinucleotide at position −1,0 [16, 40], which is also observed at mammalian TSSs and may partially reflect the initiator core promoter element [41]. However, whether this resemblance has any functional relevance in the AT-rich P. falciparum genome is questionable. Additional sequence-based features include GC-rich sequence elements at ~ 150 bp and ~ 210 bp downstream of the TSS [16], polymeric AAAAA- or TTTTT-stretches within 50 bp upstream of the TSS [42] and a local increase of CG-content around weaker TSSs [16]. Several DNA structural features may also predict TSSs with reasonable accuracy [40] but this study made use of TSS mappings for a limited number of P. falciparum genes, and it is unclear whether the same features hold true for TSSs identified on a genome-wide scale. Lastly, while the classic core promoter element, TATA-box (TATAA), is recognized by the P. falciparum TATA-binding protein in vitro [43], the relevance of this motif for in vivo TSS selection in an AT-rich genome requires experimental validation. Chromatin-related features associated with P. falciparum TSSs include a well-positioned nucleosome just downstream of the TSS (the so -called ‘+1 nucleosome’) [16, 17] and a nucleosome-depleted region (NDR) directly upstream of the TSS [17]. The NDR is more pronounced for highly expressed genes [16, 17] and shows dynamic nucleosome loss that correlates with transcriptional activity [17]. As nucleosome positioning can be determined by the underlying DNA sequence, these observations may reflect DNA sequence features. In particular, homopolymeric poly(dA:dT) sequences have been reported to be stiff and resistant to nucleosome formation [44] and are enriched next to TSSs in P. falciparum alongside other well-positioned nucleosomes close to the ATG start codon, stop codon and splice sites [17]. However, these tracts only partially correlate with nucleosome positioning, pointing to the involvement of adenosine triphosphate-dependent chromatin remodeling complexes [17]. Finally, typical eukaryotic histone marks associated with the core promoter of model organisms like H3K4me3, H3K9ac and histone variant H2A.Z are also found broadly covering P. falciparum intergenic regions; however, they only moderately correlate with transcriptional output [11, 16]. In conclusion, while several associations have been unveiled between DNA/chromatin features and TSSs (summarized in Figure 1), we still do not understand how the Plasmodium general TFs recognize the core promoter, nor how the PIC identifies the TSS to be used. Hence, the functional relevance and causative role of the observed features requires further experimental investigation.
Figure 1

P. falciparum transcriptional unit summarizing reported associative DNA- and chromatin-encoded elements. P. falciparum intergenic regions are occupied by apicomplexan-specific H2A.Z/H2B.Z double-variant nucleosomes (H2A.Z in yellow; H2B.Z in pink) [37, 38]. Acetylation of histone 3 lysine 9 (dark blue circles with K9) in these regions correlates moderately with the transcriptional output of the downstream gene. Trimethylation of histone 3 lysine 4 (pink circles with K4) associates with developmental progression in the intra-erythrocytic development cycle [11]. Transcription initiation is mapped to multiple TSS windows within promoter regions; only the most dominant TSS peaks in two windows are depicted here (black arrows). Approximately 75% of the intergenic TSSs are detected within 600 bp of the ATG [16, 17]. A typical TA dinucleotide at position −1,0 can be detected at the TSS [16, 40] as well polymeric AAAAA- or TTTTT-stretches within 50 bp upstream of the TSS [42]. The orange line depicts the typical GC-rich sequence elements detected at ~ 150 bp and ~ 210 bp downstream of the TSS and the local increase of CG content around weaker TSSs (dashed part of the orange line). A well-positioned nucleosome (indicated by more prominent coloring) is located directly downstream of the TSS (‘+1 nucleosomes’) [16, 17]. Another well-positioned nucleosome marks the start of the coding region [17]. An NDR is located upstream of the +1 nucleosome [17]. Accessible regions of variable size are detected around the TSS, located up (and down-)stream of it, and contain TFBSs. Three examples of DNA motifs are listed, that could occur at the TFBS, with their corresponding TF and transcriptional response [26–28, 31, 65].

Sequence-specific TFs

Sequence-specific TFs mediate the transcriptional regulation of specific sets of genes through their ability to recognize DNA motifs within gene regulatory elements and either directly or indirectly influence the recruitment and activity of PIC. The most comprehensive list of potential sequence-specific TFs have been drawn up for P. falciparum [18, 45] and consists of 27 ApiAP2s, 12 C2H2-type zinc finger (ZnF-C2H2), eight helix-turn-helix [HTH including high mobility group box 3 protein (HMGB3)], one β-scaffold factor with minor groove contacts [45] as well as the K homology (KH) domain-containing PREBP (Prx regulatory element (PRE) binding protein, PF3D7_1011800/PF10_0115 [46]) and a homeodomain-like TF (PF3D7_1466200/PF14_0631, Björn Kafsack personal communication). The majority of these genes have syntenic orthologues in other Plasmodium species (Table 1). However, sequence conservation is generally low and gene products might have acquired other functions during evolution.
Table 1

Overview of candidate sequence-specific TFs in P. falciparum and their syntenic orthologues in P. knowlesi, P. vivax, P. berghei and P. yoelli, based on [45, 96]. For P. falciparum, P. berghei and P. yoelli, geneIDs are color-coded based on whether the coding sequence (CDS) is mutable (P. falciparum [55]), or the gene product is essential in the IDC (P. berghei and P. yoelli [30, 34, 95]). Green, CDS is mutable or KO could be generated; red, CDS is not mutable and KO could not be generated; blue, KO does not show a phenotype; orange, CDS is mutable but tentative because of small CDS size; # CDS is mutable but KO could not be generated using conventional strategies [23, 32]. * The HTH factor ADA2 (Alteration/Deficiency in Activation 2) is a transcriptional coactivator and part of the GCN5-containing histone acetyltransferase-complex (GCN5, general control of amino acid synthesis 5) [97, 98]. The species in which the TF has been described, is indicated in brackets behind the TF name. ‡ Only C2H2 domain-containing ZnFs are included in this table. Studies that investigated the function of the individual TF, are cited as well. Syntenic orthologues were retrieved from the PlasmoDB database, release 39 [35]

(Continued).

Currently, the ApiAP2 family is considered as the principal family of TFs in Plasmodium and we therefore discuss it in a separate section. The HTH factor PfMYB1 was among the first TFs studied in Plasmodium [47, 48]. PfMYB1 can bind a putative Myb regulatory element (MRE, wAACnGh) upstream of P. falciparum genes and has been associated with the promoter of genes that were downregulated upon Pfmyb1 knockdown [47, 48]. PfMYB2, on the other hand, shows high homology to the pre-messenger RNA (mRNA)-splicing factor CEF1 in Saccharomyces cerevisiae [49]. Of the ~ 12 zinc finger C2H2 domain-containing proteins, only one has been studied. In P. falciparum, it is called PfTRZ (telomere repeat-binding ZnF protein) because of its binding to telomeric TT(T/C)AGGG repeats in vitro and in vivo. Interestingly, this same factor also binds to and regulates expression of 5S rDNA loci pointing to an evolutionary relationship with the general transcription factor TFIIIA, a combination of functions that is unique among eukaryotes [50]. Of the four Plasmodium HMGB proteins, only one has a configuration linked to sequence-specific binding, while the other three are implicated in chromatin binding [45]. Lastly, PREBP is a KH domain-containing protein. Although this domain is normally linked to RNA-binding and -processing, the P. falciparum protein bound dsDNA in a sequence-specific manner and was implicated in the expression of peroxiredoxin genes [46]. Overview of candidate sequence-specific TFs in P. falciparum and their syntenic orthologues in P. knowlesi, P. vivax, P. berghei and P. yoelli, based on [45, 96]. For P. falciparum, P. berghei and P. yoelli, geneIDs are color-coded based on whether the coding sequence (CDS) is mutable (P. falciparum [55]), or the gene product is essential in the IDC (P. berghei and P. yoelli [30, 34, 95]). Green, CDS is mutable or KO could be generated; red, CDS is not mutable and KO could not be generated; blue, KO does not show a phenotype; orange, CDS is mutable but tentative because of small CDS size; # CDS is mutable but KO could not be generated using conventional strategies [23, 32]. * The HTH factor ADA2 (Alteration/Deficiency in Activation 2) is a transcriptional coactivator and part of the GCN5-containing histone acetyltransferase-complex (GCN5, general control of amino acid synthesis 5) [97, 98]. The species in which the TF has been described, is indicated in brackets behind the TF name. ‡ Only C2H2 domain-containing ZnFs are included in this table. Studies that investigated the function of the individual TF, are cited as well. Syntenic orthologues were retrieved from the PlasmoDB database, release 39 [35] (Continued). Continued. Overview of the Plasmodium life cycle and the corresponding AP2 factors that are essential for the development of individual stages (boxed), as identified in P. falciparum (pink dot), P. berghei (light blue dot) and/or P. yoelli (dark blue dot). Note that gametocyte stages reflect the morphology of P. falciparum gametocytes only. Data were derived from references [19, 22, 23, 25–27, 29–34].

ApiAP2 factors and Plasmodium life cycle progression

The Plasmodium AP2 family, the principal TF family in apicomplexan parasites, is homologous to the plant apetala 2/ethylene response factor TF family [18]. Due to its absence in humans, it has been proposed as a potential anti-malarial drug target [20, 21, 27] and several members may mediate drug resistance in Plasmodium [51]. The 60-aa globular AP2 domain has a conserved core consisting of three β-sheets, which make base-specific contacts with the DNA and a stabilizing C-terminal α-helix [21]. A cysteine residue between the first two β-sheets can facilitate dimerization [21], thereby providing a means for the proposed combinatorial mode of regulation [24, 52]. Plasmodium AP2 proteins have one to three AP2 domains per protein, and some encode the accessory C-terminal domain, the function of which is not yet understood (ACDC, AP2-coincident C-terminal domain; PFAM id: PF14733 [53]). In vitro, the different AP2 domains show diverse DNA-binding preferences [24], although it is important to note that not all AP2 domains have to be involved in DNA-binding in vivo. PfAP2-I, for example, requires only its third AP2 domain to interact with the DNA, at least in blood stage parasites [31]. The opposite is true for PfSIP2 (SPE2-interacting protein, PF3D7_0604100/PFF0200c) where both AP2 domains are required for binding bona fide SPE2 elements (subtelomeric var promoter element 2), at least in vitro [23]. Additionally, post-translational modifications can alter the DNA-binding ability of AP2 domains in vivo. For example, lysine acetylation, which is prevalent among P. falciparum AP2 proteins, diminishes the DNA-binding ability of PfAP2-I [54]. To decipher their contribution to Plasmodium life cycle progression, these DNA-binding proteins have been studied on a per gene basis [19, 22, 23, 25–29, 31–33], as well as, more recently, in systematic knock-out (KO) screens [30, 34]. Several AP2s have now been assigned names based on the stage of the developmental defect in the respective KO line (Figure 2, Table 1). Almost half of the 27–28 AP2 factors are essential to the intra-erythrocytic developmental cycle (IDC). The number of so-called ‘essential’ AP2s varies slightly between species—11 in P. falciparum, 12 in P. yoelli and 14 in P. berghei (Table 1) [30, 34, 55]—and does not show a perfect overlap, especially between P. falciparum and the rodent species. Whether this is due to technical factors (e.g. culturing conditions, KO strategy used) or whether it points to a considerable evolutionary rewiring of the transcription regulation program in the IDC, is unclear. As their essential nature in the IDC requires the use of techniques other than phenotyping the respective KO line, four of them have been studied in detail in P. falciparum: Pfsip2, Pfap2-i, Pfap2tel and Pfap2-exp. PfSIP2 recognizes SPE2 repeats within upsB-var promoters and in telomere-associated sequences and has been implicated in upsB-type var gene silencing and heterochromatin biology in P. falciparum [23]. PfAP2-I has a critical role in the expression of invasion-related genes in merozoites [31]. Besides invasion genes, PfAP2-I also targets promoters of nucleosome- and chromatin-related genes (including seven Pfap2 genes), cell-cycle-related genes, and genes associated with vesicle transport and host-cell remodeling [31, 56]. In addition, two P. falciparum studies have suggested roles for AP2 factors in the IDC that were not obvious from rodent studies. The orthologue of AP2-SP in P. falciparum is essential for intra-erythrocytic development and is called PfAP2-EXP because of its involvement in the proper expression of multi-gene family genes [32]. Similarly, the orthologue of AP2-SP3 in P. falciparum has been named PfAP2Tel as it recognizes telomere GGGTT(T/C)A repeats in vitro and in vivo, pointing to a role in telomere maintenance [33]. If this role were conserved in rodent-infecting species, an accompanying growth defect would be expected in the corresponding KO line, but this was never observed [30, 34]. Besides these ‘essential’ AP2s, the deletion of several Pbap2 genes (Pbap2-g2, Pbap2-o, Pbap2-sp, Pbap2-l), while lethal for the development of mosquito- or liver-stage parasites, causes transcriptional deregulation and a growth delay in asexual stages [30]. Thus, although it is clear that a considerable number of AP2s are involved in Plasmodium blood-stage development, due to the difficulty of manipulating genes essential for the IDC, as well as the use of different techniques in different species, it is difficult to determine to what extent functions are conserved. Furthermore, alternative strategies are needed to determine the exact targets of the essential AP2 factors and their contribution to Plasmodium blood-stage development. The differentiation to gametocytes represents the sole developmental switch in the predominantly deterministic life cycle of Plasmodium parasites. The epigenetic, transcriptional and post-transcriptional factors involved in this pathway have recently been reviewed elsewhere [57]. Despite the existence of two possible routes for gametocyte commitment [58, 59] and despite different trajectories of gametocytogenesis among Plasmodium species [57], extensive forward and reverse genetic approaches have identified a conserved AP2 as the principal regulator of this switch: AP2-G (PBANKA_1437500 [27] and PF3D7_1222600 [26]). Interestingly, in all Plasmodium species studied so far, ap2-g is the only single-locus heterochromatic gene that is under heterochromatin protein 1 (HP1)-mediated epigenetic silencing [10, 26, 60, 61], and this epigenetic control is key to the regulated expression of ap2-g and gametocyte conversion [62]. In the current model, PfGDV1 interacts with PfHP1 and this binding is associated with PfHP1 eviction from the Pfap2-g locus by a hitherto unknown mechanism [62]. However, as homologues of Pfgdv1 are so far only detected in primate malarias and the avian parasite Plasmodium gallinaceum [63], different control mechanisms are likely to operate in the other species. Besides AP2-G, there is considerable evidence for the involvement of two other AP2s in gametocyte development, at least in rodent-infecting malaria species. P(b/y)AP2-G2 is regarded as a general transcriptional repressor, and during gametocytogenesis should release the repression of gametocyte-specific genes, start the repression of asexual genes and maintain the repression of genes specific to other stages [29, 30, 34]. PyAP2-G3 has been implicated in gametocyte commitment in P. yoelli [34] and P. falciparum [64]. Studies of P. yoelli suggest that it acts upstream of PyAP-G, since deletion of Pyap2-g3 reduced Pyap2-g expression but not vice versa. Additionally, PyAP2-G3 is highly abundant in the cytoplasm, indicating that it might relay commitment signals to the nucleus [34]. Besides these factors identified by KO studies, several other AP2s have been implicated in gametocytogenesis. Using single-cell transcriptomics, it was possible to detect transcripts that were significantly upregulated shortly after Pfap2-g expression in pfap2-g+ NF54 parasites, including those for the AP2 factors PF3D7_1222400 and PF3D7_1139300 [65]. In addition, Brancucci et al. [66] showed that the serum component lysophosphatidylcholine (lysoPC) inhibited P. falciparum gametocyte differentiation in vitro and detected significant upregulation of seven Pfap2 transcripts besides Pfap2-g in the absence of LysoPC (PF3D7_0516800 (orthologue of Pbap2-o), PF3D7_0613800, PF3D7_0802100, Pfap2-i/PF3D7_1007700, PF3D7_1222400, PF3D7_1239200 and PF3D7_1456000 (not essential in P. yoelli or P. berghei)). PF3D7_1222400 is an interesting candidate as it was found in both studies and is unique to primate-infecting Plasmodium species. As the majority of the other genes were essential for the IDC of rodent-infecting Plasmodium species, alternative methods are required to decipher their contribution to gametocytogenesis. It should also be noted that, although the majority of apiap2 genes show differential mRNA expression in P. falciparum male and female gametocytes [67], thus far, no single AP2 factor could be associated with sex-specific gene expression in the KO screens [30, 34]. KO studies have demonstrated the essential role of five AP2 factors for successful mosquito infection, at least in rodent-infecting Plasmodium species (AP2-O through AP2-O5) [19, 28, 30, 34]. Of these, the function of PbAP2-O has been most extensively studied. This AP2 has a central role in ookinete gene expression, and without Pbap2-o, development of P. berghei parasites halts in zygote-to-ookinete transition. In wild-type parasites, the Pbap2-o transcript is translationally repressed in development of zygote inhibited (DOZI)--containing ribonucleoprotein complexes in female gametocytes [68, 69] and, after being translated, it induces the expression of many ookinete-specific genes [19, 28]. Deletion of each of the other four ap2 genes in P. berghei or P. yoelli halts parasite development somewhere along the differentiation path towards mature ookinetes or settling oocysts at the basal side of the mosquito midgut epithelium [30, 34]. P(b/y)AP2-O3 and -O4 are required for proper ookinete maturation and early oocyst formation, respectively. The Pbap2-o3 KO showed an upregulation of male-specific transcripts in gametocytes in the absence of the expected shift in male:female gametocyte ratios in blood films and might therefore be required for the repression of male-specific genes. In addition, an upregulation of translationally repressed transcripts was detected in the ookinete cultures of this KO, suggesting a role for PbAP2-O3 in translational repression [30] (although these hypotheses remain to be tested). While KOs for ap2-o, −o3 and –o4 showed the same morphological phenotype in P. berghei and P. yoelli, deletion of ap2-o2 did not [19, 30, 34]. In P. berghei, ap2-o2 deletion greatly reduced in vitro zygote-to-ookinete conversion efficiency and this line formed only a few oocysts in vivo that were unable to sporulate. In P. yoelli, on the other hand, ap2-o2 deletion did not affect ookinete maturation but did affect oocyst formation and greatly reduced sporozoite numbers in midgut oocysts and salivary glands. It is interesting that even in these closely related species, homologous AP2 factors appear to have different functions. Lastly, PyAP2-O5 was found to be essential for ookinete motility. Obviously, additional experiments are needed to elucidate the role, target genes and reciprocal interactions of these five factors during development of functional ookinetes. Finally, at least three further AP2 factors are required for the formation of mature sporozoites in P. berghei and P. yoelli (PbAP2-SP, -SP2, -SP3), each functioning at a different stage in development [22, 30, 34]. While AP2-SP was the first AP2 to be studied in detail [22], its genome-wide target genes and mode of action have not yet been elucidated. In addition, as mentioned above, besides its transcription-activating function in sporozoites, recent findings indicate that this factor might have transcriptional-repressive properties in the IDC of P. berghei [30] and might play a role in the expression of multi-gene family genes in P. falciparum [32]. Finally, at least one AP2 (PbAP2-L) is essential for the development of mature liver-stage schizonts in P. berghei [25]. Recent transcriptome analyses of liver-stage schizonts and hypnozoites of P. vivax [70] and Plasmodium cynomolgi [71, 72] attempted to identify additional AP2s involved in liver-stage development of these species including quiescent hypnozoite forms. However, the resulting gene lists show very little overlap among the three studies, indicating that further experimental validation is desired to characterize the role of AP2s in liver-stage development in general and in hypnozoite formation in particular. Taken together, the above studies support the essential role of the ApiAP2 family in Plasmodium life cycle progression. The various developmental stages seem to rely critically on one or several AP2 family members but further studies are needed to investigate the functional contribution of the AP2 in question, beyond the stage at which it is essential.

Genome-wide mapping of gene regulatory elements in the Plasmodium genome

In order to modulate the transcriptional process, TFs need to bind cis-regulatory DNA elements. TF binding sites (TFBSs) are the core of larger gene regulatory elements like enhancers or promoters and play a central role in molecular gene regulatory networks [4]. The preferred approach to identify TFBSs relies on chromatin immunoprecipitation sequencing (ChIP-seq) in combination with transcriptome analyses of KO lines. However, ChIP-seq of TFs, in particular, has proven to be difficult in Plasmodium and has only been performed for a few factors [23, 28, 29, 31, 33]. Therefore, alternative approaches to identify potential regulatory DNA elements and associate those with TFBS, TFs and potential target genes, have been employed. These can roughly be divided into two strategies that are complementary to each other. One approach starts with finding the TFBS for each TF using in vitro sequence preferences of individual DNA-binding domains. To this end, protein-binding microarray (PBM) experiments have been performed to predict DNA-binding preferences for the AP2 domains of 20 P. falciparum AP2 proteins [24]. These predictions have been validated and refined by additional biochemical, transcriptomic and/or ChIP experiments for PfSIP2 [23], PfAP2-G [26] and PfAP2-I [31] in P. falciparum as well as for PbAP2-G2 [29] and PbAP2-O [28] in P. berghei, demonstrating the quality of the PBM predictions. However, in vivo binding is unlikely to occur at every genomic occurrence of the in vitro predicted motif, and not all AP2 domains are necessarily required for DNA binding. In addition, it is well appreciated that DNA-binding preferences can be influenced by the flanking DNA [24] and the interaction with epigenetic reader proteins [31, 56]. Lastly, while AP2 proteins are largely conserved across Plasmodium species [18], this does not exclude the possibility of altered DNA-binding preferences or alternative use of orthologous TFs in related species [24, 73]. Hence, while in vitro DNA-sequence preferences provide a valuable starting point for target site prediction, additional experiments are needed to validate and refine these in vitro predictions. The alternative strategy assumes that regulatory regions of co-expressed and/or functionally related genes share conserved DNA-sequences that serve as TFBS for a particular TF, operational at that stage. The G-box element, upstream of Plasmodium heat shock genes, was identified using this reasoning [74]. Subsequent genome-wide transcriptome analyses, either stand-alone or in combination with chromatin landscape profiling data sets, allowed motif predictions to be made on a larger scale using various bioinformatics approaches [52, 56, 75–82]. However, in many cases, the predicted motifs were not functionally tested and the TFs recognizing the predicted motifs were not identified. As mentioned above, an additional degree of refinement that is useful for transferring PBM-derived motifs to regulatory gene elements, as well as for de novo bioinformatics motif prediction strategies, can be gained by taking the chromatin landscape into account. TF binding disrupts the local nucleosome structure and/or prevents local occupancy by nucleosomes, thereby creating a relative ‘open’ chromatin region. Genome-wide indexing of these open chromatin regions can be achieved using exonuclease deoxyribonuclease 1 (DNase I-seq [83, 84]) or Tn5 transposition (assay for transposase-accessible chromatin using sequencing; ATAC-seq [85]) followed by sequencing of the purified fragments. Alternatively, these NDRs can be purified from the pellet of nucleosomal DNA after crosslinking and quantified by DNA sequencing (formaldehyde-assisted isolation of regulatory elements; FAIRE-seq [86]). Each of these methods is affected by some level of bias, either enzymatic or due to cross-linking, and appropriate controls should be used [87]. Both FAIRE-seq and ATAC-seq have been applied to the P. falciparum genome during blood-stage development. Of these two techniques, ATAC-seq appears to provide a higher resolution [56, 80, 82, 88]. In general, the relative level of enrichment or accessibility correlated positively with the transcript abundance of the closest downstream gene [56, 82, 88], and the differential expression of clonally variant genes across P. falciparum strains could be explained by the presence of distinct accessibility patterns [82]. These observations not only support the contribution of transcriptional control to global IDC gene expression but also suggest that, during the IDC, transcriptional activating events prevail over repressive ones. Furthermore, there is an indication that the majority of regulatory elements appear to locate close to their target gene, as is also observed in ChIP-seq studies [28, 29, 31, 56, 82]. Whether the negatively correlating events, although few in number, reflected the activity of candidate repressors like AP2-G2, AP2-O or AP2-SP [30] was not investigated. As these accessible regions represent candidate in vivo TF binding events, they can be used, either alone or in combination with other data sets, to make more informed motif predictions, identify collaborative binding events and construct global gene regulatory networks. For example, in-depth motif analyses could substantiate and refine PBM-predicted motifs and suggest candidate novel motifs in P. falciparum [56]. To identify TFs binding to such novel DNA elements, DNA pull-down coupled with quantitative proteomics can be used [89]. Interestingly, mainly AP2 factors were identified using such an approach as the respective binding partners of several de novo motifs [56], confirming their status as major P. falciparum TF family. Clearly, chromatin accessibility-based approaches combined with computational modeling have great potential in our quest to understand the regulatory network that dictates Plasmodium gene expression. Until now, the discovery of regulatory elements has mainly been limited to P. falciparum IDC parasites; however, the low-input requirements of ATAC-seq provide the means to study gene regulation in other Plasmodium species and stages as well.

Conclusion and outlook

The combined use of systematic KO screens, transcriptomics, chromatin accessibility mappings and computational approaches has established the involvement of transcriptional regulation in Plasmodium life cycle progression. Thus far, the apicomplexan AP2 family emerges as the main TF family in Plasmodium parasites, although other putative and yet unidentified DNA-binding proteins might also contribute. The exact modes of action of the different AP2 proteins should be subject to further investigation including their proposed combinatorial mode of action [30, 52] and pleiotropic functionality [30]. Interaction proteomic studies and ChIP-re-ChIP-type approaches are needed to reveal the proposed interactions between different DNA-binding as well as epigenetic reader proteins. Similarly, stage-specific interactions between TFs and/or stage-specific post-translational modifications (PTM) could help to explain how certain TFs can perform activating as well as suppressive functions at different stages. Such interaction and PTM-profiling studies might actually be the first step in drawing up the signaling cascades that control TF function. Additionally, they may shed light on the question of whether (and how) environmental changes provoke transcriptional responses that may contribute to the adaptation of parasites. Similarly, there remains a lot to be learned about both core promoters and TFBSs. While chromatin accessibility-based approaches and ChIP-seq substantially improved the identification of putative TFBS sites, we are not yet able to pinpoint target sequences confidently for the majority of TFs across parasite stages. This will likely require systematic mapping of the in vivo binding patterns of large numbers of individual TFs at different stages, followed by CRISPR-Cas9 mutagenesis of the endogenous regulatory elements, whether TSSs or TFBSs. These, in combination with other genome-wide data sets, could also reveal the features involved in targeting transcription factors beyond the DNA sequence, be that DNA structure, DNA modifications or non-coding RNAs. Eventually, the gene regulatory network of Plasmodium species will be built by combining evidence from different methodologies across life cycle stages and Plasmodium species. Low-input methodologies [90] and the advent of single-cell (multi)-omics approaches [91-93], together with interaction proteomic analyses, provide exciting opportunities, not only to study the characteristics of the network in stages and species that cannot be cultured in vitro, but also to incorporate epigenetic and post-transcriptional regulatory elements in order to model its heterogeneity and stochastic nature [94].

Key Points

Recent functional genomics studies establish a central role for transcriptional mechanisms in Plasmodium gene expression regulation. Transcription initiation in P. falciparum is rather promiscuous and core promoters are not clearly delineated. Apetala 2 (AP2) domain-containing proteins are the main transcription factor family in Plasmodium parasites. The majority of identified gene regulatory elements can be associated with an immediate downstream target gene.
Table 1

Continued.

  97 in total

1.  A universal framework for regulatory element discovery across all genomes and data types.

Authors:  Olivier Elemento; Noam Slonim; Saeed Tavazoie
Journal:  Mol Cell       Date:  2007-10-26       Impact factor: 17.970

Review 2.  Eukaryotic core promoters and the functional basis of transcription initiation.

Authors:  Vanja Haberle; Alexander Stark
Journal:  Nat Rev Mol Cell Biol       Date:  2018-10       Impact factor: 94.444

Review 3.  The Apicomplexan AP2 family: integral factors regulating Plasmodium development.

Authors:  Heather J Painter; Tracey L Campbell; Manuel Llinás
Journal:  Mol Biochem Parasitol       Date:  2010-11-30       Impact factor: 1.759

4.  Specific DNA-binding by apicomplexan AP2 transcription factors.

Authors:  Erandi K De Silva; Andrew R Gehrke; Kellen Olszewski; Ilsa León; Jasdave S Chahal; Martha L Bulyk; Manuel Llinás
Journal:  Proc Natl Acad Sci U S A       Date:  2008-06-09       Impact factor: 11.205

5.  Identification of regulatory elements in the Plasmodium falciparum genome.

Authors:  Kevin T Militello; Matthew Dodge; Lara Bethke; Dyann F Wirth
Journal:  Mol Biochem Parasitol       Date:  2004-03       Impact factor: 1.759

6.  Systematic CRISPR-Cas9-Mediated Modifications of Plasmodium yoelii ApiAP2 Genes Reveal Functional Insights into Parasite Development.

Authors:  Cui Zhang; Zhenkui Li; Huiting Cui; Yuanyuan Jiang; Zhenke Yang; Xu Wang; Han Gao; Cong Liu; Shujia Zhang; Xin-Zhuan Su; Jing Yuan
Journal:  mBio       Date:  2017-12-12       Impact factor: 7.867

7.  An expansive human regulatory lexicon encoded in transcription factor footprints.

Authors:  Shane Neph; Jeff Vierstra; Andrew B Stergachis; Alex P Reynolds; Eric Haugen; Benjamin Vernot; Robert E Thurman; Sam John; Richard Sandstrom; Audra K Johnson; Matthew T Maurano; Richard Humbert; Eric Rynes; Hao Wang; Shinny Vong; Kristen Lee; Daniel Bates; Morgan Diegel; Vaughn Roach; Douglas Dunn; Jun Neri; Anthony Schafer; R Scott Hansen; Tanya Kutyavin; Erika Giste; Molly Weaver; Theresa Canfield; Peter Sabo; Miaohua Zhang; Gayathri Balasundaram; Rachel Byron; Michael J MacCoss; Joshua M Akey; M A Bender; Mark Groudine; Rajinder Kaul; John A Stamatoyannopoulos
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

8.  Plasmodium falciparum gametocyte development 1 (Pfgdv1) and gametocytogenesis early gene identification and commitment to sexual development.

Authors:  Saliha Eksi; Belinda J Morahan; Yoseph Haile; Tetsuya Furuya; Hongying Jiang; Omar Ali; Huichun Xu; Kirakorn Kiattibutr; Amreena Suri; Beata Czesny; Adebowale Adeyemo; Timothy G Myers; Jetsumon Sattabongkot; Xin-zhuan Su; Kim C Williamson
Journal:  PLoS Pathog       Date:  2012-10-18       Impact factor: 6.823

9.  Global mapping of protein-DNA interactions in vivo by digital genomic footprinting.

Authors:  Jay R Hesselberth; Xiaoyu Chen; Zhihong Zhang; Peter J Sabo; Richard Sandstrom; Alex P Reynolds; Robert E Thurman; Shane Neph; Michael S Kuehn; William S Noble; Stanley Fields; John A Stamatoyannopoulos
Journal:  Nat Methods       Date:  2009-03-22       Impact factor: 28.547

10.  The Pfam protein families database in 2019.

Authors:  Sara El-Gebali; Jaina Mistry; Alex Bateman; Sean R Eddy; Aurélien Luciani; Simon C Potter; Matloob Qureshi; Lorna J Richardson; Gustavo A Salazar; Alfredo Smart; Erik L L Sonnhammer; Layla Hirsh; Lisanna Paladin; Damiano Piovesan; Silvio C E Tosatto; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

View more
  6 in total

Review 1.  Emerging biology of noncoding RNAs in malaria parasites.

Authors:  Karina Simantov; Manish Goyal; Ron Dzikowski
Journal:  PLoS Pathog       Date:  2022-07-07       Impact factor: 7.464

2.  A systems-level gene regulatory network model for Plasmodium falciparum.

Authors:  Maxwell L Neal; Ling Wei; Eliza Peterson; Mario L Arrieta-Ortiz; Samuel A Danziger; Nitin S Baliga; Alexis Kaushansky; John D Aitchison
Journal:  Nucleic Acids Res       Date:  2021-05-21       Impact factor: 16.971

Review 3.  Preparing for Transmission: Gene Regulation in Plasmodium Sporozoites.

Authors:  Sylvie Briquet; Carine Marinach; Olivier Silvie; Catherine Vaquero
Journal:  Front Cell Infect Microbiol       Date:  2021-01-29       Impact factor: 5.293

Review 4.  The Modular Circuitry of Apicomplexan Cell Division Plasticity.

Authors:  Marc-Jan Gubbels; Isabelle Coppens; Kourosh Zarringhalam; Manoj T Duraisingh; Klemens Engelberg
Journal:  Front Cell Infect Microbiol       Date:  2021-04-12       Impact factor: 5.293

5.  Comparative single-cell transcriptional atlases of Babesia species reveal conserved and species-specific expression profiles.

Authors:  Yasaman Rezvani; Caroline D Keroack; Brendan Elsworth; Argenis Arriojas; Marc-Jan Gubbels; Manoj T Duraisingh; Kourosh Zarringhalam
Journal:  PLoS Biol       Date:  2022-09-22       Impact factor: 9.593

Review 6.  Peculiarities of Plasmodium falciparum Gene Regulation and Chromatin Structure.

Authors:  Maria Theresia Watzlowik; Sujaan Das; Markus Meissner; Gernot Längst
Journal:  Int J Mol Sci       Date:  2021-05-13       Impact factor: 5.923

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.