| Literature DB >> 18257930 |
Jason A Young1, Jeffery R Johnson, Chris Benner, S Frank Yan, Kaisheng Chen, Karine G Le Roch, Yingyao Zhou, Elizabeth A Winzeler.
Abstract
BACKGROUND: With the sequence of the Plasmodium falciparum genome and several global mRNA and protein life cycle expression profiling projects now completed, elucidating the underlying networks of transcriptional control important for the progression of the parasite life cycle is highly pertinent to the development of new anti-malarials. To date, relatively little is known regarding the specific mechanisms the parasite employs to regulate gene expression at the mRNA level, with studies of the P. falciparum genome sequence having revealed few cis-regulatory elements and associated transcription factors. Although it is possible the parasite may evoke mechanisms of transcriptional control drastically different from those used by other eukaryotic organisms, the extreme AT-rich nature of P. falciparum intergenic regions (approximately 90% AT) presents significant challenges to in silico cis-regulatory element discovery.Entities:
Mesh:
Year: 2008 PMID: 18257930 PMCID: PMC2268928 DOI: 10.1186/1471-2164-9-70
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1An overview of some of the most biologically interesting putative regulatory elements identified using GEMS. Distribution represents the frequency of regulatory elements relative to gene start codons. Red bars represent the location of motifs upstream of genes contained within the cluster (positive set) analyzed whereas green bars represent the location of motifs upstream of genes in the remainder of the genome (negative set). "log10P" is the log10 of the probability of observed motif enrichment occurring by chance in the positive set versus negative set. "Best log10PRodent" is the log10 probability of observed motif enrichment occurring by change in positive and negative orthologous sets from rodent species (lowest p-value from P. yoelii, P. berghei, or P. chabaudi is given).
Figure 2GEMS identification of PfM2.1 from the Sexual Development cluster (GO:GNF0004). a) List of words derived from promoter regions of genes contained within the Sexual Development cluster. The words are ranked by log10P hypergeometric-derived scores that represent the degree of word enrichment in the promoters of genes contained within the Sexual Development cluster (positive set) versus the remainder of the genome (negative set). In this case, the seed word "GTACATAC" led to PfM2.1 (highlighted in red). b) A re-ordered list of the seed word "GTACATAC" (highlighted in red) and all other words that differ by one mismatch ranked again by log10P score. A PWM is generated using this list with the contribution of each word to the PWM being weighed by its |log10P| score. c) A re-ordered list of all words ranked by similarity scores to the generated PWM. Similarity scores for any word are obtained by calculating the geometrical mean of the corresponding PWM elements associated with each word. The similarity threshold that results in the inclusion of words that lead to the lowest p-value is identified as optimal (highlighted in blue, blue asterisk). d) Visual depiction of the optimization of parameters through minimization of the p-values for different mismatches and similarity thresholds. The local minima corresponding to mismatch 0, 1, 2 and 3 are highlighted by circles (red, blue, magenta, and green respectively). In this case, the optimal log10P score (-21.2) is found with one mismatch and a similarity score threshold of > 0.57.
Figure 3GEMS and MDScan random permutation analyses. a) log10P enrichment score distribution of motif candidates derived from GEMS analysis of the upstream regions of genes contained within Sexual Development cluster (GO:GNF0004) (red). For comparison, the log10P enrichment score distribution of motif candidates derived from GEMS analysis of 100 randomly selected sets of promoter sequences are also plotted (blue) representing the p-value range that is obtainable by chance, i.e. potential false positives. log10P values for the top 20 motifs from GEMS analysis of the Sexual Development cluster all fall below those obtainable by random simulations. b) MAP score distribution of motif candidates derived from MDScan analysis of the Sexual Development cluster (GO:GNF0004) (red). Again, for comparison, the MAP score distribution of motif candidates derived from MDScan analysis of 100 randomly selected sets of promoter sequences are also plotted (blue) representing the MAP score range that is obtainable by chance. MAP scores for motifs obtained using MDScan analysis of the Sexual Development cluster do not distance themselves from those obtained in random simulations suggesting the potential for many false positives in MDScan motif discovery.
Figure 4Promoter-derived putative regulatory elements as discovered by OPI cluster.
Figure 5Alignment of sequences from known and putative rhoptry gene promoter regions containing PfM18.1 and PfM18.1-like motifs. Most genes have two copies separated by six bases (highlighted in grey and bold).
Figure 6Results of transient transfection using 1527 base pair promoter region of . The native promoter (black) results in a life cycle stage-specific expression pattern while specific deletion from the construct of the two copies of PfM18.1 and interspersed sequence at -588 base pairs (white) eliminates this effect.
Figure 7Characterization of PfM18.1 binding proteins by EMSA. Incubation of 32P radiolabeled-probe containing PfM18.1 generated a multi-complex shift (O197:198, Lane 2). ×4 and ×20 molar excess cold competitor O197:198 diminished the shift in a concentration dependent manner (Lanes 5 & 6) while random 80% AT (Lanes 7 & 8) and random 20% AT (lanes 9 & 10) cold competitor probes did not compete indicating sequence specificity for the binding event. ×5 and ×10 increase in MgCl2 concentration resulted in intensification of the second-most upper band (Lanes 3 & 4).
Figure 8Gene-by-gene comparison of expression levels for mixed asexual versus sporozoite stages and mixed asexual parasites versus heat shock treated mixed asexual parasites. While expression levels for many genes vary widely between mixed asexual and sporozoite stages (dark gray points, Pearson's r = 0.284), little difference is observed in expression levels in mixed asexual parasites before and after heat shock treatment (white points, Pearson's r = 0.989) demonstrating a lack of robust transcriptional response to environmental perturbations at the level of transcription.
Alignment of promoter regions from sporozoite-expressed genes containing PfM24.1.
| MAL13P1.125 | - | -312 | tatatttttttttataagga | -358 |
| MAL13P1.212 | - | -767 | ttttaaaatttttcttaaag | -813 |
| MAL8P1.6 | - | -55 | ttttttttttcttagttata | -9 |
| PF08_0088 | S23 | -486 | caaattttttttttctcctg | -532 |
| PF08_0088 | S23 | -529 | aagttaatatataaatgctg | -483 |
| PF10_0218 | citrate synthase | -936 | tagctcaaccaaaacataag | -890 |
| PF10_0231 | - | -427 | tttaaatcttcataccaaca | -381 |
| PF11_0328 | - | -756 | tatattataaatactctctg | -710 |
| PF11_0480 | S22 | -131 | ttgtcctatccaaaaattga | -85 |
| PF11_0486 | MAEBL | -828 | attttcttcatataagaaca | -874 |
| PF13_0201 | SSP | -851 | gaatcagatttattcaaacg | -897 |
| PF14_0074 | - | -837 | atataaagctacaatacacc | -791 |
| PF14_0074 | - | -449 | tctgttttttttgtcattta | -495 |
| PF14_0427 | - | -414 | aaaaaaaaaaaatatacata | -460 |
| PF14_0427 | - | -325 | aaaattttgtatgtattata | -279 |
| PF14_0729 | - | -521 | tttgtccatttataaattag | -475 |
| PFA0200w | TSRP | -265 | ataattacatatttggtcta | -219 |
| PFA0205w | S24 | -900 | aatgaacatatattagctta | -854 |
| PFA0380w | Kinase | -986 | ttttttttttttttttgaaa | -940 |
| PFB0325c | SERA | -691 | atttatgagctgaattgtta | -737 |
| PFC0210c | CSP | -708 | tgggattattgtaaatataa | -662 |
| PFC0210c | CSP | -606 | cagaaattattcttatctta | -560 |
| PFD0215c | Pbs36 | -399 | gtgagttctacatgccactg | -353 |
| PFD0235c | S10 | -471 | tatctaggcacgtttcatca | -517 |
| PFD0425w | S17 | -897 | gattgttatatttatcgtta | -851 |
| PFD0425w | S17 | -850 | aaaaaaaaaaaattaggaaa | -896 |
| PFD0430c | Ppl1 | -578 | ttttttttaaataaatcatg | -532 |
| PFE0360c | s14 | -915 | aataacatcttgtattgtca | -869 |
| PFE0565w | - | -765 | tatatcaaaaacgagacatg | -719 |
| PFE0950c | - | -396 | tttccattttttttcctgaa | -350 |
| PFL0065w | - | -283 | aaaaaagaaaccataatatg | -237 |
| PFL0370w | - | -505 | aactctctttttttatataa | -459 |
| PFL0370w | - | -458 | ttttatatacagctctacaa | -504 |
| PFL0630w | - | -376 | atatattattaaaatgtgat | -330 |
| PFL0800c | S4 succinate dehydrogenase | -804 | tttttttcgctttatttatg | -758 |
| PFL1770c | - | -748 | ctgatatataataatggggt | -794 |
| chr1.rRNA-1-28s | - | -868 | aatagtatcggtgtaattta | -914 |
Motifs are highlighted in bold. Start and stop locations are relative to start codons.