| Literature DB >> 19454033 |
Isabelle Florent1, Betina M Porcel, Elodie Guillaume, Corinne Da Silva, François Artiguenave, Eric Maréchal, Laurent Bréhélin, Olivier Gascuel, Sébastien Charneau, Patrick Wincker, Philippe Grellier.
Abstract
BACKGROUND: The Plasmodium falciparum genome (3D7 strain) published in 2002, revealed ~5,400 genes, mostly based on in silico predictions. Experimental data is therefore required for structural and functional assessments of P. falciparum genes and expression, and polymorphic data are further necessary to exploit genomic information to further qualify therapeutic target candidates. Here, we undertook a large scale analysis of a P. falciparum FcB1-schizont-EST library previously constructed by suppression subtractive hybridization (SSH) to study genes expressed during merozoite morphogenesis, with the aim of: 1) obtaining an exhaustive collection of schizont specific ESTs, 2) experimentally validating or correcting P. falciparum gene models and 3) pinpointing genes displaying protein polymorphism between the FcB1 and 3D7 strains.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19454033 PMCID: PMC2695484 DOI: 10.1186/1471-2164-10-235
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Clustering strategy for the analysis of FcB1-schizont-ESTs. The first two clusterings performed using est2genome (BLAST score > 700) on the 3D7 genome (PlasmoDB versions 4.4 and 5.3) allowed clustering of 19,459 ESTs. By lowering the BLAST score to 500, 447 additional ESTs were clustered and mapped on the 3D7 genome (PlasmoDB vs 5.3). The remaining unmatched FcB1-schizont-ESTs were analysed by comparison with the UniProt database, revealing 270 additional ESTs matching MSP1 (K1 type) and Ebl-1, 160 ESTs matching the mitochondrial genome and 1 EST matching the apicoplast genome.
Summary of the FcB1-schizont-EST distribution on the 14 P. falciparum chromosomes.
| Chromosome (a) | Matched ESTs (b) | ESTs in protein coding genes (c) | ESTs in ribosomal loci (d) | ESTs in telomeric loci (e) | Protein coding genes identified (f) | Functionaly annotated protein coding genes (g) | Putative protein coding gene (h) | Hypothetical protein coding gene (i) | Confirmation of gene model (j) | Modification of gene model (k) | Evidence for protein polymorphism (l) |
| 1 | 991 | 365 | 626 | 0 | 5 | 3 | 0 | 2 | 1 | 1 | 1 |
| 2 | 937 | 937 | 0 | 0 | 15 | 4 | 6 | 5 | 4 | 1 | 0 |
| 3 | 430 | 430 | 0 | 0 | 13 | 2 | 5 | 6 | 2 | 0 | 0 |
| 4 | 137 | 137 | 0 | 0 | 8 | 2 | 3 | 3 | 1 | 0 | 1 |
| 5 | 10,174 | 292 | 9,846 | 36 | 15 | 1 | 5 | 9 | 2 | 2 | 2 |
| 6 | 885 | 885 | 0 | 0 | 17 | 5 | 6 | 6 | 1 | 0 | 2 |
| 7 | 9,953 | 536 | 9,417 | 0 | 14 | 3 | 4 | 7 | 3 | 0 | 2 |
| 8 | 337 | 332 | 2 | 3 | 11 | 2 | 6 | 3 | 2 | 1 | 1 |
| 9 | 905 | 905 | 0 | 0 | 16 | 4 | 5 | 7 | 2 | 2 | 0 |
| 10 | 636 | 634 | 0 | 2 | 21 | 8 | 4 | 9 | 3 | 2 | 5 |
| 11 | 653 | 653 | 0 | 0 | 17 | 4 | 2 | 11 | 1 | 1 | 0 |
| 12 | 1,011 | 1,011 | 0 | 0 | 25 | 9 | 3 | 13 | 4 | 1 | 2 |
| 13 | 994 | 958 | 0 | 36 | 32 | 8 | 6 | 18 | 2 | 2 | 3 |
| 14 | 1,179 | 1,179 | 0 | 0 | 34 | 5 | 7 | 22 | 1 | 1 | 2 |
| Total | 29,222 | 9,254 | 19,891 | 77 | 243 | 60 | 62 | 121 | 29 | 14 | 21 |
| Unique | 19,906 | 9,254 | 10,611 | 41 | 243 | 30 | 62 | 121 | 29 | 14 | 21 |
| % | 46.5 | 53.3 | 0.2 | 25 | 25 | 50 | 12 | 6 | 9 | ||
This table lists the number of FcB1-schizont-ESTs (b) mapped on each of the 14 P. falciparum chromosomes (a), detailing those matching protein coding genes (c), ribosomal loci (d) or telomeric loci (e). The number of protein coding genes covered by the FcB1-schizont-ESTs is indicated in (f), (g, h, i) detailing whether these genes are functionnaly annotated genes (g), putative genes (h) or hypothetical genes (i). Columns (j, k, l) indicate the number of gene models that were confirmed (j), modified (k) or identified as displaying some strain dependent polymorphism between FcB1 and 3D7 (l) after analysis of 3D7-genomic versus FcB1-EST alignments. "Total" corresponds to the total number of cases for the 14 chromosomes, "Unique" accounts for the fact that many ribosomal-RNA-matching ESTs matched both chromosomes 5 and 7 and that the 36 ESTs matching the telomeric end of chromosome 5 also matched the telomeric end of chromosome 13.
GO term analysis of genes spanned by FcB1-schizont-ESTs.
| nucleosome | GO:0000786 | 4 | 8 | 0.000334 |
| actin cytoskeleton | GO:0015629 | 5 | 18 | 0.00136 |
| myosin complex | GO:0016459 | 3 | 6 | 0.00208 |
| chromosome | GO:0005694 | 7 | 45 | 0.00561 |
| rhoptry | GO:0020008 | 2 | 3 | 0.00694 |
| entry into host cell | GO:0030260 | 4 | 7 | 0.000174 |
| nucleosome assembly | GO:0006334 | 4 | 12 | 0.00202 |
| cytokinesis | GO:0000910 | 2 | 2 | 0.00239 |
| cytoskeleton organisation and biogenesis | GO:0007010 | 8 | 56 | 0.00533 |
| DNA packaging process | GO:0006323 | 5 | 25 | 0.00637 |
| cell division | GO:0051301 | 2 | 3 | 0.00694 |
| cell motility | GO:0006928 | 2 | 3 | 0.00694 |
| defense response | GO:0006952 | 1 | 187 | 0.00745 |
| actin binding | GO:0003779 | 6 | 14 | 2.74e-05 |
| phospholipid binding | GO:0005543 | 3 | 7 | 0.0035 |
| calcium ion binding | GO:0005509 | 10 | 86 | 0.00853 |
A statistical analysis was performed to identify GO terms over- or under-represented in the annotated genes of this collection as compared to their distribution in the annotated genes of the whole genome. GOStat software [22] was used for this analysis, using a p-value threshold of 0.01. Only 159 genes have GO annotations among the 243 genes of the collection (3241 genes have GO annotations throughout the whole genome). This table reports all non-redundant over- or under-represented terms (i.e. all over- or under-represented GO terms that do not generalize another over- or under-represented term) in the three ontologies (cellular component, biological process, and molecular function). All terms were over-represented except for the term "defense response" (biological process) that was under-represented.
Gene models confirmed by FcB1-schizont-ESTs.
| [Gene] (a) | [Product Description] (b) | [Pf-iRBC max expr time (GS array)] (c) | [Pf-iRBC+Spz+ Gam max expr stage (Affy)] (d) | # of ESTs (e) | confirmed in gene model (f) |
| PFA0110w | ring-infected erythrocyte surface antigen | 46 | Merozoite | 229 | intron 1 |
| MAL13P1.103 | hypothetical protein, conserved | 34 | Gametocyte | 1 | introns 1 to 4 and exons 1 to 5 * |
| MAL7P1.108 | hypothetical protein, conserved | Early Schizogony | 7 | intron 3 | |
| MAL7P1.153 | hypothetical protein, conserved | 46 | Gametocyte | 2 | intron 1 |
| MAL7P1.229 | Cytoadherence linked asexual protein | 40 | 133 | intron 1 | |
| PF08_0075 | 60S ribosomal protein L13, putative | 14 | Early Trophozoite | 29 | intron 1 |
| PF10_0211 | hypothetical protein | 42 | Late Schizogony | 44 | intron 2 |
| PF10_0268 | merozoite capping protein 1 | 41 | Early Schizogony | 8 | intron 1 |
| PF10_0372 | Antigen UB05 | 37 | Early Schizogony | 7 | introns 1 and 3 |
| PF11_0348 | hypothetical protein | 37 | Gametocyte | 29 | intron 1 |
| PF13_0173 | hypothetical protein, conserved | 42 | Late Schizogony | 58 | intron 1 |
| PF14_0429 | RNA helicase, putative | 43 | Early Ring | 16 | exon 1 and intron 1 |
| PFB0310c | merozoite surface protein 4 | Early Schizogony | 33 | intron 1 | |
| PFB0340c | cysteine protease, putative | 37 | Early Schizogony | 582 | introns 2 and 3 |
| PFB0475c | hypothetical protein, conserved | 46 | Late Schizogony | 40 | introns 1 and 2 |
| PFB0815w | Pf Calcium-dependent protein kinase 1 | 42 | Late Schizogony | 60 | introns 1 to 4 * |
| PFC0120w | Cytoadherence linked asexual protein, 3.2 | 40 | Early Schizogony | 252 | introns 1 to 5 * |
| PFC0920w | histone H2A variant, putative | 39 | Late Schizogony | 42 | introns 1 and 2 |
| PFD0940w | hypothetical protein, conserved | 38 | Early Schizogony | 52 | intron 2 |
| PFE1415w | cell cycle regulator with zn-finger domain, putative | 40 | Gametocyte | 10 | introns 5 to 8 * |
| PFF0185c | hypothetical protein | 41 | 21 | intron 14 | |
| PFI0265c | RhopH3 | 41 | Early Schizogony | 183 | exons 4, 5, 6 |
| PFI1445w | High molecular weight rhoptry protein-2 | 39 | Early Schizogony | 95 | introns 1 and 6 |
| PFL0975w | hypothetical protein, conserved | 38 | Early Schizogony | 27 | introns 3 and 4 and end of gene * |
| PFL1160c | hypothetical protein, conserved | 39 | Early Schizogony | 31 | introns 2 and 3 |
| PFL2505c | hypothetical protein, conserved | Late Schizogony | 107 | introns 1 and 2 | |
This table, derived from Additional file 1, lists the 26 P. falciparum genes whose gene models were confirmed by FcB1-schizont-ESTs. The first 4 columns were downloaded from PlasmoDB version 5.3 and correspond respectively to: gene accession numbers in PlasmoDB (a), their current description in PlasmoDB (b), the maximum expression time during the erythrocytic cycle according to the transcriptomic data of [17,19] (c) and the maximum expression stage according to the transcriptomic data of [18] (d). Column (e) indicates the number of ESTs corresponding to each gene and isolated in this study. Column (f) details the genetic elements that were confirmed. Asterisks (*) in this last column refer to examples illustrated in Additional file 3. Pf-iRBC, P. falciparum-infected red blood cells.
Gene models modified by FcB1-schizont-ESTs.
| [Gene] (a) | [Product Description] (b) | [Pf-iRBC max expr time (GS array)] (c) | [Pf-iRBC+Spz+Gam max expr stage (Affy)] (d) | # of ESTs (e) | modified in gene model (f) |
| PFA0630c | hypothetical protein | 16 | Early Trophozoite | 17 | in agreement with chr1.genefinder_16r, chr1.glimmerm_366 and chr1.phat_146 * |
| MAL13P1.460 | conserved hypothetical protein | 77 | intron 3 modified | ||
| MAL8P1.73 | hypothetical protein, conserved | 40 | Early Schizogony | 28 | intron 16 modified but intron 18 confirmed |
| PF10_0072 | hypothetical protein | Late Schizogony | 8 | exon 1 would be longer at 3' end | |
| PF10_0361 | hypothetical protein | 23 | Early Ring | 37 | in agreement with chr11, glimmer_1141 |
| PF11_0194 | hypothetical protein | 41 | Gametocyte | 50 | in agreement with chr11, genefinder.157r * |
| PF13_0193 | MSP7-like protein | Early Schizogony | 41 | exon 1 would be longer at 3' end | |
| PF14_0280 | phosphotyrosyl phosphatase activator, putative | 20 | Gametocyte | 2 | gene would be longer downstream |
| PFB0305c | merozoite surface protein 5 | 46 | Late Schizogony | 5 | exon 2 would be longer at 5' end |
| PFE0240w | hypothetical protein, conserved | Gametocyte | 12 | four additional exons, longer protein * | |
| PFE1490c | hypothetical protein, conserved | Early Ring | 16 | intron 1 modified but intron 2 confirmed | |
| PFI0905w | hypothetical protein | Gametocyte | 4 | exon 2 would be longer at 5' end ** | |
| PFI1565w | conserved protein | Late Schizogony | 11 | 3'-end of gene in agreement with chr9.glimmerm_973 and chr9.glimmerm_974 * | |
| PFL0290w | hypothetical protein, conserved | 43 | Early Trophozoite | 11 | intron 1 modified but intron 2 confirmed * |
This table, derived from Additional file 1, lists the 14 P. falciparum genes whose gene models were corrected based on FcB1-schizont-ESTs. Columns (a) to (e) are as described in the Table 3 legend. The last column (f) details each modification. Note that three genes in this list were both modified/confirmed (in different parts): MAL8P1.73, PFE1490c and PFL0290w. Asterisks (*) in this last column refer to examples illustrated in Additional file 4. The model revision proposed for PFI0905w (**) will need to be confirmed by other experimental data since these 4 ESTs also matched P. falciparum telomerase RNA (see Additional file 10). Pf-iRBC, P. falciparum-infected red blood cells.
Evidence of protein polymorphism between FcB1 and 3D7 strains.
| [Gene] (a) | [Product Description] (b) | [Pf-iRBC max expr time (GS array)] (c) | [Pf-iRBC+Spz+Gam max expr stage (Affy)] (d) | # of ESTs (e) | type of polymorphism (f) |
| PFA0215w | hypothetical protein, conserved | 45 | Late Schizogony | 72 | in tandem repeats |
| PFD0185c | peptidase | 42 | Gametocyte | 10 | in tandem repeats |
| PFE0250w | hypothetical protein, conserved | 25 | Early Trophozoite | 39 | in Asn-rich region |
| PFE0655w | hypothetical protein, conserved | 16 | Early Trophozoite | 2 | in tandem repeats |
| PFF0670w | hypothetical protein, conserved | 38 | 47 | in Asn-rich region | |
| PFF0765c | hypothetical protein, conserved | 41 | 3 | in tandem repeats | |
| MAL7P1.208 | rhoptry-associated membrane antigen, RAMA | 41 | 147 | in tandem repeats | |
| PF07_0111 | hypothetical protein, conserved | 37 | Gametocyte | 1 | in tandem repeats |
| PF08_0109 | hypothetical protein, conserved | 39 | Early Schizogony | 5 | in Asn-rich region |
| PF10_0177 | erythrocyte membrane-associated antigen | 40 | Gametocyte | 31 | in tandem repeats |
| PF10_0184 | hypothetical protein | 41 | Gametocyte | 13 | local polymorphism |
| PF10_0213 | 10b antigen, putative | 33 | Early Schizogony | 20 | in Asn-rich region |
| PF10_0345 | merozoite surface protein 3 | 42 | Late Schizogony | 93 | mild polymorphism |
| PF10_0351 | hypothetical protein | 45 | Late Schizogony | 83 | in tandem repeats |
| PFL0465c | Zinc finger transcription factor (krox1) | Late Schizogony | 1 | mild polymorphism | |
| PFL1385c | Merozoite Surface Protein 9, MSP-9 | 41 | Early Schizogony | 310 | mild polymorphism |
| PF13_0053 | hypothetical protein, conserved | 12 | Early Trophozoite | 52 | in tandem repeats |
| MAL13P1.158 | hypothetical protein, conserved | 40 | Gametocyte | 10 | local polymorphism |
| PF13_0245 | hypothetical protein, conserved | 46 | Early Trophozoite | 16 | mild polymorphism |
| PF14_0175 | conserved protein unknown function | 107 | in tandem repeats | ||
| PF14_0486 | elongation factor 2 | 17 | Early Trophozoite | 5 | mild polymorphism |
This table, derived from Additional file 1, lists the 21 P. falciparum genes for which some protein polymorphism was identified between FcB1 and 3D7 strains. Columns (a) to (e) are as described in the Table 3 legend. The last column (f) details the various cases. All protein sequence alignments are illustrated in Additional file 5. Pf-iRBC, P. falciparum-infected red blood cells.
Summary of the FcB1-schizont EST distribution on the different rRNA loci of P. falciparum.
| rRNA type | Chromosomal location (a) | rRNA gene structure | Expression if known | Number of Matching ESTs | Corresponding clusters |
| A-type | Chromosome 5 | 18s – 5.8s – 28s | Human | 9846 (f) | 322, 328, 303, 302, 323, 324, 325, 326, 327 |
| A-type | Chromosome 7 | 18s – 5.8s – 28s (b) | Human | 9417 (f) | 54, 35, 37, 40, 44, 46, 49, 52, 55 |
| S-type | Chromosome 11 | 18s – 5.8s – 28s | Insect | None | |
| S-type | Chromosome 13 | 18s – 5.8s – 28s | Insect | None | |
| Not defined | Chromosome 1 | 18s – 5.8s – 28s (c) | Unknown | 626 | 146, 147, 148 |
| Not defined | Chromosome 8 | 28s – tmp2 (d) | Unknown | None | |
| Not defined | Chromosome 8 | 5.8s – tmp1 (d) | Unknown | 2 | 62 |
| Not defined | Chromosome 14 | 5s (e) | Unknown | None | |
(a) Approximate position indicated in kb; (b) also contains atypical 18s and 28s; (c) 18s and 5.8s are of S-type and 28s is divergent (65% A-type and 75% S-type); (d) 18s is missing from these units; (e) 3 tandem repeats of 5s. Data compiled from [5] and PlasmoDB. (f) Most of these ESTs clustered with homologous loci of both chromosomes 5 and 7.