| Literature DB >> 29846382 |
Genevieve Syn1, Jenefer M Blackwell1, Sarra E Jamieson1, Richard W Francis1.
Abstract
Toxoplasma gondii uses epigenetic mechanisms to regulate both endogenous and host cell gene expression. To identify genes with putative epigenetic functions, we developed an in silico pipeline to interrogate the T. gondii proteome of 8313 proteins. Step 1 employs PredictNLS and NucPred to identify genes predicted to target eukaryotic nuclei. Step 2 uses GOLink to identify proteins of epigenetic function based on Gene Ontology terms. This resulted in 611 putative nuclear localised proteins with predicted epigenetic functions. Step 3 filtered for secretory proteins using SignalP, SecretomeP, and experimental data. This identified 57 of the 611 putative epigenetic proteins as likely to be secreted. The pipeline is freely available online, uses open access tools and software with user-friendly Perl scripts to automate and manage the results, and is readily adaptable to undertake any such in silico search for genes contributing to particular functions.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29846382 PMCID: PMC5963570 DOI: 10.1590/0074-02760170471
Source DB: PubMed Journal: Mem Inst Oswaldo Cruz ISSN: 0074-0276 Impact factor: 2.743

Bioinformatics pipeline outlining the in silico tools and manual methods used to predict parasite proteins with endogenous and exogenous epigenetic function. Step 1: predicts proteins localised to the nucleus; Step 2: predicts proteins with potential epigenetic function; Step 3: applies a filter to determine potential for epigenetic function in the host versus endogenous epigenetic function.
Summary of proteins filtered through the in silico pipeline. The pipeline was designed to identify a set parasite encoded candidates in the putative secretome that could potentially target the host or parasite nucleus and have domains consistent with an epigenetic function
| Proteins retained | Overall percentage of the proteome | |
|---|---|---|
| Initial | 8318 | 100% |
| Step 1: prediction of nuclear localised proteins | 3408 | 41% |
| Step 2: proteins from Step 1 with a predicted epigenetic function | 611 | 7.4% |
| Step 3a: proteins from Step 2 with a predicted epigenetic function in host | 57 | 0.7% |
| Step 3b: proteins not included in Step 3a: predicted epigenetic function in parasite | 554 | 6.7% |
*: protein numbers retained and percent of the entire proteome at each step of the in silico pipeline.
Proteins from our pipeline with a potential epigenetic function in modulating the host epigenome. The list is annotated for proteins identified from the in silico secretome, the experimental secretome data, or both
| ToxoDB ID | Protein description |
|---|---|
|
| |
| TGME49_207690 | PDCD5 |
| TGME49_213310 | hypothetical protein |
| TGME49_239440 | protein kinase (incomplete catalytic triad) |
| TGME49_241870 | tRNA ligase class I (E and Q), catalytic domain-containing protein |
| TGME49_243280 | Met-10+ like-protein |
| TGME49_245660 | hypothetical protein |
| TGME49_257010 | sporozoite developmental protein |
| TGME49_271625 | serine--tRNA ligase |
| TGME49_277030 | isoleucyl-tRNA synthetase, putative |
| TGME49_281675 | hypothetical protein |
| TGME49_284010 | 5’-3’ exonuclease, N-terminal resolvase family domain-containing protein |
| TGME49_295050 | tRNA ligase class II core domain (G, H, P, S and T) domain-containing protein |
| TGME49_299810 | cysteine-tRNA synthetase (CysRS) |
| TGME49_305920 | endonuclease III family 1 protein |
| TGME49_312370 | RNA pseudouridine synthase superfamily protein |
| TGME49_312520 | tRNA dimethylallyltransferase |
| TGME49_313120 | DNA-directed RNA polymerase, alpha subunit |
| Experimental secretome | |
| TGME49_202490 | AP2 domain transcription factor AP2VIIa-7 |
| TGME49_206510 | toxolysin TLN4 |
| TGME49_207080 | histone lysine acetyltransferase MYST-B |
| TGME49_210310 | hypothetical protein |
| TGME49_210360 | DEAD (Asp-Glu-Ala-Asp) box polypeptide 41 family protein |
| TGME49_219600 | hypothetical protein |
| TGME49_223880 | zinc finger, C3HC4 type (RING finger) domain-containing protein |
| TGME49_224480 | cell-cycle-associated protein kinase CLK, putative |
| TGME49_226510 | Sec23/Sec24 trunk domain-containing protein |
| TGME49_228120 | hypothetical protein |
| TGME49_231170 | RecF/RecN/SMC N terminal domain-containing protein |
| TGME49_239420 | protein kinase |
| TGME49_240090 | rhoptry kinase family protein ROP34, putative |
| TGME49_246060 | polymerase (RNA) mitochondrial (DNA directed) POLRMT |
| TGME49_246760 | hypothetical protein |
| TGME49_252500 | polo kinase |
| TGME49_253750 | PLU-1 family protein |
| TGME49_253890 | peptidase M16 inactive domain-containing protein |
| TGME49_267030 | ribonuclease type III Dicer |
| TGME49_268900 | dense granular protein GRA10 |
| TGME49_269885 | rhoptry metalloprotease toxolysin TLN1 |
| TGME49_271290 | hypothetical protein |
| TGME49_271740 | hypothetical protein |
| TGME49_278440 | SWI2/SNF2 Brahma-like putative |
| TGME49_285895 | AP2 domain transcription factor AP2V-2 |
| TGME49_289330 | ubiquitin carboxyl-terminal hydrolase family 2 protein |
| TGME49_292055 | calcium dependent protein kinase CDPK8 |
| TGME49_292235 | hypothetical protein |
| TGME49_294840 | zinc finger (CCCH type) motif-containing protein |
| TGME49_305750 | nucleolar gtp-binding protein 2, putative |
| TGME49_306660 | RNA pseudouridine synthase superfamily protein |
| TGME49_312830 | hypothetical protein |
| TGME49_313330 | rhoptry kinase family protein ROP27 |
| Both | |
| TGME49_201130 | rhoptry kinase family protein ROP33 |
| TGME49_207610 | rhoptry kinase family protein ROP36 (incomplete catalytic triad) |
| TGME49_221330 | DNA gyrase/topoisomerase IV, A subunit domain-containing protein |
| TGME49_229630 | eIF2 kinase IF2K-A (incomplete catalytic triad) |
| TGME49_262730 | rhoptry protein ROP16 |
| TGME49_294560 | rhoptry kinase family protein ROP37 (incomplete catalytic triad) |
| TGME49_309110 | tRNA methyl transferase |
a: retrieved from ToxoDB (Version 8.0).