| Literature DB >> 20691395 |
Aylan Farid Arenas1, Juan Felipe Osorio-Méndez, Andres Julian Gutierrez, Jorge E Gomez-Marin.
Abstract
Apicomplexa are an extremely diverse group of unicellular organisms that infect humans and other animals. Despite the great advances in combating infectious diseases over the past century, these parasites still have a tremendous social and economic burden on human societies, particularly in tropical and subtropical regions of the world. Proteases from apicomplexa have been characterized at the molecular and cellular levels, and central roles have been proposed for proteases in diverse processes. In this work, 16 new genes encoding for trypsin proteases are identified in 8 apicomplexan genomes by a genome-wide survey. Phylogenetic analysis suggests that these genes were gained through both intracellular gene transfer and vertical gene transfer. Identification, characterization and understanding of the evolutionary origin of protease-mediated processes are crucial to increase the knowledge and improve the strategies for the development of novel chemotherapeutic agents and vaccines. Copyright 2010 Beijing Genomics Institute. Published by Elsevier Ltd. All rights reserved.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20691395 PMCID: PMC5054444 DOI: 10.1016/S1672-0229(10)60011-3
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Main characteristics of the 16 trypsin genes identified in 8 apicomplexan genomes
| Gene | Species | Accession number | ESTs | Upstream promoter regions | Exons | Gene size (bp) | ORF size (bp) | Genome localization |
|---|---|---|---|---|---|---|---|---|
| 1 | Tg | TGME49_062920 | 48 | YES | 11 | 9,503 | 2,871 | VIIb |
| 2 | Tg | TGME49_077850 | 4 | YES | 11 | 7,933 | 2,235 | XII |
| 3 | Tg | TGME49_090840 | 9 | YES | 9 | 6,837 | 2,883 | IX |
| 4 | Tg | TGME49_118290 | 3 | YES | 8 | 5,474 | 972 | IV |
| 1 | Tg | TGGT1_007070 | 48 | YES | 11 | 9,503 | 2,871 | VIIb |
| 2 | Tg | TGGT1_104800 | 4 | YES | 11 | 7,933 | 2,235 | XII |
| 3 | Tg | TGGT1_032310 | 9 | YES | 9 | 6,837 | 2,883 | IX |
| 4 | Tg | TGGT1_122960 | 3 | YES | 8 | 5,474 | 972 | IV |
| 1 | Tg | TGVEG_071800 | 48 | YES | 11 | 9,503 | 2,871 | VIIb |
| 2 | Tg | TGVEG_028600 | 4 | YES | 10 | 7,933 | 2,124 | XII |
| 3 | Tg | TGVEG_084140 | 9 | YES | 9 | 6,837 | 2,880 | IX |
| 4 | Tg | TGVEG_009710 | 3 | YES | 7 | 5,474 | 1,089 | IV |
| 5 | Pf | Pf_MAL8P1.126 | 1 | – | 5 | 3,139 | 2,613 | MAL 8 |
| 6 | Pf | Pf_MAL8P1.98 | 1 | – | 11 | 2,489 | 1,125 | MAL 8 |
| 7 | Pv | PVX_088155 | 0 | – | 5 | 2,913 | 2,430 | CM000442 |
| 8 | Pv | PVX_123160 | 5 | – | 14 | 5,097 | 2,112 | CM000455 |
| 9 | Pk | PKH_142640 | 0 | – | 15 | 5,470 | 2,112 | 14 |
| 10 | Py | PY01797 | 1 | – | 9 | 2,764 | 1,359 | MALPY00485 |
| 11 | Nc | NC_LIV_090910 | 0 | – | 11 | 8,011 | 3,138 | VIIb |
| 12 | Nc | NC_LIV_145670 | 2 | – | 10 | 6,077 | 2,193 | XII |
| 13 | Nc | NC_LIV_113280 | 5 | – | 17 | 13,649 | 2,460 | IX |
| 14 | Nc | NC_LIV_051240 | 1 | – | 15 | 9,202 | 1,590 | IV |
| 15 | Ta | XP_954412.1 | 1 | – | 9 | 2,108 | 1,731 | – |
| 16 | Tp | XP_765845.1 | 0 | – | – | – | – | – |
Note: The accession numbers, the number of EST alignments per gene, the prediction of upstream promoter regions [based on chip-chip data for T. gondii(], the exon counts, the gene and ORF sizes and the genome localization of the 16 putative trypsin protein sequences are indicated. Information of T. gondii and N. caninum trypsin genes is based on ToxoDB 5.1, and for Plasmodium is based on PlasmoDB 5.5. Analysis of Theileria genes were made by using sequences retrieved from the Welcome Trust Sanger Institute (http://www.sanger.ac.uk). Tg: Toxoplasma gondii; Pf: Plasmodium falciparum; Pv: P. vivax; Pk: P. knowlesi; Py: P. yoelii; Nc: Neospora caninum; Ta: Theileria annulata; Tp: Theileria parva.
Figure 1Multiple alignment of the 16 trypsin domain sequences identified in 8 apicomplexan genomes. Amino acids belonging to the trypsin catalytic triad are indicated by red boxes. The characteristic trypsin motif GNSGGPAL is indicated by the blue box. Alignment was edited by using GeneDoc program through deleting gap columns. Conserved positions are shaded as black and semi-conserved positions are shaded as grey.
Prediction of secretion and subcellular localization of T. gondii trypsin homologous sequences
| Prokaryotic trypsins | Secretory | No secretory | ||
|---|---|---|---|---|
| Cyanobacteria (10) | 8 | 2 | ||
| Clostridia (8) | 7 | 1 | ||
| Alpha-proteobacteria (7) | 6 | 1 | ||
| Beta-proteobacteria (11) | 11 | 0 | ||
| Delta-proteobacteria (6) | 5 | 1 | ||
| Gamma-proteobacteria (4) | 4 | 0 | ||
| Proteobacteria (1) | 1 | 0 | ||
| Bacilli (10) | 10 | 0 | ||
| Spirochetes (3) | 1 | 2 | ||
| Chlamydiae (1) | 1 | 0 | ||
| Actinobacteria (3) | 1 | 2 | ||
| Deinococcus (4) | 4 | 0 | ||
| Thermotogae (2) | 1 | 1 | ||
| Dictyoglomi (2) | 0 | 2 | ||
| Chlorobi (4) | 4 | 0 | ||
| Deferribacteres (1) | 1 | 0 | ||
| Eukaryotic trypsins | Secretory | Chloroplast | Mitochondria | Other |
| Apicomplexa (16) | 6 | 3 | 4 | 3 |
| Plants (3) | 0 | 1 | 1 | 1 |
| Metazoa (38) | 12 | - | 13 | 8 |
Note: The number of retrieved sequences for each taxonomic group is indicated in parenthesis. Prediction of secretion for prokaryotic sequences is based on either SignalP 3.0 ( or SecretomeP 2.0 ( and for eukaryotic sequences is based on either SignalP 3.0 ( or TargetP 1.1 (. Transit peptides for mitochondria and chloroplast were predicted by TargetP 1.1 (.
Domain architecture of trypsins
| Domain architecture | Archaea | Bacteria | Apicomplexa | Other eukaryotes |
|---|---|---|---|---|
| Tryp | 0 | 3 | 6 | 3 |
| Tryp - PDZ | 3 | 42 | 7 | 22 |
| Tryp - PDZ - PDZ | 0 | 32 | 0 | 0 |
| CS - Tryp - PDZ | 0 | 0 | 1 | 0 |
| KAZAL - Tryp - PDZ | 0 | 0 | 0 | 2 |
| IB - KAZAL - Tryp - PDZ | 0 | 0 | 0 | 13 |
| 2-Hacid_dh_C - Tryp | 0 | 0 | 0 | 1 |
| Tryp - PDZ - MMR_HSR1 | 0 | 0 | 2 | 0 |
| Domain architecture | Secretory | Mitochondria | Chloroplast | Other |
| Tryp | 2 | 3 | 2 | 5 |
| Tryp - PDZ | 39 | 10 | 1 | 24 |
| Tryp - PDZ - PDZ | 30 | 0 | 0 | 2 |
| CS - Tryp - PDZ | 0 | 0 | 0 | 1 |
| KAZAL - Tryp - PDZ | 0 | 2 | 0 | 0 |
| IB - KAZAL - Tryp - PDZ | 12 | 0 | 0 | 1 |
| 2-Hacid_dh_C - Tryp | 0 | 0 | 0 | 1 |
| Tryp - PDZ - MMR_HSR1 | 0 | 2 | 0 | 0 |
Note: Protein domains were defined as in the SMART database (. The classification of the sequences is based on the NCBI taxonomy database. Prediction of secretion for prokaryotic sequences is based on either SignalP 3.0 ( or SecretomeP 2.0 ( and for eukaryotic sequences is based on either SignalP 3.0 ( or TargetP 1.1 (. Transit peptides for mitochondria and chloroplast were predicted by TargetP 1.1 (.
Figure 2Bootstrap consensus tree inferred using the neighbor-joining method. Confidence values were assessed from 5,000 replicates. All positions containing gaps and missing data were eliminated from the dataset. Phylogenetic analyses were conducted in MEGA4 software (. Branches supported by significant bootstrap values (>50) are highlighted by using the following color code. Light blue: Cyanobacteria; Brown: Deinococcus; Grey: Chloribium; Violet: Proteobacteria; Green: Metazoa; Light green: Bacilli; Black: Clostridia; Blue: Apicomplexan trypsin from IGT (Cluster 1 in the text); Red: Other apicomplexan trypsins (Cluster 2 in the text). The black arrow head indicates a possible segmental duplication event, and the red arrow head indicates the IGT event from an endosymbiont to the nuclear genome of an apicomplexan ancestor. The apicomplexan trypsins accession numbers (Table 1) are used as operational taxonomic units.