| Literature DB >> 21143776 |
Shunfu Xu1, Chao Zhang, Yi Miao, Jianjiong Gao, Dong Xu.
Abstract
BACKGROUND: Effector secretion is a common strategy of pathogen in mediating host-pathogen interaction. Eight EPIYA-motif containing effectors have recently been discovered in six pathogens. Once these effectors enter host cells through type III/IV secretion systems (T3SS/T4SS), tyrosine in the EPIYA motif is phosphorylated, which triggers effectors binding other proteins to manipulate host-cell functions. The objectives of this study are to evaluate the distribution pattern of EPIYA motif in broad biological species, to predict potential effectors with EPIYA motif, and to suggest roles and biological functions of potential effectors in host-pathogen interactions.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21143776 PMCID: PMC2999339 DOI: 10.1186/1471-2164-11-S3-S1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Experimentally determined tyrosine-phosphorylated effectors and their motifs
| Effector | Pathogen | Locus of protein | Motif (phosphorylated Y position) | |||||
|---|---|---|---|---|---|---|---|---|
| CagA | NP_207343 | EPIYAKVNK | Y-899 | EPIYTQVAK | Y-918 | EPIYATIDD | Y-972 | |
| Ankyrin | ABB84853 | ESIYEEIKD | Y-940 | ESIYEEIKD | Y-967 | ESIYEEIKD | Y-994 | |
| EDLYATVGA | Y-1028 | ESIYADPFD | Y-1056 | ESIYADPFA | Y-1074 | |||
| EPIYATVKK | Y-1098 | |||||||
| BepD | YP_034066 | EPLYAQVNK | Y-32 | NPLYEGVGG | Y-114 | NPLYEGVGS | Y-176 | |
| EPLYAQVNK | Y-211 | NPLYEGVGG | Y-293 | NPLYEGVGP | Y-355 | |||
| BepE | YP_034067 | EPLYATVNK | Y-37 | ETIYTTVSS | Y-91 | |||
| BepF | YP_034068 | TPLYATPSP | Y-149 | EPLYATPLP | Y-213 | EPLYATPLP | Y-241 | |
| EPLYATAAP | Y-297 | EPLYATPLP | Y-269 | |||||
| Tir | AAC38390 | EHIYDEVAA | Y-474 | |||||
| Tir | AAL06376 | EPIYDEVAP | Y-468 | |||||
| Trap | YP_001654788 | ENIYENIYE | Y-136 | ENIYENIYE | Y-238 | ENIYENIYE | Y-390 | |
CagA: cytotoxin associated gene A [7,8,10,52-56]; BepD: Bartonella henselae protein D [14-16,57]; BepE: Bartonella henselae protein E [14-16,57]; BepF: Bartonella henselae protein F [14-16,57]; Tir: translocated intimin receptor [58-60] ; Trap: Translocated actin-recruiting protein [61-63] . The first five amino acids of the listed sequences in the table correspond to the EPIYA motif.
Figure 1Sequence logo of the known EPIYA motif sequences
Distribution of protein sequences containing the EPIYA motif
| Number of motif repeats in one protein | Number of protein sequences | Observed Frequency | Expected Frequency |
|---|---|---|---|
| 29 | 1 | 1.09E-07 | 3.44E-57 |
| 14 | 2 | 2.17E-07 | 5.52E-28 |
| 13 | 2 | 2.17E-07 | 4.88E-26 |
| 12 | 1 | 1.09E-07 | 4.32E-24 |
| 10 | 1 | 1.09E-07 | 3.39E-20 |
| 9 | 3 | 3.26E-07 | 3.00E-18 |
| 8 | 6 | 6.51E-07 | 2.65E-16 |
| 7 | 10 | 1.09E-06 | 2.35E-14 |
| 6 | 32 | 3.47E-06 | 2.08E-12 |
| 5 | 55 | 5.97E-06 | 1.84E-10 |
| 4 | 173 | 1.88E-05 | 1.63E-08 |
| 3 | 916 | 9.94E-05 | 1.44E-06 |
| 2 | 1913 | 2.08E-04 | 1.28E-04 |
| 1 | 104116 | 1.13E-02 | 1.13E-02 |
Expected frequency is the expected probability if the combination of the motif in a protein sequence is random.
Distribution of EPIYA-motif containing proteins at genus and species levels (as of July 6th 2009)
| Groups | Number of genuses | Number of species | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Archaea | 109 | 49 | 44.95% | 19 | 17.43% | 330 | 90 | 27.27% | 28 | 8.48% |
| Viruses | 623 | 221 | 35.47% | 18 | 2.89% | 6443 | 433 | 6.72% | 30 | 0.47% |
| Bacteria | 1198 | 560 | 46.74% | 209 | 17.45% | 6291 | 1398 | 22.22% | 360 | 5.72% |
| Eukaryota | 35499 | 1828 | 5.15% | 122 | 0.34% | 108654 | 2725 | 2.51% | 169 | 0.16% |
| -Protista | 1263 | 109 | 8.63% | 18 | 1.43% | 3747 | 186 | 4.96% | 32 | 0.85% |
| -Fungi | 1509 | 121 | 8.02% | 40 | 2.65% | 5772 | 206 | 3.57% | 52 | 0.90% |
| -Metazoa | 22309 | 662 | 2.97% | 49 | 0.22% | 62097 | 826 | 1.33% | 68 | 0.11% |
| -Viridiplantae | 10418 | 936 | 8.98% | 15 | 0.14% | 37038 | 1507 | 4.07% | 17 | 0.05% |
| total | 37429 | 2658 | 7.10% | 368 | 0.98% | 121718 | 4646 | 3.82% | 587 | 0.48% |
Data in this table presents the numbers of genuses/species with proteins containing the EPIYA motif versus total number of genuses/species in NR. The Eukaryota group is divided into protista, fungi, metzoa and viridiplantae.
Known intracellular bacterial pathogens or bacteria containing III/IV type secretion system, and intracellular parasitic protozoan
| Bacteria | Protista | ||||
|---|---|---|---|---|---|
| Type | Number of species | Type | Number of species | ||
| 245 | 187 | ||||
| T3SS | IPP | ||||
| T3SS | IPP | ||||
| T3SS | IPP | ||||
| T3SS | IPP | ||||
| 76 | IPP | ||||
| T4SS | IPP | ||||
| T4SS | |||||
| T4SS | |||||
| 346 | 120 | ||||
| Brucella | IPB | IPP | |||
| IPB | IPP | ||||
| T4SS | |||||
| 83 | |||||
| IPB | |||||
| IPB | |||||
| IPB | |||||
| IPB | |||||
| Chlamydiae | 23 | ||||
| IPB | |||||
| 62 | |||||
| IPB | |||||
| IPB | |||||
| IPB | |||||
| 168 | |||||
| IPB | |||||
| 9 | |||||
| IPB | |||||
IPB: intracellular parasitic bacteria; IPP: intracellular parasitic protozoan; T3SS: type III secretion system; T4SS: type IV secretion system.
Figure 2Relationship between number of EPIYA motif copies and number of species in known pathogens The p-values are calculated from chi-square test statistic between the group and the control (the total with all the species as the background). The list of pathogens is shown in Table 4
Distribution of top 40 protein sequences containing at least two copies of EPIYA motif
| Protein Name | Number of proteins (number of genuses) | |||||||
|---|---|---|---|---|---|---|---|---|
| CagA | 1015(1) | 1015(1) | ||||||
| hypothetical protein | 689(186) | 15(10) | 10(6) | 242(88) | 162(19) | 78(28) | 127(25) | 55(11) |
| ATP* | 81(21) | 2(2) | 68(11) | 6(4) | 4(3) | 1(1) | ||
| Ankryin | 55(7) | 51(3) | 4(4) | |||||
| DNA* | 52(34) | 3(3) | 40(25) | 7(4) | 1(1) | 1(1) | ||
| Kinase | 43(28) | 5(2) | 23(15) | 4(3) | 11(8) | |||
| zinc finger protein | 43(11) | 43(11) | ||||||
| TPR repeat protein | 33(15) | 33(15) | ||||||
| Polyprotein | 24(2) | 24(2) | ||||||
| SecA | 23(14) | 23(14) | ||||||
| Peptidase | 19(12) | 1(1) | 16(9) | 1(1) | 1(1) | |||
| dynein heavy chain | 17(13) | 4(2) | 1(1) | 11(9) | 1(1) | |||
| elongation factor 2 | 15(7) | 10(2) | 1(1) | 4(4) | ||||
| Palmdelphin | 14(9) | 14(9) | ||||||
| tRNA* | 14(11) | 2(1) | 10(8) | 1(1) | 1(1) | |||
| glycogen synthase | 13(1) | 13(1) | ||||||
| GTP-binding | 13(3) | 12(2) | 1(1) | |||||
| transcriptional regulator | 13(8) | 1(1) | 9(4) | 3(3) | ||||
| unc-119 homolog | 13(6) | 13(6) | ||||||
| FAT tumor suppressor homolog 3 | 12(9) | 12(9) | ||||||
| nuclear ribonucleoprotein | 12(9) | 1(1) | 10(7) | 1(1) | ||||
| 4-alpha-glucanotransferase | 9(1) | 9(1) | ||||||
| paternally expressed 3 | 8(6) | 8(6) | ||||||
| Striatin | 8(7) | 8(7) | ||||||
| Tarp | 8(1) | 8(1) | ||||||
| nuclear autoantigen | 7(6) | 7(6) | ||||||
| putative mannosyltransferase | 7(1) | 7(1) | ||||||
| Ubiquitin | 7(6) | 5(4) | 2(2) | |||||
| 26S proteasome regulatory subunit | 6(3) | 6(3) | ||||||
| cell division protein | 6(4) | 3(3) | 1(1) | |||||
| centaurin, delta 3 | 6(5) | 6(5) | ||||||
| fat tumor suppressor homolog 2 | 6(5) | 6(5) | ||||||
| glycosyl transferase | 6(5) | 6(5) | ||||||
| guanine nucleotide exchange factor | 6(6) | 6(6) | ||||||
| cytochrome c oxidase subunit VI | 5(4) | 5(4) | ||||||
| PEG3 | 5(5) | 5(5) | ||||||
| polyketide synthase | 5(4) | 3(3) | 2(1) | |||||
| polysaccharide biosynthesis protein | 5(3) | 5(3) | ||||||
| TatD-related deoxyribonuclease | 5(1) | 5(1) | ||||||
| translation initiation factor | 5(4) | 2(2) | 2(1) | 1(1) | ||||
ATP* includes ATPase, ABC transporter, ATP-binding protein, and ATP-dependent helicase; DNA* includes DNA photolyase, DNA primase, DNA repair protein, DNA-binding protein, and DNA mismatch repair protein; kinases* includes histidine kinase, protein kinase, hexokinase, serine kinase, and fyn-related kinase; tRNA* includes tRNA synthetase, tRNA formyltransferase, and tRNA ligase.
Figure 3Sequence logos for KK, R4, Tir and Tarp motifs A: the logo was built with 1705 KK motif sequences extracted from 842 CagA protein sequences; B: the logo was built with 979 R4 motif sequences extracted from 842 CagA protein sequences; C: the logo was built with 20 Tir motif sequences extracted from 20 Tir protein sequences of Escherichia Eoli and Citrobacter rodentium; D: the logo was built with 16 Tarp motif sequences extracted from 7 Tarp protein sequences of Chlamydie trachomatis.
Sequences containing KK and R4 motifs in known effectors
| KK Motif | Species | Protein | pY prosition | Locus |
|---|---|---|---|---|
| cagA | Y-899 | NP_207343 | ||
| cagA | Y-918 | NP_207343 | ||
| Tir | Y-477 | BAF52548 | ||
| Ankyrin | Y-1094 | ABB84853 | ||
| BepD protein | Y-28 | YP_034066 | ||
| BepE protein | Y-33 | YP_034067 | ||
| Ankyrin | Y-1024 | ABB84853 | ||
| R4 | ||||
| cagA | Y-972 | NP_207343 | ||
| Tarp | Y-189 | YP_001654788 | ||
| Ankyrin | Y-990 | ABB84853 | ||
Figure 4Workflow of the whole project.