| Literature DB >> 25803627 |
Petar Petrov1, Riikka Syrjänen1, Jacqueline Smith2, Maria Weronika Gutowska2, Tatsuya Uchida3, Olli Vainio1, David W Burt2.
Abstract
"Trojan" is a leukocyte-specific, cell surface protein originally identified in the chicken. Its molecular function has been hypothesized to be related to anti-apoptosis and the proliferation of immune cells. The Trojan gene has been localized onto the Z sex chromosome. The adjacent two genes also show significant homology to Trojan, suggesting the existence of a novel gene/protein family. Here, we characterize this Trojan family, identify homologues in other species and predict evolutionary constraints on these genes. The two Trojan-related proteins in chicken were predicted as a receptor-type tyrosine phosphatase and a transmembrane protein, bearing a cytoplasmic immuno-receptor tyrosine-based activation motif. We identified the Trojan gene family in ten other bird species and found related genes in three reptiles and a fish species. The phylogenetic analysis of the homologues revealed a gradual diversification among the family members. Evolutionary analyzes of the avian genes predicted that the extracellular regions of the proteins have been subjected to positive selection. Such selection was possibly a response to evolving interacting partners or to pathogen challenges. We also observed an almost complete lack of intracellular positively selected sites, suggesting a conserved signaling mechanism of the molecules. Therefore, the contrasting patterns of selection likely correlate with the interaction and signaling potential of the molecules.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25803627 PMCID: PMC4372362 DOI: 10.1371/journal.pone.0121672
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The Trojan family in chicken.
A) Mystran, Trojan and Thracian on chicken chromosome Z. Genes are represented as hollow boxes showing their direction. Exons are shown as filled fragments within the gene boxes; B) The overall topology organization of the Trojan family proteins. Complement control protein (CCP) domains, fibronectin type III domains (FN3) and protein tyrosine phosphatase domains (PTP) are labeled. Signal peptides (SP), domains, transmembrane regions (TM), N-glycosylation (N-glyc) sites and intrinsically disordered (ID) binding sites are indicated. The Mystran CCP domain is shown in gray scale, as it was predicted slightly below threshold, but had the expected position. C) The cytoplasmic tails of Mystran, Trojan and Thracian are shown in a “snake” amino acids view. Short functional motifs are indicated: MAPK docking motif (DOC_MAPK), Grb association motif (LIG_SH_GRB), 14-3-3 docking motif (LIG_14-3-3), PKA phosphorylation motif (MOD_PKA), ITAM (LIG_TYR_ITAM) and TRAF2 interacting motif (LIG_TRAF2). For Thracian, the positions of cytoplasmic sites identified to be under positive evolutionary selection with probability higher than 90% and 95% (*) are indicated.
Fig 2Expression of chicken Mystran, Trojan and Thracian.
The relative expression levels of Mystran, Trojan and Thracian genes are presented as RNASeq reads from different organs, tissues, or cell types. Data from Ensembl 75.
Trojan family genes in avian and non-avian species.
| Gene name | Database ID (NCBI) | Reference Number (NCBI) | Predicted gene coordinates | Orientation |
|---|---|---|---|---|
|
| ||||
|
| scaffold3198 | NW_004679491.1 | 2101–19824 | plus |
|
| scaffold3198 | NW_004679491.1 | 25334–21537 | minus |
|
| scaffold3514 | NW_004679800.1 | 7158–15026 | plus |
|
| scaffold139 | KK719755.1 | 24708–6970 | minus |
|
| scaffold140 | KK717827.1 | 1362390–1363028 → | plus |
|
| scaffold139 | KK719755.1 | 1–4804 | plus |
|
| scaffold140 | KK717827.1 | 1352341–1358982 | plus |
|
| scaffold483 | KL447474.1 | 1797517–1822404 | plus |
|
| scaffold483 | KL447474.1 | 1832626–1824723 | minus |
|
| scaffold483 | KL447474.1 | 1847004–1837135 | minus |
|
| scaffold483 | KL447474.1 | 1859639–1850416 | minus |
|
| C10295565_1 | NW_004936052.1 | 2470–1 → | minus |
|
| scaffold348_1 | NW_004930102.1 | 31933–21297 | minus |
|
| scaffold348_1 | NW_004930102.1 | 11621–18923 | plus |
|
| scaffold348_1 | NW_004930102.1 | 993–5823 | plus |
|
| N00377 | NW_004775827.1 | 44655–14629 | minus |
|
| N00377 | NW_004775827.1 | 4801–11236 | plus |
|
| N00129 | NW_004775826.1 | 31992–3152 | minus |
|
| C13346903 | JH749265.1 | 2–898 → | plus |
|
| C13853812 | JH742003.1 | 1–3519 → | plus |
|
| scaffold4670 | JH740780.1 | 1–6016 | plus |
|
| scaffold1509 | JH740316.1 | 141695–144904 | plus |
|
| scaffold1509 | JH740316.1 | 130683–139060 | plus |
|
| Chromosome Z | NC_015041.1 | 9304150–9324725 | plus |
|
| scf900160276923 | JH556470.1 | 57058–37500 | minus |
|
| scf900160276923 | JH556470.1 | 29926–35443 | plus |
|
| scf900160274638 | JH554185.1 | 950–1 → | minus |
|
| scf900160259551 | JH539098.1 | 1243–326 | minus |
|
| scaffold569 | KK736078.1 | 1320257–1344925 | plus |
|
| scaffold569 | KK736078.1 | 1354221–1347799 | minus |
|
| scaffold569 | KK736078.1 | 1365487–1356898 | minus |
|
| Chromosome Z | NC_011493.1 | 39643887–39669696 | plus |
|
| Chromosome Z | NC_011493.1 | 39676959–39672793 | minus |
|
| Chromosome Z | NC_011493.1 | 39688288–39679555 | minus |
|
| ||||
|
| chrUn0393 | GL343585.1 | 174373–223856 | plus |
|
| chrUn0393 | GL343585.1 | 434289–412843 | minus |
|
| scaffold1093 | KB480077.1 | 2501–66524 | plus |
|
| Scfld2946 | JH586667.1 | 14548–23778 → | plus |
|
| Scfld1664 | JH585500.1 | 1–41787 | plus |
|
| Scfld6634 | JH589385.1 | 1221–4974 → | plus |
|
| Scfld1664 | JH585500.1 | 105715–50952 | minus |
|
| scaffold00761 | JH127322.1 | 548001–727560 | plus |
|
| scaffold00761 | JH127322.1 | 474809–356096 | minus |
Gene names combine the respective homologue: Mystran (MYS), Trojan (TRO), Thracian (THR), Protein phosphatase (PP) or Transmembrane protein (TP) and the species abbreviation. Avian species: A. platyrhynchos (ANAPL), C. brachyrhynchos (CORBR), C. canorus (CUCCA), F. peregrinus (FALPE), F. albicollis (FICAL), G. fortis (GEOFO), M. gallopavo (MELGA), M. undulatus (MELUN), O. hoazin (OPPHO), T. guttata (TAEGU); Non-avian species: A. carolinensis (ANOCA), C. mydas (CHEMY), C. picta (CHRPI), L. chalumnae (LATCH). An arrow indicates a gene found on more than one scaffold and the direction the scaffolds were combined.
Fig 3Trojan family proteins in other avian and non-avian species.
Domain types, other topology properties and short functional motifs are shown in the legend. Domains presented in gray were predicted below threshold, but had the expected type, position and relative size. A) Avian species. B) Non-avian species.
Gene conversion analyzes for the Trojan family in avian species.
| Sequence I | Sequence II | BC KA P-value | Fragment in Sequence I | Fragment in Sequence II |
|---|---|---|---|---|
| MYS_ANAPL | TRO_ANAPL | 3.93E-002 | 802–1176 (375) | 607–1008 (402) |
| TRO_ANAPL | THR_ANAPL | 9.20E-002 | 358–661 (304) | 604–909 (306) |
| MYS_CORBR | TRO_CORBR | 9.80E-002 | 919–3465 (2547) | 706–1447 (742) |
| TRO1_CUCCA | TRO2_CUCCA | 9.82E-002 | 376–1335 (960) | 385–1290 (906) |
| TRO2_CUCCA | THR_CUCCA | 1.17E-001 | 910–1435 (526) | 1174–1495 (322) |
| MYS_GALGA | TRO_GALGA | 5.86E-002 | 595–1641 (1047) | 352–1383 (1032) |
| MYS_GEOFO | TRO_GEOFO | 2.42E-001 | 61–531 (471) | 274–771 (498) |
| MYS_OPHHO | TRO_OPHHO | 1.98E-002 | 559–867 (309) | 316–618 (303) |
The gene converted fragments between sequence pairs (Sequence I and Sequence II) are given with respect to their unaligned offsets and lengths within each sequence. “BC KA P-values”: Bonferroni-corrected KA (BLAST-like P-values). Names combine Mystran (MYS), Trojan (TRO) or Thracian (THR) and the corresponding species abbreviation. Species: A. platyrhynchos (ANAPL), C. brachyrhynchos (CORBR), C. canorus (CUCCA), G. gallus (GALGA), G. fortis (GEOFO), O. hoazin (OPPHO).
Fig 4ML tree of the Trojan family members from all species.
Mystrans are shown in blue, Trojans are shown in green and Thracians are shown in red. Major groups of homologues are enclosed within gray boxes. The tree is rooted to L. chalumnae and bootstrap values are indicated at nodes. Gene names combine the respective orthologue: Mystran (MYS), Trojan (TRO), Thracian (THR), Protein phosphatase (PP) or Transmembrane protein (TP) and species abbreviations. Avian species: A. platyrhynchos (ANAPL), C. brachyrhynchos (CORBR), C. canorus (CUCCA), F. peregrinus (FALPE), F. albicollis (FICAL), G. fortis (GEOFO), M. gallopavo (MELGA), M. undulatus (MELUN), O. hoazin (OPPHO), T. guttata (TAEGU); Non-avian species: A. carolinensis (ANOCA), C. mydas (CHEMY), C. picta (CHRPI), L. chalumnae (LATCH).
Positively selected sites in the Trojan gene family.
| Gene | LL test (M8A vs M8) | Sites with probability >90% |
|---|---|---|
| Mystran | 2ΔL = 109.8P-value = 1.1E−25ω = 2.1 (16.5%) | 4Q*, 6A*, 23H**, 24D, 28G*, 30Y*, 32G**, 33Y**, 34S, 44D**, 49R*, 54T**, 56A*, 84G*, 86D**, 89K, 90P*, 92Y, 163A, 165E, 166K**, 168A*, 169L**, 170D*, 172D*, 173G, 175I*, 179T, 181Q**, 188N, 194Q, 195T*, 251S*, 288S*, 290R*, 296A*, 300 K, 309R*, 322R*, 338Q, 344H*, 361T*, 366T*, 384S*, 397G*, 399P, 457S*, 462P*, 498G, 508A, 511S*, 531I* |
| Trojan | 2ΔL = 13.5P-value = 0.00024ω = 1.5 (28.1%) | 255A, 314T*, 316G**, 319H*, 321C, 324L, 326L, 327D*, 430S* |
| Thracian | 2ΔL = 120.2P-value = 5.7E−28ω = 9.2 (7.2%) | 26G*, 27A**, 28G*, 29A**, 30V*, 33K**, 34T**, 35E*, 36E**, 41E**, 48L, 87K**, 93G**, 94L*, 96A*, 190T**, 196A*, 465S*, 481A |
Amino acids from chicken Mystran, Trojan and Thracian with Bayesian posterior probabilities to belong to site-class under positive selection are listed. Probability: >90%, >95% (*) or >99%(**), as inferred by Bayes-Empirical-Bayes (BEB).
Fig 5Evolutionary selection of the Trojan family members in chicken.
Amino acid postmean ω values are mapped onto the protein topologies. Non selected sites are shown in blue, selected sites with probability below 90% are shown in light blue and selected sites with probability greater than 90% are shown in orange. Sites with probability greater than 95% and 99% are indicated by one or two red dots, respectively. Domain types and other topology properties are shown in the legend. The Mystran CCP domain is shown in gray scale, as it was predicted slightly below threshold, but had the expected position.
Fig 6Co-evolutionary analyzes of Trojan family members.
A) Intermolecular co-evolution between Trojan, Mystran and Thracian. Positions of co-evolving amino acids are mapped onto proteins topology from chicken. Correlation coefficients are indicated between each pair of residues. Coordinates on the polypeptide chain are indicated for each domain and transmembrane regions. Domain types and other topology properties are shown in the legend. B) Intramolecular co-evolution from chicken Mystran, Trojan and Thracian. Numerical values indicate the protein region to which the majority of network residues are confined.