| Literature DB >> 29390964 |
Sergio Martin Espinola1,2, Martin Pablo Cancela2,3, Lauís Brisolara Corrêa1, Arnaldo Zaha4,5.
Abstract
BACKGROUND: Universal stress proteins (USPs) are present in all domains of life. Their expression is upregulated in response to a large variety of stress conditions. The functional diversity found in this protein family, paired with the sequence degeneration of the characteristic ATP-binding motif, suggests a complex evolutionary pattern for the paralogous USP-encoding genes. In this work, we investigated the origin, genomic organization, expression patterns and evolutionary history of the USP gene family in species of the phylum Platyhelminthes.Entities:
Keywords: Evolutionary patterns; Flatworms; Functional divergence; Pseudogenes; Stress responsive proteins
Mesh:
Substances:
Year: 2018 PMID: 29390964 PMCID: PMC5793430 DOI: 10.1186/s12862-018-1129-x
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Fig. 1Organization of the USP gene family in Platyhelminthes. a Platyhelminthes USP genes are distributed in clusters throughout the genome and show lineage-specific expansions and losses. For each species, the total number of USP genes are given in parentheses, followed by a scheme (boxes) showing the genomic organization for some genes. Boxes with blue frame correspond to syntenic genes in Cestoda species. The sequence alignment of the entire cluster in Cestoda parasites (50% of identity at baseline and using E. granulosus as the reference sequence) display high levels of sequence identity in the coding region. Gene identity of USP genes is lost when compared to the Trematoda and Turbellaria classes, suggesting a high divergence of the USP genes between groups. Some USP paralogs in the Cestoda were identified as pseudogenes (dotted, gripped, and striped boxes). Because of the synteny, pseudogenes in T. solium and Echinococcus spp. correspond to gene losses when compared to H. microstoma. The EgrG_ps1 pseudogene refers to the EgrG_2019 sequence (WormBase Parasite annotation). Asterisks indicate the same USP distribution for E. canadensis and E. multilocularis compared to E. granulosus, and for S. haematobium and O. viverrini compared to S. mansoni and C. sinensis, respectively. b The presence of indels (highlighted in red in the protein sequence) generates frameshift mutations and serves as evidence of a pseudogenization process. The sequences of some pseudogenes (EgrG_ps1, HmN_ps1) are very similar to that of their orthologs. Others (EgrG_ps2) are very different, and homologies are difficult to identify accurately by tBLASTn. In addition to the protein alignment, the indels are indicated in the coding sequence for three pseudogenes
Fig. 2Phylogenetic relationships between USPs in the Platyhelminthes. The maximum-likelihood phylogenetic tree (method aLRT-SH for branch support, see text for more details) shows several USP sequences shared by platyhelminthes and annelids, referred as ancestral USP genes (highlighted in yellow). On the other hand, the Trematoda and Cestoda (highlighted in green) classes show species-specific expansions and losses (one asterisk indicates losses in the Taeniidae family; while two asterisks represent losses in the genus Echinococcus). For simplification (and to facilitate associations with Fig. 1), we grouped the Cestoda sequences according to the IDs of E. granulosus based on the ortholog relationship (see Additional file 1: Table S1). Gene names are in italic. Prefix species are as follow: CapteP for C. teleta, HelroG for H. robusta, Gsa for G. salaris, Mli for M. lignano, SMU for S. mediterranea, Smp for S. mansoni, Csin for C. sinensis, TsM for T. solium, HmN for H. microstoma, and EgrG for E. granulosus. The number in parenthesis beside SMU and Mli correspond to the number of collapsed sequences in S. mediterranea and M. lignano, respectively. The USPs clustered in the chromosome (see Fig. 1) are also grouped together in the phylogenetic tree, suggesting an origin by subsequent tandem duplications. Identical sequences from M. lignano (~ 35) were excluded in the analysis. Three molluscs (L. gigantea, C. gigas, and O. bimaculoides) and two annelids (H. robusta and C. teleta) were used as outgroups. A minor ID for M. lignano and G. salaris was used (Additional file 1: Table S5). For an extended tree, see Additional file 3: Figures S1 and S2. Branch support values obtained by Bayesian Inference are in bold font. Only values with a branch support greater than 0.7 are showed
Fig. 3Relationship between USP sequence modifications and gene expression patterns. a The gene expression profile is highly variable between the different life cycle stages of Cestoda and Trematoda species, with some USP genes with null or very low expression (EgrG_08734, EgrG_08735 and orthologs), others expressed in specific manner (EgrG_08111; EgrG_07797 and orthologs), and others constitutively expressed (EgrG_08738 and orthologs). ID numbers in red color refer to USP proteins that exhibit modifications in the ATP binding motif. The asterisk refers to the “ancestral” USPs (see Fig. 2). b USP sequence variations in the Platyhelminthes. On top, sequence logo generated with all USP sequences without changes to the ATP binding motif [Gx2Gx9G(S/T)]. Amino acids interacting with ligands are shown in boxes. Below, alignment of USP protein sequences showing modifications in the protein motif. ID numbers (in red) refer to USP proteins for which RNA-seq data was available, to facilitate the comparison between sequence modification and gene expression patterns. Modifications in the [Gx2Gx9G(S/T)] motif and at other sites known to be involved in ligand interaction are highlighted in red and yellow, respectively. The sequence of the Methanococcus jannaschii USP MJ0577 USP was used as a reference (starting from position 6), with the ATP binding motif highlighted in green. The residues under functional divergence are indicated by arrows (black arrows, residues shared by Cestoda and Trematoda; gray arrows: residues specific to the Cestoda or Trematoda) (see Table 1). c qPCR gene expression analysis of USP genes in the pre-adult form of E. ortleppi. Some genes (E0_08736, Eo_08738, Eo_07797, and Eo_10769) showed higher levels of gene expression than others (Eo_08734 and Eo_08735), in line with previously published RNA-seq data for the genus Echinococcus (see above). Asterisks indicate a p-value < 0.01 for the comparison of Eo_08734 and Eo_08735 with the other genes
Positive selection analysis of USP genes for species of class Cestoda
| Model | Estimates of parameters |
| Positive selected sites (PSS)a |
|---|---|---|---|
| M0 (one-ratio) | −17568.832757 | None | |
| M1a (neutral) | −16175.637618 | Not allowed | |
| M2a (selection) | −16175.637618 |
| |
| M7 (beta) | −16106.206229 | Not allowed | |
| M8 (beta & | −16028.140026 |
|
aPositive selected sites (Bayes Empirical Bayes, BEB) are inferred at a cutoff posterior probability P ≥ 95%. Values for P ≥ 99% are shown in bold font. The underlined PSS indicate a value of 3 (range from 1, positive selection, to 7, purifying selection) obtained with SELECTON (see Additional file 1: Table S4). Amino acid sites correspond to the reference sequence MJ0577 from Methanococcus jannaschii
bDespite the presence of positive selected sites (24 sites with P ≥ 50%, 1 site with P ≥ 95%), the LRT test was not significant when comparing the Log likelihood scores from the M1a and M2a models
Functional divergence analysis (Type I) within Cestoda and Trematoda species
| Cestoda comparisonsa | ||||
| Cluster 1 | Cluster 2 | LRT | Sites (Qk > 0.9)c | |
| Ce11_07258 | Ce12_09018 | 0.70 ± 0.14 | 23.04 | 28,31,37,44,94,98,124 |
| Ce10_10769 | Ce12_09018 | 0.78 ± 0.17 | 20.59 | 14, |
| Ce11_07258 | Ce10_10769 | 0.95 ± 0.16 | 31.63 | Almost all |
| Ce4_08735 | Ce3_09839 | 0.97 ± 0.14 | 45.46 | Almost all |
| Ce4_08735 | Ce5_08738 | 0.98 ± 0.17 | 32.99 | Almost all |
| Ce4_08735 | Ce6_08734 | 0.91 ± 0.14 | 40.82 | Almost all |
| Ce4_08735 | Ce8_08736 | 1.36 ± 0.15 | 80.72 | Almost all |
| Ce5_08738 | Ce6_08734 | 0.80 ± 0.18 | 18.96 | |
| Ce5_08738 | Ce8_08736 | 0.64 ± 0.14 | 18.81 |
|
| Ce6_08734 | Ce8_08736 | 0.79 ± 0.14 | 29.00 | 19, |
| Trematoda comparisonsb | ||||
| Cluster 1 | Cluster 2 | LRT | Sites (Qk > 0.9)c | |
| Tr4_CsinT265 | Tr3_CsinT265 | 0.64 ± 0.20 | 9.84 | None |
| Tr3_CsinT265 | Tr5_CsinT265 | 0.75 ± 0.21 | 12.58 | 134 |
| Tr4_CsinT265 | Tr5_CsinT265 | 0.99 ± 0.24 | 16.21 | Almost all |
| Tr2_SmpA | Tr3_CsinT265 | 0.94 ± 0.13 | 45.93 | Almost all |
| Tr2_SmpA | Tr5_CsinT265 | 0.68 ± 0.17 | 15.36 |
|
| Tr2_SmpA | Tr1_SmpA | 0.69 ± 0.18 | 14.17 | 12,26,102 |
| Tr7_CsinT265SmpA | Tr1_SmpA | 0.68 ± 0.19 | 12.56 | 12,26,102 |
| Tr6_CsinT265SmpA | Tr9_CsinT265SmpA | 0.84 ± 0.19 | 17.70 | 12, |
aCestoda clusters are defined based on the E. granulosus IDs, e.g. the cluster Ce3_09839 is composed by the EgrG_09839, EmuJ_09839, Ecan_08199, TsM_09222, and HmN_01226 sequences
bTrematoda clusters are described as follows: Tr1 (Smp_136870, A_04288, Smp_136890, A_06342), Tr2 (Smp_043120, A_03767, Smp_202690, A_04393), Tr3 (Csin107893, T265_02180, CsinSc585new, T265_02179), Tr4 (Csin107892, Csin110039, Csin107891, T265_02178a, T265_02178b,T265_02177), Tr5 (Csin107894, T265_02181, Csin107895, T265_02182), Tr6 (Smp_076400, A_07834, Csin112002, T265_05585), Tr7 (Smp_031300, A_04567, Csin112617, T265_03499), Tr9 (Smp_001000, A_07787, Smp_200240, A_05680)
cAmino acid sites correspond to the reference sequence MJ0577 from Methanococcus jannaschii. Amino acids shared by Cestoda and Trematoda species are indicated in bold font and plotted in the Fig. 3b. Underlined sites correspond to positive selected sites detected with PAML (Table 1)
Fig. 4Possible evolutionary fates for the USP paralogs in Platyhelminthes parasites. First, a new USP copy can accumulate deleterious mutations, leading to alterations in the protein sequence with a loss of function (pseudogenization). From this, some ncRNAs can be transcribed, and thus, regulate the gene expression of the other USP paralogs via mRNA degradation (regulation by ncRNAs). Second, the USP paralog could undergo several non-synonymous mutations, thereby acquiring a new function (neofunctionalization). Finally, some USP copies could maintain the same function, but be expressed in a specific life cycle stage or in response to a specific stressor (subfunctionalization).