| Literature DB >> 27894257 |
Diego Garzón-Ospina1,2, Johanna Forero-Rodríguez1, Manuel A Patarroyo3,4.
Abstract
BACKGROUND: The merozoite surface protein 7 (MSP7) is a Plasmodium protein which is involved in parasite invasion; the gene encoding it belongs to a multigene family. It has been proposed that MSP7 paralogues seem to be functionally redundant; however, recent experiments have suggested that they could have different roles.Entities:
Keywords: Episodic positive selection; Functional divergence; Intensified selection; Multigene family; Plasmodium; Relaxed selection; msp7
Mesh:
Substances:
Year: 2016 PMID: 27894257 PMCID: PMC5126858 DOI: 10.1186/s12862-016-0830-x
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Fig. 1Schematic representation of the chromosomal msp7 loci in 13 Plasmodium genomes. The genes flanking the msp7 chromosome region in Plasmodium species are represented by orange boxes. The coloured boxes within flanking sequences represent msp7 genes in each species whilst black boxes symbolise pseudogenes. The genes are given in alphabetical order from left to right. The dashed lines connect orthologous genes. All genes are represented to scale, but the distance between them is not representative. Question marks refer to what were not clearly orthologous relationships. By contrast with hominid-parasites, species from monkey-parasite and rodent-parasite lineages seem to have similar evolutionary histories regarding msp7 expansion. The grey boxes are lineage-specific genes only found in hominid-parasite species, as they do not belong to the msp7 family (the latter gene representations are not to scale)
Fig. 2msp7 gene family phylogeny inferred by the DLTRS evolutionary model. a Species tree used for generating the MSP7 tree. b MSP7 tree created by evolving down the species tree. Numbers represent different clades whilst numbers on branches are posterior probability values. Nine major clades were identified on the tree. Proteins were clustered in agreement with parasite phylogenetic relationships, clades 1 (red), 2 (yellow) and 5 (purple) being the most ancestral ones. The clades clustering genes from monkey-parasite lineage are depicted in green, proteins from rodent-parasite lineage in blue and hominid-parasite lineage in grey. The P. inui specie-specific duplicate was not considered in this analysis. Due to the family’s complex evolutionary history (which includes gene conversion, intragenic recombination, positive and/or balancing selection) the MCMC analysis did not converge and therefore the duplication/lost rates were not obtained even though a tree reconciliation similar to other topologies was inferred (BY and ML)
In-silico characterisation of putative MSP7 proteins
|
| ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | B | C | D | E | F | G | H | I | J | K | L | M | ||
|
| SP | y | y | y | y | y | y | y | y | y | - | y | y | y |
| MSP7_C | y | y | y | - | y | y | y | y | y | y | y | y | - | |
|
| SP | y | y | y | y | - | y | y | y | y | - | y | y | y |
| MSP7_C | y | y | y | - | y | y | y | y | y | y | y | y | - | |
|
| SP | y | - | y | y | y | - | y | y | |||||
| MSP7_C | y | y | y | y | y | - | y | y | ||||||
|
| SP | y | y | y | y | y | ||||||||
| MSP7_C | y | - | y | y | y | |||||||||
|
| SP | y | - | y | y | y | ||||||||
| MSP7_C | y | - | y | y | y | |||||||||
|
| SP | y | y | y | y | |||||||||
| MSP7_C | y | y | y | - | ||||||||||
|
| SP | y | y | y | y | |||||||||
| MSP7_C | y | y | y | - | ||||||||||
|
| SP | y | y | y | y | |||||||||
| MSP7_C | y | y | y | - | ||||||||||
|
| SP | y | y | y | y | |||||||||
| MSP7_C | y | y | y | - | ||||||||||
|
| SP | y | y | y | y | |||||||||
| MSP7_C | y | y | y | - | ||||||||||
|
| SP | y | y | - | y | y | y | y | ||||||
| MSP7_C | y | y | y | y | y | y | y | y | y | |||||
|
| SP | y | - | y | y | y | y | y | ||||||
| MSP7_C | y | y | y | y | y | y | y | y | y | |||||
|
| SP | y | ||||||||||||
| MSP7_C | y | |||||||||||||
Eighty-three sequences between flanking genes were screening for identifying a signal peptide and the characteristic MSP_7C domain (Pfam access number: PF12948). y: proteins having a signal peptide according to the Phobius algorithm or a MSP_7C domain in a Pfam search. -: proteins appeared not to have a signal peptide or MSP_7C domain
Fig. 3Schematic representations of gene conversion tracks identified by the GENECONV method. Each gene is represented by a colour bar (pvivmsp7C (blue), pvivmsp7H (red), pvivmsp7I (green), pvivmsp7B (purple), pvivmsp7E (fuchsia), and pvivmsp7G (orange)); a different coloured rectangle is a graphical representation of sequence fragments potentially originating from gene conversion. Conversion tracks were mainly located in the 3′-ends. The % value refers to the similarity value for the sequence region involved in gene conversion (or intragenic recombination)
Fig. 4Phylogenies analysed for episodic selection. Each orthologous cluster was analysed by the Branch-site REL method. The shade of each colour on branches indicates strength of selection (red shows ω >13, blue ω ≤1 and grey ω = 1). The size of each colour represents the percentage of sites in the corresponding class found by Branch-site REL. Branches have been classified as undergoing episodic diversifying selection by the p-value corrected for multiple testing using the Holm-Bonferroni method at p < 0.05. a. clade 1; b. pviv/pcynmsp7B and 7E; c. pviv/pcynmsp7C and pinumsp7B; d. pviv/pcynmsp7F and pinumsp7C; e. pviv/pcynmsp7G, pkno/pcoamsp7C and pinumsp7D; f. pviv/pcynmsp7H/7I, pkno/pcoamsp7D and pinumsp7E; g. clade 2; h. pviv/pcynmsp7L and pinumsp7H and i. clade 4. At the bottom of each phylogeny there is a scale representation of msp7s. The blue boxes represent the encoded N-terminal region, the light brown ones symbolise the central region and the purple boxes the MSP_7C domain. Numbers within boxes represent the number of codons under positive selection inferred by MEME, SLAC, FEL, REL and FUBAR methods using the Datamonkey web server
In silico assessment of functional divergence between paralogous and orthologous MSP7 proteins
| # | Cluster A | Compared to | Cluster B | θD | LRTθD | RELAX ( |
|---|---|---|---|---|---|---|
| 1 | Clade 1 Primate-parasites | Clade 1 Rodent-parasites | 0.54 | 4.74a | NP | |
| 2 | Clade 1 Primate-parasites | Clade 3d Primate-parasites (B/E) | −0.21 | 0.00 | Intensification (0.0038) | |
| 3 | Clade 1 Rodent-parasites | Clade 3d Primate-parasites (B/E) | 0.78 | 15.19a | NP | |
| 4 | Clade 1 Primate-parasites | Clade 3d Primate-parasites (C/H/E/D/I) | 0.34 | 4.25a | Intensification (0.00006) | |
| 5 | Clade 1 Primate-parasites | Clade 3c Primate-parasites (G/C/D) | 0.90 | 14.46a | Relaxation (1) | |
| 6 | Clade 1 Rodent-parasites | Clade 3c Primate-parasites (G/C/D) | 0.54 | 8.89a | NP | |
| 7 | Clade 1 Rodent-parasites | Clade 3d Primate-parasites (C/H/E/D/I) | 0.37 | 13.01a | NP | |
| 8 | Clade 1 Primate-parasites | Clade 2 Primate-parasites | 0.45 | 3.79 | Relaxation (0,2) | |
| 9 | Clade 1 Primate-parasites | Clade 2 Rodent-parasites | 0.20 | 0.76 | NP | |
| 10 | Clade 1 Rodent-parasites | Clade 2 Primate-parasites | 0.74 | 13.21a | NP | |
| 11 | Clade 1 Rodent-parasites | Clade 2 Rodent-parasites | 0.93 | 24.39a | Intensification (9.1e-7) | |
| 12 | Clade 2 Primate-parasites | Clade 2 Rodent-parasites | 0.87 | 12.60a | NP | |
| 13 | Clade 1 Primate-parasites | Clade 4 Rodent-parasites | 0.15 | 0.14 | NP | |
| 14 | Clade 1 Rodent-parasites | Clade 4 Rodent-parasites | 0.69 | 22.20a | Relaxation (0.4) | |
| 15 | Clade 3d Primate-parasites (B/E) | Clade 3d Primate-parasites (C/H/E/D/I) | 0.03 | 0.14 | Intensification (0.04) | |
| 16 | Clade 3d Primate-parasites (B/E) | Clade 2 Primate-parasites | 0.72 | 18.91a | Intensification | |
| 17 | Clade 3d Primate-parasites (B/E) | Clade 2 Rodent-parasites | 0.31 | 4.99a | NP | |
| 18 | Clade 3d Primate-parasites (B/E) | Clade 4 Rodent-parasites | 0.35 | 7.76a | NP | |
| 19 | Clade 3d Primate-parasites (B/E) | Clade 3c Primate-parasites (G/C/D) | 0.37 | 3.91a | Intensification (2.8e-7) | |
| 20 | Clade 2 Primate-parasites | Clade 3d Primate-parasites (C/H/E/D/I) | 0.93 | 37.86a | Intensification (0.0002) | |
| 21 | Clade 3d Primate-parasites (C/H/E/D/I) | Clade 2 Rodent-parasites | 0.81 | 20.90a | NP | |
| 22 | Clade 3d Primate-parasites (C/H/E/D/I) | Clade 4 Rodent-parasites | 0.72 | 15.04a | NP | |
| 23 | Clade 2 Primate-parasites | Clade 4 Rodent-parasites | 1.0 | 27.64a | NP | |
| 24 | Clade 3c Primate-parasites (G/C/D) | Clade 3d Primate-parasites (C/H/E/D/I) | 0.45 | 9.55a | Relaxation (0.000008) | |
| 25 | Clade 2 Primate-parasites | Clade 3c Primate-parasites (G/C/D) | 1.0 | 23.68a | Relaxation (0.5) | |
| 26 | Clade 3c Primate-parasites (G/C/D) | Clade 2 Rodent-parasites | 0.46 | 3.04 | NP | |
| 27 | Clade 3c Primate-parasites (G/C/D) | Clade 4 Rodent-parasites | 0.44 | 3.94a | NP | |
| 28 | Clade 4 Rodent-parasites | Clade 2 Rodent-parasites | 0.57 | 8.56a | Intensification (0.03) |
The coefficients of divergence (θD) and their LRT values from pairwise cluster comparisons in the msp7 multigene family. LRTθD is the (log) score for the likelihood ratio test against the null hypothesis (θD = 0) [5]. It is the output of DIVERGE and it follows a chi-square distribution with one degree of freedom; thus, values greater than or equal to 3.84 (a) indicate functional divergence between pairwise clusters. Selection intensity (relaxation or intensification) found by the RELAX method is shown for paralogous pairwise comparisons (see Additional file 8). Comparisons 2, 4 and 15 revealed fewer positive selected sites on test branches than on reference branches as well as an intensification of negative selected sites and non-significant θD. Comparisons 11, 19, 20 and 28 revealed an increased proportion of positive selected sites on the test branches, having an intensification of this kind of selection, while the proportion of negative selected sites stayed the same or decreased. The θD values were statistically significant. Comparisons 5 and 24 gave a statistically significant θD and relaxed selection (on test branches). NP: analysis was not performed because proteins came from different species