Literature DB >> 35909628

MAPPINGS, a tool for network analysis of large phospho-signalling datasets: application to host erythrocyte response to Plasmodium infection.

Jack Adderley¹, Finn O'Donoghue², Christian Doerig¹, Stephen Davis².

Abstract

Large datasets of phosphorylation interactions are constantly being generated, but deciphering the complex network structure hidden in these datasets remains challenging. Many phosphorylation interactions occurring in human cells have been identified and constitute the basis for the known phosphorylation interaction network. We overlayed onto this network phosphorylation datasets obtained from an antibody microarray approach aimed at determining changes in phospho-signalling of host erythrocytes, during infection with the malaria parasite Plasmodium falciparum. We designed a pathway analysis tool denoted MAPPINGS that uses random walks to identify chains of phosphorylation events occurring much more or much less frequently than expected. MAPPINGS highlights pathways of phosphorylation that work synergistically, providing a rapid interpretation of the most critical pathways in each dataset. MAPPINGS confirmed several signalling interactions previously shown to be modulated by infection, and revealed additional interactions which could form the basis of numerous future studies. The MAPPINGS analysis strategy described here is widely applicable to comparative phosphorylation datasets in any context, such as response of cells to infection, treatment, or comparison between differentiation stages of any cellular population.

Entities: Chemical

Keywords: Computational network analysis; Host-pathogen interactions; Malaria; Signalling

Year: 2022 PMID： 35909628 PMCID： PMC9325900 DOI： 10.1016/j.crmicr.2022.100149

Source DB: PubMed Journal: Curr Res Microb Sci ISSN： 2666-5174

Introduction

Protein phosphorylation is one of the most prominent among several post-translational modifications which alter the functionality of affected proteins. Protein phosphorylation is achieved by a family of enzymes known as kinases, which transfer the gamma phosphate of adenosine triphosphate (ATP) onto hydroxyl groups of amino acids (protein kinases) (Fabbro et al., 2015). In humans, protein kinases make up approximately 2% of encoded genes, yet over 50% of all proteins are phosphorylated across >200,000 known phosphorylation sites (Taylor and Kornev, 2011, Endicott et al., 2012); the inclusion of putative phosphorylation would put the number of potential phosphorylation sites in the human proteome closer to a million (see www.phosphonet.ca). Therefore, it is not surprising that kinases play essential roles in a plethora of intra- and extracellular processes (Cohen, 2002, Kannan et al., 2007, Walsh et al., 2005). The dynamic activation of kinases, along with the activity of protein phosphatases (enzymes which remove the phosphate groups from proteins) enable the fine control of numerous cellular functions ranging from regulating metabolism, transcription and translation, protein transport and cell growth, division and differentiation (Endicott et al., 2012, Cargnello and Roux, 2011). The capacity of protein kinases to act as substrates of other protein kinases underlies a complex interconnected web of signal transduction. In a recent study, Olow et al. mapped a large number of these phosphorylation interactions in human cells, cataloguing 1733 functionally interconnected proteins into a network denoted as PhosphoAtlas (Olow et al., 2016). This seminal work enabled large phosphorylation datasets to be mapped into sophisticated networks, and has allowed for the exploration of how each phosphorylation on a given protein can impact its immediate neighbours in the network. Additionally, datasets that report on phosphorylation changes under various conditions (e.g. infection of a cell by a pathogen or treatment with a drug) can be analysed holistically to provide a more detailed understanding of how particular intracellular conditions impact on the phosphorylation-based signalling environment. Large amounts of signalling information can be obtained through microarrays of phospho-specific antibodies, which report on phosphorylation changes across a large number of proteins. Antibody microarrays are highly sensitive quantitative tools capable of analysing a sample without complicated or expensive enrichment protocols, which is one of the drawbacks of mass spectrometry-based techniques (reviewed in (Yue and Pelech, 2018)). This positions the antibody microarray as an ideal system to assess the signalling environment inside human cells in disease and infection settings. We have applied microarrays in our efforts to understand how intracellular pathogens alter their host cells during development and identified a number of key host proteins essential to the proliferation of various pathogens (Adderley et al., 2020, Haqshenas et al., 2019, Haqshenas et al., 2017). However, the datasets obtained from antibody microarray experiments are complex and difficult to interpret holistically. Currently, typical analysis consists of identifying the key phosphorylation/protein changes manually, and the focus of any follow-up analysis is often based on the phosphorylation events that cause the largest changes observed. This often leads to less obvious changes being disregarded, despite the potential that some of these changes may play important signalling roles. To address this fundamental problem, we have developed a computational technique named MAPPINGS (Mapping and Analysis of Phosphorylation Pathways Identified through Network/Graph Signalling) that simulates and records random walks that `traverse’ through an observed phospho-signalling network (available at https://github.com/FinnOD/mappings). The walks are not true random walks because of additional rules imposed that prevent repeated use of the same edge (such walks are called trails). The edges are weighted by the magnitude of the observed phosphorylation differences between the datasets (in our case, infected versus non-infected erythrocytes), so that the algorithm tends to generate paths associated with large negative (infection causes a decrease in the signal) or positive (infection causes an increased in the signal) differences (Fig. 1). This strategy was formulated using PhosphoAtlas (Olow et al., 2016) as the underlying network, which was further curated to include phosphorylation connections relevant to the antibodies present on the microarray. We applied our computational technique to our published datasets on the blood-stage development of the human malaria parasite, Plasmodium falciparum (Adderley et al., 2020). This consisted of a control dataset (uninfected red blood cells), and three datasets obtained from erythrocytes infected with the parasite at the three major distinguishable stages of parasite development (see our original study for more detail (Adderley et al., 2020)).

Fig. 1

Overview of data generation, processing, modelling and output using antibody microarrays and MAPPINGS. 1) Sample generation, samples used for the development of the network analysis approach used here were of human red blood cells following infection with Plasmodium falciparum. 2) Application of the samples onto the antibody microarrays. 3) Selection of reliable antibodies based on the signal intensity observed, the error observed between the replicates and removal of signals identified as cross reactive to parasite material (see Methods for more detail). 4) The reliable signals are split into positive (green) and negative (red) networks and are mapped to the known global phosphorylation interactions. 5) The path analysis simulates and records large numbers of modified random walks (see main text) where paths are weighted by the observed intensity differences between infected and healthy red blood cells. Separate analyses were performed using the positive and negative values of the log2 fold change for the edge weights to tease apart signalling pathways enhanced (positive weights) or suppressed (negative weights) by the parasite. 6) Output is captured as an edge list consisting of comma-separated values (.csv) which can be visualised in programs such as Cystoscope (Shannon et al., 2003). Here we have represented the more frequent edges for the positive network (green) and the negative network (red) Fig. adapted from (Adderley et al., 2021) and modified using BioRender.com. The benefits of the MAPPINGS approach were threefold; (i) it enabled the objective identification of signalling pathways which went unnoticed in traditional analysis strategies, shedding light on some of the more elusive signalling dynamics; (ii) it provided a greatly accelerated starting point for future analysis of similar signalling datasets; (iii) it enabled the identification of pivotal kinases in the network that functioned as major nodes of downstream signalling events. The significance at a practical level is that such identifications generate numerous new and testable hypotheses on the host signalling dynamics during blood-stage development of the malaria parasite, and is applicable to any phosphorylation-based dataset.

Results and discussion

The datasets used to design this computational approach were sourced from Adderley et al., 2020 (Adderley et al., 2020). This publication provided unique datasets that represent host erythrocyte signalling in the context of three stages of P. falciparum asexual development. This includes ring-stage parasites (n=3), representing the early stages of development, trophozoite-stage parasites (n=3), representing the most metabolically active form, and schizont stage parasites (n=2), when the parasites daughter cells are assembled. This asexual development cycle is completed in 48 hours for P. falciparum (Fig. 1). For each antibody on the array, each dataset reports the fold change from an uninfected red blood cell control which was used as the baseline in this comparison. Details about the datasets and further background information are available in the source article (Adderley et al., 2020); see (Venugopal et al., 2020) for a succinct review of the P. falciparum lifecycle. The analysis pipeline from sample generation through to the MAPPINGS analysis output is represented as a flow chart in Fig. 1. The datasets were generated using the KAM-900 series antibody microarray produced by the kinomics company Kinexus, which contains 613 phosphorylation-specific antibodies. Additionally, a further 265 pan-specific antibodies recognising both the phosphorylated and unphosphorylated forms of a target protein are present, which provide information on protein abundance. While the approach is very sensitive, we cannot of course exclude that biologically relevant signals were below the detection threshold. Furthermore, as detailed in (Adderley et al., 2020), a number of signals from the array were deemed unsuitable for analysis and were therefore removed here (Fig. 1.3). The disregarded signals fell into at least one of the following three categories; (i) low signal intensity, (ii) relatively high error (compared to the change observed from uninfected control) and (iii) cross-reactivity to parasite proteins. This reduced the overall number of phosphorylation-specific antibody signals of each dataset to; 69 (ring), 184 (trophozoite) and 135 (schizont) (see methods section for detailed description of these categories). To account for degradation or expression of signalling proteins, each of the phosphorylation-specific signals on the array was normalised to account for changes in signalling protein abundance.

Network construction and mapping to biological datasets

Phosphorylation signalling is a highly interconnected network that contains numerous feedback loops which enable finely tuned responses to external/internal stimuli. Many of the globally identified phosphorylation events have unknown functions. Additionally, the specific kinases responsible for many of these phosphorylation events remain elusive. These factors prove challenging in the understanding of how various phosphorylation events come together into a greater network. Despite this, in a study by Olow et al. (Olow et al., 2016) a network containing 1733 proteins interconnected through phosphorylation interactions was pieced together. Using this study as a framework, we made further annotations to this network map to include; the target proteins response following phosphorylation at a specific site (activation/inhibition) and added additional phosphorylation sites reported on the microarray which were missing from the network map (See Methods for further detail). The subsequent phosphorylation network consists of kinases/substrates which are represented as nodes, while the specific phosphorylation events are represented as the network's edges (arrows). The networks edges are directed and point towards the phosphorylated substrate; an arrowhead indicates an activation effect and a square indicates that phosphorylation causes inhibition of the substrate's activity (Fig. 2a).

Fig. 2

Example of the phosphorylation network structure and the networks associated with each dataset analysed. a) Definition of edge type and node type in the networks utilised throughout this study. Kinases and substrates are represented as nodes (dark nodes = substrates, light nodes = kinases). Phosphorylation's events are represented as directed edges, with edges pointing towards the substrate of the interaction. The effect of the edge is represented in the arrowhead (arrow = activation, square = inhibition). b) Left - The basal human phospho-signalling network used here contains 1156 proteins (nodes, dark = substrate, light = kinases) and 6224 phosphorylation connections (edges, grey = not in trophozoite dataset, red = in trophozoite dataset). Right - Subnetwork of the connections in the human phosphorylation network which were assigned antibody microarray data from the trophozoite dataset. This subnetwork contains 167 proteins (nodes) and 237 phosphorylation connections (edges). The nodes and edges were identified as having reliable phosphorylation changes during the trophozoite stage of P. falciparum blood stage development following signal filtering. c-e) Subnetwork of the connections in the human phosphorylation network which were assigned antibody microarray data from the ring, trophozoite and schizont datasets respectively. Full size images are available as Supplementary Fig.s 1-3. Once the phosphorylation network had been optimised (see above), the next challenge was to overlay the reliable data from each of the malaria datasets onto the network. A number of antibody signals from the microarray reported on dual or triple phosphorylation sites for a given substrate. This is quite common among antibodies that target phosphorylation sites on proteins, and many phosphorylation sites are in relatively close proximity. These phosphorylation's also often share common roles for the substrate (Li et al., 2017). The network used here was structured so that each individual phosphorylation site was given its own unique edge. Therefore, the dual and triple phosphorylation signals were split, and the associated signals reassigned to the now separated sites. Additionally, in the cases where more than one antibody on the microarray recognises the same phosphorylation site, the signals from all such antibodies were averaged. The three datasets were then overlayed on the framework to generate unique networks for the ring, trophozoite and schizont developmental stages. Each of the networks underwent an edge reduction step, reducing the number of parallel edges between the nodes, thereby simplifying the overall networks and enabling a straightforward integration in Python (Networkx package). Parallel edges were defined here as multiple unique phosphorylation sites on Protein A which were all phosphorylated by Protein B. Additionally, we further separated each unique network into a positive and a negative network, to allow an independent assessment of the phosphorylation events that either increased in intensity (positive networks) or decreased in intensity (negative networks) during infection (depicted in Fig. 1.4, also see methods section for more detail). All three of the positive networks were exported as an edge list and rendered into a network map using Cytoscape v3.8.2 to provide a visual representation. This illustrated the level of interconnectivity, while also illustrating the highly interconnected nature of the datasets (Fig. 2b-e).

MAPPINGS strategy

To analyse the flow of phosphorylation through the networks (corresponding to the ring, trophozoite and schizont stages) a computational approach was designed to generate modified random walks. The walks were not completely random because the algorithm explicitly avoided (i) repetition of self-loops and (ii) cycling through the same edges in highly interconnected regions. Such paths are known as trails in network science terminology. No edge is revisited; however, nodes could be repeated to allow differential pathway development through unique edges. Each dataset was assessed for walks that were overrepresented, indicating increases in phosphorylation during infection, in the case of the positive network, or decreased activity in the case of the negative network (suppression of the pathway during infection) (Fig. 1). The positive network analysis weighted edge selection towards larger positive phosphorylation changes between healthy and infected cells and consider all negative phosphorylation changes as unusable edges. The inverse of this was performed for the negative network analysis.

Pathway Analysis

The computational technique begins with the selection of a random node in the network. The algorithm identifies all possible outbound edges (representing potential phosphorylation events) and selects one by weighting its decision on the edge weights (representing the magnitude of the associated biological signals represented as a fold change). Once an edge is selected the algorithm moves down the edge and considers the outgoing edges originating from the new node; it repeats this process until terminated and a new walk is initiated (see Equation 1). A walk is terminated under three circumstances; (i) there are no available edges which have not previously been used in the current walk, (ii) the last edge used is an inhibitory phosphorylation event (depicted in Fig. 2a) or (iii) the edge fails the termination check. The termination check provides a small chance that a walk is terminated and is based on the strength of the biological signal. The probability of termination was set to 20% for the control networks (see below for more details). The strongest signals in the data have a lower likelihood to result in termination with increasing likelihood for signals which were weaker. This was implemented to distinguish edge choices where only a single edge was ultimately available at any given edge selection stage. Without this parameter the fold change value assigned to an edge in this circumstance was irrelevant due to the lack of competing choices (for more detail see methods section). Equation 1. Probability that the edge connecting node Na to Nb will be chosen by the modified random walk algorithm. fab represents the fold change of phosphorylation of Nb by Na, where entries without a connecting edge and edges that have already been visited are given a fold change value of 0. The pathway analysis was initially repeated until 1,000,000 trails were recorded. The output of the analysis resulted in a significant number of trails with a length <3 (data not shown). This was not surprising as the node selection was random, and therefore nodes with no outbound edges (non-kinases) were often selected. The optimal output for this analysis was to distinguish pathways of phosphorylation interactions (i.e. trails), rather than simply highlight individual phosphorylation events. To this end, a minimum walk length of 3 edges was introduced as a threshold, thereby disregarding walks that terminated shorter than this. To ensure adequate trails were recorded for each analysis, we required the program to run a total of 1,000,000 trails of length ≥3. Following completion of the designated number of trails, the walks of length ≥3 were decomposed into single edges and tallied to determine the number of times each unique edge was utilised. These values were denoted as the total edge usage. Analysis of the total edge usage values indicated that there was a wide variation of edge usage that occurred for every dataset analysed. Upon further analysis it was clear that certain edges were overwhelmingly more frequently traversed, regardless of the underlying fold change data (data not shown). This was a consequence of the base network structure, which, like most biological networks, displays nodes that act as hubs in the network because they have numerous inbound and outbound edges. To address this, control networks were included as a base for comparison of the effect that the fold change data caused (Fig. 1.5). The control networks were identical to the respective Positive or Negative networks; however, the edges were not assigned fold change data. Therefore, when the pathway analysis was applied to these networks, there was no inherent biological preference for any particular edge. They did however provide a baseline usage of each edge in the respective networks and therefore provided a means to account for the different interconnectivity of the nodes. The total edge usage for each of the edges in the positive network, negative network and their respective control networks were compared to determine the percentage change from control (% CFC). The MAPPINGS tool as well as detailed instructions for its use are accessible at https://github.com/FinnOD/mappings.

MAPPINGS analysis results

The pathways yielded by MAPPINGS for each of the developmental stages are presented in Fig. 3 (Rings), Fig. 4 (Trophozoites) and Fig. 5 (Schizonts); for each stage the data are also presented in a tabular format in Supplementary Table 1. In each case, the analysis yielded pathways whose components have previously been implicated in infection with Plasmodium, providing a positive control and validation with respect to the ability of the approach to detect kinases that are modulated by infection. MAPPINGS also revealed additional pathways, providing a well-grounded rationale for further experimental work and potential novel targets. Detailed below are the key findings from each of the developmental stage analysed.

Fig. 3

Fig. 4

Subnetwork of the MAPPINGS analysis performed on P. falciparum trophozoite stage microarray dataset. The network contains 53 nodes with 66 connecting edges. Kinases and substrates are represented as nodes (dark nodes = substrates, light nodes = kinases) and phosphorylation events are represented as edges, which are designated with the specific phosphorylation site. Edges are represented in a colour gradient from grey to green (positive edges) and grey to red (negative edges) and a size gradient which corresponds to the percentage change from the control network trails (%CFC). The effect of the edge is represented in the arrowhead (arrow = activation, square = inhibition). The data are presented in a tabular format in Supplementary Table 1.

Fig. 5

Subnetwork of the MAPPINGS analysis performed on P. falciparum schizont stage infection. The network contains 42 nodes with 45 connecting edges. Kinases and substrates are represented as nodes (dark nodes = substrates, light nodes = kinases) and phosphorylation events are represented as edges, which are designated with the specific phosphorylation site. Edges are represented in a colour gradient from grey to green (positive edges) and grey to red (negative edges) and a size gradient which corresponds to the percentage change from the control network trails (%CFC). The effect of the edge is represented in the arrowhead (arrow = activation, square = inhibition). The data are presented in a tabular format in Supplementary Table 1.

Subnetwork of the MAPPINGS analysis performed on P. falciparum ring stage microarray dataset. The network contains 20 nodes with 21 connecting edges. Kinases and substrates are represented as nodes (dark nodes = substrates, light nodes = kinases) and phosphorylation events are represented as edges, which are designated with the specific phosphorylation site. Edges are represented in a colour gradient from grey to green (positive edges) and grey to red (negative edges) and a size gradient which corresponds to the percentage change from the control network trails (%CFC). The effect of the edge is represented in the arrowhead (arrow = activation, square = inhibition). The data are presented in a tabular format in Supplementary Table 1. Subnetwork of the MAPPINGS analysis performed on P. falciparum trophozoite stage microarray dataset. The network contains 53 nodes with 66 connecting edges. Kinases and substrates are represented as nodes (dark nodes = substrates, light nodes = kinases) and phosphorylation events are represented as edges, which are designated with the specific phosphorylation site. Edges are represented in a colour gradient from grey to green (positive edges) and grey to red (negative edges) and a size gradient which corresponds to the percentage change from the control network trails (%CFC). The effect of the edge is represented in the arrowhead (arrow = activation, square = inhibition). The data are presented in a tabular format in Supplementary Table 1. Subnetwork of the MAPPINGS analysis performed on P. falciparum schizont stage infection. The network contains 42 nodes with 45 connecting edges. Kinases and substrates are represented as nodes (dark nodes = substrates, light nodes = kinases) and phosphorylation events are represented as edges, which are designated with the specific phosphorylation site. Edges are represented in a colour gradient from grey to green (positive edges) and grey to red (negative edges) and a size gradient which corresponds to the percentage change from the control network trails (%CFC). The effect of the edge is represented in the arrowhead (arrow = activation, square = inhibition). The data are presented in a tabular format in Supplementary Table 1.

Ring stage network

The ring stage, which takes approximately 24 hours to complete (Gruring et al., 2011), begins as a merozoite invades a red blood cell and establishes the infection. The antibody microarray dataset used in this analysis covered a time window of 8 – 16 hours after invasion and compared a population of 33% infected erythrocytes to an uninfected control. a pure population of ring stage infected cells was not used because it is problematic to purify infected cells from uninfected cells at this early stage of infection, in contrast to the trophozoite and schizont stages, for which we can obtain preparation of >95% infected cells (see below). Despite this limitation, a number of changes in ring-infected versus uninfected erythrocyte signalling components were identified. The output of the ring stage network analysis contained 20 nodes (proteins/kinases) and 21 edges (phosphorylations) (Fig. 3). The most prominent host signalling elements of the positive network were the non-receptor tyrosine kinases (TK) Src and FAK1/2 and the receptor Tyrosine kinase (RTK) RET (connected by green edges). In the negative network are the TKs Lck, Lyn, Syk, the serine/threonine kinases PKCα/δ/µ/ε, and the protein PEA15 (connected by red edges), discussed below.

Syk/Lyn

Phosphorylation of the membrane protein Band 3 by the Syk TK is essential to destabilise the erythrocyte membrane during parasite egress, and Syk inhibitors have been shown to block this process (Pantaleo et al., 2017). Our network analysis detected a decrease in Syk phosphorylation of the activation-associated residue Y323 during ring stage infection. The decrease in Syk phosphorylation in the early stages of infection is consistent with the parasite preventing the premature lysis of the RBC. This block may simply be released at the end of schizont stage development to allow parasite egress; this would explain why there is no detected increase (relative to uninfected red blood cells) in schizonts. The network analysis indicates that the TKs Lck and/or Lyn may be implicated in the reduction of Syk phosphorylation in rings. Both Lck and Lyn are non-receptor TKs that belong to the Src family, with wide functionality from proliferation through to apoptosis and metabolic signalling (Parsons and Parsons, 2004). Lck is primarily expressed in T-cells and the brain (Bommhardt et al., 2019), while Lyn has been identified to have roles across several cells types of hematopoietic origin (Ingley, 2012), and has been detected in the erythrocyte proteome (Bryk and Wisniewski, 2017). Consequently, it is more likely that Lyn is responsible for Syk phosphorylation observed; however, this may reveal a novel function for Lck in the erythroid lineage. Lyn was previously suggested to be involved in Band 3 phosphorylation in uninfected erythrocytes, implicating a possible role in P. falciparum infection that warrants further exploration (Pantaleo et al., 2016, Brunati et al., 2000).

Protein kinase C (PKC)

The human PKC family comprises 10 isoforms of serine/threonine kinases that function in the phosphoinositide pathway to regulate a range of cellular processes. A decrease in overall host PKC activity during P. falciparum infection of red blood cells was first reported more than 20 years ago (Hall et al., 1997), which is consistent with the decrease in PKCα, PKCδ, PKCµ and PKCε phosphorylation detected by our analysis. PKCδ phosphorylation was also decreased in the trophozoite analysis as well (see Fig. 4). The biological function of this decrease is not understood.

Focal adhesion kinases (FAK1/2)

FAKs are non-receptor TKs which serve to promote signalling through recruitment to activated cell surface receptors, notably of the integrin family (Zhao and Guan, 2011). Our analysis indicated that FAK1/2 are phosphorylated on the activating residues Y397/Y402 during infection. Further, the activation of FAK1 was notable across all time points examined during parasite development, suggesting it may have a continuous role in the infected host cell. These activating phosphorylation event can be mediated by a number of kinases, including the aforementioned Src TK (Lu and Sun, 2020). No implication of FAKs or RET in infection has been reported, but our data suggest it may be of interest to explore this further (Fig. 3 and 4). Indeed, FAKs can be activated by membrane deformation, therefore, it is tempting to speculate that the ontogeny of knobs (made of proteins exported by P. falciparum to the red blood cell membrane to provide cytoadherence, see (Subramani et al., 2015)) in the plasma membrane of infected red blood cells may trigger FAK signalling.

Trophozoite stage network

Trophozoite are the most metabolically active stage of development and are notable for extensive host cell modification and increased haemoglobin digestion (Elliott et al., 2008). The antibody microarray dataset used in this analysis covered a window of 24 – 28 hours post invasion and compared a population of >95% infected cells to an uninfected counterpart (Adderley et al., 2020). This was possible because the parasite's digestion of host red blood cell haemoglobin leads to the accumulation of a paramagnetic pigment known as hemozoin, enabling magnet-mediated enrichment of infected cells (Inyushin et al., 2016). The higher level of activity of the parasite at its trophozoite stage, combined with the greater level of enrichment, resulted in a greater number of signals of sufficient quality to be included in this analysis. The output of the trophozoite stage analysis contained a subnetwork of 53 nodes (proteins/kinases) and 66 edges (phosphorylation's) (Fig. 4). The most prominent host signalling elements of the positive network were the aforementioned TKs FAK and Ret, and several MAPKs (connected by green edges; see below). In the negative network were the Syk TK and the PAK1 serine/threonine kinase, and the substrate protein RPS6 (connected by red edges), discussed below.

Syk

The reduction in Syk phosphorylation observed at ring stage is not maintained at the trophozoite stage. This suggests the possible inhibition occurring during the ring stage begins to be alleviated at the trophozoite stage, which is consistent with the appearance of Band 3 phosphorylation in mature trophozoites (Billett, 2017).

FAK-Ret-MAPK

Phosphorylation of the FAK and RET TKs observed in the ring stage analysis (see above) appears to still be present at the trophozoite stage, though to a lesser extent (Fig. 4). In addition to Src (see above), the hepatocyte growth factor receptor (MET) is activated in trophozoites and represents a possible additional activator of FAK1, which suggest multiple activation pathways may converge on FAK1 (Fig. 4). MET phosphorylation in trophozoite-infected erythrocytes has been previously confirmed by Western blot (Adderley et al., 2020). Interestingly, a strong candidate for an effector of RET in trophozoites, the mitogen activated protein kinase 3 (MAPK3, or ERK1) was not observed in the ring stage analysis. ERK1 phosphorylation has not been investigated in the context of P. falciparum development in red blood cells. Two of the alternative activators of ERK1 from this analysis were MAP2K1 and MAP2K2 (otherwise known as MEK1/2). Though phosphorylation of MEK1/2 was not detected at the trophozoite stage, active ERK1 strongly suggest this is indeed the case. MEK1/2 phosphorylation during the later stages of P. falciparum blood stage development has been reported previously (Sicard et al., 2011). Unfortunately, the signals for these phosphorylation sites were included in those that were disregarded because of low reliability (see above and reference (Adderley et al., 2020)). Interestingly, abnormal MEK1 phosphorylation of ERK signalling in erythrocytes of patients with sickle cell disease (SCD) is critical for the adhesive interactions of these cells with the endothelium (Zennadi, 2019). Together with our observations on MAPK pathway activation, this may have profound implications with respect to the mechanisms of cytoadherence of P. falciparum infected red blood cells.

P21 activated kinase 1 (PAK1)

PAK1 is a serine/threonine kinase with multiple roles in regulation the cytoskeleton and apoptosis (Dummler et al., 2009). The activation of PAK1 by Plasmodium infection has been reported (Sicard et al., 2011), but the mechanism of its activation remains unclear. Our analysis suggest it may be of interest to determine whether PDK1 plays a role upstream of this pathway.

Ribosomal protein S6 (RPS6)

While there are no published data on RPS6 during P. falciparum red blood cell infection, it has been shown that infected hepatocytes, during the liver stage of P. falciparum life cycle, show elevated levels of RPS6 phosphorylation (Glennon et al., 2019). In our analysis we noted a decrease in phosphorylation of the S235 site during trophozoite development. The primary activator of RPS6 is the Ribosomal Protein S6 Kinase (S6K), but PKC (which is activated by infection, see above) has also been reported to be involved in RPS6 phosphorylation (Valovka et al., 2003). Interestingly, the sites T421/S424 on RPS6, which acts to enhance activation, can be phosphorylated by ERK1/2 (Biever et al., 2015). The T421/S424 site was not part of the microarray dataset; it would be of great interest to investigate this site further in relation with MAPK pathways and RPS6 S235 phosphorylation.

Schizont stage network

The final notable stage of P. falciparum asexual development within human erythrocytes is the schizont stage, which accounts for the final hours of the asexual lifecycle. Schizont stage parasites are less metabolically active than the trophozoites, with a large proportion of the host cells cytosol now digested (Krugliak et al., 2002). The antibody microarray dataset used in this analysis was performed on infected red blood cells 44 – 48 hours post invasion, corresponding to the final few hours before daughter cell release. This dataset, like that of the trophozoite dataset compared a population of >95% infected cells to an uninfected counterpart (Adderley et al., 2020). The output of the schizont stage network analysis contained a subnetwork of 42 nodes (proteins/kinases) and 45 edges (phosphorylation's) (Fig. 5). The trophozoite and schizont analysis shared a number of similar signalling connections, suggesting an overall signalling trend at the later stages of blood stage development. These commonalities include Src, FAK, RET MAPK3, MAPK14 (also known as p38α) and many of the PKC isoforms. The most striking pathway from the schizont stage analysis is the pathway from MAPK14 (p38α) to Mdm2, discussed below.

P38α - MAPKAPK2 - Mdm2 pathway

p38α is a mitogen-activated protein kinase (MAPK) with key responsibilities in erythroblast enucleation during stress erythropoiesis (Schultze et al., 2012). Interestingly p38α activity has also been linked to stress responses in red blood cells, with a suspected role in eryptosis (the red blood cell equivalent of apoptosis) (Gatidis et al., 2011). The reduction in phosphorylation observed in our analysis could indicate that this cell death pathway is being circumvented by the parasite to facilitate prolonged survival of its host cell. The downstream effector of p38α, Mdm2 is a E3 ubiquitin-protein ligase and is responsible for the ubiquitination of TP53 (or p53) which flags p53 for proteasomal degradation (Karni-Schmidt et al., 2016). Its role in mature red blood cells is unknown, but it is essential for regulating erythropoiesis (Maetens et al., 2007). Our analysis points to a reduction in Mdm2 phosphorylation at S166, which is known to enhance the proteins capacity to inactivate p53 signalling (Chen, 2012). Interestingly, p53 is shown to be activated from our analysis, which could be in part due to a reduction in Mdm2 activity. Together this illustrates a possible ‘pro survival’ pathway being activated, which could be the result of direct signalling manipulation by the parasite. In the absence of transcriptional activity in erythrocyte, this concept is intriguing and warrants further exploration.

Concluding remarks

The combination of (i) increasingly (very) large datasets originating from phospho-proteomic approaches (including those obtained through arrays comprising phospho-specific antibodies as exemplified here), and (ii) the tremendous underlying complexity of phospho-signalling networks in mammalian cells, raises serious issues with respect to deciphering the modulation of signalling pathways between biological samples. This calls for system-wide computational approaches to deconvolute raw data. MAPPINGS, the network-based analysis tool developed here aims at contributing to this important task, and consists of a pathway-based analysis tool that uses random walks to identify chains of consistent phosphorylation events. MAPPINGS highlights pathways of phosphorylation that work synergistically to provide a rapid interpretation of the consistent pathways in a dataset. A limitation of this approach is in the treatment of the inhibitory associated edges: MAPPINGS currently terminates trails following the selection of an inhibitor edge; however if a subsequent edge out of this node indicated the inverse effect (i.e increase in inhibition then a decrease on outbound edges) this would be congruent with the expected flow of phosphorylation. Consequently, this leads to a small number of trails not being explored and therefore not represented in the output. This limitation is currently being addressed and may be amended into the program in a later version. The MAPPINGS tool developed here is applicable to any antibody microarray or phospho-proteomic dataset comprising signalling proteins. MAPPINGS can be utilised to provide detailed pathway analysis on already published datasets, and will be an effective tool to screen new datasets in any context where there is a need to compare phospho-signalling between two biological samples. We illustrate this concept through the identification of host erythrocyte pathway modulation during the development of malaria parasites. A notable proportion of the pathways identified with MAPPINGS are consistent with published host signalling studies, validating the strategy. Additionally, new and exciting host signalling interactions were observed, for example signalling pathways that implicate FAK, p38α and RET. The tool does not allow to discriminate between pathways that are mobilised by the parasite to enable infection, and those that are part of the host cell “innate immunity-like” response, which in turn would require a countermeasure from the parasite to ensure its own survival. Clearly, the function of each activated kinase/pathway needs to be investigated. MAPPINGS analysis is essentially a hypothesis-generating exercise, and these findings now need functional validation, which is under way for several of the newly identified kinases. The findings in the specific example used here are far-reaching. Currently all deployed antimalarials and those in development target parasite-encoded proteins, with many derivatives of current or previous deployed compounds (Tse et al., 2019). Parasite resistance and ensuing treatment failure is becoming apparent for every deployed antimalarial (Thu et al., 2017). This calls for the development of next-generation drugs with (i) untapped modes of action to prevent cross-resistance and (ii) have low propensity for the emergence of de novo resistance. In recent years P. falciparum has been shown to require the activity of several of its host kinases (Sicard et al., 2011, Adderley et al., 2020, Kesely et al., 2020), which when inhibited chemically, result in parasite death. This suggests that host targeted drug discovery (HDT) may be feasible avenue for malaria treatments as it has for other infectious diseases (reviewed in (Zumla et al., 2016)).

Methods

Microarray datasets

Datasets used to design this random walks-based network analysis were published in Adderley et al. 2020 (Adderley et al., 2020), containing a total of 8 datasets, including replicates. The replicate datasets were unified with the replicate values averaged for the analysis. These datasets were ring-stage parasites (n=3), trophozoite-stage parasites (n=3), and schizont stage parasites (n=2). Each dataset reports the fold change from an uninfected red blood cell control which was used as the base line of signalling. See source article for further details on these samples and the manual data analysis performed (Adderley et al., 2020).

Signal filtering

A number of signals reported by the antibody microarray were deemed unreliable. These signals fell into three categories, and were removed from the dataset and the subsequent analysis in the present work. These categories were; (i) low signal intensity, (ii) relatively high error (compared to change observed from control) and (iii) cross-reactivity to parasite proteins. Low intensity signals: were defined as signals were both the control (uninfected sample) and infected sample were below 1000 relative units. This threshold is recommended by the manufacturer, as signals below this intensity are often difficult to validate. High error relative to signal change: in some instances, signals appeared to vary notably between the biological replicates of that datasets. To account for this, we combined the uninfected and infected signals error for each unique antibody on the microarray (which are reported as percentage error) and disregarded any antibody whose total signal error was greater than the percentage change reported from the uninfected control. Cross-reactive signals: we removed the antibodies that were identified as cross-reactive (see (Adderley et al., 2020) for more detail on cross-reactive signal determination).

The substrate effect for each phosphorylation in the network

As mentioned in the introduction, phosphorylation fundamentally results in either the activation or inhibition of the target substrate. The network on which we based our study (Olow et al., 2016) did not record the effect that each phosphorylation event had on the target substrate. As this information is crucial biological interpretation of the output data, where possible we annotated this information into the network. This information was provided by Kinexus (the antibody microarray manufacturer), otherwise literature searching was undertaken to classify as many phosphorylation effects as possible as activation or inhibition. Despite these efforts, a number of phosphorylation sites have unknown effects, consequently these sites could not be annotated in our networks. To enable continuity of the MAPPINGS analysis these sites are treated as though they were activation sites; whenever future studies uncover the function of these sites the base network used in this analysis can be updated accordingly.

Assignment of biological data to duplicated edges

One limitation of this approach is that the upstream kinase responsible for each of the substrate phosphorylation's is unknown. Consequently, as multiple kinases can often phosphorylate the same target substrate at the same phosphorylation site, the fold change data was assigned to each of these possible interactions in the network. This means that in some instances a single antibody signal is mapped to multiple edges. As it is unlikely that all possible kinases contribute to the phosphorylation of a single substrate at once, the MAPPINGS analysis (see analysis strategy section) was designed to determine which of the possible kinases was most likely resulting in the phosphorylation event observed.

Developing a positive and negative network for independent dataset analysis

Once the biological data and network were combined, there were a hand full of nodes which were connected by multiple parallel directed edges. To enable a straightforward analysis strategy, we applied an edge reduction step that left a single directed edge between each of the two nodes. To account for the multiple directed edges that had varying phosphorylation-specific antibody signals, we developed two independent networks using the antibody microarray data. One of these networks was designated the positive network, which retained the phosphorylation-specific antibody signals which increased during infection, while the other network was designated the negative network, which retained the decreasing signals during infection. In both instances the greater magnitude values were preferentially selected. This enabled the independent assessment of the phosphorylation interactions that increased during infection and those that decreased.

Analysis strategy walk termination settings

There were three termination checks placed into the function used in this analysis: (i) No edges to choose - If there were no remaining usable edges available the walk would terminate, this was to avoid self-loop and to stop cyclic connections being over reporting. (ii) Inhibitory signalling - If the last edge used during a walk was an inhibition phosphorylation (depicted in Fig. 2a) the walk would terminate. Inhibitory phosphorylation results in the de-activation of the target protein or kinase. Therefore, walks were terminated following usage to remain consistent with what would happen in a biological setting. (iii) Weighted termination chance – The weighted termination chance enabled the function to discriminate edge usage due to fold change data when a single edge was available during edge selection. In the circumstance where a single edge is available, the function will select it, as there are no other options. This was problematic, as situations where the edge option was 1 the output would report no change in edge usage between the control and microarray data networks, regardless of the strength of the microarray data. By including a weighted chance for edge termination based on the magnitude of the edges fold change we were able to account for this in our analysis. A weighted chance which scaled from 0 – 20% was applied after each step in a walk. This scaling was linearly assigned to the fold change data in the positive and negative networks with the largest magnitude fold changes being assigned no termination chance (0%) and smallest fold changes being assigned a chance of 20%. For the control networks, where no fold change values were assigned, the weighted termination chance was set at 20% globally. The MAPPINGS tool as well as detailed instructions for its use are accessible at https://github.com/FinnOD/mappings.

Availability and implementation

The program is available at https://github.com/FinnOD/mappings

CRediT authorship contribution statement

Jack Adderley: Conceptualization, Formal analysis, Writing - Original Draft, Funding acquisition, Biological interpretation of signalling pathway output; Finn O'Donoghue: Software, Formal analysis, Writing - Review & Editing; Christian Doerig: Funding acquisition, Biological interpretation of signalling pathway output, Writing - Review & Editing; Stephen Davis: Conceptualization, Methodology, Writing - Review & Editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

46 in total

Review 1. Protein posttranslational modifications: the chemistry of proteome diversifications.

Authors: Christopher T Walsh; Sylvie Garneau-Tsodikova; Gregory J Gatto
Journal: Angew Chem Int Ed Engl Date: 2005-12-01 Impact factor: 15.336

Review 2. Protein kinases: evolution of dynamic regulatory proteins.

Authors: Susan S Taylor; Alexandr P Kornev
Journal: Trends Biochem Sci Date: 2010-10-23 Impact factor: 13.807

3. Syk TKIs "strengthen" RBCs against malaria.

Authors: Henny H Billett
Journal: Blood Date: 2017-08-24 Impact factor: 22.113

4. Modulation of protein kinase C activity in Plasmodium falciparum-infected erythrocytes.

Authors: B S Hall; O O Daramola; G Barden; G A Targett
Journal: Blood Date: 1997-03-01 Impact factor: 22.113

Review 5. Activation and function of the MAPKs and their substrates, the MAPK-activated protein kinases.

Authors: Marie Cargnello; Philippe P Roux
Journal: Microbiol Mol Biol Rev Date: 2011-03 Impact factor: 11.056

6. Progress in the Development of Small Molecular Inhibitors of Focal Adhesion Kinase (FAK).

Authors: Yang Lu; Haiying Sun
Journal: J Med Chem Date: 2020-10-15 Impact factor: 7.446

7. Protein kinase C phosphorylates ribosomal protein S6 kinase betaII and regulates its subcellular localization.

Authors: Taras Valovka; Frederique Verdier; Rainer Cramer; Alexander Zhyvoloup; Timothy Fenton; Heike Rebholz; Mong-Lien Wang; Miechyslav Gzhegotsky; Alexander Lutsyk; Genadiy Matsuka; Valeriy Filonenko; Lijun Wang; Christopher G Proud; Peter J Parker; Ivan T Gout
Journal: Mol Cell Biol Date: 2003-02 Impact factor: 4.272

8. Structural and functional diversity of the microbial kinome.

Authors: Natarajan Kannan; Susan S Taylor; Yufeng Zhai; J Craig Venter; Gerard Manning
Journal: PLoS Biol Date: 2007-03 Impact factor: 8.029

Review 9. The past, present and future of anti-malarial medicines.

Authors: Edwin G Tse; Marat Korsik; Matthew H Todd
Journal: Malar J Date: 2019-03-22 Impact factor: 2.979

Review 10. Host-directed therapies for infectious diseases: current status, recent progress, and future prospects.

Authors: Alimuddin Zumla; Martin Rao; Robert S Wallis; Stefan H E Kaufmann; Roxana Rustomjee; Peter Mwaba; Cris Vilaplana; Dorothy Yeboah-Manu; Jeremiah Chakaya; Giuseppe Ippolito; Esam Azhar; Michael Hoelscher; Markus Maeurer
Journal: Lancet Infect Dis Date: 2016-04 Impact factor: 25.071