Literature DB >> 32395997

Structural Determinants of Phosphopeptide Binding to the N-Terminal Src Homology 2 Domain of the SHP2 Phosphatase.

Massimiliano Anselmi¹, Paolo Calligari¹, Jochen S Hub², Marco Tartaglia³, Gianfranco Bocchinfuso¹, Lorenzo Stella¹.

Abstract

SH2 domain-containing tyrosine phosphatase 2 (SHP2), encoded by PTPN11, plays a fundamental role in the modulation of several signaling pathways. Germline and somatic mutations in PTPN11 are associated with different rare diseases and hematologic malignancies, and recent studies have individuated SHP2 as a central node in oncogenesis and cancer drug resistance. The SHP2 structure includes two Src homology 2 domains (N-SH2 and C-SH2) followed by a catalytic protein tyrosine phosphatase (PTP) domain. Under basal conditions, the N-SH2 domain blocks the active site, inhibiting phosphatase activity. Association of the N-SH2 domain with binding partners containing short amino acid motifs comprising a phosphotyrosine residue (pY) leads to N-SH2/PTP dissociation and SHP2 activation. Considering the relevance of SHP2 in signaling and disease and the central role of the N-SH2 domain in its allosteric regulation mechanism, we performed microsecond-long molecular dynamics (MD) simulations of the N-SH2 domain complexed to 12 different peptides to define the structural and dynamical features determining the binding affinity and specificity of the domain. Phosphopeptide residues at position -2 to +5, with respect to pY, have significant interactions with the SH2 domain. In addition to the strong interaction of the pY residue with its conserved binding pocket, the complex is stabilized hydrophobically by insertion of residues +1, +3, and +5 in an apolar groove of the domain and interaction of residue -2 with both the pY and a protein surface residue. Additional interactions are provided by hydrogen bonds formed by the backbone of residues -1, +1, +2, and +4. Finally, negatively charged residues at positions +2 and +4 are involved in electrostatic interactions with two lysines (Lys89 and Lys91) specific for the SHP2 N-SH2 domain. Interestingly, the MD simulations illustrated a previously undescribed conformational flexibility of the domain, involving the core β sheet and the loop that closes the pY binding pocket.

Entities: Chemical

Mesh：

Substances：

Year: 2020 PMID： 32395997 PMCID： PMC8007070 DOI： 10.1021/acs.jcim.0c00307

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Introduction

SH2 Domains

The idea of protein modularity, with independently folding domains of conserved sequences, began with the discovery of Src homology 2 (SH2) domains.[1] Their name comes from the identification of sequences of ∼100 amino acids conserved in numerous cytosolic tyrosine kinases, including Src, and the appendix “2” indicates that this module is the second in the Src sequence.[2] Today, we know that the human genome codes for 121 SH2 domains, contained in 111 distinct proteins.[3,4] The primary biochemical function of SH2 domains is to selectively recognize polypeptides containing a phosphotyrosine (pY), along with specific contiguous residues.[5] Tyrosine phosphorylation contributes only ∼0.5% of the total phosphoproteome, yet it plays critical roles in eukaryotic cell regulation.[6] Substrate specificities of kinases and phosphatases are broad, and their effects in signaling are controlled also by their location. The presence in their structures of domains devoted to protein/protein interactions leads to proper positioning of these enzymes close to their substrates.[7] In pY signaling, kinases “write” the phosphorylation signal, which can be “erased” by phosphatases. SH2 domains “read” this information, using it to localize signaling proteins correctly.[8] As a general scheme, binding of an extracellular ligand to a receptor tyrosine kinase induces activation of the receptor, which phosphorylates itself and other nearby proteins. These phosphorylated tyrosine residues then function as docking sites for the SH2 domains of other proteins, which are thus recruited to the cell membrane or activated, causing propagation of the signal.[9] In addition, SH2 domains enhance tyrosine phosphorylation in vivo by protecting binding sites in their target proteins from dephosphorylation.[10]

Structure of the SH2 Domains

Three hundred 3D structures of approximately 70 different SH2 domains have been determined. They reveal a highly conserved topology.[6,11] These domains contain approximately 100 amino acids, with a central β strand, flanked by two α helices. These secondary structure elements are labeled according to their position along the sequence: βA αA βB βC βD βE βF αB βG (Figure A). Each residue is then numbered consecutively within the secondary structures.[12] The central β sheet divides the domain into two functionally distinct sides. The N-terminal side, flanked by helix αA, comprises the conserved pY binding pocket (formed by the BC loop); the C-terminal side, flanked by helix αB and the EF and BG loops, provides a more variable binding surface (specificity determining region) that typically engages residues C-terminal to the pY (Figure B).[3,9,13] The structural arrangement of the domain complexes described above corresponds to the two requirements of SH2 domains: these structural modules (i) must bind only to phosphorylated proteins and (ii) must associate specifically only with certain sequences.

Figure 1

Structure of SHP2: N-SH2 domain and whole protein. (A) The structure of the N-SH2 domain of SHP2 has the βαβββββαβ topology typical of SH2 domains. Loop BC (purple) is part of the pY binding pocket, loop DE (blue) inserts in the PTP active site in the autoinhibited SHP2 conformation, and loops EF (orange) and BG (red) control access to the groove where the phosphopeptide binds. The crystallographic structures of the N-SH2 domain (A) in the autoinhibited conformation of SHP2 and (B) when bound to a phosphopeptide differ mainly for a rearrangement of the EF loop, which in the autoinhibited state blocks the peptide binding site of the N-SH2 domain. SHP2 comprises three domains: N-SH2 (light blue), C-SH2 (orange), and PTP (pink). (C) In the absence of external stimuli, the N-SH2 domain blocks the catalytic site of the PTP domain. (D) Binding of the SH2 domain to phosphorylated sequences or pathogenic mutations favor a conformational transition leading to a rearrangement of the domains and to activation. The SHP2 structures in panels (C) and (D) are reported with their PTP domain in a similar orientation. PDB codes: (A,C) 2SHP, (B) 1AYA, (D) 6CRF.

SH2-Domain-Containing Protein Tyrosine Phosphatase 2

SH2 domains not only serve to connect the various components of signaling pathways by protein/protein interactions but often also have a role in modulating enzymatic functions. The SH2 domain-containing protein tyrosine phosphatases (PTPs) SHP1 and SHP2 contain two SH2 domains that are N-terminal to the catalytic domain, termed N-SH2 and C-SH2 (Figure C). In the absence of external stimuli, the N-SH2 domain interacts with the PTP active site, blocking it.[15] Association of the SH2 domains with pY motifs favors N-SH2/PTP dissociation and thereby activation of the phosphatase (Figure D).[6] The loss of N-SH2/PTP interactions is triggered by a conformational transition of N-SH2 that leads to a loss of complementarity between the N-SH2 and PTP surfaces. The SHP2 protein was the first oncogenic PTP discovered. Mutations of PTPN11 (the gene coding for SHP2) cause more than 30% of cases of juvenile myelomonocytic leukemia (JMML) and are variably found in other childhood malignancies.[16−19] In addition, SHP2 is required for the survival of receptor tyrosine kinases (RTK)-driven cancer cells,[20] plays an important role in resistance to targeted cancer drugs,[21] is a mediator of immune checkpoint pathways,[22] and is involved in the induction of gastric carcinoma by Helicobacter pylori.[23]PTPN11 mutations also cause the Noonan syndrome and Noonan syndrome with multiple lentigines, two disorders belonging to a family of rare diseases collectively known as RASopathies.[24,25] For all these reasons, SHP2 is an important molecular target for therapies against cancer and rare diseases.[26−28] At the molecular level, pathogenic mutations of PTPN11 often cause an increase in the binding affinity of the SH2 domains of SHP2, leading to hyperactivated signaling of the Ras/MAPK pathway.[29−32] Due to their role in many signaling pathways, SH2 domains have received much attention as potential targets of pharmaceuticals.[9] The fact that short pY-containing peptides (usually five to six amino acids) are sufficient to compete with larger protein ligands for SH2 domain binding has prompted researchers both in academia and industry to develop inhibitors of clinically relevant SH2 domains.[33] However, no molecules targeting the SH2 domains of SHP2 for therapeutic purposes have been reported. Considering its role in the allosteric regulation of SHP2, the N-SH2 domain is particularly interesting under this respect.

Phosphopeptide Sequence Selectivity of the N-SH2 Domain of SHP2

Several proteins interacting with SHP2 through its SH2 domains have been identified. Lists of more than 50 known or putative interacting proteins have been compiled in the past,[34−36] and several additional partners have been reported since then.[37−43] A database of the known interactions is available at phospho.elm.eu.org. However, in many of these cases, the sites of interaction, the pY residues that bind specifically to the SHP2 N-SH2 domain, and the binding affinities have not been determined. Table summarizes the phosphorylated sequences for which a high binding affinity to the N-SH2 domain of SHP2 has been reported. Although exceptions do exist, a general consensus pattern can be clearly detected, with hydrophobic residues (A, L, I, V, M, F, and P) at positions −2, +1, +3, and +5 and acidic amino acids (D or E) at positions 2 and 4.

Table 1

Natural Sequences with a High Affinity for the N-SH2 Domain of SHP2a

protein	pY	–3	–2	–1	0	+1	+2	+3	+4	+5	+6	relative K_d	ref
Gab1	627	Q	V	E̅	pY	L	D̅	L	D̅	L	D̅	0.1*	(44)
IRS-1	1179 (1172)	G	L	N	pY	I	D̅	L	D̅	L	V	1	(45)
Gab2	614	S	V	D̅	pY	L	A	L	D̅	F	Q	2	(46)
IRS-1	896 (895)	P	G	E̅	pY	V	N	I	E̅	F	G	4–8	(47, 48)
SHPS-1	470	T	L	T	pY	A	D̅	L	D̅	M	V	10	(49)
CagA		E̅	P	I	pY	A	T	I	D̅	F	D̅	10	(23)
IRS-1	551 (546)	I	E̅	E̅	pY	T	E̅	M	M	P	A	10	(47)
PDGFR	1009	S	V	L	pY	T	A	V	Q	P	N	10–20	(47, 48)
PDGFR	763	D̅	V	K	pY	A	D̅	I	E̅	S	S	n.a.	(50)
SHPS-1	429	D̅	I	T	pY	A	D̅	L	N	L	P	n.a.	(51)

CagA: H. pylori virulence factor CagA (cytotoxin-associated gene A); Gab1 and Gab2: GRB2-associated binding proteins 1 and 2; IRS-1: insulin receptor substrate 1; PDGFR: platelet-derived growth factor receptor; SHPS-1: Src homology 2 (SH2)-domain-containing protein tyrosine phosphatase substrate 1 or signal regulatory protein α (SIRPα). Hydrophobic and anionic residues are reported in underlined bold and in overlined italics, respectively. pY numbers refer to the human sequence, except for H. pylori CagA, where the sequence refers to the EPIYA-D segment.[23] For IRS-1 pYs, rat sequence numbers are indicated in parentheses, too, as dissociation constants (K) were reported for the rat peptides. Relative K values are normalized to that of IRS-1 pY1172 (rat sequence, corresponding to human pY1179), i.e., 14 ± 8 nM.[45] The asterisk indicates that the dissociation constant of Gab1 was measured on a construct containing both the N-SH2 and C-SH2 domains, and the exact phosphopeptide sequence used in the binding assay is unclear due to inconsistencies in the reference.[44] The sequence selectivity of the N-SH2 domain of SHP2 has been analyzed also by utilizing phosphopeptide libraries. Oriented peptide library studies have examined positions from −1 to +6 with respect to pY. More recently, high-throughput studies with surface-immobilized peptide arrays[31,32,52,53] analyzed positions from −6 to +6, but distinct preferences were observed only in the −3 to +5 sequence stretch. The results of these investigations are summarized in Table . Collectively, a distinct preference for hydrophobic residues at positions −2, +1, +3, and + 5 emerges (consistent with the natural sequences listed in Table ), while other positions appear to be more variable. In particular, only peptide arrays indicated a possible preference for anionic residues in positions +2 and +4.

Table 2

Motifs Determined from Peptide Library Studiesa

X = norleucine. The sequence positions investigated in each study have a thicker border. Hydrophobic and anionic residues are reported in underlined bold and in overlined italics, respectively. Roman numerals indicate different peptide classes identified in ref (35). Distinct selectivity features emerge from these data. Defining the determinants of N-SH2 selectivity is essential to allow the design of new peptides, peptidomimetics, and small molecules targeted to this domain. To this end, we analyzed collectively the available X-ray structures and performed several molecular dynamics (MD) simulations of N-SH2/phosphopeptide complexes.

Structures of N-SH2/phosphopeptide Complexes and MD Simulations

Seven experimental structures of N-SH2/phosphopeptide complexes, obtained by X-ray crystallography, are available (PDB codes, 3TKZ, 3TL0, 4QSY, 1AYA, 1AYB, 1AYC, 5DF6, 5X7B, and 5X94). In this work, 3TKZ and 1AYC were excluded from further analysis as, in 3TKZ, a non-canonical 1:2 protein/peptide complex is formed,[55] while in 1AYC the N-SH2 domain is complexed with a nonspecific peptide.[56] The phosphopeptides present in the remaining structures are listed in Table , which include the natural sequences of IRS-1 pY896 (pY895 in rat sequence numbering) (1AYB), PDGFR pY1009 (1AYA), CagA (5X94 and 5X7B), and Gab1 pY627 (4QSY).

Table 3

N-SH2/Peptide Complexes (Experimental and Simulated)a

method	ID.chain	–7	–6	–5	–4	–3	–2	–1	0	+1	+2	+3	+4	+5	+6	+7	+8	relative K_d	ref
PDB	4QSY.B (Gab1)		g	d̅	K	Q	V	E̅	pY	L	D̅	L	D̅	L	D̅			0.1	(44)
	1AYB.P (IRS-1 895)				s	p	G	E̅	pY	V	N	I	E̅	F	g	s		4–8	(47), (48)
	1AYA.P (PDGFR 1009)					S	V	L	pY	T	A	V	Q	P	n	e̅		10	(47)
	5X94.L (CagA EPIYA-D)		a	s	p	e̅	P	I	pY	A	T	I	D̅	F	D̅			10	(23)
	3TL0.B (artificial)					r	L	N	pY	A	Q	L	W	h	r			20	(55)
	5DF6.B (TXNIP)	k	f	m	p	p	p	T	pY	T	E̅	V	D̅					400	(57)
	5X7B.L (CagA EPIYA-C)		v	s	p	e̅	P	I	pY	A	T	I	D̅	d̅	l			1500	(23)
MD	GAB1_10					Q	V	E̅	pY	L	D̅	L	D̅	L	D̅			*
	GAB1_13		G	D̅	K	Q	V	E̅	pY	L	D̅	L	D̅	L	D̅			0.1	(44)
	IRS1-1172_8						L	N	pY	I	D̅	L	D̅	L				*
	IRS1-1172_9						L	N	pY	I	D̅	L	D̅	L	V			*
	IRS1-1172_11					S	L	N	pY	I	D̅	L	D̅	L	V	K		1.0	(45)
	IRS1-1172_12					S	L	N	pY	I	D̅	L	D̅	L	V	K	D̅	*
	IRS1-895				S	P	G	E̅	pY	V	N	I	E̅	F	G	S		4–8	(47, 48)
	IMHOF9 (artificial)				A	A	L	N	pY	A	Q	L	M	F	P			5	(36)
	SWEENEY12 (artificial)						V	L	pY	M	Q	P	L	N	G	R	K	8	(35)
	IRS1-546					I	E̅	E̅	pY	T	E̅	M	M	P	A	A		10	(47)
	PDGFR-1009					S	V	L	pY	T	A	V	Q	P	N	E̅		10−20	(47, 48)
	IMHOF5 (artificial)					R	L	N	pY	A	Q	L	W	H	R			20	(36)

Hydrophobic and anionic residues are reported in underlined bold and in overlined italics, respectively. Residues in lowercase were not resolved in the crystallographic structures. References indicated in the last column concern data on relative dissociation constant (Kd) values, which were normalized to that of IRS-1 pY1172. IDs of the different simulations will be used, for the sake of brevity, in the rest of the article. Artificial peptide sequences are indicated. Asterisks indicate that the Kd for the Gab1 peptide was measured with the tandem N-SH2 and C-SH2 domains, and the exact phosphopeptide sequence used in the binding assay is unclear due to inconsistencies in the reference,[44] and that the Kd for IRS-1 pY1172 refers to the sequence spanning from −3 to +7.[45] While these structures provide insights into the determinants of N-SH2 selectivity, characterization of the dynamics of domain/peptide complexes is essential to evaluate (i) the stability of the interactions observed in the crystallographic data and (ii) possible conformational transitions of the peptide or of the domain. In addition, no structures are available for the IRS-1 pY1179 peptide (which has one of the highest affinities among known sequences) or for high-affinity artificial peptides that were isolated in library screening studies. To address these issues, we performed 12 (microsecond-long) MD simulations of complexes of the N-SH2 domain with Gab1, IRS-1 pY1172, pY895, pY546 (rat sequence numbering, corresponding to human pY1179, pY896, and pY551), PDGFR pY1009, and three artificial peptides isolated in refs (35) and (36). Moreover, for Gab1 and IRS-1 pY1172, we simulated several analogues of different lengths to check for possible interactions involving N-terminal or C-terminal residues, distant from the pY (Table ).

Methods

Initial atomic coordinates were taken from crystallographic structures. As shown in Table S1, for five of the simulated sequences (GAB1_10, GAB1_13, IRS1-895, PDGFR-1009, and IMHOF5), X-ray structures were available, but some residues had to be removed or added. In the other cases (IRS1-1172, IMHOF9, SWEENEY12, and IRS1-546), the sequence to be simulated was obtained by substituting (and adding or removing) some residues, starting from the crystallographic structures listed in Table S1. The termini of the peptides were capped by acetyl and amide groups. These modifications in the peptide molecules were performed by means of Sequence Editor and Protein Builder functionalities in Molecular Operative Environment (MOE) (Chemical Computing Group, Inc.). The backbone of the added residues (at the termini) was initially modeled in an extended conformation. The side chains of the substituted residues were modeled by means of conformational search using a rotamer library as starting guess and allowing repacking. The structures were minimized, with the AMBER12:EHT force field[57] in generalized Born implicit water,[58] first on substituted side chains, constraining the backbone, and then on all substituted/added amino acids and on adjacent residues, without restraints, yielding a reasonable binding pose for all peptides. In all cases, the N-SH2 domain comprised residues 3 to 103. Each protein molecule was put at the center of a dodecahedron box, large enough to contain the domain and at least 0.9 nm of solvent on all sides. The protein was solvated with explicit TIP3P water molecules.[59] All MD simulations were performed with the GROMACS 4.6.5 software package[60] using the AMBER99SB force field[61] augmented with the parm99 data set for phosphotyrosine.[62] Long-range electrostatic interactions were calculated with the particle-mesh Ewald (PME) approach.[63] A cutoff of 1.2 nm was applied to the direct-space Coulomb and Lennard-Jones interactions. Bond lengths and angles of water molecules were constrained with the SETTLE algorithm,[64] and all other bonds were constrained with LINCS.[65] The pressure was set to 1 bar using the weak-coupling barostat.[66] Temperature was fixed at 300 K using velocity rescaling with a stochastic term.[67] For all systems, the solvent was relaxed by energy minimization followed by 100 ps of MD at 300 K while restraining protein and peptide atomic positions with a harmonic potential. The systems were then minimized without restraints and slowly equilibrated to remove any possible strains in the starting structures. Their temperature was increased in steps of 50 K from 50 to 300 K. Each step from 50 to 200 K comprised a first stage of 0.5 ns at fixed temperature and a linear temperature ramp of 50 K, lasting 0.5 ns; for the steps from 200 K to 300 K, the duration of these two stages was increased to 1 ns, and then 3 ns were performed at 300 K, for equilibration. Finally, productive runs of 1 μs were performed. Analysis of structural properties was performed using the GROMACS 2016 analysis tools, on the last 500 ns of the simulations, where convergence of the structural properties was confirmed by block averaging. For crystallographic structures, hydrogen bonds were detected following the usual geometric criteria.[68] The order parameter Θχ for the side-chain dihedral angle χ was calculated aswhere the summation is over the N frames in the MD trajectory and is a two-dimensional unit vector whose phase is equal to the dihedral angle χ in structure i.[69] Θχ = 1 and Θχ = 0 correspond to a fixed dihedral and free rotation, respectively. In the present work, we limited our analysis only to the order parameter for side-chain dihedral angle χ1. Molecular graphics were prepared with UCSF Chimera (www.cgl.ucsf.edu).

Results and Discussion

The −2 to +5 Phosphopeptide Region Interacts Tightly with the Domain

During all simulations, peptides remained in the binding cleft for the whole length of the trajectory. Figure reports the root-mean-square fluctuations (RMSF) of the position of phosphopeptide atoms and the order parameters of the side-chain Cα–Cβ bonds, calculated during the 12 MD simulations. In all cases, RMSF values were less than 1.8 Å for residues in the 0 to +4 interval, indicating a very low mobility for these peptide stretches. Consistently, order parameters were generally higher than 0.75 in this peptide region, although some exceptions were present at positions +1 and +4. In principle, order parameters could be influenced by the size of the side chain, but the fact that we consistently observed high values in the central region of the peptide, irrespective of the peptide sequence, confirms the low mobility of this stretch. In many cases, also, residues −2, −1, and +5 were rather stable, although a larger variability was observed compared to the central stretch. The structures in Figure show that the peptide termini (out of the −2 to +5 region) can detach from the protein. Overall, these findings explain why a distinct selectivity was observed in the peptide library studies only for amino acids falling in the interval from −2 to +5 (Tables and 2). This conclusion is supported by the fact that residues preceding −2 or following +5 are often unresolved in X-ray structures (Table ).

Figure 2

Dynamics of bound peptides. Left panel: RMSF of peptides bound to N-SH2. Residues whose RMSF is less than 1 Å larger than the minimal value are colored in cyan. Middle panel: side-chain order parameter Θ. Values close to unity indicate very narrow dihedral angle distributions and therefore bonds that are rigid with respect to rotation. Bars are colored according to the following scheme: Θ lower than 0.25 (red), between 0.25 and 0.75 (gray), and greater than 0.75 (blue). A bold “x” indicates residues for which the side-chain order parameter cannot be defined (glycines and alanines). Right panel: most representative structures of the IRS1-1172_12 and IMHOF9 simulations, with the peptide backbone size and color (from blue to red) assigned based on the mobility of each residue.

The Central Region of the Peptide Is in an Extended Conformation

Figure shows the Ramachandran plots of the peptide backbone in the X-ray structures and in the MD simulations for residues −2 to +5. In all cases, the dihedral angles of the conformations populated by residues from 0 to +3 fall in the top-left region of the plot, indicating an extremely stable extended structure.[70] Residues −1 and +4 are extended, too, in all crystallographic structures, but they are more mobile in the simulations, populating regions of the Ramachandran plot corresponding to helical conformations in some cases. Beyond the −1 to +4 region, the backbone conformation is variable.

Figure 3

Backbone conformation of the bound peptide residues in PDB X-ray structures and in the simulations. Ramachandran plots of residues from positions −2 to +5 with respect to pY are shown. Crystallographic structures are reported in the first line (“PDB”), with the following color code: 1AYA: green, 1AYB: red, 3TL0: purple, 4QSY: black, 5DF6: orange, 5X7B: brown, 5X94: blue. The allowed regions of the Ramachandran plot are reported in cyan in the background. Angles ϕ and ψ are reported on the x and y axes, respectively, with values from −180 to +180°. The background shows the allowed regions for a standard amino acid or for Pro or Gly where present (adapted from ref (71)). The extended peptide backbone conformation is stabilized by several H-bonds between the peptide and protein backbones, involving peptide residues −1, +1, +2, and +4 and protein residues H53 (βD4), K91 (BG7), and K89 (BG5), as illustrated in Figure . These interactions are present in some of the X-ray structures, and they are stably conserved in most of the MD simulations (Table ). In addition, the MD trajectories show some transient interactions also for the backbone of residue +3 with K91 (BG7) and of +6 with Q86 (BG2) or G87 (BG3), which were not observed in the crystallographic structures.

Figure 4

Main H-bonds between the peptide and protein backbones. Most representative structure of the IRS1-1172_12 simulation. H-bonds are highlighted by green lines.

Table 4

Hydrogen Bonds between the Peptide Backbone and the N-SH2 Domaina

		–2	–1		+1	+2	+3	+4		+6
method	ID	N	N V51^O (βD2)	O H53^N (βD4)	N H53^O (βD4)	O K91^N (BG7)	O K91^Nζ (BG7)	N K89^O (BG5)	O K89^N (BG5)	N Q87^O (BG3)
PDB (Å)	4QSY	-	-	-	-	3.0	-	3.0	-	-
	1AYB	V51^O: 2.8	-	2.9	3.0	-	-	3.0	-	n.a.
	1AYA	-	-	-	2.7	2.8	-	2.9	-	n.a.
	5X94	-	-	-	-	-	-	-	-	-
	3TL0	-	-	2.9	2.9	2.5	-	2.8	-	n.a.
	5DF6	-	-	-	3.0	3.2	-	2.9	2.9	n.a.
	5X7B	-	-	3.0	2.8	-	-	2.9	-	n.a.
MD (%)	GAB1_10	-	-	-	73	17	-	83	90	66
	GAB1_13	-	-	68	96	92	43	97	18	18
	IRS1-1172_8	-	-	-	80	74	36	90	-	n.a.
	IRS1-1172_9	-	-	62	88	84	20	99	30	26
	IRS1-1172_11	-	-	-	93	82	29	94	-	-
	IRS1-1172_12	-	-	81	91	85	56	99	-	-
	IRS1-895	E17^Oε: 91	63	93	61	98	-	98	91	62
	IMHOF9	-	-	24	96	84	-	93	31	-
	SWEENEY12	-	-	80	91	79	-	91	-	-
	IRS1-546	-	-	-	77	97	-	96	96	85
	PDGFR-1009	-	-	-	53	92	-	97	96	G86^O: 81
	IMHOF5	-	-	-	53	69	-	99	-	-

Stable H-bonds (distance ≤ 3.5 Å in X-ray structures or persistence ≥ 50% in MD simulations) are highlighted in bold. Peptide residues are numbered with respect to the pY position. Backbone atoms involved in hydrogen bonds are shown as apices. Interatomic distances (in Å) are reported for X-ray structures, while % persistence values along the trajectory are shown for MD simulations (see the Methods section). Dashes indicate that the H-bond is not formed in X-ray structures and that it is present for less than 5% in MD simulations. No data are reported for H-bonds that were not stable in at least one of the simulations or structures. Secondary-structure-based residue numbering follows ref (56).

Main H-bonds between the peptide and protein backbones. Most representative structure of the IRS1-1172_12 simulation. H-bonds are highlighted by green lines. Stable H-bonds (distance ≤ 3.5 Å in X-ray structures or persistence ≥ 50% in MD simulations) are highlighted in bold. Peptide residues are numbered with respect to the pY position. Backbone atoms involved in hydrogen bonds are shown as apices. Interatomic distances (in Å) are reported for X-ray structures, while % persistence values along the trajectory are shown for MD simulations (see the Methods section). Dashes indicate that the H-bond is not formed in X-ray structures and that it is present for less than 5% in MD simulations. No data are reported for H-bonds that were not stable in at least one of the simulations or structures. Secondary-structure-based residue numbering follows ref (56).

Phosphotyrosine Interactions

The peptide position in the N-SH2 domain is strongly stabilized also by the interactions of the pY residue with its binding pocket. Several pY interactions are widely conserved in SH2 domains. The most conserved residue is R βB5 (present in 98% of SH2 domains),[4] which forms a salt bridge with the phosphate.[72] This is by far the most stabilizing interaction[73] and is responsible for the specificity for binding pY (as opposed to other phosphoamino acids): only the lengthy tyrosine side chain allows the phosphate to interact productively with this arginine, whereas serine and threonine are too short.[5,13] R αA2 (present in 82% of SH2 domains)[4] interacts with the phosphate group and makes an amino-aromatic interaction with the phenol ring of the pY.[9] K βD6 is located on the other side of the pY phenol ring from R αA2 so that the two residues together form a clamp around the pY.[9] The pY recognition site also contains an extensive network of hydrogen bonds.[72] In particular, S βB7 (present in 88% of SH2 domains) and T/S BC2 form direct hydrogen bonds with the phosphate. The BC loop backbone also contributes to H-bonding.[9] With respect to these general features of SH2 domains, the N-SH2 domain of SHP2 presents several peculiarities:[72,73] it has a G in place of R αA2, and in the crystallographic structures, K βD6 contacts the phenol ring solely with its hydrocarbon chain and not with the amine. Table reports the H-bonds and the salt bridges formed by the phosphate in the crystallographic structures and in the simulations. The general picture described above is confirmed by our analysis of X-ray data. The R βB5 (R32)-pY phosphate ion pair is formed essentially in all structures, while K βD6 (K55) is at a larger distance. H-bonds with S βB7 (S34), S BC2 (S36), and the K BC1 (K35) backbone are consistently formed. An additional H-bond, present in all N-SH2 structures but not conserved in other SH2 domains, is formed with the side chain of T βC3 (T42).

Table 5

Hydrogen Bonds and Salt Bridges between pY and N-SH2 Residues

		hydrogen bonds					salt bridges
		S34 (βB7)	K35 (BC1)	S36 (BC2)		T42 (βC3)	R32 (βB5)	K35 (BC1)	K55 (βD6)
method	ID	side-chain Oγ	backbone N	backbone N	side-chain Oγ	side-chain Oγ1
PDB (Å)	4QSY	2.7	2.6	3.0	2.7	2.8	4.5	n.a.	5.5
	1AYB	2.8	3.0	2.9	2.7	2.9	4.0	n.a.	4.5
	1AYA	2.5	2.8	2.9	2.4	2.5	4.0	7.9	4.8
	5X94	3.2	3.2	-	2.5	3.4	4.5	9.0	6.1
	3TL0	2.8	2.8	3.5	2.9	2.9	4.2	n.a.	4.7
	5DF6	3.0	3.2	2.7	2.9	2.7	4.3	n.a.	6.9
	5X7B	-	2.8	2.6	3.3	3.2	4.0	6.6	n.a.
MD (%)	GAB1_10	100	95	-	-	99	100	79	-
	GAB1_13	98	94	-	27	99	100	62	65
	IRS1-1172_8	-	-	-	-	91	99	-	77
	IRS1-1172_9	-	-	-	-	38	32	-	94
	IRS1-1172_11	100	91	85	87	98	80	-	50
	IRS1-1172_12	-	-	-	-	96	99	-	82
	IRS1-895	-	-	-	-	84	91	-	75
	IMHOF9	-	-	-	-	61	83	-	69
	SWEENEY12	20	-	-	-	84	100	-	79
	IRS1-546	98	94	51	55	92	99	36	11
	PDGFR-1009	-	-	-	-	-	83	-	90
	IMHOF5	-	-	-	-	-	66	-	91

Stable bonds (distance ≤ 3.5 Å for H-bonds and ≤ 4.0 Å for salt bridges in X-ray structures or persistence ≥ 50% in MD simulations) are highlighted in bold. Interatomic distances (in Å) are reported for X-ray structures, while % persistence values along the trajectory are shown for MD simulations. Values lower than 5% are omitted. n.a. indicates X-ray structures where lysines 35 or 55 were not resolved in the electron density. Dashes indicate that the bond is not formed in X-ray structures and that it is present for less than 5% in MD simulations. Secondary-structure-based residue numbering follows ref (56). In the MD simulations, the R βB5 (R32)-pY ion pair is stably maintained, as well as the H-bond formed by T βC3 (T42) (peculiar of SHP2 N-SH2). The other H-bonds are less stable, indicating a significant mobility of the SH2 BC loop. The distances between the pY phosphate and the charged side chains of R32, K35, and K55 are reported in Figure . Interestingly, the possible ion pair with K βD6 (K55), which is conserved in other SH2 domains but is surprisingly not present in crystallographic structures of the N-SH2 domain,[56] does form often during the simulations. In addition, while the N-SH2 domain lacks the conserved R αA2, it has a K residue in position BC1 (K35), adjacent to the phosphate-binding site. In the crystallographic structures, its side chains point toward the solvent, but in some of the simulations, conformational fluctuations of the BC loop allow the formation of this additional ion pair.

Figure 5

Most common ion-pair interactions between the pY phosphate and N-SH2 residues in MD trajectories. Top panel: distribution of distances between the phosphotyrosine phosphate and protein residues. Distances of less than 4 Å (vertical red dashed lines) are indicative of a stable salt bridge. Bottom panels: N-SH2 residues that interact with the phosphate group of pY (see Table ) are shown on the left in the most representative structure of the IRS1-1172_8 simulation; the structure on the right shows the alternative arrangement of K35, where it interacts with the pY and a phosphopepeptide anionic residue in −1 (most representative structure of the GAB1_10 simulation).

“Selectivity-Determining Region”: Residues +1, +3, and +5 Insert in Hydrophobic Pockets

Selectivity of SH2 domains is commonly considered to be determined mainly by residues C-terminal to the pY. Based on the interactions in this selectivity-determining region, the domains have been classified in three classes.[2,9,73,74] The N-SH2 domain of SHP2 belongs to type II, called “open groove”, or “PLC-γ1-like”, in which residues C-terminal to the pY bind in a long hydrophobic groove, delimited by EF and BG loops. This is illustrated in Figure , which shows the most representative conformation of the IRS1-1172_8 MD simulation. With the pY inserted in its binding pocket, the extended conformation of the peptide backbone forces residues +1, +3, and +5 to point toward the protein core and to insert into the hydrophobic ridge. Residues +2 and +4 point toward the solvent but can interact with the loops BG and EF, which delimit the groove. In the N-terminal region of the peptide, residue −1 is solvent-exposed, while residue −2 points toward the protein surface in the region of helix αA.

Figure 6

Most representative conformation in the IRS1-1172_8 MD simulation, illustrating the main specificity determining side-chain interactions. Top: hydrophobic regions of the domain surface are shown in green, while cationic K89 and K91 are reported in blue. Bottom: interactions of the L – 2 residue (gray surface), which inserts between the pY ring (red) and V14 (green). Residues +1, +3, and +5 of the peptide (I, L, and L, respectively) form several interactions with hydrophobic amino acids that line the groove, remaining in contact with them for the whole length of the MD trajectory. In particular, residue +1 interacts with I54 (βD5), I96 (BG12), and methyl groups in the side chains of T52 (βD3) and E90 (BG6); residue +3 makes stable interactions with I54 (βD5), L65 (βE4), L88 (BG4), and I96 (BG12), and residue +5 interacts with L65 (βE4), Y81 (αB9), and L88 (BG4). Interestingly, the +1 pocket is the only one where polar residues are present in addition to hydrophobic amino acids. This might explain why peptide library studies and the sequences of known binding partners indicate that T can be present at position +1 of the peptide. As shown in Table , in the crystallographic structures 1AYA and 5DF6 (where a T is present in +1), no direct H-bond is formed between this residue and the protein domain. By contrast, our simulations show that a H-bond can indeed be formed either with T52 (βD3) or E90 (BG6). In one case (1AYA and PDGFR-1009), the peptide present in the crystal and in the simulation is the same one. However, the protein and peptide mobility, normally present in solution, can allow the formation of a H-bond that was not observed in the crystallographic structures.

Table 6

Hydrogen Bonds and Salt Bridges between Peptide Side Chains and the N-SH2 Domaina

		H-bonds					salt bridges
method	ID	–1	+1	+2	+4	+6	–1 K35 (BC1)	+2 K91 (BG7)	+4 K89 (BG5)	+6 K91 (BG7)
PDB (Å)	4QSY	-	n.a.	-	-	D-Oδ G68^N (EF3): 3.0	n.a.	D:7.6	D:9.4	D:15
	1AYB	-	n.a.	-	-	n.a.	n.a.	n.a.	E:7.3	n.a.
	1AYA	n.a.	-	n.a.	-	n.a.	n.a.	n.a.	n.a.	n.a.
	5X94	n.a.	n.a.	-	-	-	n.a.	n.a.	n.a.	n.a.
	3TL0	-	n.a.	-	n.a.	n.a.	n.a.	n.a.	n.a.	n.a.
	5DF6	-	-	-	-	n.a.	n.a.	E:4.2	D:4.7	n.a.
	5X7B	n.a.	n.a.	-	-	n.a.	n.a.	n.a.	n.a.	n.a.
MD (%)	GAB1_10	*	n.a.	*	*	*	E:58	D:62	D:8	D:23
	GAB1_13	E-Oε H53^Nε (βD4): 42	n.a.	*	*	*	E:23	D:81	D:16	D:15
	IRS1-1172_8	N-Nδ V51^O (βD2): 24	n.a.	*	*	n.a.	n.a.	D:83	D:16	n.a.
	IRS1-1172_9	N-Nδ V51^O (βD2): 44	n.a.	*	*	n.a.	n.a.	D:65	D:22	n.a.
	IRS1-1172_11	-	n.a.	*	*	n.a.	n.a.	D:71	D:27	n.a.
	IRS1-1172_12	N-Nδ V51^O (βD2): 19	n.a.	*	*	n.a.	n.a.	D:83	D:14	n.a.
	IRS1-895	-	n.a.	-	E-Oε N92^N (BG8): 43	n.a.	-	n.a.	E:30 E:18 (K91)	n.a.
	IMHOF9	-	n.a.	Q-Oε K91^Nζ (BG7): 21	n.a	n.a.	n.a.	n.a.	n.a.	n.a.
	SWEENEY12	n.a.	n.a.	Q-Oδ K91^Nζ (BG7): 29	n.a.	n.a.	n.a.	n.a.	n.a.	n.a.
	IRS1-546	*	T-Oγ T52^Oγ (βD3): 24	*	n.a.	n.a.	E:31	E:57	n.a.	n.a.
	PDGFR-1009	n.a.	T-Oε E90^Oγ (BG6): 49	n.a.	Q-Nε E90^O (BG6): 22	N-Nδ Q87^O (BG3): 62	n.a.	n.a.	n.a.	n.a.
	IMHOF5	-	n.a.	Q-Oε K91^Nζ (BG7): 27	W-Nε Q87^O (BG3): 91	-	n.a.	n.a.	n.a.	n.a.

The same protein residue numbering and definitions for stable interactions (reported in bold) of Table were applied here. Distances (Å) and % persistence are reported for X-ray structures and MD simulations, respectively. Peptide residues are numbered with respect to the pY. n.a. indicates that the peptide residue is missing, or that the specific amino acid cannot form H-bonds/salt bridges, or that the protein residue (Lys 35, 89, or 91) was not resolved in the X-ray electron density. Dashes indicate that the H-bond is not formed in X-ray structures and that it is present for <5% in MD simulations; asterisks indicate that the H-bond is not reported because the same interaction was considered as an ion pair. To quantify the stability of the hydrophobic interactions between each peptide residue and the N-SH2 domain during all simulations, Figure reports the solvent accessible surface (SAS) for each side chain. For comparison, the same parameter was calculated in the available crystallographic structures. Quantitative values are reported in Table S2. For all the simulated sequences, residues +1 and +3 remained stably embedded in the domain groove. Residue +5 was also buried in all cases where a hydrophobic side chain was present at that position (GAB1, IRS1-1172, IRS1-895, and IMHOF9 simulations, where residue +5 is L or F), with the single exception of the IRS1-1172_11 trajectory.

Figure 7

Solvent exposure of phosphopeptide residues; except for pY, each residue is colored in green when its solvent accessibility is lower than 50% and in red when it is higher than 50%. For MD simulations, an average value is reported. Hydrophobic, anionic, and cationic residues are colored in green, red, and blue, respectively. Overall, the hydrophobic interactions involving residues +1, +3, and +5 of the peptide, which characterize type II SH2 domains, remained stable in most of our simulations, corroborating their importance in determining the affinity and selectivity of the N-SH2 domain of SHP2.

Characteristic Features of the SHP2 N-SH2 Domain: Interactions of Residues −2, −1, +2, and +4

The N-SH2 domain of SHP2, while part of class II, presents peculiar features, which could affect its binding selectivity. As discussed in the section focusing on the pY interactions, more than 80% of SH2 domains have a conserved arginine at position αA2. By contrast, the SH2 domains of SHP2, SHP1, and MATK have a glycine at that position.[3] In the N-SH2 domain, the lack of side chain at position 13 (G αA2) favors the accessibility of an exposed V14 at position αA3. This peculiarity has been previously described[9,56,73] and explains why the N-SH2 of SHP2 is one of the few SH2 domains in which residues N-terminal to the pY contribute to the binding specificity. Indeed, in several simulations, we observed that hydrophobic residues in −2 inserted between the pY ring and V14, interacting hydrophobically with both (Figure ). A second peculiarity, which has received limited attention in the literature, is that the N-SH2 domain has two K residues one amino acid apart in loop BG (K89 and K91, BG5 and BG7). The alignment of the human SH2 domains[3] shows that positive residues in the BG loop are rather frequent and that 33 of the total 120 domains have a cationic amino acid in the position corresponding to K91. However, we noticed that a (K/R-X-K/R) pattern in the positions corresponding to K89 and K91, which face toward the peptide binding groove, is shared only by the SHP2 N-SH2 domain and by the C-terminal SH2 domains of PLC-γ1 and 2. In principle, these side chains could form electrostatic interactions with acidic residues present in +2 and +4 of the peptide, which are shown to be favorable at those positions by peptide array studies and by the sequences of high-affinity natural partners (Tables and 2). In the available crystallographic structures, these interactions would be possible in 4QSY, 1AYB, and 5X94, where a D/E residue is present at positions +2, +4, or both. However, rather surprisingly, a bona fide ion pair is not formed in any of these structures (Table ). Among the simulated sequences, a D/E residue is present at position +2 or +4 (or both) in eight of the twelve simulations. Different from the X-ray conformations, our simulations show that a stable salt bridge forms between the +2 residue and K91 in all cases where this is possible. An ion pair between residues +4 and K89 forms, too, although only for a fraction of the simulation time (Table and Figure ). Interestingly, even polar, uncharged residues at positions +2 and +4 can interact with K89 and K91 by forming H-bonds (which, again, were not observed in the crystallographic structures). Therefore, the simulations indicate that electrostatic or H-bonding interactions between the BG loop and residues +2 and +4 can contribute significantly to the binding affinity and selectivity.

Figure 8

Most common intermolecular ion-pair interactions between the phosphopeptide side chains and the N-SH2 domain. Distribution of charged group distances populated in each MD trajectory. Distances of less than 4 Å (vertical red dashed lines) are indicative of a stable salt bridge. Dashed horizontal lines indicate that the corresponding phosphopeptide sequences lack an anionic residue at these positions and therefore cannot form the ion pair. A third characteristic feature of the N-SH2 domain of SHP2 is the K residue at position BC1, which is present only in the C-terminal domain of ZAP70, while an R is present at that position in the N-SH2 domain of SHP1. As discussed above, in the crystallographic structures, K35 points toward the solvent. However, in the simulations, when E was present at position −1 (with the single exception of IRS-895), it interacted electrostatically with K35 (BC1). Interestingly, the trajectories in which this happened (GAB1 and IRS1-546) were the same in which the K35-pY ion pair was observed, as discussed above (Table ). Probably, the negative residue in −1 favors a conformational transition, which brings the K35 side chain from being solvent-exposed to pointing toward the domain core, and in interaction with the pY, where it partially replaces K55 (Figure ). A high mobility of K35 is supported by the observation that its side chain is not resolved in the electron density of several crystallographic structures (Table ). In addition, during the simulations, polar residues in −1 could also form marginally stable H-bonds, with amino acids of the βD strand. Finally, our simulations showed that some interactions are also possible for negatively charged or polar residues in +6. An aspartate in that position can interact electrostatically with K91 (BG7) (although without forming a stable ion pair due to the flexibility of the C-terminal end of the peptide). By contrast, in the crystallographic structure 4QSV, D +6 and K91 are very distant. In addition, the side chain of residue +6 can also form a H-bond with the EF or BG loops.

The N-SH2 Domain Populates Different Conformations

The data reported above on interactions in the pY binding pocket in the MD simulations indirectly suggested a significant conformational variability of this region. This is clearly shown by an overall analysis of the domain mobility in the 12 trajectories. As shown in Figure , the most mobile regions were the BC loop, which forms the pY pocket, and the EF and BG loops, which control access to the hydrophobic specificity region.

Figure 9

N-SH2 domain conformational variability in the MD simulations. Root-mean-square fluctuations (RMSF) of the N-SH2 domain backbone in the cumulative trajectory including all 12 simulations. The domain secondary structure is reported at the bottom for reference. The most mobile loops are highlighted in red in the figure. The blue-shaded area represents the standard deviation of the RMSF profile calculated between the twelve 1 μs trajectories. Figure analyzes the conformation of these flexible regions. For the BC loop, it reports its average distance from T42, which is located in the pY pocket, on the βC strand (βC3) (Figure , left panel). While this loop is closed in all X-ray structures of the N-SH2 domain of SHP2, in our simulations, we find that it can change its structure significantly, populating also a more open conformation. Since this region is highly conserved in SH2 domains, we compared the MD conformations to those observed in experimental structures (both crystallographic and NMR, obtained in solution) of other SH2 domains. An open conformation of the BC loop is observed in only a few of the crystallographic structures but is significantly populated in solution according to NMR data. Therefore, our simulations might have observed a conformation of the pY loop that had not been previously reported for the N-SH2 domain of SHP2, possibly because it is disfavored by the crystal environment and by intermolecular crystallographic contacts.

Figure 10

Structural parameters in simulated and experimental structures. Left: conformation of the pY pocket, as measured from the average distance between residues in the pY-loop (BC, residues 34–38) and T42 (βC3) in N-SH2 or structurally equivalent residues in other SH2 domains (see the Supporting Information). Center: conformation of the loops controlling access to the selectivity-determining region, as measured from the minimum distance between the EF loop (residues 66–69) and BG loop (residues 84–96). Right: conformation of the central β sheet as measured from the interstrand distance between the C atom of D40 (βC1) and N atom of Q57 (βD’1) or structurally equivalent residues in other SH2 domains. Data from the overall MD simulation of 12 N-SH2:peptide complexes are shown in black, along with analogous data from X-ray (red) and NMR (green) structures of SH2 domains. Values for experimental structures of isolated N-SH2 domains are shown as blue (when phosphopeptide-bound) or red dots (with no bound peptide). Values for structures of the domain in the whole SHP2 protein are reported as cyan (autoinhibited conformation) or orange dots (active conformation). Average ± standard deviation values of distances spanned by the individual simulations are indicated by black horizontal bars, reported in the order of Table , with GAB1_10 at the bottom and IMHOF5 at the top. The EF and BG loops, which regulate the accessibility of the specificity region, are distant in all structures of phosphopeptide/N-SH2 complexes and more closed in the structure of the autoinhibited state of SHP2. Indeed, based on structural data, this transition has been hypothesized to be part of the allosteric switch controlling SHP2 activity and binding affinity.[15,75,76] In our simulations, we find that the loops can attain a significantly closed conformation even when a phosphopeptide is present in the binding cleft. In some trajectories (GAB1_10, GAB1_13, SWEENEY12, and IRS1-546), they stably embraced the peptide, clasping it tightly and getting in contact. The high sequence variability of the BG loop does not allow a quantitative comparison with the structures of other SH2 domains in this case. However, while such closed conformations have never been observed in X-ray structures of SHP2, for other SH2 domains, the EF and BG loops have been described as a ″set of jaws″ that clamp down on the peptide.[12,72] Another element of structural flexibility that we observed in our simulations is a variable length for the central β sheet. As shown in Figure , values going from ∼5 to ∼12 Å are populated for the distance between the N-terminal residue of the C strand (D40 and βC1) and the opposite residue in strand D (Q57 and βD’1). A similar variability (although in a smaller range) is present in the X-ray structures of the N-SH2 domain and also in the experimental structures of other SH2 domains. However, to the best of our knowledge, this important feature of conformational flexibility has not been previously discussed. These different conformational features are illustrated in Figure , which reports the most representative structures of two simulations.

Figure 11

Conformational variability of the peptide-bound N-SH2 domain. Most representative structures of simulations IRS1-1172_9 and IRS1-1172_11, showing the conformational transitions of BC (purple), EF (orange), and BG (red) loops and of the central β sheet connecting strands C and D. The DE loop is highlighted in blue. The peptide surface is shown in yellow. As shown in Figure , each individual simulation populated only one region of the overall conformational space. This finding could be due to an effect of the peptide sequence on the conformational properties of the domain, but it could also be caused by insufficient sampling of the conformational space in the single simulations. Further studies will be required to clarify these aspects.

Conclusions

This work analyzed the structural determinants of the binding affinity and selectivity of the N-SH2 domain of SHP2. Some of the features responsible for the sequence preferences of this domain were already visible in the previously published crystallographic structures. The simulations confirmed that, even in solution and notwithstanding the significant motions of the domain and of the bound peptide, these interactions are conserved. In particular, residues −2 to +5 are stably interacting with the domain, and this region of the peptide adopts an extended conformation (particularly from 0 to +3). The pY is stabilized in its pocket by multiple electrostatic and H-bonding interactions, while hydrophobic residues are needed at positions +1, +3, and +5, where they interact with apolar side chains of the domain binding groove. These properties are common to type II SH2 domains. However, the simulations confirmed some peculiarities of the N-SH2 domain of SHP2, which differentiate it from other SH2 domains and might contribute to its selectivity. Specifically, in place of the commonly conserved R αA2, the N-SH2 domain of SHP2 has G13. As a consequence, a hydrophobic peptide residue at position −2 can insert in the space left free by the missing side chain and interact with the accessible side chain of V14 αA3, as well as with the phenol ring of pY, stabilizing its orientation and the overall complex. Indeed, selectivity for residues N-terminal to the pY is peculiar to the N-SH2 domain. Another characteristic property of the N-SH2 domain of SHP2 is the nonconserved T42 in βC3, which forms a stable H-bond with the pY phosphate. More importantly, the simulations highlighted some features that were not visible in the crystal structures, thus providing novel insights into the binding preferences of the N-SH2 domain. A peculiarity of this domain is the K-X-K motif in the region of the BG loop facing toward the peptide binding groove. Anionic residues at positions +2 and +4 strongly interact with the two K side chains. Even polar amino acids at these positions in the peptide sequence can interact with them through H-bonds. These observations are supported by the frequent presence of acidic residues at these positions in the sequences of natural binding partners, while a similar sequence selectivity did not emerge clearly from peptide library studies. Another feature characterizing the N-SH2 domain is that, in some cases, interactions extended up to residue +6 through H-bond or ion-pair formation with the EF or BG loops. This previously unexplored possibility warrants further investigation. Polar amino acids at +1 can form H-bonds with residues in the corresponding domain pocket. This finding explains why a T residue was shown to be permitted at that position by library studies (in addition to hydrophobic amino acids). Surprisingly, the conserved K βD6 does not form an ion pair with the pY phosphate in crystallographic structures. MD simulations indicated that, in solution, a slight rearrangement of the pY binding pocket might allow salt-bridge formation. Another cationic residue is present in the pY pocket (K35, BC1), but in the crystallographic structures, it points toward the solvent, without interacting with the pY. Our simulations showed that the presence of an acidic residue at position −1 of the phosphopeptide can favor a conformational transition that brings K35 toward the domain. In this new orientation, it interacts both with pY and with the residue in −1. Finally, we observed in our simulations a significant conformational flexibility of the domain. These conformational transitions were associated with the BC loop (which forms the pY pocket), with the DE and BG loops controlling access to the peptide binding groove and with the central βC and βB strands, and were broader than those previously hypothesized based on the different crystallographic structures of the domain. Investigation of the possible role of these motions in the function of SHP2 will require a more extensive exploration of the conformational properties of the N-SH2 domain.

64 in total

Review 1. Molecular recognition by SH2 domains.

Authors: J Michael Bradshaw; Gabriel Waksman
Journal: Adv Protein Chem Date: 2002

2. Sequence specificity of SHP-1 and SHP-2 Src homology 2 domains. Critical roles of residues beyond the pY+3 position.

Authors: Diana Imhof; Anne-Sophie Wavreille; Andreas May; Martin Zacharias; Susheela Tridandapani; Dehua Pei
Journal: J Biol Chem Date: 2006-05-15 Impact factor: 5.157

3. Canonical sampling through velocity rescaling.

Authors: Giovanni Bussi; Davide Donadio; Michele Parrinello
Journal: J Chem Phys Date: 2007-01-07 Impact factor: 3.488

4. Comparison of multiple Amber force fields and development of improved protein backbone parameters.

Authors: Viktor Hornak; Robert Abel; Asim Okur; Bentley Strockbine; Adrian Roitberg; Carlos Simmerling
Journal: Proteins Date: 2006-11-15

5. Structural and functional effects of disease-causing amino acid substitutions affecting residues Ala72 and Glu76 of the protein tyrosine phosphatase SHP-2.

Authors: Gianfranco Bocchinfuso; Lorenzo Stella; Simone Martinelli; Elisabetta Flex; Claudio Carta; Francesca Pantaleoni; Basilio Pispisa; Mariano Venanzi; Marco Tartaglia; Antonio Palleschi
Journal: Proteins Date: 2007-03-01

6. Crystal structures of peptide complexes of the amino-terminal SH2 domain of the Syp tyrosine phosphatase.

Authors: C H Lee; D Kominos; S Jacques; B Margolis; J Schlessinger; S E Shoelson; J Kuriyan
Journal: Structure Date: 1994-05-15 Impact factor: 5.006

7. Recognition of a high-affinity phosphotyrosyl peptide by the Src homology-2 domain of p56lck.

Authors: M J Eck; S E Shoelson; S C Harrison
Journal: Nature Date: 1993-03-04 Impact factor: 49.962

8. SH2 domains from suppressor of cytokine signaling-3 and protein tyrosine phosphatase SHP-2 have similar binding specificities.

Authors: David De Souza; Louis J Fabri; Andrew Nash; Douglas J Hilton; Nicos A Nicola; Manuel Baca
Journal: Biochemistry Date: 2002-07-23 Impact factor: 3.162

9. Protein tyrosine phosphatase SHP2/PTPN11 mistargeting as a consequence of SH2-domain point mutations associated with Noonan Syndrome and leukemia.

Authors: Pia J Müller; Kristoffer T G Rigbolt; Dirk Paterok; Jacob Piehler; Jens Vanselow; Edwin Lasonder; Jens S Andersen; Fred Schaper; Radoslaw M Sobota
Journal: J Proteomics Date: 2013-04-11 Impact factor: 4.044

10. Phosphotyrosine recognition domains: the typical, the atypical and the versatile.

Authors: Tomonori Kaneko; Rakesh Joshi; Stephan M Feller; Shawn Sc Li
Journal: Cell Commun Signal Date: 2012-11-07 Impact factor: 5.712

6 in total

1. Dissecting protein tyrosine phosphatase signaling by engineered chemogenetic control of its activity.

Authors: Jordan Fauser; Vincent Huyot; Jacob Matsche; Barbara N Szynal; Yuri Alexeev; Pradeep Kota; Andrei V Karginov
Journal: J Cell Biol Date: 2022-07-13 Impact factor: 8.077

Review 2. A comprehensive review of SHP2 and its role in cancer.

Authors: Moges Dessale Asmamaw; Xiao-Jing Shi; Li-Rong Zhang; Hong-Min Liu
Journal: Cell Oncol (Dordr) Date: 2022-09-06 Impact factor: 7.051

Review 3. Phase-Separated Subcellular Compartmentation and Related Human Diseases.

Authors: Lin Zhang; Shubo Wang; Wenmeng Wang; Jinming Shi; Daniel B Stovall; Dangdang Li; Guangchao Sui
Journal: Int J Mol Sci Date: 2022-05-14 Impact factor: 6.208

4. Targeting Oncogenic Src Homology 2 Domain-Containing Phosphatase 2 (SHP2) by Inhibiting Its Protein-Protein Interactions.

Authors: Sara Bobone; Luca Pannone; Barbara Biondi; Maja Solman; Elisabetta Flex; Viviana Claudia Canale; Paolo Calligari; Chiara De Faveri; Tommaso Gandini; Andrea Quercioli; Giuseppe Torini; Martina Venditti; Antonella Lauri; Giulia Fasano; Jelmer Hoeksma; Valerio Santucci; Giada Cattani; Alessio Bocedi; Giovanna Carpentieri; Valentina Tirelli; Massimo Sanchez; Cristina Peggion; Fernando Formaggio; Jeroen den Hertog; Simone Martinelli; Gianfranco Bocchinfuso; Marco Tartaglia; Lorenzo Stella
Journal: J Med Chem Date: 2021-10-29 Impact factor: 7.446

5. Discriminating between competing models for the allosteric regulation of oncogenic phosphatase SHP2 by characterizing its active state.

Authors: Paolo Calligari; Valerio Santucci; Lorenzo Stella; Gianfranco Bocchinfuso
Journal: Comput Struct Biotechnol J Date: 2021-11-03 Impact factor: 7.271

6. An allosteric interaction controls the activation mechanism of SHP2 tyrosine phosphatase.

Authors: Massimiliano Anselmi; Jochen S Hub
Journal: Sci Rep Date: 2020-10-28 Impact factor: 4.379

6 in total