Literature DB >> 35128084

In-silico analysis of nonsynonymous genomic variants within CCM2 gene reaffirm the existence of dual cores within typical PTB domain.

Akhil Padarti^1,2, Ofek Belkin¹, Johnathan Abou-Fadel¹, Jun Zhang¹.

Abstract

PURPOSE: The objective of this study is to validate the existence of dual cores within the typical phosphotyrosine binding (PTB) domain and to identify potentially damaging and pathogenic nonsynonymous coding single nuclear polymorphisms (nsSNPs) in the canonical PTB domain of the CCM2 gene that causes cerebral cavernous malformations (CCMs).
METHODS: The nsSNPs within the coding sequence for PTB domain of human CCM2 gene, retrieved from exclusive database searches, were analyzed for their functional and structural impact using a series of bioinformatic tools. The effects of mutations on the tertiary structure of the PTB domain in human CCM2 protein were predicted to examine the effect of nsSNPs on the tertiary structure of PTB Cores.
RESULTS: Our mutation analysis, through alignment of protein structures between wildtype CCM2 and mutant, predicted that the structural impacts of pathogenic nsSNPs is biophysically limited to only the spatially adjacent substituted amino acid site with minimal structural influence on the adjacent core of the PTB domain, suggesting both cores are independently functional and essential for proper CCM2 PTB function.
CONCLUSION: Utilizing a combination of protein conservation and structure-based analysis, we analyzed the structural effects of inherited pathogenic mutations within the CCM2 PTB domain. Our results predicted that the pathogenic amino acid substitutions lead to only subtle changes locally, confined to the surrounding tertiary structure of the PTB core within which it resides, while no structural disturbance to the neighboring PTB core was observed, reaffirming the presence of independently functional dual cores in the CCM2 typical PTB domain.

Entities: Chemical

Keywords: Amino acid substitution; CCMs, cerebral cavernous malformations; CSC, CCM signaling complex; CUPSAT, Cologne University Protein Stability Analysis Tool; HOPE, Have (y)Our Protein Explained; I-TASSER, the iterative threading assembly refinement; INDELs, insertions/deletions; In-silico analysis; MAF, minor allele frequency; Nonsynonymous single nucleotide polymorphisms (nsSNPs); PANTHER, Protein ANalysis THrough Evolutionary Relationship; PDB, protein data bank; PH, pleckstrin homology; POLYPHEN-2, Polymorphism Phenotyping; PROVEAN, Protein Variation Effect Analyzer; PTB, phosphotyrosine binding; PTCs, premature termination codons; SIFT, Sorting Intolerant From Tolerant; Single nucleotide polymorphisms (SNPs); Superimposition of protein structures; Tertiary structure; nsSNP, nonsynonymous single nucleotide polymorphism

Year: 2022 PMID： 35128084 PMCID： PMC8808078 DOI： 10.1016/j.bbrep.2022.101218

Source DB: PubMed Journal: Biochem Biophys Rep ISSN： 2405-5808

Introduction

Scaffold proteins have essential roles in various pivotal cellular signaling cascades [1]. One recurring domain shared by scaffolding proteins is the phosphotyrosine binding (PTB) domain [[2], [3]]. As the second-largest family of phosphotyrosine recognition domains, the PTB domain was shown to evolve from pleckstrin homology (PH) domains, as both PTB and PH domains are structurally and functionally comparable sharing the PH superfolds [4]. Contrary to the singular binding pocket in PH and FERM domains, our lab discovered that the full-length typical PTB domain contains two equal, unique, versatile, and independent binding pockets (PTB dual cores), allowing the domain to bind to multiple NPXY motifs simultaneously present in the cytoplasmic tails of membrane receptors [5]. In this report, we utilized confirmed genetic data to validate the existence of PTB dual cores with biological phenotypes and corresponding genetic data, which can help refine PTB domain binding partners within the cell. CCM2 is a PTB domain-containing protein that binds to CCM1 through either one [6] or two [7] NPXY motifs. CCM1 and CCM3 bind to CCM2, which serves as a docking site, to form the CCM signaling complex (CSC) in combination with other CSC members, and plays a key role in multiple essential cellular processes [8]. To validate the existence of two functional binding pockets in the CCM2 typical PTB domain, which is used to bind CCM1, our focus in this report will be solely on CCM1/CCM2 interactions via the CCM2 typical PTB domain. Several mutations that disrupt the CCM1/CCM2 interaction have been implicated in cerebral cavernous malformations (CCMs), an autosomal dominant disease with incomplete penetrance, indicating even individuals that carry the same mutation in the same family have variable clinical outcomes; some will develop CCMs while others will never show any clinical CCM symptoms. Mutations on both alleles of one of the CCM genes have been reported only in CCM lesions [8]. While it has been experimentally shown that both PTB cores (Core1 and Core2) in CCM2 are independently capable of binding to NPXY motifs on CCM1 [9], the biological relevance of dual PTB cores remains elusive. To date, more than 150 distinct pathogenic CCM1/CCM2/CCM3 germline mutations causing CCMs (OMIM 116860) have been implicated in 87–98% of familial CCMs [10]. Approximately 80 genomic variants have been reported in the CCM2 gene, and half of them are nonsense loss of function mutations or frameshifting-insertions/deletions (INDELs) leading to premature termination codons (PTCs) [11]. In general, amino acid substitutions can affect protein function through altered folding/stability, aberrant post-translational modifications, disruption of domain functions, and/or affecting sites of interaction [12]. Experimental methods for protein structure determination have been established with X-ray crystallography and NMR spectroscopy, generating substantial structural data. With an enormous amount of protein structure information in protein data bank (PDB), various computational approaches have been developed to model the 3D structure of a protein with a known amino acid sequence, a type of in-silico analysis [[13], [14], [15]]. Since it is unclear whether both CCM2 PTB cores are necessary and essential for CCM2 functionality, we believe that the effect of a nonsynonymous single nucleotide polymorphism (nsSNP) at the protein level should be studied with in-silico methodology to predict the effect of single amino acid substitutions during CCMs pathogenesis, and to validate the existence of functional dual PTB cores in CCM2. This will be accomplished through phenotype/genotype correlation during CCMs pathogenesis to evaluate the functionality of CCM2 PTB cores.

Materials and methods

To thoroughly investigate the clinical relevance of the nonsynonymous genomic variants within the CCM2 gene, we searched well-known databases from HGMD (the Human Gene Mutation Database, http://www.hgmd.cf.ac.uk), ClinVar (a public archive with interpretations of clinically relevant variants, https://www.ncbi.nlm.nih.gov/clinvar/), gnomAD (the Genome Aggregation Database, https://gnomad.broadinstitute.org), the 1000 Genomes Project (https://www.ncbi.nlm.nih.gov/variation/tools/1000genomes), dbSNP (the Single Nucleotide Polymorphism database, https://www.ncbi.nlm.nih.gov/snp/), OMIM (the Online Mendelian Inheritance in Man, https://www.omim.org), Angioma Alliance (https://www.angioma.org/), and ESP (NHLBI Exome Sequencing Project, https://evs.gs.washington.edu/EVS/). In order to precisely predict the impact of nonsynonymous genomic variants within the CCM2 protein, we utilized well-known bioinformatic tools, which can be categorized into three major groups: protein conservation-based, protein structure-based analysis [16], or a combination thereof, with a selected array of in-silico predictive algorithms to evaluate the genomic variants [16,17]. For evolutionary conservative approaches, there are multiple programs such as SIFT (Sorting Intolerant From Tolerant, https://sift.bii.a-star.edu.sg/www/Extended_SIFT_chr_coords_submit.html) [18], PANTHER (Protein ANalysis THrough Evolutionary Relationship, http://www.pantherdb.org/tools/) [19], and MUTATION ASSESSOR (http://mutationassessor.org/r3/). For homolog modelling (sequence similarities/alignment), we utilized PROVEAN (Protein Variation Effect Analyzer, https://provean.jcvi.org/index.php) [20]. For protein structure/function using an evolutionary conservation-based approach, we selected POLYPHEN-2 (Polymorphism Phenotyping Ver. 2.0, https://genetics.bwh.harvard.edu/pph2/) [21], and MUTATION TESTER (http://www.mutationtaster.org/) [22]. For protein structure stability measurements, we utilized MUPRO (http://mupro.proteomics.ics.uci.edu/), I-MUTANT (http://gpcr2.biocomp.unibo.it/cgi/predictors/I-Mutant3.0/I-Mutant3.0.cgi) [23], HOPE (Have (y)Our Protein Explained, https://www3.cmbi.umcn.nl/hope/method/), and CUPSAT (Cologne University Protein Stability Analysis Tool, http://cupsat.tu-bs.de/) [24] with different parameters to define nonsynonymous variants. The minor allele frequency (MAF) was acquired for each nsSNP from the SNP database. MAF represents the incidence of the gene variant in the general population. We hypothesize if MAF of one nsSNP is less than the overall prevalence of symptomatic CCMs in the general population (0.04%), this nsSNP is likely to be pathogenic. Currently, there is only one X-ray crystallography structure of CCM2 PTB domain (bound with an NPXY motif ligand) deposited in PDB (4WJ7) [6]. To better serve our purpose, we utilized MODELLER (https://salilab.org/modeller/9.16/release.html), which uses homology modeling with ab initio methods producing solutions that satisfy a set of spatial rules derived from probability density functions and statistical analysis of PTB domain containing protein structures, deposited in PDB [25]. One limitation of using MODELLER, is that the program depends on the deposited x-ray structure data of the CCM2 PDB domain, which is bound to a ligand in the C-terminus, resulting in the predicted structure of CCM2 PTB domain being truncated in the C-terminus. Therefore, MODELLER was unable to generate nsSNPs encompassing the C-terminus of the CCM2 PTB structure (such as the A179S mutant). Comparatively, we also used an integrated platform I-TASSER (The iterative threading assembly refinement, https://zhanglab.dcmb.med.umich.edu/I-TASSER/), which is an automated protein structure and functional prediction software based on the sequence-to-structure-to-function paradigm from multiple threading alignments to perform iterative structural assembly simulations [26]. Three-dimensional (3D) atomic models generated by either I-TASSER or MODELLER were then visualized and compared by molecular visualization software, PYMOL (http://www.pymol.org/), CHIMERA (http://www.cgl.ucsf.edu/chimera) [25], and RASWIN (http://www.openrasmol.org/). This process was performed for wildtype CCM2 and identified CCM2 mutants. The tertiary structure of the two proteins were superimposed and analyzed for any structural differences.

Results

Define genomic variants within the PTB domain of CCM2

By searching all available databases, 66 nsSNPs in 49 amino acid positions were identified in the PTB domain of CCM2, in addition, two in-frame deletions in exon 2 were also identified, making a total of 68 nonsynonymous genomic variants within the PTB domain (Suppl. Table 1). The interpretations of our in-silico analysis for all nsSNPs are shown (Table 1, Suppl. Table 1). Among known genetic mutants and our selected candidate mutants, there is consistent agreement among predicted results through our various in-silico analysis (Table 1).

Table 1

Pathogenic nsSNPs in both cores of CCM2 PTB domain. The known pathogenic and several potential pathogenic nsSNPs of CCM2 PTB domain are presented. The nsSNPs were chosen based on genetic results from familial CCM cases and high probability of the predicted pathological nature of the mutation. All 66 recorded mutations within CCM2 PTB domain are shown in the Suppl.mentary Table 1. The secondary structural motif and the core location of each substitution is shown. The SNP nomenclature and MAF for known mutations are also shown if available. Each nsSNP were further evaluated with various in-silico tools for pathogenicity. The reference for each reported pathogenic nsSNP in human genetic study evidenced as phenotype/genotype correlation is also provided.

Mutation	Exon number	PTB Core	Secondary structure	SNP nomenclature	MAF	SIFT	MUTATION ASSESSOR	PANTHER	CUPSAT	MUPRO	HOPE	I-MUTANT-2.0	PROVEAN	POLYPHEN-2	MUTATION TESTER	References
I76T	3	Core1	β1-α2	rs756431644	8.00E-06	Functional	low	probably benign	Destabilising	Decrease stability	mutation can disturb this domain and abolish its function	Destabilising	Deleterious	probably damaging	Disease-Causing
A98T	4	Core1	α2	rs780867674	1.70E-05	Tolerated	low	probably damaging	Stabilising	Decrease stability	mutation can disturb this domain and abolish its function	Destabilising	Neutral	benign	POLYMORPHISM
A111P	4	Core1	β2	rs750889112	4.00E-06	Tolerated	medium	probably damaging	Destabilising	Decrease stability	might disturb the core structure of this domain.	Stabilising	Deleterious	probably damaging	Deleterious	[35]
L113P	4	Core1	β2	rs11552377	0.06	Functional	low	probably benign	Stabilising	Decrease stability	might disturb the core structure of this domain.	Destabilising	Deleterious	likely damaging	Deleterious	[34]
L115R	4	Core1	β2	N/A	N/A	Functional	medium	probably damaging	Destabilising	Decrease stability	residue is located near a highly conserved position	Destabilising	Deleterious	probably damaging	Deleterious	[6]
V120I	4	Core1	β3	rs11552377	0.175	Tolerated	low	probably benign	Stabilising	Decrease stability	polymorphism	Destabilising	Neutral	benign	Benign	[40]
V120D	4	Core1	β3	rs745788686	4.00E-06	Functional	medium	probably benign	Destabilising	Decrease stability	might be damaging to the protein and abolish its function	Destabilising	Deleterious	probably damaging	Deleterious	[36]
A141T	4	Core2	β5	rs1562908094	8.00E-06	Functional	medium	probably damaging	Destabilising	Decrease stability	might be damaging to the protein and abolish its function	Destabilising	Deleterious	likely damaging	Deleterious
R146W	4	Core2	β5	rs769929401	4.80E-05	Functional	medium	probably damaging	Destabilising	Decrease stability	residue is located near a highly conserved position	Destabilising	Deleterious	benign	Disease-Causing
L152 M	4	Core2	β6	rs760117074	8.00E-06	Functional	medium	probably damaging	Destabilising	Decrease stability	might be damaging to the protein and abolish its function	Destabilising	Neutral	likely damaging	Deleterious
V154G	4	Core2	β6	rs141353947	4.00E-06	Functional	medium	probably damaging	Destabilising	Decrease stability	might be damaging to the protein and abolish its function	Destabilising	Deleterious	probably damaging	Deleterious
L155P	4	Core2	β6	rs373239614	4.00E-06	Functional	medium	probably damaging	Stabilising	Decrease stability	might be damaging to the protein and abolish its function	Destabilising	Deleterious	probably damaging	Deleterious	[6]
A179S	5	Core2	α3	rs373136857	1.60E-05	Tolerated	neutral	probably benign	Destabilising	Decrease stability	might be damaging to the protein and abolish its function	Destabilising	Neutral	benign	POLYMORPHISM

Genomic variants with in-frame deletions leading to conformational changes in both PTB cores

This 58 amino acid in-frame deletion was one of the first discovered genomic variants in CCM2 mutation screening [27,28], and has been consistently observed [29,30]. Experimental data showed this in-frame deletion abolishes the interaction between CCM1/CCM2, indicating this portion of the peptide sequence plays an important role in this interaction [28]. Our analysis indicates that the deletion encompasses a portion of the β1 strand within Core1 (Fig. 1A, Suppl. Video 1A). β-Sheet 1 contains only 3 β-strands in the deletion mutant, one fewer than the wildtype. The majority of the PTB domain remains overlapping between the mutant and the wildtype PTB domain, however, the α3 helix in Core2 is surprisingly distorted indicating that this large deletion results in overwhelming structural changes to both Core1 and Core2 of the CCM2 PTB domain.

Fig. 1

N-terminus in-frame deletion in Core 1 leads to conformational changes in both cores of CCM2 PTB domain. Deletions were modelled by I-TASSER and alignments were visualized using PYMOL. wildtype CCM2 (white) and mutant (green) are shown. The α3 helix is shown superior and α2 helix is shown inferior, while β-sheet 1 is shown anterior to β-sheet 2. A) 58 amino acid deletion in core1 mutant is illustrated. B) In-frame 4 amino acid deletion mutant (65-KEVK-68) is illustrated. Although the β-sheet overlaps between the wildtype and mutant, the α3 helix is shown to be mismatched indicating conformational change in the C-terminus. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) Another smaller four amino acid in-frame deletion in the same region of the PTB domain was also reported in two Japanese familial CCM cases [31], indicating the importance of this region for the PTB domain. In this deletion, only the N-terminal portion of Core1 is affected with intact upstream flanking sequences. Our analysis indicate that these four amino acids are in the β1 strand between Core1 and Core2 (Fig. 1B, Suppl. Video 1B). Similarly in the deletion mutant, β-Sheet 1 contains only 3 β-strands, one fewer than the wildtype, while β-Sheet 2 contains 2 β-strands, two fewer than the wildtype. While the majority of the PTB domain between the wildtype and mutant remains aligned, the α3 helix is misaligned in the mutant, similar to the large exon 2 deletion. It is interesting that this four amino acid deletion alters the structure of the β-barrel more severely than the large 58 amino acid deletion, suggesting that these four amino acids may be essential for Core1 function. The structural changes resulting from this deletion encompasses both Core1 and Core 2 of the PTB domain.

Genomic variants with nsSNPs only have local effects

CCM disease associated with nsSNPs in Core2, such as L198R and L213P, were among the first discovered genomic variants identified in CCM mutation screening [[32], [33], [34]]. Six nsSNPs/mutations (all MAFs <4e-4, symptomatic CCM incidence in general population) found in Core2 of CCM2 are modelled with I-TASSER and MODELLER (A141T, R146W, L152 M, V154G, L155P, A179S, Fig. 2, Suppl. Fig 1, Suppl. Video 2). As aforementioned, MODELLER generated the tertiary structures with a truncated C-terminus, which resulted in being unable to generate a valid structure for the A179S mutant (C-terminal). A141T (Fig. 2, Suppl. Fig. 1A, Suppl. Videos 2A, 2E) is predicted to be a destabilising/unfavorable, and rare nsSNP (MAF <10e-5). This mutation is in the β5 strand, with tertiary structure changes encompassing β5 and a portion of the β5/β6 loop, due to the polar side chain of threonine relative to alanine in the wildtype. L152 M (Fig. 2, Suppl. Fig. 1B, Suppl. Videos 2B, 2F) and V154G (Fig. 2, Suppl. Fig. 1C, Suppl. Videos 2C, 2G) are two other destabilising/unfavorable, and rare nsSNPs (MAF< 10e-4) encompassing the β6 strand and these structural perturbations in both mutations are limited to the β6 strand only. The amino acid side chains of both CCM2 mutants are more polar than the side chains of the wildtype CCM2. L155P (Fig. 2, Suppl. Fig. 1D, Suppl. Videos 2D, 2H) is a known CCM2 mutation resulting in CCM phenotype [6]. This mutation is in the β6 strand, resulting in structural changes in the β6 strand and β6/β7 loop. While leucine and proline both have hydrophobic side chains, proline results in a peptide backbone twist resulting in local structural disturbance. All six Core2 nsSNPs/mutations (A141T, R146W, L152 M, V154G, L155P, A179S) are predicted to show only local structural alterations, but none of these mutations lead to backbone distortion or have structural effects on the neighboring Core1 of CCM2 PTB domain (Fig. 2).

Fig. 2

nsSNPs in Core 2 lead to local disturbance of substituted amino acids without perturbing 3D conformation in Core 1 of CCM2 PTB domain in I-TASSER and MODELLER. For I-TASSER 3D structure orientation (Row 1, 2), the α3 helix is shown superior and the α2 helix is shown inferior, while β-sheet 1 is shown anterior to β-sheet 2. For MODELLER 3D structure orientation (Row 3, 4), the α2 helix is shown superior, while the full β-sheet 1 is shown inferior and the partial β-sheet 2 is shown on the right. For the ribbon models (Row 1, 3), the nsSNP mutant (green) is superimposed on the wildtype (white) in the ribbon conformation while the substituted amino acids are shown in stick configuration to demonstrate local conformational change. For the backbone models (Row 2, 4), the same nsSNP is superimposed on the wildtype in backbone conformation with mutant amino acids in line conformation to explore any possible peptide backbone distortion. For the backbone models (Row 2, 4), each core is shown in the different color: wildtype core1 (white), wildtype core2 (gray), mutant core1 (red), and mutant core2 (green) to highlight the two PTB cores (both red and green backbone from mutant can only be visualized if there is a distortion between mutant and wildtype). Several 3D PDB images are also provided as Suppl.ments. The nsSNPs are A141T (first column), R146W (second column), L152 M (third column), V154G (fourth column), L155P (fifth column), A179S (last column). The C-terminal portion (α3 helix and partial β-sheet 2) of the CCM2 PTB domain model is absent due to the CCM2 x-ray crystallographic structure data in the PDB database which emphasizes ligand binding in the C-terminal PTB core2 and for the same reason, A179S mutant is unable to be generated with MODELLER. Only subtle local disturbance was seen surrounding the substituted amino acids and no amino acid backbone distortion was observed. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) Interestingly, more pathogenic nsSNPs have been reported in the N-terminus of the PTB domain corresponding to Core1 than Core2. Six nsSNPs/mutations in Core1 have been modelled with I-TASSER and MODELLER (I76T, A98T, A111P, L113P, L115R, V120D) (Fig. 3, Suppl. Fig. 2, Suppl. Video 3). A111P (Fig. 3, Suppl. Fig. 2A, Suppl. Videos 3A, 3E) is a pathogenic mutation in the β2 strand (β-sheet 1), with minimal overlap between the amino acid side chains in both tertiary structure modalities [35]. There is misalignment in the β2, β2/3 loop, and α1/β2 loop between the two structures. Proline is more sterically bulky than alanine and possesses a larger local twist in the peptide backbone, resulting in a shortened β2 strand. L113P (Fig. 3, Suppl. Fig. 2B, Suppl. Videos 3B, 3F) is a known mutation that results in familial CCMs [34] that is present in the β2 strand, with predicted structural perturbations in the β1/β2 strands and β2/β3 loop. The side chain of proline results in a backbone peptide chain twist resulting in delayed formation of the β2 strand. L115R (Fig. 3, Suppl. Fig. 2C, Suppl. Videos 3C, 3G), a known pathological CCM mutation [6], resides in the β2 strand, with all predicted structural differences limited to the β2 strand and β2/β3 loop. Arginine has a charged side chain and all hydrophobic interactions with wildtype leucine are severed. V120D (Fig. 3, Suppl. Fig. 2D, Suppl. Videos 3D, 3H), a known mutation resulting in a Japanese familial CCM pedigree [36], is located in the β3 strand, with all structural differences limited to the β3 strand and β3/4 loop. Similar to L115R, V120D modelling predicts inhibition of all hydrophobic interactions of valine with the charged amino acid side chain of aspartate. In sum, like Core2, all six nsSNPs/mutations within Core1 (I76T, A98T, A111P, L113P, L115R, V120D) are predicted to show only local structural alterations, but none of these mutations lead to backbone distortion or have structural disruption on the adjacent Core2 of the PTB domain (Fig. 3).

Fig. 3

(Row 3, 4) 3D structure orientation, ribbon (Row 1, 3) and backbone (Row 2, 4) are displayed in the same fashion. The nsSNPs are I76T (first column), A98T (second column), A111P (third column), L113P (fourth column), L115R (fifth column), V120D (last column). Similar to Core2 mutants, only subtle local disturbance was seen surrounding the substituted amino acids and no amino acid backbone distortion was observed.

nsSNPs in Core 1 lead to local disturbance of substituted amino acids without perturbing 3D conformation in Core 2 of CCM2 PTB domain in I-TASSER and MODELLER. . Diagram layout is similar to Fig. 2, I-TASSER (Row 1, 2) and MODELLER (Row 3, 4) 3D structure orientation, ribbon (Row 1, 3) and backbone (Row 2, 4) are displayed in the same fashion. The nsSNPs are I76T (first column), A98T (second column), A111P (third column), L113P (fourth column), L115R (fifth column), V120D (last column). Similar to Core2 mutants, only subtle local disturbance was seen surrounding the substituted amino acids and no amino acid backbone distortion was observed. It is interesting to note that nsSNP (rs11552377, (c.358G > A, NM_ 001167935, p.Val120Ile)) has been reported as either pathogenic [29,[36], [37], [38], [39], [40]] and associated with increased risk of CCMs [41,42], or predicted to be at the “benign” end of the CCM phenotypic spectrum [37]. However, some data suggest this nsSNP affects splicing [41,43], despite its allele frequency (MAF range 0.13–0.17, Table 1) being much higher than the incidence of CCMs. Furthermore, various in-silico testing methodologies of V120I consistently concluded it is benign/polymorphic (Table 1), challenging previous mutational screening reports. Our in-silico analysis indicates V120I mutation is in the β3 strand in β-sheet 1 of Core1 (Suppl. Figs. 3, 4, Suppl. Video 4). The associated tertiary structure perturbation is limited to the β3 strand and β3/4 loop, which is the more flexible region of the PTB domain. The side chains of both valine and isoleucine are overlapping, similar sterically and equally hydrophobic, maintaining the hydrophobic side chain interactions. Therefore, we conclude this mutation (V120I) is polymorphic and not pathogenic. Further mutation screening in newly described novel coding exons of CCM2 [9] might be warranted for these familial CCM cases.

Discussion

. Since identification of causative genes of CCMs, there have been many attempts to utilize in-silico analysis with bioinformatics tools to interpret the genetic variants identified within CCM genes [29,30,34,41,42,[44], [45], [46], [47]]. However, the majority of these targeted genetic variants among three known CCM genes are either nonsense mutations or frame-shift mutations, making the outcomes of the in-silico analysis irrelevant to protein structural investigations. To date, one protein tertiary structure of CCM2 PTB domain binding to NPXY motif has been deposited in PDB (4WJ7), determined by x-ray crystallography [6]. Although this work was primarily focused on the traditionally recognized PTB functional pocket, Core2, it provided the foundation for one of our protein structural simulation programs, probability density functions-based MODELLER [25], making our novel integrated in-silico/structural simulation analysis more thorough. Furthermore, recent efforts for in-silico analysis of Core1 in a genetic variant that is co-segregated in a large Chinese CCM pedigree [35] provide us with additional strong evidence for supporting this methodology. On the structural level, each amino acid substitution frequently resulted in perturbations within the tertiary structure of the PTB domain. These perturbations were due to disruption of the electrostatic/hydrophobic interactions, yet the structural alteration was limited to the surrounding region of the single amino acid substitution within the PTB domain. The two in-frame deletions are the exception to this observance. Both in-frame N-terminal deletions resulted in alteration of both Core1 and Core2, strongly suggesting that the β1 strand/α2 helix at the N terminus may be equally essential for maintaining structural stability of the entire PTB domain as the C terminus, in contrast with previous reports [6,48]. Our combined structure and in-silico bioinformatics analyses provided strong evidence that nsSNPs in Core1 did not disturb neighboring Core2 and vice versa, indicating that both binding pockets have independent functional roles in their interactions with NPXY motifs, and dysfunction of either binding pocket is sufficient to initiate pathogenesis of CCMs. A significant portion of the CCM patients have no known causative mutations identified [41]. It may be difficult to differentiate between non-pathogenic and pathogenic nsSNPs in these patients. In-silico methodology can be used to identify high risk nsSNPs as potential causal mutations for these patients while ruling out certain nsSNPs as benign polymorphisms. Our in-silico analysis revealed that one missense mutation, V120I, currently considered a causal mutation of CCMs, is in fact a relatively common polymorphism and unlikely to result in phenotype. Similarly, this methodology also helped us to identify several potential candidate pathological nsSNPs along with known pathogenic nsSNPs previously reported (Table 1). Those candidate pathogenic nsSNPs can be further evaluated through MAF and their performance using in-silico analytical tools. In sum, our integrated structural and in-silico analysis will be useful in future CCM mutational screenings to identify pathogenic mutations and excluding normal variants (Suppl. Table 1).

Conclusion

This analysis predicted that both PTB-cores in CCM2 have independent functional binding pockets and mutations in either binding pocket can result in CCM phenotype without disrupting the conformation of the neighboring core, validating our PTB dual core theory. One important limitation to our methodology is analyzing the extent of cooperative binding between the two PTB cores. It is rather likely that cooperative binding between the two PTB cores exists, optimizing the binding ability of the CCM2 PTB domain. However, the extent of this cooperative binding cannot be evaluated through in-silico analysis. Nonetheless with the advent of robust computational tools, in-silico analysis is a proven valid and robust methodology for analyzing protein domains. Our integrated approach can also be utilized to verify potential pathogenic mutations in subsequent mutant screening analyses. Future efforts will be made to further explore the dual core nature of the PTB domains and to identify potential unique cellular partners for each binding pocket.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

48 in total

1. CCM2 gene polymorphisms in Italian sporadic patients with cerebral cavernous malformation: a case-control study.

Authors: Rosalia D'Angelo; Concetta Scimone; Carmela Rinaldi; Giuseppe Trimarchi; Domenico Italiano; Placido Bramanti; Aldo Amato; Antonina Sidoti
Journal: Int J Mol Med Date: 2012-02-28 Impact factor: 4.101

Review 2. The function of PTB domain proteins.

Authors: B Margolis; J P Borg; S Straight; D Meyer
Journal: Kidney Int Date: 1999-10 Impact factor: 10.612

3. Structural basis for the disruption of the cerebral cavernous malformations 2 (CCM2) interaction with Krev interaction trapped 1 (KRIT1) by disease-associated mutations.

Authors: Oriana S Fisher; Weizhi Liu; Rong Zhang; Amy L Stiegler; Sondhya Ghedia; James L Weber; Titus J Boggon
Journal: J Biol Chem Date: 2014-12-18 Impact factor: 5.157

4. Interaction between krit1 and malcavernin: implications for the pathogenesis of cerebral cavernous malformations.

Authors: Jun Zhang; Daniele Rigamonti; Harry C Dietz; Richard E Clatterbuck
Journal: Neurosurgery Date: 2007-02 Impact factor: 4.654

5. Genetic variations within KRIT1/CCM1, MGC4607/CCM2 and PDCD10/CCM3 in a large Italian family harbouring a Krit1/CCM1 mutation.

Authors: Silvana Pileggi; Serena Buscone; Claudia Ricci; Maria Cristina Patrosso; Alessandro Marocchi; Paola Brunori; Stefania Battistini; Silvana Penco
Journal: J Mol Neurosci Date: 2010-04-24 Impact factor: 3.444

6. First large genomic inversion in familial cerebral cavernous malformation identified by whole genome sequencing.

Authors: Stefanie Spiegler; Matthias Rath; Sabine Hoffjan; Philipp Dammann; Ulrich Sure; Axel Pagenstecher; Tim Strom; Ute Felbor
Journal: Neurogenetics Date: 2017-12-02 Impact factor: 2.660

7. A single-center study on 140 patients with cerebral cavernous malformations: 28 new pathogenic variants and functional characterization of a PDCD10 large deletion.

Authors: Grazia Nardella; Grazia Visci; Vito Guarnieri; Stefano Castellana; Tommaso Biagini; Luigi Bisceglia; Orazio Palumbo; Marina Trivisano; Carmela Vaira; Massimo Scerrati; Davide Debrasi; Vincenzo D'Angelo; Massimo Carella; Giuseppe Merla; Tommaso Mazza; Marco Castori; Leonardo D'Agruma; Carmela Fusco
Journal: Hum Mutat Date: 2018-09-24 Impact factor: 4.878

8. The rs61742690 (S783N) single nucleotide polymorphism is a suitable target for disrupting BCL11A-mediated foetal-to-adult globin switching.

Authors: Sayed Abdulazeez; Shaheen Sultana; Noor B Almandil; Dana Almohazey; B Jesvin Bency; J Francis Borgio
Journal: PLoS One Date: 2019-02-15 Impact factor: 3.240

9. A Novel CCM2 Missense Variant Caused Cerebral Cavernous Malformations in a Chinese Family.

Authors: Guoqing Han; Li Ma; Huanhuan Qiao; Lin Han; Qiaoli Wu; Qingguo Li
Journal: Front Neurosci Date: 2021-01-05 Impact factor: 4.677

10. Alternatively spliced isoforms reveal a novel type of PTB domain in CCM2 protein.

Authors: Xiaoting Jiang; Akhil Padarti; Yanchun Qu; Shen Sheng; Johnathan Abou-Fadel; Ahmed Badr; Jun Zhang
Journal: Sci Rep Date: 2019-11-01 Impact factor: 4.379