Literature DB >> 35914754

The Protein-Protein Interaction Network of Hereditary Parkinsonism Genes Is a Hierarchical Scale-Free Network.

Yun Joong Kim^1,2, Kiyong Kim³, Heonwoo Lee⁴, Junbeom Jeon⁴, Jinwoo Lee⁴, Jeehee Yoon⁴.

Abstract

PURPOSE: Hereditary parkinsonism genes consist of causative genes of familial Parkinson's disease (PD) with a locus symbol prefix (PARK genes) and hereditary atypical parkinsonian disorders that present atypical features and limited responsiveness to levodopa (non-PARK genes). Although studies have shown that hereditary parkinsonism genes are related to idiopathic PD at the phenotypic, gene expression, and genomic levels, no study has systematically investigated connectivity among the proteins encoded by these genes at the protein-protein interaction (PPI) level.
MATERIALS AND METHODS: Topological measurements and physical interaction enrichment were performed to assess PPI networks constructed using some or all the proteins encoded by hereditary parkinsonism genes (n=96), which were curated using the Online Mendelian Inheritance in Man database and literature.
RESULTS: Non-PARK and PARK genes were involved in common functional modules related to autophagy, mitochondrial or lysosomal organization, catecholamine metabolic process, chemical synapse transmission, response to oxidative stress, neuronal apoptosis, regulation of cellular protein catabolic process, and vesicle-mediated transport in synapse. The hereditary parkinsonism proteins formed a single large network comprising 51 nodes, 83 edges, and three PPI pairs. The probability of degree distribution followed a power-law scaling behavior, with a degree exponent of 1.24 and a correlation coefficient of 0.92. LRRK2 was identified as a hub gene with the highest degree of betweenness centrality; its physical interaction enrichment score was 1.28, which was highly significant.
CONCLUSION: Both PARK and non-PARK genes show high connectivity at the PPI and biological functional levels. © Copyright: Yonsei University College of Medicine 2022.

Entities: Chemical

Keywords: LRRK2; Parkinson's disease; causative gene; hereditary parkinsonian disorder; protein-protein interaction

Mesh：

Substances：
Proteins

Year: 2022 PMID： 35914754 PMCID： PMC9344267 DOI： 10.3349/ymj.2022.63.8.724

Source DB: PubMed Journal: Yonsei Med J ISSN： 0513-5796 Impact factor: 3.052

INTRODUCTION

More than 20 causative genes/loci showing Mendelian inheritance have been shown to be associated with familial Parkinson’s disease (PD), and these genes have been assigned a locus symbol prefix, “PARK”.1 In hereditary atypical parkinsonian disorders, also known as heredodegenerative parkinsonian disorders, parkinsonism is either concomitant or predominant with other phenotypes of movement disorders, and the causative genes are not assigned the PARK prefix (non-PARK genes). Typically, PARK genes are considered as Mendelian genes in PD because their mutations are associated with key features of idiopathic PD, including levodopa-responsiveness; whereas, in cases with non-PARK gene mutations, either no response or a transient response to levodopa is observed. Recent studies have shown that non-PARK genes are related to idiopathic PD at the phenotypic, gene expression, and genomic levels.234 However, to date, no study has investigated the connectivity between PARK and non-PARK genes at the protein-protein interaction (PPI) level. PPIs are pivotal for the proper functioning of a biological system. Understanding PPIs can help in predicting the biological functions of a protein and improving the characterization of pathways or functional modules.5 A PPI network (PPIN) can be used to identify novel drugs and new disease-related genes, subnetworks, and molecular mechanisms.6 Although PPINs have been used to explore PD-related functional pathways and discover novel genes, no study has investigated the relationship between PARK and non-PARK genes.789 Moreover, no study has systematically analyzed PPINs formed by proteins encoded by all PARK and non-PARK genes. Meanwhile, Goh, et al.10 applied graph theory to Mendelian genes and disorders from the Online Mendelian Inheritance in Man (OMIM) database and suggested that genes associated with similar disorders show a higher likelihood of physical interactions among their products. Given that both PARK and non-PARK genes share the parkinsonism phenotype, we speculate that their encoded proteins may interact often and may be involved in the same functional modules of PD pathogenesis. Here, we systematically curated PARK and non-PARK genes and performed functional annotation analyses to explore functional and physical connectivity between the hereditary parkinsonism genes. We also analyzed PPINs between PARK and non-PARK proteins and showed that hereditary parkinsonism genes are highly connected in a hierarchical scale-free network.

MATERIALS AND METHODS

Constructing gene sets

A custom gene set of hereditary parkinsonism genes, consisting of 96 causative genes for Mendelian disorders with a parkinsonism phenotype, was constructed via manual curation from the OMIM database11 and previous studies24 (Table 1 and Supplementary Table 1, only online). We also included DNAJC13, TMEM230, LRP10, and ATXN10,4 which are recently reported genes. Albeit somewhat controversial, DNAJC13 and TMEM230 have been assigned to the PARK21 loci [Mendelian Inheritance of Man (MIM) number: %616361]. LRP10 has been reported in cases with familial PD and dementia with Lewy bodies, although no PARK locus has been assigned for it. ATXN10 mutations present with levodopa-responsive parkinsonism with presynaptic loss of dopamine, as documented with 99mTc-TRODAT-1 SPECT.12 The final gene set consisted of 19 PARK genes and 77 non-PARK genes. Genes with more than two OMIM phenotype IDs, such as assignment of PARK loci and hereditary atypical parkinsonian disorders (e.g., ATP13A2 and PLA2G6), were categorized as PARK genes. For comparison, a gene set of PD risk genes (n=86 in 90 loci) was sourced from a meta-analysis of a genome-wide association study (meta-GWAS),13 and a gene set of Alzheimer’s disease (AD) risk genes (n=129) was constructed based on recent meta-GWAS.14 The AD gene set included genes that were nearest to the GWAS loci, had rare variant association, or documented gene-based association in AD. Genes expressed in the central nervous system were derived from the Human Protein Atlas.15

Table 1

List of Genes in the Hereditary Parkinsonism Gene Set

Gene sets		Genes
Hereditary parkinsonism genes (n=96)	PARK genes (n=19)	SNCA, UCHL1, EIF4G1, VPS35, PARK7, PRKN, PLA2G6, SYNJ1, FBXO7, HTRA2, PINK1, DNAJC6, VPS13C, LRRK2, ATP13A2, GIGYF2, DNAJC13, CHCHD2, TMEM230
Hereditary parkinsonism genes (n=96)	Non-PARK genes (n=77)	PSEN1, APP, ANG, SLC6A3, PDYN, FTL, GRN, MAPT, SLC20A2, CSF1R, PDGFRB, POLG, PRNP, SPR, ATP1A3, PDGFB, TH, SLC18A2, MECP2, SLC6A8, PLP1, WDR45, ATP6AP2, RAB39B, FMR1, TAF1, TBP, GCH1, NOTCH3, UBTF, PPT1, PSEN2, CACNA1A, VCP, DCTN1, ATXN2, SQSTM1, CHD3, SNCB, KIF5A, PDE8B, PRKRA, PPP2R2B, AFG3L2, TOR1A, XPR1, JPH3, OPA1, AAAS, VPS13A, DNAJC12, TWNK, PANK2, GBA, ATP7B, LYST, CLN3, ATXN3, ATM, SMPD1, NPC1, SLC39A14, COASY, LRP10, RNF216, PDE10A, SLC30A10, DNAJC5, GLB1, PTS, HTT, AP5Z1, C9orf72, C19orf12, CHCHD10, TMEM240, ATXN10
PD-related genes from meta-GWAS (n=86)		GRN, SNCA, GCH1, UBTF, DYRK1A, SCARB2, GBA, GALC, VPS13C, LRRK2, MCCC1, ASXL3, BAG3, BIN3, BST1, C5orf24, CAB39L, CAMK2D, CASC16, CD19, CHD9, CHRNB1, CLCN3, CRHR1, CRLS1, CTSB, DEG2, DNAH17, EL0VL7, FAM47E, FAM47E-STBD1, FAM49B, FBRSL1, FCGR2A, FGF20, FYN, GAK, GBF1, GPNMB, GS1-124K5.11, HIP1R, HLADRB5, IGSF9B, INPP5F, IP6K2, ITGA8, ITPKB, KCNIP3, KCNS3, KPNA1, KRTCAP2, LCORL, LINC00693, LOC100131289, MAP4K4, MBNL2, MED12L, MED13, MEX3C, MIP0L1, NOD2, NUCKS1, PAM, PMVK, RAB29, RETREG3, RIMS1, RIT2, RNF141, RPS12, RPS6KL1, SATB1, SCAF11, SETD1A, SH3GL2, SIPA1L2, SPPL2B, SPTSSB, STK39, SYT17, TMEM163, TMEM175, TRIM40, UBAP2, VAMP4, WNT3
AD-related genes from meta-GWAS (n=129)		ABCA7, ABI3, ACE, ADAM10, ADAMTS1, ADAMTS4, AGFG2, ALPK2, AP4M1, APH1B, APOC1, APOC2, APOC4, APOE, BCAM, BCKDK, BCL3, BCL7C, BIN1, BTNL2, BZRAP1, C14orf93, C1orf116, C6orf10, CASS4, CBLC, CCDC6, CD2AP, CD33, CD3EAP, CEACAM16, CEACAM19, CEACAM20, CELF1, CHRNA2, CKM, CLASRP, CLNK, CLPTM1, CLU, CNN2, CNTNAP2, CR1, CR1L, CSTF1, CYB561, DEDD, ECHDC3, EED, ENAH, EPHA1, ERCC1, ERCC2, EXOC3L2, FBXL19, FERMT2, GATS, GEMIN7, GPC2, GPR141, HESX1, HLA-DQA1, HLA-DRA, HLA-DRB1, ICA1L, IGSF23, IL34, INPP5D, IQCK, KAT8, KLC3, MARK4, MEF2C, MINDY2, MS4A2, MS4A3, MS4A4A, MS4A4E, MS4A6A, MS4A6E, NECTIN2, NKPD1, NYAP1, OARD1, PICALM, PILRA, PLCG2, PLCH1, PLD3, PPP1R37, PRL, PSMC3, PTK2B, PVR, PVRIG, PVRL2, RAB8B, RABEP1, RELB, RP11-81K2.1, SCARA3, SCIMP, SLC24A4, SLC39A13, SLTM, SORL1, SPI1, SPPL2A, STAG3, STYX, TAOK2, TMEM184A, TOMM40, TRAPPC6A, TREM2, TREML2, UNC5CL, USP6, USP6NL, WDR12, WIPI2, WWOX, ZCWPW1, ZFP3, ZKSCAN1, ZNF232, ZNF646, ZNF652, ZYX

Parkinson’s disease (PD)-related genes and Alzheimer’s disease (AD)-related genes were acquired from the meta-analysis of genome-wide association studies (meta-GWAS): PARK genes indicate causative genes of familial PD with a locus prefix symbol, “PARK.” Non-PARK genes indicate causative genes of hereditary atypical parkinsonian disorders accompanied by atypical features. Gene/locus Mendelian Inheritance in Man (MIM) number and phenotype MIM number for individual genes are shown in Supplementary Table 1 (only online).

Gene ontology analysis

To identify the biological processes enriched in the hereditary parkinsonism genes, we used g:Profiler, a web-based gene ontology (GO) analysis tool16 (archive: Ensembl 100, Ensembl Genomes 47, rev f46603d; database built on 2020-09-21). For functional annotation, we compared the enrichment results of all hereditary parkinsonism genes, only PARK genes, and only non-PARK genes with each other. The g:Profiler settings used were as follows: GO enrichment restricted to biological processes (GO-BP), Fisher’s one-tailed test as the statistical method, Bonferroni correction for multiple testing correction, statistical domain size comprising only annotated genes, and no hierarchical filtering. Bonferroni corrected p<0.05 were considered significant. Fold enrichment of the GO terms was calculated as follows: (a/b)/{(c-a)/(d-b)}, where a is the number of genes in a gene set associated with a GO term, b is the total number of genes in that gene set, c is the total number of genes in the genome associated with that GO term, and d is the total number of annotated human genes present in Ensembl. We further filtered overrepresented GO-BP terms by considering a small term size with the number of genes associated with a GO term as ≤3 and a broader GO-BP term with >1000 genes.

Construction of PPINs for the hereditary parkinsonism genes

For constructing PPINs, only direct PPIs of human proteins were downloaded as an MITAB 2.5 file from the IntAct database17 (accessed on 2020-08-27) using the PSICQUIC platform (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml). The IntAct database is one of the most comprehensive databases in terms of number of PPIs, species, and molecular interaction data.1718 An MITAB file is a tab-delimited text file wherein each row represents one interaction between two proteins, reporting the features of the interaction and the PubMed identifier. Uniprot protein identifiers were mapped to HGNC gene symbols using the human proteome annotation table downloaded from Uniprot (https://www.uniprot.org/uniprotkb?facets=model_organism%3A9606&query=Human; accessed on 2020-09-01). This translation ensured that each protein was mapped to a unique gene symbol. Next, the PPI annotations underwent quality control and filtering, as previously described.812 Briefly, all non-human TaxID annotations, all annotations with non-assigned interaction detection methods or PubMed identifiers, all annotations with generic specifications, and all proteins whose transcripts were not expressed in the brain (https://www.proteinatlas.org/humanproteome/brain/human+brain) were removed. Then, a threshold was determined, and the filtered interactions were scored by evaluating the number of publications and detection methods reporting an interaction. A final score (number of PubMed IDs+number of methods) was calculated, and a score ≥2 was considered. We referred to this quality-controlled PPI network as “Human_brain_PPI.” The Human_brain_PPI network contained 200398 interactions with a score ≥2 between 14326 proteins and 55582 interactions with a score ≥3 between 9688 proteins. Here, we used Human_brain_PPI with an interaction score ≥2 as the general PPI network. Next, we constructed subnetworks by including all protein interactions from the gene sets of all hereditary parkinsonism genes, PARK genes, non-PARK genes, PD risk genes, and AD risk genes. In the hereditary parkinsonism gene network, encoded proteins were identified as “seeds” and their interactions as undirected edges. We also built an extended network that comprised the seed proteins, their direct neighboring proteins, and the PPIs between them.

Analysis of PPIN topology

Four measurement criteria, degree (k), betweenness centrality (BC), closeness centrality (CC), and clustering coefficient, were used to evaluate the PPIN nodes.1920 The degree (k) of a node is defined as the number of edges adjacent to that node. Analyzing the degree distribution can capture the structure and modular organization of a network. A node with a high k is identified as a “hub” that has many neighbors. The BC of a node is defined as the sum of the fraction of the shortest paths that pass through a node and measures how often a node acts as a “bridge” along the shortest paths between two other nodes. A node with a high BC denotes a bottleneck node that has greater control over a network than a node with low BC. The CC of a node, a measure of network centrality, is defined as the reciprocal of the sum of the shortest paths from a node to all other nodes in the network. Since the sum of the distances depends upon the number of nodes in a network, closeness is normalized by the total number of nodes in a network. A node with higher CC is closer to other nodes and is more central in the network. Finally, the clustering coefficient of a node is the fraction of the pair of a node’s neighbors that are adjacent to each other. A node with a high clustering coefficient has a small “world,” and its neighbors are closer to each other. Global topological measurements of a network include average degree (), which is the mean of all degrees in a network; diameter (D), which is the maximum shortest path length between a pair of nodes in a network; mean shortest path length (mspl), which is the average of the path distances required to connect every pair of nodes via their shortest path and indicates the network’s overall connectedness; and average clustering coefficient (acc), which is the normalized sum of all the clustering coefficients of individual nodes and indicates the local interconnectedness of a network. According to the Watts and Strogatz model,21 a network is referred to as a “small-world network” if it has a high clustering coefficient and a low mspl.

Calculation of physical interaction enrichment

Physical cohesiveness of the PPINs was assessed using a PIE analysis.22 A set of proteins of interest is more functionally cohesive if it has a higher interaction enrichment in the PPIN. However, PPINs show interaction enrichment biases for proteins that are often studied. To address this, we utilized the PIE algorithm described by Sama and Huynen,22 which extracts random sets of proteins from the general PPINs that have the same node (protein) and edge (interaction) biases as that in a set of proteins of interest. The PIE algorithm then measures whether the proteins in a set of proteins of interest have more interactions among themselves than with proteins in random sets. The PIE score for randomly chosen proteins is 1.0. In total, we generated 100000 random protein sets with the same size and degree distribution as that in the set of proteins of interest and analyzed the number of interactions in the random sets to obtain a frequency distribution. This distribution was used to calculate the physical cohesiveness (PIE score) of the set of proteins of interest and its significance (p-value). All calculations and algorithms were implemented using custom Python scripts.

RESULTS

Enriched biological processes shared by PARK and non-PARK genes

To investigate whether the hereditary parkinsonism genes were associated with distinct biological processes, we performed functional annotation and gene set enrichment analyses of all hereditary parkinsonism genes (both PARK and non-PARK genes) and PARK and non-PARK genes, separately. Overall, 303 GO-BP terms were found enriched in the hereditary parkinsonism genes (Supplementary Table 2, only online). A total of 241 GO-BP terms was significantly enriched in PARK genes (n=175 GO-BP terms) or non-PARK genes (n=90 GO-BP terms), wherein 24 GO-BP terms were commonly enriched in both PARK and non-PARK genes. GO-BP terms with significant enrichment only in PARK genes also included non-PARK genes and vice versa. In total, 121 of 175 (69.1%) GO-BP terms enriched in PARK genes included non-PARK genes, and 77 of 90 (85.6%) GO-BP terms enriched in non-PARK genes included PARK genes. When all hereditary parkinsonism genes (both PARK and non-PARK genes combined) were analyzed, gene set enrichment analysis revealed an additional 62 significantly enriched GO-BP terms, which were not enriched in the analyses using PARK gene or non-PARK genes only. The enriched GO-BP terms could be categorized into the following functional semantic clusters: 1) autophagy/cellular catabolic processes, 2) mitochondria or lysosome organization, 3) dopamine- or catecholamine-containing compound metabolic processes, 4) synapse organization and chemical synaptic transmission, 5) response to oxidative stress, 6) neuron apoptosis process/neuronal death, 7) regulation of cellular protein catabolic processes, and 8) transport of lysosomes or mitochondria and vesicle-mediated transport in synapse (Table 2 and Fig. 1). Taken together, both PARK and non-PARK genes were involved in common functional modules in the cells.

Table 2

Representative GO Terms of the Enriched BP of PARK Genes, Non-PARK Genes, and All Hereditary Parkinsonism Genes

Functional clusters	GO term (GO term ID)	Num-ber of input genes represented by a GO term	Population, hits	Population, total	PARK genes	Non-PARK genes	Analysis with PARK genes		Analysis with non-PARK genes		Analysis with all hereditary parkinsonism genes
Functional clusters	GO term (GO term ID)	Num-ber of input genes represented by a GO term	Population, hits	Population, total	PARK genes	Non-PARK genes	FE	p value	FE	p value	FE	p value
Autophagy/regulation of cellular catabolic processes	Autophagy (go:0006914)	26	537	18017	ATP13A2, EIF4G1, FBXO7, HTRA2, LRRK2, PRKN, PARK7, PINK1, SNCA, UCHL1, VPS13C, VPS35	ATM, C19ORF12, C9ORF72, CLN3, GBA, HTT, MAPT, NPC1, PSEN1, RAB39B, SQSTM1, VCP, VPS13A, WDR45	21.7	3.81.E-11	6.5	1.09.E-04	9.8	5.76.E-15
	Positive regulation of cellular catabolic process (go:0031331)	19	395	18017	FBXO7, GIGYF2, LRRK2, PRKN, PARK7, PINK1, SNCA, VPS13C, VPS35	APP, ATM, ATXN3, C9ORF72, FMR1, GBA, HTT, PSEN1, TAF1, VCP	22.1	1.71.E-07	6.3	1.58.E-02	9.7	5.06.E-10
	Negative regulation of cellular catabolic process (go:0031330)	12	264	18017	ATP13A2, EIF4G1, HTRA2, LRRK2, PARK7, PINK1, SNCA, VPS35	FMR1, NPC1, PSEN1, TAF1	29.6	2.64.E-07		ns	9.2	4.41.E-05
Mitochondria or lysosome organization	Mitochondrion organization (go:0007005)	22	560	18017	ATP13A2, CHCHD2, FBXO7, HTRA2, LRRK2, PRKN, PARK7, PINK1, PLA2G6, SNCA, VPS13C, VPS35	AFG3L2, CHCHD10, GBA, HTT, MAPT, OPA1, PANK2, SQSTM1, VPS13A, WDR45	20.7	6.28.E-11		ns	7.9	2.80.E-10
Mitochondria or lysosome organization	Lysosome organization (go:0007040)	8	68	18017	LRRK2, VPS35	ATP6AP2, CLN3, GBA, GRN, LYST, PPT1		ns	23.5	1.17.E-03	25.7	8.46.E-06
Dopamine or catecholamine containing compound metabolic process	Dopamine metabolic process (go:0042417)	8	39	18017	PRKN, PARK7, SNCA, VPS35	GCH1, SLC6A3, SNCB, TH	108.3	1.49.E-04		ns	49.7	7.95.E-08
	Catechol-containing compound metabolic process (go:0009712)	8	54	18017	PRKN, PARK7, SNCA, VPS35	GCH1, SLC6A3, SNCB, TH	75.8	5.67.E-04		ns	33.5	1.26.E-06
Synapse organization and chemical synaptic transmission	Synapse organization (go:0050808)	17	417	18017	EIF4G1, LRRK2, SNCA, VPS35	AFG3L2, APP, CACNA1A, CHCHD10, CLN3, DCTN1, MAPT, MECP2, OPA1, PRNP, PSEN1, RAB39B, SNCB		ns	7.8	4.44.E-05	8.2	1.56.E-07
Synapse organization and chemical synaptic transmission	Chemical synaptic transmission (go:0007268)	29	694	18017	LRRK2, PRKN, PARK7, PINK1, PLA2G6, SNCA, SYNJ1	APP, ATXN3, CACNA1A, CLN3, DNAJC5, FMR1, JPH3, KIF5A, MAPT, MECP2, PDYN, PLP1, PPT1, PRNP, PSEN1, RNF216, SLC18A2, SLC6A3, SNCB, SQSTM1, TH, TOR1A	9.7	8.61.E-03	7.9	1.03.E-10	8.4	2.18.E-15
Response to oxidative stress	Response to oxidative stress (go:0006979)	17	470	18017	ATP13A2, FBXO7, HTRA2, LRRK2, PRKN, PARK7, PINK1, SNCA	APP, C19ORF12, GCH1, MAPT, PDGFRB, PRKRA, PRNP, PSEN1, TOR1A	16.4	2.49.E-05		ns	7.2	9.96.E-07
Neuron death/neuron apoptosis process	Neuron apoptotic process (go:0051402)	16	258	18017	PRKN, PARK7, PINK1, SNCA	APP, ATM, CACNA1A, CLN3, DNAJC5, GRN, MECP2, PPT1, PRNP, PSEN1, SLC30A10, SNCB		ns	11.8	1.93.E-06	12.7	1.11.E-09
	Neuron death (go:0070997)	24	371	18017	ATP13A2, EIF4G1, FBXO7, HTRA2, LRRK2, PRKN, PARK7, PINK1, SNCA, VPS35	APP, ATM, CACNA1A, CLN3, DNAJC5, GBA, GRN, MAPT, MECP2, PPT1, PRNP, PSEN1, SLC30A10, SNCB	26.2	2.00.E-09	9.5	9.66.E-07	13.3	1.91.E-16
Regulation of cellular protein catabolic process	Regulation of cellular protein catabolic process (go:1903362)	12	262	18017	ATP13A2, LRRK2, PRKN, PARK7, PINK1, VPS35	ATXN3, FMR1, GBA, PSEN1, TAF1, VCP	22.2	4.34.E-04		ns	9.3	4.05.E-05
Transport of lysosome or mitochondria and vesicle mediated transport in synapse	Lysosomal transport (go:0007041)	11	108	18017	ATP13A2, LRRK2, PRKN, PINK1, VPS35	C9ORF72, CLN3, GRN, LYST, NPC1, VCP	46.0	1.61.E-04	14.3	1.80.E-02	21.9	3.43.E-08
	Mitochondrial transport (go:0006839)	11	270	18017	FBXO7, HTRA2, LRRK2, PRKN, PINK1, VPS35	AFG3L2, CHCHD10, OPA1, PSEN1, PSEN2	21.5	5.18.E-04		ns	8.2	5.71.E-04
	Vesicle-mediated transport in synapse (go:0099003)	12	188	18017	DNAJC6, LRRK2, SNCA, SYNJ1, VPS35	CACNA1A, DNAJC5, FMR1, PSEN1, SLC18A2, TH, TOR1A	25.9	2.54.E-03	9.4	4.04.E-02	13.1	9.29.E-07

GO, gene ontology; BP, biological processes; ID, identification; FE, fold enrichment; p-value, Bonferroni corrected p-value; ns, not significant.

FE and p-values were calculated for all three gene sets.

Fig. 1

Schematic representation of a tripartite network of hereditary parkinsonian disorders, biological processes, and genes. The first layer represents two groups of hereditary parkinsonian disorders: familial Parkinson’s disease (PD) associated with PARK genes and hereditary atypical parkinsonian disorders caused by mutations in non-PARK genes. The second layer represents the major gene ontology-biological process terms (GO-BP), which were significantly enriched in either PARK or non-PARK genes. GO-BP terms enriched in PARK genes were connected to the familial PD node, whereas those enriched in the non-PARK genes were connected to the hereditary atypical parkinsonian disorder node. The third layer represents the causative genes of hereditary parkinsonian disorders. Nodes and edges of PARK genes are red and those of non-PARK genes are blue.

PPINs of proteins encoded by PARK and non-PARK genes

Given that enriched biological processes were shared by PARK and non-PARK genes, we hypothesized that the PPINs formed by proteins encoded by PARK or non-PARK genes (PARK or non-PARK proteins) were close to each other. We first constructed independent PPINs using PARK, non-PARK, and both PARK and non-PARK proteins as the seed proteins (Fig. 2 and Supplementary Table 3, only online). From a total of 19 PARK proteins, 11 proteins formed a single PPIN with 12 PPIs. After removing five controversial PARK proteins (UCHL1, GIGYF2, HTRA2, EIF4G1, and TMEM230), a single PPIN with seven proteins and eight PPIs was formed (Fig. 2A and B). The PPIN of the non-PARK proteins (n=77) consisted of 28 proteins with 36 PPIs and small modules, including one PPI trimer and four PPI pairs (Fig. 2C). The PPIN comprising both PARK and non-PARK proteins (n=96) consisted of a giant component with 51 proteins connected via 83 PPIs and three PPI pairs (Fig. 2D). Twenty-five (30.1%) of the 83 PPIs in the giant component were direct interactions between PARK and non-PARK proteins, and six proteins in the 25 PPIs were linked to the giant component only when both PARK and non-PARK proteins were used. The probability of degree distribution P(k) of a PPIN follows a power-law scaling behavior represented by P(k)~k-r, where r is an order parameter. In a hierarchical network (r<2), however, the network loses various scale-free features and becomes random if r>3.1923 In the PPIN consisting of PARK and non-PARK proteins, the probability of degree distribution P(k) followed a power-law scaling behavior, P(k)~k-γ, with degree exponent γ=1.24, where a straight line qualified as the fitted curve for the data points (P(k)~k-1.24) and correlation coefficient r2=0.92 (Fig. 2G). In contrast, in the PPIN constructed using 86 proteins encoded by genes associated with risk alleles,13 a single PPIN with 17 proteins, 18 PPIs, and three PPI pairs was formed (Fig. 2E). After removing seven proteins that overlapped with PARK or non-PARK proteins, 13 proteins were connected in small modules, including three pairs, one trimer, and one tetramer (Fig. 2F). A comparison of global topological measurements observed between the PPINs formed by PARK and/or non-PARK proteins is summarized in Supplementary Table 3 (only online). In the PPIN formed by both PARK and non-PARK proteins, the top five proteins with the highest degree connectivity (k) were LRRK2, MAPT, APP, HTT, and SNCA (Supplementary Table 4, only online). The top five proteins with the highest BC, which formed the backbone network,24 were LRRK2, APP, MAPT, HTT, and FMR1 (Supplementary Fig. 1A, only online). Additionally, we found two cliques comprising four proteins (LRRK2-SNCA-MAPT-FMR1 and MAPT-APP-PRNP-PLP1), where all proteins were connected with each other (Supplementary Fig. 1B, only online). Chemical synaptic transmission was the common GO-BP term shared by all proteins in both cliques (Supplementary Table 2, only online). Taken together, these results suggested that the PPIN formed by both PARK and non-PARK proteins is a single hierarchical scale-free network, where both PARK and non-PARK proteins are hub proteins involved in a backbone network and functional modules (Supplementary Fig. 1C, only online).

Fig. 2

Protein-protein interaction networks (PPINs) of PARK (red) and/or non-PARK proteins (blue). Nodes represent the proteins that encode Parkinson’s disease (PD)-related genes, and edges represent known protein-protein interactions with a quality score ≥2. Each PPIN was constructed using Cytoscape v3.8.2. with the following proteins: PARK proteins (A), PARK proteins excluding those encoded by controversial genes (B), non-PARK proteins (C), all hereditary parkinsonism proteins i.e., both PARK and non-PARK proteins (D), proteins encoded by PD-related genes (gray) from meta-analysis of genome-wide association studies (GWAS) (E), and proteins encoded by PD-related genes, with PARK and non-PARK proteins excluded (F). Probabilities of degree distribution P(k) with the degree exponent (γ) and correlation coefficient values (r2) of the PPIN formed by PARK and non-PARK proteins (G) and the PPIN formed by PARK, non-PARK, and direct interacting first neighbor proteins (H).

Next, we constructed an extended PPIN using PARK and non-PARK proteins and their direct interacting neighbor proteins (first neighbors). This single giant PPIN consisted of 4493 proteins connected via 75125 PPIs (Supplementary Table 3, only online). As expected, the probability of degree distribution of the extended PPIN also followed a power-law scaling behavior with γ=1.34 and r2=0.82 (Fig. 2H). The backbone network of the extended PPIN consisted of the top 1% nodes with the highest BC and included both PARK and non-PARK proteins, including LRRK2, APP, PRNP, MAPT, FMR1, HTT, PLP1, DCTN1, VCP, GRN, and SQSTM1 (Supplementary Table 5, only online). Next, we calculated the PIE scores and associated p-values for all hereditary parkinsonism proteins, PARK proteins, and non-PARK proteins, separately (Table 3). The PIE score was significant for PARK and non-PARK proteins, independently. Notably, the PIE score of both PARK and non-PARK proteins as a single group was highly significant (p=0.00018), suggesting that these proteins interacted more than randomly expected, and this cannot be attributed to publication bias. In contrast, analyses of PIE scores between the PD GWAS proteins and AD GWAS proteins were not significant. On applying more stringent data curation criteria (quality score ≥3), the results remained the same except for PD GWAS proteins (before excluding PARK and non-PARK proteins), which were marginally significant (Supplementary Table 6, only online).

Table 3

Physical Cohesiveness between the Proteins Encoded by PD-Related Genes and AD-Related Genes

Category	No. of nodes	Means of interactions	PIE score	p value
PARK proteins	19	1.26	1.53	0.00058
PARK proteins excluding controversial genes	14	1.14	1.43	0.01083
Non-PARK proteins	77	1.09	1.21	0.00587
Both PARK and non-PARK proteins	96	1.79	1.28	0.00018
Both PARK and non-PARK proteins excluding controversial genes	91	1.67	1.22	0.00225
Proteins encoded by PD-related genes from meta-GWAS	86	0.54	1.07	0.18961
Proteins encoded by PD-related genes from meta-GWAS excluding PARK and non-PARK proteins	79	0.23	1.06	0.18617
Proteins encoded by AD-related genes from meta-GWAS	129	0.25	1.07	0.09377

PD, Parkinson’s disease; AD, Alzheimer’s disease; meta-GWAS, meta-analysis of genome-wide association studies.

Physical cohesiveness in a gene set was measured by protein interaction enrichment score and p-value. Protein-protein interactions with a quality score of two or more were subjected to further analysis.

DISCUSSION

To the best of our knowledge, this is the first study to systematically analyze the PPINs and PIE scores of PARK and non-PARK proteins and demonstrate that both PARK and non-PARK proteins are highly connected at the PPI level and form a hierarchical scale-free PPIN. Moreover, GO functional enrichment analysis revealed that both PARK and non-PARK proteins are involved in shared functional modules. Several studies correlating genotype with phenotype have shown that mutations in non-PARK genes may present with idiopathic PD.252627 At the pathological level as well, there are overlapping features between PARK and non-PARK gene mutations: for instance, loss of dopaminergic neurons is common in hereditary parkinsonian disorders caused by non-PARK mutations. Lewy body pathology, which is a pathological hallmark of idiopathic PD, is found in some hereditary atypical parkinsonian disorders.2829303132 Furthermore, a close relationship between PARK and non-PARK genes has also been found in other system layers. At the gene expression level, meta-analyses have shown that PARK and non-PARK genes are overrepresented in dysregulated genes in the substantia nigra of patients with idiopathic PD, particularly as downregulated genes.23 Recent studies have suggested that the burden of rare coding variants in non-PARK and PARK genes is increased in idiopathic PD genome, thereby supporting the possible role of non-PARK genes in the pathogenesis of idiopathic PD.4 The purpose, methods, and results of the PPIN analysis in our study differ from those in previous studies. The primary aim of our study was to explore the relationship between PARK and non-PARK proteins in a PPIN; whereas, previous studies have constructed PPINs to find novel PD-related genes89 or to explore the functional connectivity of newly identified genes with PD-related genes.33 In previous studies, the criteria for defining PD genes were unclear, and only a small number of non-PARK genes was included in the analyses. In contrast, we performed systematic curation from public databases and literature to accumulate an unbiased set of 77 non-PARK genes. Although we considered controversial genes, their inclusion/exclusion did not affect our results. Our results suggested that the PPINs of PARK and non-PARK proteins are a part of a single scale-free hierarchical network, rather than two independent networks. In the gigantic component of the PPIN consisting of both PARK and non-PARK proteins, the network topology measures and probability of degree distribution, which followed a power-law scaling behavior, strongly suggested that these proteins were nodes of a single hierarchical network. The PIE results revealed that after merging the PARK and non-PARK proteins, they interacted 1.28 times more than what is randomly expected, supporting the fact that these proteins form a single hierarchical PPIN. As PIE analysis precludes any publication biases, this interaction enrichment was not due to a bias of proteins that have been studied often.22 Further, the close relationship between PARK and non-PARK proteins at the PPIN level adhered to the results of our GO functional enrichment analysis. Both PARK and non-PARK genes belonged to all eight major functional clusters of the enriched GO terms. In contrast to that in the PARK and non-PARK proteins, there was no enrichment of the PPIs in proteins encoded by PD-related risk alleles, regardless of the inclusion/exclusion of the PARK and non-PARK genes in the GWAS-related genes. This may be because genes near statistically-associated risk alleles, which we selected as PD-related genes, were not necessarily candidate genes for a disease.1434 Another possible explanation is that the risk alleles modulate PD-related genetic network indirectly, but are not a part of it per se. Recently, Li, et al.33 showed that PD-associated risk alleles influence innate-immune mechanisms by modulating gene expression in monocytes and lysosomal function via changes in gene expression or RNA splicing in dorsolateral prefrontal cortex. In biological networks, degree, CC, and BC have been used to identify influential nodes that can be used for biomarker discovery, drug designing, and drug repurposing.3536 In the gigantic component of the PPIN formed by PARK and non-PARK proteins, LRRK2, MAPT, APP, HTT, SNCA, and FMR1 were identified as hub proteins. Among these, LRRK2 had the highest degree and BC; thus, we suggest that LRRK2 may be the best target for designing and repurposing drugs for parkinsonian disorders. An unstable trinucleotide repeat expansion in the 5'-UTR of FMR1 causes fragile X syndrome or fragile X tremor/ataxia syndrome.37 The relationship between FMR1 and idiopathic PD has been studied previously. Reduced FMR1 expression and reduced FMRP encoded by FMR1 have been observed in the substantia nigra of idiopathic PD and in incidental Lewy body disease, which is considered as a preclinical stage of idiopathic PD.238 FMRP reduction is mediated by inhibition of PKC-CREB signaling by alpha-synuclein.38 Although there are exceptions, patients with non-PARK gene mutations generally present with broader phenotypes of parkinsonism and additional movement disorders with or without cognitive dysfunction. Given the close physical connectivity in the PPIN between PARK and non-PARK proteins and shared functional modules (i.e., overlapping clinical features between “pure” parkinsonism and parkinsonism with additional features, as well as shared biological processes), lumping a group of hereditary parkinsonian disorders caused by PARK or non-PARK gene mutations mimics a recently suggested a concept of ataxia-spasticity disease spectrum.39 Hereditary ataxias and hereditary spastic paraplegias share common phenotypes, pathophysiological pathways, mechanisms, and PPINs.3940 The concept suggests a descriptive, unbiased approach of modular phenotyping, instead of divisive diagnosis-driven ataxia and hereditary spastic paraplegia classification system. Therefore, it may be reasonable to include non-PARK genes in the PD-related genome along with PARK genes and PD risk alleles discovered by GWAS, although mutations in non-PARK genes in idiopathic PD have not been systematically studied. The limitations of our study must be acknowledged. Although IntAct is frequently used as a gold standard in various applications, this was the only PPI database we used in this study. We analyzed proteins that are expressed only in the brain. Given that molecular functions and biological processes of genes involved in the hereditary parkinsonism are not necessarily limited to the brain level, PPINs and related pathogeneses in non-nervous tissue might have been missed. Further, a functional analysis of the PPIN of PARK and non-PARK proteins was not included in this study. Despite these limitations, we believe that our PPIN comprising PARK and non-PARK proteins can be used in future studies to explore novel PD-related genes or for drug discovery and repurposing. In conclusion, our results suggest that both PARK and non-PARK proteins constitute a single hierarchical PPIN and share molecular mechanisms in the pathogenesis of idiopathic PD.

40 in total

1. Loss of fragile X mental retardation protein precedes Lewy pathology in Parkinson's disease.

Authors: Yi Tan; Carmelo Sgobio; Thomas Arzberger; Felix Machleid; Qilin Tang; Elisabeth Findeis; Jorg Tost; Tasnim Chakroun; Pan Gao; Mathias Höllerhage; Kai Bötzel; Jochen Herms; Günter Höglinger; Thomas Koeglsperger
Journal: Acta Neuropathol Date: 2019-11-25 Impact factor: 17.088

2. Multicenter analysis of glucocerebrosidase mutations in Parkinson's disease.

Authors: E Sidransky; M A Nalls; J O Aasly; J Aharon-Peretz; G Annesi; E R Barbosa; A Bar-Shira; D Berg; J Bras; A Brice; C-M Chen; L N Clark; C Condroyer; E V De Marco; A Dürr; M J Eblan; S Fahn; M J Farrer; H-C Fung; Z Gan-Or; T Gasser; R Gershoni-Baruch; N Giladi; A Griffith; T Gurevich; C Januario; P Kropp; A E Lang; G-J Lee-Chen; S Lesage; K Marder; I F Mata; A Mirelman; J Mitsui; I Mizuta; G Nicoletti; C Oliveira; R Ottman; A Orr-Urtreger; L V Pereira; A Quattrone; E Rogaeva; A Rolfs; H Rosenbaum; R Rozenberg; A Samii; T Samaddar; C Schulte; M Sharma; A Singleton; M Spitz; E-K Tan; N Tayebi; T Toda; A R Troiano; S Tsuji; M Wittstock; T G Wolfsberg; Y-R Wu; C P Zabetian; Y Zhao; S G Ziegler
Journal: N Engl J Med Date: 2009-10-22 Impact factor: 91.245

3. ^99mTc-TRODAT-1 SPECT Showing Dopaminergic Deficiency in a Patient with Spinocerebellar Ataxia Type 10 and Parkinsonism.

Authors: Giorgio Fabiani; Raul Martins; Tetsuo Ashizawa; Francisco M B Germiniani; Hélio A G Teive
Journal: Mov Disord Clin Pract Date: 2018-11-16

Review 4. Overcoming the divide between ataxias and spastic paraplegias: Shared phenotypes, genes, and pathways.

Authors: Matthis Synofzik; Rebecca Schüle
Journal: Mov Disord Date: 2017-02-14 Impact factor: 10.338

Review 5. The Post-GWAS Era: From Association to Function.

Authors: Michael D Gallagher; Alice S Chen-Plotkin
Journal: Am J Hum Genet Date: 2018-05-03 Impact factor: 11.025

6. Dysregulation of the causative genes for hereditary parkinsonism in the midbrain in Parkinson's disease.

Authors: Yun Joong Kim; Junbeom Jeon; Jaemoon Shin; Nan Young Kim; Jeong Hoon Hong; Jae-Min Oh; SangKyoon Hong; Yeo Jin Kim; Young-Eun Kim; Suk Yun Kang; Hyeo-Il Ma; Unjoo Lee; Jeehee Yoon
Journal: Mov Disord Date: 2017-05-26 Impact factor: 10.338

7. The IntAct molecular interaction database in 2012.

Authors: Samuel Kerrien; Bruno Aranda; Lionel Breuza; Alan Bridge; Fiona Broackes-Carter; Carol Chen; Margaret Duesbury; Marine Dumousseau; Marc Feuermann; Ursula Hinz; Christine Jandrasits; Rafael C Jimenez; Jyoti Khadake; Usha Mahadevan; Patrick Masson; Ivo Pedruzzi; Eric Pfeiffenberger; Pablo Porras; Arathi Raghunath; Bernd Roechert; Sandra Orchard; Henning Hermjakob
Journal: Nucleic Acids Res Date: 2011-11-24 Impact factor: 16.971

8. Meta-Analysis of Differentially Expressed Genes in the Substantia Nigra in Parkinson's Disease Supports Phenotype-Specific Transcriptome Changes.

Authors: Duong My Phung; Jinwoo Lee; SangKyoon Hong; Young Eun Kim; Jeehee Yoon; Yun Joong Kim
Journal: Front Neurosci Date: 2020-12-18 Impact factor: 4.677

9. Identifying functional modules in protein-protein interaction networks: an integrated exact approach.

Authors: Marcus T Dittrich; Gunnar W Klau; Andreas Rosenwald; Thomas Dandekar; Tobias Müller
Journal: Bioinformatics Date: 2008-07-01 Impact factor: 6.937

Review 10. Analyzing of Molecular Networks for Human Diseases and Drug Discovery.

Authors: Tong Hao; Qian Wang; Lingxuan Zhao; Dan Wu; Edwin Wang; Jinsheng Sun
Journal: Curr Top Med Chem Date: 2018 Impact factor: 3.295