| Literature DB >> 35328027 |
Vidya Manian1,2, Jairo Orozco-Sandoval1, Victor Diaz-Martinez1, Heeralal Janwa3, Carlos Agrinsoni3.
Abstract
Skeletal muscle atrophy is a common condition in aging, diabetes, and in long duration spaceflights due to microgravity. This article investigates multi-modal gene disease and disease drug networks via link prediction algorithms to select drugs for repurposing to treat skeletal muscle atrophy. Key target genes that cause muscle atrophy in the left and right extensor digitorum longus muscle tissue, gastrocnemius, quadriceps, and the left and right soleus muscles are detected using graph theoretic network analysis, by mining the transcriptomic datasets collected from mice flown in spaceflight made available by GeneLab. We identified the top muscle atrophy gene regulators by the Pearson correlation and Bayesian Markov blanket method. The gene disease knowledge graph was constructed using the scalable precision medicine knowledge engine. We computed node embeddings, random walk measures from the networks. Graph convolutional networks, graph neural networks, random forest, and gradient boosting methods were trained using the embeddings, network features for predicting links and ranking top gene-disease associations for skeletal muscle atrophy. Drugs were selected and a disease drug knowledge graph was constructed. Link prediction methods were applied to the disease drug networks to identify top ranked drugs for therapeutic treatment of skeletal muscle atrophy. The graph convolution network performs best in link prediction based on receiver operating characteristic curves and prediction accuracies. The key genes involved in skeletal muscle atrophy are associated with metabolic and neurodegenerative diseases. The drugs selected for repurposing using the graph convolution network method were nutrients, corticosteroids, anti-inflammatory medications, and others related to insulin.Entities:
Keywords: diseases; drugs; gradient boosting method; graph convolutional neural networks; graph neural network; knowledge graphs; link prediction; machine learning; node embeddings; random forest; random walk; skeletal muscle atrophy
Mesh:
Year: 2022 PMID: 35328027 PMCID: PMC8953707 DOI: 10.3390/genes13030473
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Workflow pipeline showing the order of steps involved in constructing GDKG and DDKG and link prediction methods for finding key diseases associated with skeletal muscle atrophy genes and drugs for repurposing.
Figure 2Graph Convolutional Network (GCN) is trained on the GDKG network. Figure shows the sparse GCN layer, ReLu activation function, graph embedding, and the decoded GCN with output predicted links between the genes and disease nodes. g1 and g2 are the gene nodes, d1, d2, and d3 are disease nodes. The output predicted links are shown as red dotted lines.
Figure 3Gene Disease Network (Red nodes–Genes, Blue nodes–Disease).
Ranking of genes and diseases with new predicted links using GCN.
| Gene | Disease Code | Link Prediction Probabilities | Disease Name |
|---|---|---|---|
|
| ICD10:C22 | 0.92 | Malignant neoplasm of liver and intrahepatic bile ducts |
|
| DOID:178 | 0.83 | vascular disease |
|
| ICD10:G969 | 0.77 | Disorder of central nervous system |
|
| DOID:0050589 | 0.79 | inflammatory bowel disease |
|
| DOID:10273 | 0.95 | heart conduction disease |
|
| DOID:1289 | 0.79 | neurodegenerative disease |
|
| ICD10:I5 | 0.97 | Non-ischemic myocardial injury (non-traumatic) |
|
| ICD10:C25 | 0.79 | Malignant neoplasm of the pancreas |
|
| DOID:8857 | 0.84 | lupus erythematosus |
|
| ICD10:H8 | 0.8 | disorder of vestibular function |
|
| ICD10:N429 | 0.73 | Disorder of prostate |
|
| DOID:6364 | 0.71 | migraine |
|
| DOID:2007 | 0.73 | Pesticide residues in food |
|
| ICD10:N399 | 0.78 | Disorder of urinary system |
|
| ICD10:G93 | 0.89 | brain disorder |
|
| DOID:0050890 | 0.84 | synucleinopathy |
|
| ICD10:K0 | 0.81 | Diseases of the oral cavity and salivary glands |
|
| ICD10:N399 | 0.86 | Disorder of urinary system |
|
| DOID:0050687 | 0.89 | cell type cancer |
|
| ICD10:C64 | 0.83 | Malignant neoplasm of kidney |
Figure 4Receiver Operating Characteristic (ROC) curve showing true positive and false positive scores for link prediction in the GDKG using the five methods.
Ten-fold cross validation accuracies for link prediction using RF, Gboost, and GNN in GDKG.
| Methods | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | AUROC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| RF | 89.39 | 88.64 | 90.54 | 91.76 | 89.09 | 88.57 | 86.04 | 91.03 | 87.66 | 90.94 | 88.75 |
| GB | 85.28 | 84.62 | 87.03 | 86.82 | 85.62 | 86.52 | 83.22 | 87.33 | 82.89 | 88.06 | 85.69 |
| GNN | 87.70 | 90.13 | 89.78 | 89.96 | 88.79 | 90.59 | 85.60 | 89.44 | 88.35 | 90.26 | 88.63 |
| GCN | 88.95 | 90.87 | 91.21 | 92.79 | 93.00 | 93.20 | 94.29 | 95.05 | 95.72 | 96.00 | 96.11 |
Figure 5Disease Drug Network (Red nodes—Drugs, Blue nodes—Diseases).
Figure 6Receiver Operating Characteristic (ROC) curve showing the true positive and false positive scores for link prediction in the DDKG using the five methods.
Ranking of drugs and diseases with new predicted links using GCN.
| Drugs | Disease | Link Prediction Probability |
|---|---|---|
| L-CARNITINE | Metabolic disease | 1 |
| THIAMINE | Autoimmune disease of the musculoskeletal system | 1 |
| TELITHROMYCIN | Breast cancer | 0.98 |
| FLUOCINOLONE ACETONIDE | Uterine disease | 0.96 |
| RIBOFLAVIN | Autoimmune disease of the musculoskeletal system | 0.94 |
| AZATHIOPRINE | Cardiovascular system disease | 0.94 |
| IVERMECTIN | Allergic rhinitis | 0.9 |
| INSULIN LISPRO | Urinary system disease | 0.9 |
| NELARABINE | Hypervitaminosis | 0.9 |
| SURAMIN | Allergic rhinitis | 0.89 |
| TETRACYCLINE | Male reproductive organ cancer | 0.86 |
| INSULIN DETEMIR | Urinary system disease | 0.85 |
| PRAMLINTIDE | Type 2 diabetes mellitus | 0.84 |
| ARCITUMOMAB | Breast cancer | 0.83 |
| CLINDAMYCIN | Influenza and pneumonia | 0.83 |
| L-ORNITHINE | Vasomotor and allergic rhinitis | 0.83 |
| BUDESONIDE | Autoimmune thyroiditis | 0.82 |
| GOLIMUMAB | Benign neoplasm | 0.82 |
| ARCITUMOMAB | Skin disease | 0.82 |
| INSULIN, ISOPHANE | Unspecified diabetes mellitus | 0.82 |
| HYDROCORTISONE | Integumentary system cancer | 0.82 |
| CHLOROQUINE | Bone inflammation disease | 0.82 |
| L-CARNITINE | Malignant neoplasm | 0.82 |
| INSULIN GLARGINE | Disease of the genitourinary system | 0.81 |
| KETOCONAZOLE | Allergic rhinitis | 0.8 |
| WARFARIN | Generalized skin eruption | 0.79 |
| ARCITUMOMAB | Nasal cavity disease | 0.79 |
| KETOCONAZOLE | Malignant neoplasm of prostate | 0.79 |
| VITAMIN C | Lung disease | 0.78 |
| GALSULFASE | Malignant neoplasm of other endocrine glands | 0.77 |
| L-ORNITHINE | Arterial fibrillation | 0.75 |
| LUCINACTANT | Mood disorder | 0.75 |
| VITAMIN C | Mental, behavioral and neurodevelopmental disorders | 0.74 |
| TETRACYCLINE | Allergic rhinitis | 0.74 |
| SURAMIN | Other disorders of central nervous system | 0.73 |
| SULFASALAZINE | Other and unspecified noninfective gastroenteritis and colitis | 0.71 |
| TINIDAZOLE | Bronchial disease | 0.71 |
Ranking of drugs and diseases with new predicted links using GNN.
| Drugs. | Disease Name | Link Prediction |
|---|---|---|
| MEMANTINE | Carcinoma | 0.98 |
| CINNARIZINE | Carcinoma | 0.97 |
| MEMANTINE | Heart Disease | 0.97 |
| IXABEPILONE | Complications Additionally, Ill-Defined Descriptions Of Heart Disease | 0.96 |
| PREDNISOLONE | Malignant Neoplasm of Other Additionally, Unspecified Urinary Organs | 0.95 |
| CLINDAMYCIN | Artery Disease | 0.93 |
| CLINDAMYCIN | Urinary System Disease | 0.93 |
| LUCINACTANT | Malignant Neoplasm of Other Additionally, Unspecified Major Salivary Glands | 0.93 |
| CINNARIZINE | Cancer | 0.93 |
| ETOPOSIDE | Artery Disease | 0.93 |
| L-ORNITHINE | Carcinoma | 0.92 |
| LUCINACTANT | Disorder Of Urinary System | 0.92 |
| IMATINIB | Heart Conduction Disease | 0.91 |
| L-ORNITHINE | Heart Disease | 0.89 |
| NELARABINE | Heart Conduction Disease | 0.88 |
| NIMODIPINE | Abscess Of Lung Additionally, Mediastinum | 0.87 |
| METHOTREXATE | Integumentary System Cancer | 0.86 |
| PREDNISOLONE | In Situ Neoplasms | 0.85 |
| MELATONIN | Cognitive Disorder | 0.85 |
| TEMOZOLOMIDE | Other Disorders of Urinary System | 0.84 |
| ANASTROZOLE | Malignant Neoplasm of Other Endocrine Glands Additionally, Related Structures | 0.82 |
| FLUOCINOLONE ACETONIDE | Other Diseases of Liver | 0.79 |
| AGALSIDASE β | Carbohydrate Metabolism Disease | 0.77 |
| CALCIUM ACETATE | Type 2 Diabetes Mellitus | 0.75 |
| CYSTEAMINE | Other Disorders of Carbohydrate Metabolism | 0.74 |
| VITAMIN C | Type 2 Diabetes Mellitus | 0.74 |
| L-CARNITINE | Autosomal Dominant Disease | 0.70 |
| IBUPROFEN | Cardiovascular System Disease | 0.70 |
Ten-fold cross validation accuracies for link prediction using RF, Gboost, and GNN in DDKG.
| Methods | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | AUROC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| RF | 96.69 | 99.44 | 99.60 | 98.05 | 99.88 | 99.65 | 98.34 | 98.86 | 99.68 | 99.52 | 98.09 |
| GB | 92.10 | 97.12 | 99.80 | 91.60 | 99.69 | 96.83 | 97.07 | 94.86 | 97.32 | 98.39 | 96.19 |
| GNN | 95.55 | 99.36 | 95.56 | 95.42 | 98.62 | 99.22 | 97.98 | 95.18 | 99.86 | 100.00 | 97.70 |
| GCN | 99.75 | 100.00 | 99.75 | 99.872 | 99.87 | 100.00 | 99.75 | 100.00 | 100.00 | 99.87 | 99.19 |
Graph theoretic measures for the GDKG and DDKG networks.
| Network Measure | GDKG | DDKG |
|---|---|---|
| Spectral gap | 37.5218 | 99.7221 |
| Density | 0.0221 | 0.0452 |
| Average number of neighbors | 26.423 | 13.345 |