| Literature DB >> 34704369 |
Heewon Chung1, Yunju Jo2,3, Dongryeol Ryu2,3, Changwon Jeong3,4, Seong-Kyu Choe3,5, Jinseok Lee1.
Abstract
BACKGROUND: Sarcopenia is defined as muscle wasting, characterized by a progressive loss of muscle mass and function due to ageing. Diagnosis of sarcopenia typically involves both muscle imaging and the physical performance of people exhibiting signs of muscle weakness. Despite its worldwide prevalence, a molecular method for accurately diagnosing sarcopenia has not been established.Entities:
Keywords: Artificial intelligence; Diagnosis; Muscle wasting; Sarcopenia; Transcriptome
Mesh:
Substances:
Year: 2021 PMID: 34704369 PMCID: PMC8718042 DOI: 10.1002/jcsm.12840
Source DB: PubMed Journal: J Cachexia Sarcopenia Muscle ISSN: 2190-5991 Impact factor: 12.910
Figure 1Proposed DSnet‐v1 with four‐layer deep neural network (DNN) for the diagnosis of sarcopenia.
Feature importance
| Rank | Feature name | Gene name/ensembl gene ID | Random forest | XGBoost | AdaBoost | Mean |
|---|---|---|---|---|---|---|
| 1 | H4C3 | H4 clustered histone 3 | 0.5715 | 1.0000 | 0.2857 | 0.6191 |
| 2 | PSMA6 | Proteasome subunit alpha 6 | 0.2263 | 0.8000 | 0.4286 | 0.4850 |
| 3 | TSPY26P | Testis specific protein, Y‐linked 26, pseudogene | 0.0000 | 0.0667 | 1.0000 | 0.3556 |
| 4 | CRHR2 | Corticotropin releasing hormone receptor 2 | 0.0943 | 0.2000 | 0.7143 | 0.3362 |
| 5 | GRTP1‐AS1 | Growth hormone regulated TBC protein 1‐antisense | 1.0000 | 0.0000 | 0.0000 | 0.3333 |
| 6 | SUMO1P3 | SUMO1 Pseudogene 3 | 0.9620 | 0.0000 | 0.0000 | 0.3207 |
| 7 | STAG3L3 | Stromal antigen 3‐like 3, transcribed_unprocessed_pseudogene | 0.0000 | 0.0000 | 0.8571 | 0.2857 |
| 8 | KAT2A | Lysine acetyltransferase 2A | 0.4174 | 0.0000 | 0.4286 | 0.2820 |
| 9 | PEF1 | Penta‐EF‐hand domain containing 1 | 0.0000 | 0.0667 | 0.7143 | 0.2603 |
| 10 | SMIM26 | Small integral membrane protein 26 | 0.5975 | 0.1333 | 0.0000 | 0.2436 |
| 11 | FKBP1C | FKBP prolyl isomerase family member 1C | 0.0000 | 0.0000 | 0.7143 | 0.2381 |
| 12 | TEX261 | Testis expressed 261 | 0.7084 | 0.0000 | 0.0000 | 0.2361 |
| 13 | PFKFB4 | 6‐Phosphofructo‐2‐kinase/fructose‐2,6‐biphosphatase 4 | 0.2055 | 0.3333 | 0.1429 | 0.2272 |
| 14 | AC116913.1 | No NCBI gene ID yet, novel noncoding transcript, antisense to MAP 2 K1 and SNAPC5/ENSG00000261351 | 0.2651 | 0.4000 | 0.0000 | 0.2217 |
| 15 | TBC1D8 | TBC1 domain family member 8 | 0.0690 | 0.0000 | 0.5714 | 0.2135 |
| 16 | MYF5 | Myogenic factor 5 | 0.0000 | 0.3333 | 0.2857 | 0.2063 |
| 17 | TPSAB1 | Tryptase alpha/beta 1 | 0.1825 | 0.0000 | 0.4286 | 0.2037 |
| 18 | AC002070.1 | LOC105370027/ENSG00000248636 | 0.0000 | 0.4667 | 0.1429 | 0.2032 |
| 19 | RASSF1 | RAS association domain family member 1 | 0.3554 | 0.2000 | 0.0000 | 0.1851 |
| 20 | AC006971.1 | No NCBI gene ID yet, novel noncoding transcript, ARHGAP5 pseudogene/ENSG00000218586 | 0.0000 | 0.2667 | 0.2857 | 0.1841 |
| 21 | SNX12 | Sorting nexin 12 | 0.4790 | 0.0000 | 0.0000 | 0.1597 |
| 22 | ANKRD23 | Ankyrin repeat domain 23 | 0.0000 | 0.3333 | 0.1429 | 0.1587 |
| 23 | AC104564.5 | No NCBI gene ID yet, novel noncoding transcript/ENSG00000265625 | 0.0430 | 0.0000 | 0.4286 | 0.1572 |
| 24 | LINC00893 | Long intergenic non‐protein coding RNA 893 | 0.0430 | 0.0000 | 0.4286 | 0.1572 |
| 25 | PCK1 | Phosphoenolpyruvate carboxykinase 1 | 0.0000 | 0.4667 | 0.0000 | 0.1556 |
| 26 | CENPC | Centromere protein C processed_pseudogene | 0.0000 | 0.4667 | 0.0000 | 0.1556 |
| 27 | VPS35L |
| 0.1680 | 0.0000 | 0.2857 | 0.1512 |
Figure 2Accuracy, balanced accuracy and area under the receiver operating characteristics (AUROC) according each different number of selected top features.
Comparison of cross‐validation evaluation metrics (mean ± standard deviation)
| Model | Cross‐validation results | |||
|---|---|---|---|---|
| Sensitivity | Specificity | Accuracy | Balanced accuracy (%) | |
| RF | 0.7448 ± 0.15832 | 0.8790 ± 0.1441 | 0.8503 ± 0.0.096 | 0.8119 ± 0.0813 |
| XGBoost | 0.7162 ± 0.1866 | 0.86485 ± 0.1415 | 0.8292 ± 0.1153 | 0.7905 ± 0.1014 |
| AdaBoost | 0.7848 ± 0.1705 | 0.8971 ± 0.0674 | 0.8719 ± 0.0487 | 0.8418 ± 0.0593 |
| DNN | 0.8772 ± 0.1072 | 0.9825 ± 0.0317 | 0.9583 ± 0.0439 | 0.9286 ± 0.0596 |
DNN, deep neural network; RF, random forest.
Comparison of prediction performances among prediction models in test dataset
| Model | TN | FP | FN | TP | Sensitivity | Specificity | Accuracy | Balanced accuracy | AUROC |
|---|---|---|---|---|---|---|---|---|---|
| RF | 13 | 4 | 1 | 6 | 0.8571 | 0.7647 | 0.7917 | 0.8109 | 0.7479 |
| XGBoost | 12 | 5 | 1 | 6 | 0.8571 | 0.7059 | 0.7500 | 0.7815 | 0.7563 |
| AdaBoost | 14 | 3 | 1 | 6 | 0.8571 | 0.8235 | 0.8333 | 0.8403 | 0.8319 |
| DNN | 16 | 1 | 0 | 7 | 1.0000 | 0.9412 | 0.9583 | 0.9706 | 0.9916 |
AUROC, area under the receiver operating characteristics; DNN, deep neural network; RF, random forest.
Figure 3Developed AI model (DSNet‐v1) was successfully deployed on a public website (http://sarcopeniaAI.ml/)
Figure 4The visualization of expression and correlation of 27 AI‐featured gene. (A) Boxplots showing the relative expression of each gene in the healthy (white box) and sarcopenic elderly (green box). Yellow (HSS), pink (JSS) or blue (SSS) dots represent the gene expression of each individual subject in three races. Data are median ± interquartile. *P < 0.05, **P < 0.01, ***P < 0.001; P values calculated using two‐tailed Wilcoxon rank sum test. (B) Correlogram matrices display Spearman's rho between the genes facing each side of the square. The depth of the shading at the correlation matrices indicates the magnitude of the correlation as shown in the scale.
Figure 5Co‐expression and human phenotype ontology assay categorizing 27 artificial intelligence (AI)‐featured genes into three groups related with skeletal muscle function, metabolism and diseases. (A) Correlogram matrices display Spearman's rho of two genes facing each side of the square. The shading intensity of the correlation matrices displays Spearman's Rho as presented in the scale (left‐hand side of the correlogram). The red, yellow and green triangles on the correlogram tie group α, β and γ, respectively. (B) Gene network showing co‐expression of group α, β and γ. The Spearman's rho of two node (gene) generates the colour and depth of each edge. The colour (pink, yellow and green) of node indicates each group (group α, β and γ). The gene symbols are indicated in . (C) Three networks generated by Enrichr showing Human Phenotype Ontology that associated with each group. The colour of nodes (HPO terms) indicates each group and the edge means sharing common genes.