| Literature DB >> 35456203 |
Gerardo Alfonso Perez1, Javier Caballero Villarraso1,2.
Abstract
Huntington Disease (HD) is a degenerative neurological disease that causes a significant impact on the quality of life of the patient and eventually death. In this paper we present an approach to create a biomarker using as an input DNA CpG methylation data to identify HD patients. DNA CpG methylation is a well-known epigenetic marker for disease state. Technological advances have made it possible to quickly analyze hundreds of thousands of CpGs. This large amount of information might introduce noise as potentially not all DNA CpG methylation levels will be related to the presence of the illness. In this paper, we were able to reduce the number of CpGs considered from hundreds of thousands to 237 using a non-linear approach. It will be shown that using only these 237 CpGs and non-linear techniques such as artificial neural networks makes it possible to accurately differentiate between control and HD patients. An underlying assumption in this paper is that there are no indications suggesting that the process is linear and therefore non-linear techniques, such as artificial neural networks, are a valid tool to analyze this complex disease. The proposed approach is able to accurately distinguish between control and HD patients using DNA CpG methylation data as an input and non-linear forecasting techniques. It should be noted that the dataset analyzed is relatively small. However, the results seem relatively consistent and the analysis can be repeated with larger data-sets as they become available.Entities:
Keywords: DNA methylation; Huntington disease; neural networks
Year: 2022 PMID: 35456203 PMCID: PMC9032851 DOI: 10.3390/jcm11082110
Source DB: PubMed Journal: J Clin Med ISSN: 2077-0383 Impact factor: 4.964
Figure 1Illustration showing the concept of DNA methylation.
Figure 2Forecasting accuracy and required computational time using different ANN architectures.
Forecasting precision obtained with the different neural network configurations (after the second part of the algorithm). The second column shows the results using control, pre-manifest and manifest cases while the third column includes only control and pre-manifest cases. The fourth column shows the computational time required for training the neural network.
| N. Layers | Max Precision | Max Precision | Training Time |
|---|---|---|---|
| (Control & Manifest & Pre-Manifest) | (Control & Pre-Manifest) | (Days) | |
| 1 | 0.80 | 0.76 | 3.45 |
| 2 | 0.84 | 0.81 | 3.78 |
| 3 | 0.88 | 0.86 | 4.12 |
| 4 | 0.92 | 0.81 | 4.61 |
| 5 | 0.88 | 0.76 | 5.82 |
| 6 | 0.88 | 0.71 | 6.17 |
| 7 | 0.84 | 0.71 | 7.56 |
| 8 | 0.80 | 0.67 | 8.43 |
| 9 | 0.80 | 0.67 | 9.62 |
| 10 | 0.84 | 0.62 | 10.38 |
Forecasting accuracy results.The second column shows the results using control, pre-manifest and manifest cases while the third column includes only control and pre-manifest cases.
| Field | Control & Manifest | Control |
|---|---|---|
| & Pre-Manifest (%) | & Pre-Manifest (%) | |
| Correct classification | 0.92 | 0.86 |
| Sensitivity | 0.95 | 0.88 |
| Specificity | 0.80 | 0.80 |