| Literature DB >> 35330398 |
Gerardo Alfonso Perez1, Javier Caballero Villarraso1,2.
Abstract
Multiple sclerosis (MS) is a relatively common neurodegenerative illness that frequently causes a large level of disability in patients. While its cause is not fully understood, it is likely due to a combination of genetic and environmental factors. Diagnosis of multiple sclerosis through a simple clinical examination might be challenging as the evolution of the illness varies significantly from patient to patient, with some patients experiencing long periods of remission. In this regard, having a quick and inexpensive tool to help identify the illness, such as DNA CpG (cytosine-phosphate-guanine) methylation, might be useful. In this paper, a technique is presented, based on the concept of Shannon Entropy, to select CpGs as inputs for non-linear classification algorithms. It will be shown that this approach generates accurate classifications that are a statistically significant improvement over using all the data available or randomly selecting the same number of CpGs. The analysis controlled for factors such as age, gender and smoking status of the patient. This approach managed to reduce the number of CpGs used while at the same time significantly increasing the accuracy.Entities:
Keywords: DNA methylation; entropy; multiple sclerosis
Year: 2022 PMID: 35330398 PMCID: PMC8948909 DOI: 10.3390/jpm12030398
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Figure 1Graphical illustration of neurological damage in MS.
Figure 2Illustration of CpG islands.
Figure 3DNA methylation illustration.
Basic descriptive information of the patients.
| Description | Amount |
|---|---|
| Male | 77 |
| Female | 202 |
| Smokers | 138 |
| Non-smokers | 141 |
| Age | 16, 77 |
Figure 4Error rate comparison between direct approach and Shannon Entropy filtered approach.
Average classification forecasting accuracy.
| Accuracy Measure | Percentage |
|---|---|
| Average successful classification | 80.1% |
| Sensitivity | 78.3% |
| Specificity | 81.8% |
Figure 5A sample confusion matrix (after p-value prefiltering and Shannon Entropy filtering).
Figure 6ROC (after p-value prefiltering and Shannon Entropy filtering).
Figure 7Error rate comparison between the Shannon Entropy filtered approach and random selection of the same size.
Figure 8Sensitivity analysis according to the standard deviation of the value of the CpGs. Error rate as a function of the amount of CpGs selected according to their standard deviation.