Literature DB >> 16610955

Linear data mining the Wichita clinical matrix suggests sleep and allostatic load involvement in chronic fatigue syndrome.

Brian M Gurbaxani1, James F Jones, Benjamin N Goertzel, Elizabeth M Maloney.   

Abstract

OBJECTIVES: To provide a mathematical introduction to the Wichita (KS, USA) clinical dataset, which is all of the nongenetic data (no microarray or single nucleotide polymorphism data) from the 2-day clinical evaluation, and show the preliminary findings and limitations, of popular, matrix algebra-based data mining techniques.
METHODS: An initial matrix of 440 variables by 227 human subjects was reduced to 183 variables by 164 subjects. Variables were excluded that strongly correlated with chronic fatigue syndrome (CFS) case classification by design (for example, the multidimensional fatigue inventory [MFI] data), that were otherwise self reporting in nature and also tended to correlate strongly with CFS classification, or were sparse or nonvarying between case and control. Subjects were excluded if they did not clearly fall into well-defined CFS classifications, had comorbid depression with melancholic features, or other medical or psychiatric exclusions. The popular data mining techniques, principle components analysis (PCA) and linear discriminant analysis (LDA), were used to determine how well the data separated into groups. Two different feature selection methods helped identify the most discriminating parameters.
RESULTS: Although purely biological features (variables) were found to separate CFS cases from controls, including many allostatic load and sleep-related variables, most parameters were not statistically significant individually. However, biological correlates of CFS, such as heart rate and heart rate variability, require further investigation.
CONCLUSIONS: Feature selection of a limited number of variables from the purely biological dataset produced better separation between groups than a PCA of the entire dataset. Feature selection highlighted the importance of many of the allostatic load variables studied in more detail by Maloney and colleagues in this issue [1] , as well as some sleep-related variables. Nonetheless, matrix linear algebra-based data mining approaches appeared to be of limited utility when compared with more sophisticated nonlinear analyses on richer data types, such as those found in Maloney and colleagues [1] and Goertzel and colleagues [2] in this issue.

Entities:  

Mesh:

Year:  2006        PMID: 16610955     DOI: 10.2217/14622416.7.3.455

Source DB:  PubMed          Journal:  Pharmacogenomics        ISSN: 1462-2416            Impact factor:   2.533


  2 in total

1.  Reproducibility and validity of heart rate variability and respiration rate measurements in participants with prolonged fatigue complaints.

Authors:  Judith K Sluiter; Alida M Guijt; Monique H Frings-Dresen
Journal:  Int Arch Occup Environ Health       Date:  2008-11-26       Impact factor: 3.015

2.  Perception versus polysomnographic assessment of sleep in CFS and non-fatigued control subjects: results from a population-based study.

Authors:  Matthias Majer; James F Jones; Elizabeth R Unger; Laura Solomon Youngblood; Michael J Decker; Brian Gurbaxani; Christine Heim; William C Reeves
Journal:  BMC Neurol       Date:  2007-12-05       Impact factor: 2.474

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.