| Literature DB >> 30671425 |
Ashley R Valdez1, Elizabeth E Hancock1, Seyi Adebayo1, David J Kiernicki1, Daniel Proskauer2, John R Attewell3, Lucinda Bateman4, Alfred DeMaria5, Charles W Lapp6, Peter C Rowe7, Charmian Proskauer8.
Abstract
Techniques of data mining and machine learning were applied to a large database of medical and facility claims from commercially insured patients to determine the prevalence, gender demographics, and costs for individuals with provider-assigned diagnosis codes for myalgic encephalomyelitis (ME) or chronic fatigue syndrome (CFS). The frequency of diagnosis was 519-1,038/100,000 with the relative risk of females being diagnosed with ME or CFS compared to males 1.238 and 1.178, respectively. While the percentage of women diagnosed with ME/CFS is higher than the percentage of men, ME/CFS is not a "women's disease." Thirty-five to forty percent of diagnosed patients are men. Extrapolating from this frequency of diagnosis and based on the estimated 2017 population of the United States, a rough estimate for the number of patients who may be diagnosed with ME or CFS in the U.S. is 1.7 million to 3.38 million. Patients diagnosed with CFS appear to represent a more heterogeneous group than those diagnosed with ME. A machine learning model based on characteristics of individuals diagnosed with ME was developed and applied, resulting in a predicted prevalence of 857/100,000 (p > 0.01), or roughly 2.8 million in the U.S. Average annual costs for individuals with a diagnosis of ME or CFS were compared with those for lupus (all categories) and multiple sclerosis (MS), and found to be 50% higher for ME and CFS than for lupus or MS, and three to four times higher than for the general insured population. A separate aspect of the study attempted to determine if a diagnosis of ME or CFS could be predicted based on symptom codes in the insurance claims records. Due to the absence of specific codes for some core symptoms, we were unable to validate that the information in insurance claims records is sufficient to identify diagnosed patients or suggest that a diagnosis of ME or CFS should be considered based solely on looking for presence of those symptoms. These results show that a prevalence rate of 857/100,000 for ME/CFS is not unreasonable; therefore, it is not a rare disease, but in fact a relatively common one.Entities:
Keywords: ME/CFS; chronic fatigue syndrome; costs; data mining; machine learning; myalgic encephalomyelitis; prevalence
Year: 2019 PMID: 30671425 PMCID: PMC6331450 DOI: 10.3389/fped.2018.00412
Source DB: PubMed Journal: Front Pediatr ISSN: 2296-2360 Impact factor: 3.418
Figure 1Graphic showing sequence of steps in this study.
ICD codes used for diagnosis of ME and CFS.
| ME (Myalgic Encephalomyelitis) | 323.9 | G93.3 |
| CFS (Chronic Fatigue Syndrome) | 780.71 | R53.82 |
ICD codes used for the diagnosis of multiple sclerosis and lupus.
| MS (Multiple Sclerosis) | 340 | G35 |
| Lupus (includes all subcategories) | 710 | M32, L93 |
Prevalence of ME diagnosis for varying lengths of continuous enrollment (any years).
| 0–1 | 2,668 | 3,648,421 | 73 |
| 1–2 | 11,070 | 13,422,797 | 82 |
| 2–3 | 7,883 | 7,339,562 | 107 |
| 3–4 | 6,337 | 4,438,630 | 143 |
| 4–5 | 6,375 | 3,660,868 | 174 |
| 5–6 | 2,971 | 1,670,694 | 178 |
| 6–7 | 4,925 | 2,629,342 | 187 |
| Total | 42,229 | 36,810,314 | 115 |
Summary of prevalence of ME and CFS in three studied cohorts.
| Main dataset | 16,305 | 9,263 | 25,568 | 51 | 140,947 | 99,929 | 240,876 | 482 |
| Subset 1 | 1,044 | 1,030 | 2,074 | 81 | 6,635 | 10,234 | 16,869 | 661 |
| Subset 2 | 10,196 | 3,945 | 14,141 | 121 | 87,282 | 57,614 | 144,896 | 1,236 |
Summary of prevalence of ME + CFS in the three studied cohorts, and with duplicates eliminated.
| Main dataset | 266,444 | 533 | 259,275 | 519 | 49,963,500 |
| Subset 1 | 18,943 | 742 | 17,074 | 669 | 2,553,722 |
| Subset 2 | 159,037 | 1,357 | 121,632 | 1,038 | 11,720,401 |
Figure 2ME or CFS gender demographics by age and prevalence (main dataset, non-continuous enrollment).
Figure 4ME gender demographics by age and prevalence (Subset 2, continuous enrollment from 2 to 4 years).
ME or CFS gender demographics by age and prevalence vs. reference population in the main dataset (Main dataset: non-continuous enrollment).
| 0 to 9 | 40.0 | 34.3 | 37.2 | 0.86: 1 | 53.83% | 46.17% | 978 | 796 | 1,774 | 2,446,920 | 2,321,941 | 4,768,861 |
| 10 to 19 | 148.8 | 261.8 | 204.8 | 1.76: 1 | 36.24% | 63.76% | 4,023 | 6,934 | 10,957 | 2,702,876 | 2,648,306 | 5,351,182 |
| 20 to 29 | 259.5 | 478.1 | 379.3 | 1.84: 1 | 35.18% | 64.82% | 7,725 | 17,245 | 24,970 | 2,976,670 | 3,606,744 | 6,583,414 |
| 30 to 39 | 365.8 | 681.9 | 539.0 | 1.86: 1 | 34.91% | 65.09% | 11,679 | 26,390 | 38,069 | 3,192,640 | 3,869,894 | 7,062,534 |
| 40 to 49 | 482.1 | 909.1 | 708.4 | 1.89: 1 | 34.65% | 65.35% | 15,264 | 32,472 | 47,736 | 3,166,430 | 3,571,767 | 6,738,197 |
| 50 to 59 | 510.4 | 879.5 | 705.2 | 1.72: 1 | 36.72% | 63.28% | 16,865 | 32,459 | 49,324 | 3,304,031 | 3,690,758 | 6,994,789 |
| 60 to 69 | 499.6 | 739.6 | 628.1 | 1.48: 1 | 40.31% | 59.69% | 14,433 | 24,610 | 39,043 | 2,889,185 | 3,327,290 | 6,216,475 |
| 70 to 79 | 623.7 | 822.8 | 731.3 | 1.32: 1 | 43.12% | 56.88% | 9,976 | 15,479 | 25,455 | 1,599,520 | 1,881,168 | 3,480,688 |
| 80 to 89 | 814.2 | 958.0 | 900.3 | 1.18: 1 | 45.94% | 54.06% | 7,766 | 13,608 | 21,374 | 953,817 | 1,420,412 | 2,374,229 |
| Total | 381.8 | 645.4 | 521.9 | 1.69: 1 | 37.17% | 62.83% | 88,709 | 169,993 | 258,702 | 23,232,089 | 26,338,280 | 49,570,369 |
ME or CFS gender demographics by age and prevalence vs. reference population (Subset 1: continuous enrollment 2011–2016).
| 0–9 | 0.0 | 0.0 | 0.0 | 0.00% | 0.00% | 0 | 0 | 0 | 4 | 5 | 9 | |
| 10 to 19 | 179.7 | 288.7 | 233.1 | 1.61: 1 | 38.36% | 61.64% | 242 | 374 | 616 | 134,694 | 129,563 | 264,257 |
| 20–29 | 297.3 | 584.0 | 443.8 | 1.96: 1 | 33.74% | 66.26% | 405 | 831 | 1,236 | 136,205 | 142,303 | 278,508 |
| 30–39 | 481.6 | 881.7 | 695.9 | 1.83: 1 | 35.33% | 64.67% | 270 | 570 | 840 | 56,062 | 64,646 | 120,708 |
| 40–49 | 515.3 | 956.9 | 749.2 | 1.86: 1 | 35.00% | 65.00% | 780 | 1,631 | 2,411 | 151,354 | 170,449 | 321,803 |
| 50–59 | 530.7 | 947.4 | 749.4 | 1.79: 1 | 35.91% | 64.09% | 1,155 | 2,277 | 3,432 | 217,625 | 240,347 | 457,972 |
| 60–69 | 527.7 | 819.9 | 682.5 | 1.55: 1 | 39.16% | 60.84% | 1,073 | 1,877 | 2,950 | 203,345 | 228,921 | 432,266 |
| 70–79 | 640.2 | 848.8 | 752.9 | 1.33: 1 | 43.00% | 57.00% | 695 | 1,082 | 1,777 | 108,557 | 127,478 | 236,035 |
| 80–89 | 791.5 | 909.7 | 862.8 | 1.15: 1 | 46.52% | 53.48% | 1,385 | 2,420 | 3,805 | 174,985 | 266,008 | 440,993 |
| Total | 507.7 | 807.6 | 668.6 | 1.59: 1 | 38.60% | 61.40% | 6,005 | 11,062 | 17,067 | 1,182,831 | 1,369,720 | 2,552,551 |
Figure 3ME or CFS gender demographics by age and prevalence (Subset 1, continuous enrollment from 2011 to 2016).
ME gender demographics by age and prevalence vs. reference population (Subset 2, continuous enrollment 2 to 4 years).
| 0–9 | 32.7 | 25.9 | 29.4 | 0.79: 1 | 55.81% | 44.19% | 207 | 157 | 364 | 633,270 | 606,551 | 1,239,821 |
| 10–19 | 42.0 | 58.1 | 50.0 | 1.38: 1 | 41.97% | 58.03% | 307 | 412 | 719 | 730,324 | 708,729 | 1,439,053 |
| 20–29 | 57.4 | 97.8 | 77.5 | 1.70: 1 | 37.00% | 63.00% | 462 | 778 | 1,240 | 804,286 | 795,311 | 1,599,597 |
| 30–39 | 65.8 | 111.1 | 88.6 | 1.69: 1 | 37.20% | 62.80% | 545 | 932 | 1,477 | 827,930 | 838,616 | 1,666,546 |
| 40–49 | 86.9 | 142.5 | 114.7 | 1.64: 1 | 37.89% | 62.11% | 668 | 1,099 | 1,767 | 768,545 | 771,386 | 1,539,931 |
| 50–59 | 118.5 | 168.9 | 144.1 | 1.43: 1 | 41.22% | 58.78% | 903 | 1,335 | 2,238 | 762,241 | 790,364 | 1,552,605 |
| 60–69 | 164.0 | 207.6 | 187.0 | 1.27: 1 | 44.13% | 55.87% | 1,090 | 1,540 | 2,630 | 664,725 | 741,738 | 1,406,463 |
| 70–79 | 234.3 | 268.7 | 253.1 | 1.15: 1 | 46.58% | 53.42% | 824 | 1,135 | 1,959 | 351,709 | 422,440 | 774,149 |
| 80–89 | 317.9 | 358.1 | 342.5 | 1.13: 1 | 47.02% | 52.98% | 621 | 1,099 | 1,720 | 195,363 | 306,871 | 502,234 |
| Total | 98.1 | 141.9 | 120.4 | 1.45: 1 | 40.87% | 59.13% | 5,627 | 8,487 | 14,114 | 5,738,393 | 5,982,006 | 11,720,399 |
Figure 5Overlap of individuals with some symptoms of ME/CFS vs. those diagnosed.
Top predictive features for ME machine learning model.
| 0.105990 | age | |
| 0.017413 | gender | |
| 0.016377 | icd_R53 | Malaise and fatigue |
| 0.014899 | cpt_00175 | Qualitative_or_Semiquantitative_Immunoassays |
| 0.014763 | icd_N39 | Other disorders of urinary system |
| 0.014508 | icd_E55 | Vitamin D deficiency |
| 0.014083 | cpt_00123 | Diagnostic_Radiology_(Diagnostic_Imaging)_Procedures_of_the_Head_and_Neck |
| 0.013081 | cpt_00124 | Diagnostic_Radiology_(Diagnostic_Imaging)_Procedures_of_the_Chest |
| 0.012911 | icd_R07 | Pain in throat and chest |
| 0.012452 | cpt_00128 | Diagnostic_Radiology_(Diagnostic_Imaging)_Procedures_of_the_Abdomen |
| 0.011824 | icd_R51 | Headache |
| 0.011824 | cpt_00217 | Cardiography_Procedures |
| 0.011178 | cpt_00168 | Urinalysis_Procedures |
| 0.010957 | icd_R06 | Abnormalities of breathing |
| 0.010499 | icd_R00 | Abnormalities of heart beat |
| 0.010465 | icd_R94 | Abnormal results of function studies |
| 0.010431 | cpt_00289 | Subsequent_Hospital_Care_Services |
| 0.010074 | icd_R50 | Fever of other and unknown origin |
| 0.009819 | icd_D64 | Other anemias |
| 0.009751 | icd_E03 | Other hypothyroidism |
| 0.009429 | cpt_00367 | Temporary_National_Codes_(Non-Medicare) |
| 0.009378 | cpt_00220 | Echocardiography_Procedures |
| 0.008987 | icd_K59 | Other functional intestinal disorders |
| 0.008885 | cpt_00350 | Ambulance_and_Other_Transport_Services_and_Support |
| 0.008596 | cpt_00174 | Hematology_and_Coagulation_Procedures |
| 0.008392 | icd_Z51 | Encounter for other aftercare and medical care |
| 0.008307 | icd_M62 | Other disorders of muscle |
| 0.00739 | icd_R79 | Other abnormal findings of blood chemistry |
| 0.007339 | cpt_00160 | Diagnostic_Nuclear_Medicine_Procedures |
| 0.007203 | icd_R26 | Abnormalities of gait and mobility |
Average yearly medical costs for diagnosed vs. reference population.
| 2016 | $ 8,500 | $ 30,600 | $ 22,600 | $ 23,220 |
| 2015 | $ 7,800 | $ 32,400 | $ 21,100 | $ 22,090 |
| 2014 | $ 7,500 | $ 31,300 | $ 20,100 | $ 21,050 |
| 2013 | $ 7,700 | $ 34,300 | $ 20,100 | $ 22,780 |
| 2012 | $ 7,300 | $ 25,700 | $ 16,900 | $ 19,160 |
| Average | $ 7,760 | $ 30,860 | $ 20,160 | $ 21,660 |
Comparison of prevalence rates.
| Diagnosed with ME (subset 2, continuous enrollment 2–4 years) | 11.7M | 121 | 60.1% | Insurance Claim Data |
| Nacul et al. ( | 143,000 | 200 | 51.0% | Community Health Study |
| Reyes et al. ( | 90,316 | 240 | 81.8% | Community Health Study |
| Jason et al. ( | 18,675 | 420 | 71.9% | Community Health Study |
| Diagnosed with ME or CFS (main dataset, non-continuous enrollment) | 50M | 519 | 65.7% | Insurance Claim Data |
| Diagnosed with ME or CFS (subset 1, continuous enrollment 2011–2016) | 2.5M | 669 | 64.7% | Insurance Claim Data |
| Projected prevalence of ME using machine learning | 2.7M | 857 | 57.9% | Machine Learning Predictive Model |
| Diagnosed with ME or CFS (subset 2, continuous enrollment 2 to 4 years) | 11.7M | 1038 | 65.0% | Insurance Claim Data |
| National ME/FM Action Network ( | 65,000 | 1,400 | 63.4% | Survey |
| Lin et al. ( | 54,695 | 1,600 | 80.0% | Survey |
Comparison of several factors relating to ME/CFS, lupus and multiple sclerosis.
| ME/CFS | 1,726,000–3,746,000 | 519–1,038/100,000 0.52–1.04% | 714000 ( | $30,860 | $15MM |
| Lupus | 785,000 | 241/100,000 0.241% ( | No data available | $20,160 | $109MM |
| Multiple Sclerosis | 486,000 | 149/100,000 0.15% ( | 300200 ( | $21,000 | $111MM |
| Reference population | $7,760 |