| Literature DB >> 31829224 |
Annika M Jödicke1,2, Urs Zellweger3, Ivan T Tomka3, Thomas Neuer4, Ivanka Curkovic1,4, Malgorzata Roos5, Gerd A Kullak-Ublick1, Hayk Sargsyan4, Marco Egbring6,7.
Abstract
BACKGROUND: Rising health care costs are a major public health issue. Thus, accurately predicting future costs and understanding which factors contribute to increases in health care expenditures are important. The objective of this project was to predict patients healthcare costs development in the subsequent year and to identify factors contributing to this prediction, with a particular focus on the role of pharmacotherapy.Entities:
Keywords: Boosted decision tree; Claims data; Health care costs; Health care utilisation; Machine learning; Neural network; Pharmacology
Mesh:
Year: 2019 PMID: 31829224 PMCID: PMC6907182 DOI: 10.1186/s12913-019-4616-x
Source DB: PubMed Journal: BMC Health Serv Res ISSN: 1472-6963 Impact factor: 2.655
Study population characteristics (2014)
| All | Train | Validation | Test | |
|---|---|---|---|---|
| Patients, n (%) | 373′264 (100%) | 298′611 (80%) | 37′326 (10%) | 37′327 (10%) |
| Demographics | ||||
| Age, median [IQR] | 63.8 [49.2, 75.1] | 63.8 [49.3, 75.1] | 63.6 [49.0, 75.2] | 63.8 [49.1, 75.1] |
| Gender [female], n (%) | 226′085 (60.6%) | 180′730 (60.5%) | 22′565 (60.5%) | 22′790 (61.1%) |
| Language Area, n (%) | ||||
| | 275′025 (73.7%) | 219′998 (73.7%) | 27′469 (73.6%) | 27′558 (73.8%) |
| | 69′120 (18.5%) | 55′292 (18.5%) | 6′927 (18.6%) | 6′901 (18.5%) |
| | 29′119 (7.8%) | 23′321 (7.8%) | 2′930 (7.8%) | 2′868 (7.7%) |
| Deductible, n (%) | ||||
| | 250′287 (67.1%) | 200′211 (67.0%) | 25′026 (67.0%) | 25′050 (67.1%) |
| | 92′274 (24.7%) | 73′900 (24.7%) | 9′175 (24.6%) | 9′199 (24.6%) |
| | 30′703 (8.2%) | 24′500 (8.2%) | 3′125 (8.4%) | 3′078 (8.2%) |
| Cost | ||||
| Total Costs (CHF), median [IQR] | 3′932 [1′944, 8′597] | 3′935 [1′946, 8′586] | 3′948 [1′958, 8′642] | 3′894 [1′915, 8′642] |
| Cost Difference (CHF)*, median [IQR] | 93 [−1′746, 2365] | 93 [− 1′750, 2′365] | 62 [− 1′791, 2′305] | 122 [− 1′668, 2′441] |
| Increase†, n (%) | 193′766 (51.9%) | 155′058 (51.9%) | 19′130 (51.3%) | 19′578 (52.4%) |
| Drug Therapy | ||||
| Number of drugs‡, median [IQR] | 9 [6, 15] | 9 [6, 15] | 9 [6, 15] | 9 [6, 15] |
| Number of prescriptions§, median [IQR] | 19 [11, 34] | 19 [11, 34] | 19 [11, 34] | 19 [11, 34] |
| Route of administration, n (%) | ||||
| | 369'101 (98.9%) | 295′313 (98.9%) | 36′926 (98.9%) | 36′862 (98.8%) |
| | 122′361 (32.8%) | 97′764 (32.7%) | 12′427 (33.3%) | 12′170 (32.6%) |
| Health Care Utilisation | ||||
| Number of visits||, median [IQR] | 8 [4, 13] | 8 [4, 13] | 8 [4, 13] | 8 [4, 13] |
| Hospitalisation [yes], n (%) | 66′427 (17.8%) | 53′085 (17.8%) | 6′688 (17.9%) | 6′654 (17.8%) |
Descriptive statistics such as median, interquartile range (IQR), absolute and relative frequencies were computed using R (Version 3.3.1). Age was used as 18 age categories in the models, but shown as continuous variable in this table for easier interpretation. *Cost Difference = Total Costs 2015 - Total Costs 2014 (CHF = Swiss Francs), †Increase = Cost Difference > 0, ‡Number of different drugs defined by active components, §Number of prescribed drugs, identified by GTIN, ||Number of outpatient physician office visits
Comparison of prediction performance of logistic regression (LR), boosted decision tree (BDT) and feedforward neural network (FNN) using different sets of features
| Models | |||||||
| LR | BDT | FNN | |||||
| Demographic model* | 7 | 51.2 | 0.52 | 51.3 | 0.52 | 52.2 | 0.53 |
| + number of different drugs | 8 | 58.0 | 0.61 | 58.1 | 0.61 | 58.7 | 0.61 |
| + number of individual prescriptions | 8 | 55.3 | 0.58 | 56.9 | 0.60 | 57.5 | 0.60 |
| + number of hospitalisations | 8 | 61.0 | 0.62 | 61.0 | 0.62 | 61.1 | 0.63 |
| + number of outpatient physician office visits | 8 | 59.4 | 0.63 | 60.1 | 0.63 | 60.4 | 0.64 |
| + chronic conditions | 29 | 54.8 | 0.57 | 57.0 | 0.59 | 57.5 | 0.60 |
| Extended model† | 33 | 62.8 | 0.67 | 63.1 | 0.68 | 64.0 | 0.69 |
| + additional features | 297 | 64.8 | 0.70 | 66.3 | 0.72 | 66.1 | 0.72 |
| + features representing pharmacotherapy | 482 | 64.5 | 0.69 | 65.4 | 0.71 | 65.6 | 0.71 |
| + total costs | 34 | 62.1 | 0.67 | 64.8 | 0.71 | 65.7 | 0.71 |
| + additional features + total costs | 298 | 65.0 | 0.70 | 67.0 | 0.74 | 67.0 | 0.73 |
| Complete model‡ without total costs | 746 | 65.3 | 0.71 | 66.5 | 0.73 | 66.5 | 0.72 |
| Complete model | 747 | 65.2 | 0.70 | 67.4 | 0.74 | 67.4 | 0.73 |
| Backward Deletion | 36 | – | – | 66.9 | 0.73 | – | – |
| Complete model‡ without total costs | 746 | 65.9 | 0.71 | 66.8 | 0.73 | 66.4 | 0.72 |
| Complete model | 747 | 65.7 | 0.71 | 67.6 | 0.74 | 67.2 | 0.73 |
| Backward Deletion | 36 | – | – | 67.1 | 0.73 | – | – |
Acc = Accuracy, AUC = Area under the curve, Size = Number of features in the model
*Demographic model = age + gender + area of residence + deductible + insurance model,
†Extended model = Demographic model + number of different drugs + number of individual prescriptions + number of hospitalisations + number of outpatient physician office visits + chronic conditions
‡Complete model = Extended model + additional predictors + features representing pharmacotherapy + total costs
Bold data are significant
Fig. 1Area under the receiver operating characteristic curve (AUC): Comparison of prediction performance. LR = logistic regression, BDT = boosted decision tree. FNN = feedforward neural network
Probabilities of cost increase and decrease for patient groups, conditioned on drug groups and hospitalisation
| All | Hospitalisation | No Hospitalisation | |||||
|---|---|---|---|---|---|---|---|
| N | P (Increase), % | N | P (Increase), % | N | P (Increase), % | ||
| Total study population (irrespective of drug groups) | 373′264 | 51.9 | 66′427 | 23.1 | 306′837 | 58.1 | |
| ATC | Name | N | P (Increase), % | N | P (Increase), % | N | P (Increase), % |
| N06DA | Anticholinesterases | 2′214 | 61.1 | 587 | 43.4 | 1′627 | 67.4 |
| N04BA | Dopa and dopa derivatives | 3′722 | 56.1 | 1′158 | 35.1 | 2′564 | 65.6 |
| B01AA | Vitamin K antagonists | 18′893 | 51.6 | 6′087 | 27.0 | 12′806 | 63.2 |
| B03BB | Folic acid and derivatives | 10′467 | 52.6 | 3′177 | 28.4 | 7′290 | 63.1 |
| C01BD | Antiarrhythmics, class III | 4′622 | 47.3 | 1′890 | 25.7 | 2′732 | 62.3 |
| C03CA | Sulfonamides, plain | 29′741 | 50.3 | 10′793 | 29.5 | 18′948 | 62.1 |
| B03AD | Iron in combination with folic acid | 4′483 | 41.6 | 1′919 | 14.8 | 2′564 | 61.7 |
| C03BA | Sulfonamides, plain | 3′256 | 52.7 | 961 | 31.2 | 2′295 | 61.7 |
| G04CB | Testosterone-5a-reductase inhibitors | 2′238 | 53.2 | 552 | 27.5 | 1′686 | 61.6 |
| A10BB | Sulfonylureas | 11′418 | 54.7 | 2′298 | 27.8 | 9′120 | 61.4 |
| ATC | Name | N | P (Decrease), % | N | P (Decrease), % | N | P (Decrease), % |
| S01FA | Anticholinergics | 2′959 | 72.4 | 626 | 78.9 | 2′333 | 70.7 |
| A03BA | Belladonna alkaloids, tertiary amines | 7′851 | 73.4 | 1′964 | 82.1 | 5′887 | 70.5 |
| N01AH | Opioid anaesthetics | 9′715 | 72.3 | 2′235 | 81.9 | 7′480 | 69.5 |
| S01HA | Local anaesthetics | 7′783 | 71.5 | 1′598 | 79.3 | 6′185 | 69.5 |
| A04AA | Serotonin (5HT3) antagonists | 3′771 | 70.9 | 1′637 | 77.3 | 2′134 | 66.0 |
| N01AX | Other general anaesthetics | 29′106 | 68.2 | 7′299 | 80.3 | 21′807 | 64.1 |
| S01EC | Carbonic anhydrase inhibitors | 9′510 | 66.7 | 2′120 | 79.7 | 7′390 | 63.0 |
| S01BC | Antiinflammatory agents, non-steroids | 13′218 | 65.3 | 2′935 | 77.9 | 10′283 | 61.7 |
| C01CA | Adrenergic and dopaminergic agents | 13′690 | 65.3 | 3′085 | 79.7 | 10′605 | 61.2 |
| A03BB | Belladonna alkaloids, semisynthetic | 7′718 | 66.1 | 2′089 | 80.7 | 5′629 | 60.8 |
| ATC | Name | N | P (Decrease), % | N | P (Decrease), % | N | P (Decrease), % |
| B05BB | Solutions affecting the electrolyte balance | 66′792 | 64.0 | 20′392 | 77.9 | 46′400 | 58.0 |
| V08AB | Low osmolar X-ray contrast media* | 29′654 | 63.7 | 10′870 | 78.1 | 18′784 | 55.3 |
| S01CA | Corticosteroids and antiinfectives in combination | 28′050 | 59.7 | 5′774 | 78.0 | 22′276 | 55.0 |
| N01BB | Amides | 57′144 | 59.6 | 14′887 | 77.0 | 42′257 | 53.5 |
Bold data are considered table section headings. N = Number of patients per group (characterised by drug group and hospitalisation/no hospitalisation). P (Increase), P (Decrease) = Probability of cost increase or cost decrease in the respective group, displayed in %. The probability of increase in costs conditioned on hospitalisation was computed for all drug classes
Selection criteria: 1.) prescribed to ≥1500 patients who didn’t have a hospitalisation, arranged by descending probability of increase or decrease (top 10 shown), 2.) prescribed to more than 10′000 patients who didn’t have a hospitalisation, arranged by descending probability of increase or decrease (top 4 not included in 1.) shown)
ATC = “Anatomical Therapeutic Chemical” classification code, *Water-soluble, nephrotropic, low osmolar X-ray contrast media
Weight analysis: Contribution of drug classes to the prediction
| ATC | Name | Acc,% | N | ||
| N06DA | Anticholinesterases | 77.3 | 88 | ||
| A12CC | Magnesium | 71.4 | 56 | ||
| B03BB | Folic acid and derivatives | 69.1 | 408 | ||
| A10AE | Insulins and analogues for injection, long-acting | 68.1 | 91 | ||
| B01AA | Vitamin K antagonists | 67.3 | 1065 | ||
| A03FA | Propulsives | 65.5 | 58 | ||
| Without Hospitalisation | With Hospitalisation | ||||
| ATC | Name | Acc,% | N | Acc,% | N |
| A12CC | Magnesium | 78.6 | 28 | 78.8 | 33 |
| B01AC | Platelet aggregation inhibitors excluding heparin | 76.9 | 26 | 82.9 | 345 |
| C01BD | Antiarrhythmics, class III | 73.3 | 30 | 81.0 | 58 |
| N06CA | Antidepressants in combination with psycholeptics | 69.5 | 59 | 75.0 | 44 |
| B01AA | Vitamin K antagonists | 68.8 | 173 | 74.5 | 471 |
| A03FA | Propulsives | 65.8 | 146 | 84.6 | 143 |
All numbers were calculated on the test dataset. Drug groups contributing at least 5% to the overall positive or negative score (complete model without costs). Additionally, the drug classes must have contributed for at least N = 40 patients (increase) or N = 20 patients (decrease without hospitalisation). The top 6 drug classes are provided, arranged by descending order of accuracy
Acc = Accuracy
N = Number of patients for whom the drug class contributed at least 5% to the overall positive or negative score
ATC = “Anatomical Therapeutic Chemical” classification code
Bold data are significant
Examples of subgroups derived from the decision tree
| N | PI | Gain | ||
| #1 | Patients younger than 35 years with at least 1 prescription for folic acid (B03BB), no more than two outpatient office visits in the second quarter of the year, and fewer than 12 drug prescriptions | 634 | 0.88 | 0.23 |
| #2 | Patients with at least 1 prescription for magnesium (A12CC), no hospitalisation (≤ 1 day), at least 5 outpatient office visits with a gynaecologist during the year, no more than 1 outpatient visit in the first quarter of the year overall, and at least 4 visits in the third quarter of the year | 265 | 0.86 | 0.16 |
| #3 | Patients with at least 1 prescription for iron (trivalent, oral preparations, B03AB), a deductible > 1000 Swiss francs for 2014, no change in this deductible for 2015, at least 6 prescribed drugs, and no more than 5 prescriptions* | 114 | 0.78 | 0.21 |
| #4 | Patients with at least 1 prescription for anticholinesterases for dementia (N06DA), no home care (≤ 1 day), no more than 1 prescription in February, and few prescriptions filled by pharmacies | 276 | 0.70 | 0.13 |
| N | PD | Gain | ||
| #5 | Patients with at least 2 prescriptions for anticholinergics for ophthalmologic use (S01FA), no concomitant therapy with Vitamin K antagonists, no more than 11 prescriptions, and no more than 6 outpatient physician visits in the third quarter of the year | 303 | 0.89 | 0.45 |
| #6 | Patients with at least 4 prescriptions for platelet aggregation inhibitors (excluding heparin, B01AC), who were hospitalised (cardiac-related major disease category) and had frequent home care (mean interval < 3.3 days) | 3777 | 0.83 | 0.04 |
| #7 | Patients with at least 2 prescriptions for any other general anaesthetics (N01AX), with a mode of administration ‘intravenously’ and fewer than 2 prescriptions for sulfonamides (C03CA), and no hospitalisation in the first year | 4237 | 0.74 | 0.18 |
| #8 | Patients with at least 1 prescription for both a beta blocking agent (S01ED) and a corticosteroid and anti-infective (S01CA) for ophthalmologic use within one year, who had a maximum of 2 outpatient visits in December | 2229 | 0.67 | 0.08 |
N = size of subgroup; PI, PD = conditional probability of increase or decrease for the cuts, P(increase | cut); Gain = |difference P(increase | cut) − P(increase | cut without drug class)|; Maximal number of cuts = 5, *defined as different purchase dates
Bold data are significant