| Literature DB >> 26936756 |
Minlei Liao1, Yunfeng Li2, Farid Kianifard3, Engels Obi4, Stephen Arcona5.
Abstract
BACKGROUND: Cluster analysis (CA) is a frequently used applied statistical technique that helps to reveal hidden structures and "clusters" found in large data sets. However, this method has not been widely used in large healthcare claims databases where the distribution of expenditure data is commonly severely skewed. The purpose of this study was to identify cost change patterns of patients with end-stage renal disease (ESRD) who initiated hemodialysis (HD) by applying different clustering methods.Entities:
Mesh:
Year: 2016 PMID: 26936756 PMCID: PMC4776444 DOI: 10.1186/s12882-016-0238-2
Source DB: PubMed Journal: BMC Nephrol ISSN: 1471-2369 Impact factor: 2.388
Common agglomerative algorithms for forming clusters
| Average-Linkage [ | • The distance between 2 clusters is defined as the average distance between all pairs of the 2 clusters’ members |
| Centroid Method [ | • Cluster centroids are defined as the mean values of the observation on the variables of the cluster |
| Single-Linkage [ | • Also known as “nearest-neighbor” method |
| Complete-Linkage [ | • Also known as the “farthest-neighbor” method |
| Flexible-Beta [ | • Uses a weighted average distance between pairs of |
| McQuitty’s Similarity [ | • Assumes that each entity is a separate cluster |
| Ward’s Method [ | • The similarity between two clusters is the sum of squares within the clusters summed over all variables |
Strengths and weaknesses of hierarchical and K-means CA methods
| Advantages | Disadvantages | |
|---|---|---|
| Hierarchical CA | • Offers a simple yet comprehensive portrayal of clustering solutions | • Susceptible to impact of outliers in the data |
| K-means CA | • Results less susceptible to outliers in the data, influence of chosen distance measure, or the inclusion of inappropriate or irrelevant variables | • Different solutions for each set of seed points and no guarantee of optimal clustering of observations |
CA, cluster analysis
Fig. 1Patient selection diagram. Abbreviations: ESRD, end-stage renal disease; HD, hemodialysis
All-cause medical costs in the 12-month baseline and follow-up periods
| Variables | Mean (SD) | Min | Median | 75th Percentile | 90thth Percentile | 95th Percentile | 99th Percentile | Max |
|---|---|---|---|---|---|---|---|---|
| All cause medical costs (pre-HD period) | $45,145 ($109,596) | 0 | $16,905 | $42,758 | $102,722 | $178,250 | $461,317 | $4,771,412 |
| All cause medical cost (post-HD period) | $48,713 ($108,506) | 0 | $16,330 | $47,995 | $123,513 | $194,050 | $495,240 | $2,664,338 |
SD standard deviation, Min minimum, Max maximum
Summary of results from clustering analysis methods applied
| Clustering Approach | Linkage Type | Number of Clustersa | Cluster Sample Size (Smallest in Bold) |
|---|---|---|---|
| Hierarchical | Average | 3 | 18,376; 3; |
| Average | 4 | 18,376; 2; | |
| Average | 5 | 18,312; 64; 2; | |
| Hierarchical | Centroid | 3 | 18,365; 14; |
| Centroid | 4 | 18,351; 14; 14; | |
| Centroid | 5 | 18,351; 13; 14; | |
| Hierarchical | Single-Linkage | 3 | 18,378; |
| Single-Linkage | 4 | 18,377; | |
| Single-Linkage | 5 | 18,376; | |
| Hierarchical | Complete-Linkage | 3 | 18,367; 7; |
| Complete-Linkage | 4 | 18,118; 249; 7; | |
| Complete-Linkage | 5 | 18,118; 249; 6; 6; | |
| Hierarchical | Flexible-Beta | 3 | 13,416; 3,732; |
| Flexible-Beta | 4 | 13,416; 3,732; 1059; | |
| Flexible-Beta | 5 | 8,919; 4,497; 3,732; 1,059; | |
| Hierarchical | McQuitty’s Similarity | 3 | 18,373; 6; |
| McQuitty’s Similarity | 4 | 18,367; 6; 6; | |
| McQuitty’s Similarity | 4 | 18,205; 162; 6; 6; | |
| Hierarchical | Ward’s Method | 3 | 15,718; 2,315; |
| Ward’s Method | 4 | 15,718; 2,315; 284; | |
| Ward’s Method | 5 | 15,718; 2,315; 239; 63; | |
| Non-hierarchical | N/A | 3 | 336; 17,909; |
| N/A | 4 | 113; 16,624; 1,554; | |
| N/A | 5 | 116; 594; 16,162; |
N/A not applicable. aNumber of clusters in the model
Fig. 2Scatter plot by cluster of all-cause medical costs in pre- and post-HD periods by K-means CA with four cluster solutionsa. Footnote: aPseudo F Statistics = 13,979.98; Approximate Expected Over-All R = 0.79; Cubic Clustering Criterion = −63.93. Each cluster is labeled by corresponding number
Fig. 3All-cause medical costs in pre- and post-HD periods by clustera
Demographic and clinical characteristics of patients grouped into 4 proposed clusters using K-means CA
| Cluster 1: Average to High | Cluster 2: Very High to High | Cluster 3: Average to Average | Cluster 4: Increasing Costs, High at Both Points | |||||
|---|---|---|---|---|---|---|---|---|
| ( | ( | ( | ( | |||||
| Age (y), mean (SD) | 57.6 (11.6) | 55.5 (14.8) | 63.9 (14.0) | 56.2 (12.8) | ||||
| Age (y), n (%) | ||||||||
| 18-24 | 0 (0.0) | 4 (4.5) | 121 (0.7) | 33 (2.1) | ||||
| 25-34 | 2 (1.8) | 7 (7.9) | 355 (2.1) | 54 (3.5) | ||||
| 35-44 | 15 (13.3) | 6 (6.7) | 1026 (6.2) | 156 (10.0) | ||||
| 45-54 | 24 (21.2) | 19 (21.3) | 2401 (14.4) | 375 (24.1) | ||||
| 55-64 | 50 (44.2) | 33 (37.1) | 4652 (28.0) | 609 (39.2) | ||||
| 65+ | 22 (19.5) | 20 (22.5) | 8069 (48.5) | 327 (21.0) | ||||
| Sex, n (%) | ||||||||
| Male | 66 (58.4) | 48 (53.9) | 9599 (57.7) | 924 (59.5) | ||||
| Female | 47 (41.6) | 41 (46.1) | 7025 (42.3) | 630 (40.5) | ||||
| Region in the United States, n (%) | ||||||||
| Northeast | 12 (10.6) | 12 (13.5) | 1843 (11.1) | 192 (12.4) | ||||
| North central | 32 (28.3) | 18 (20.2) | 6084 (36.6) | 444 (28.6) | ||||
| South | 38 (33.6) | 39 (43.8) | 6354 (38.2) | 625 (40.2) | ||||
| West | 30 (26.5) | 19 (21.3) | 2235 (13.4) | 286 (18.4) | ||||
| Unknown | 1 (0.9) | 1 (1.1) | 108 (0.6) | 7 (0.5) | ||||
| Health insurance type, n (%) | ||||||||
| FFS | 87 (77.0) | 71 (79.8) | 13,967 (84.0) | 1237 (79.6) | ||||
| HMO and POS capitation | 20 (17.7) | 17 (19.1) | 2304 (13.9) | 270 (17.4) | ||||
| Missing | 6 (5.3) | 1 (1.1) | 353 (2.1) | 4 (3.0) | ||||
| Comorbidity Score Indicesa | Pre-HD Period | Post-HD Period | Pre-HD Period | Post-HD Period | Pre-HD Period | Post-HD Period | Pre-HD Period | Post-HD Period |
| ECI, mean (SD) | 6.9 (3.4) | 10.8 (3.5) | 9.0 (4.1) | 9.3 (3.7) | 5.7 (2.5) | 6.8 (2.8) | 6.5 (3.0) | 9.5 (3.2) |
| CCI, mean (SD) | 5.0 (2.7) | 7.1 (2.4) | 5.6 (3.2) | 6.2 (2.9) | 4.6 (2.2) | 5.1 (2.3) | 5.0 (2.5) | 6.5 (2.7) |
FFS fee-for-service, HD, hemodialysis, HMO health maintenance organization, PPS, point-of-service, ECI elixhauser comorbidity index, CCI charlson comorbidity index, SD standard deviation. aIdentification was based on non-rule out diagnosis
Medical codes indicating HD
| Code | Description | Source |
|---|---|---|
| 458.21 | Hypotension from HD | ICD-9-CM diagnosis |
| V56.31 | Adequacy testing for HD | ICD-9-CM diagnosis |
| 39.95 | HD | ICD-9-CM procedure |
| A4680 | Activated carbon filter for HD (each) | HCPCS |
| A4690 | Dialyzer (artificial kidneys); all types and all sizes for HD | HCPCS |
| A4706 | Bicarbonate concentrate solution per gallon for HD | HCPCS |
| A4707 | Bicarbonate concentrate powder per packet for HD | HCPCS |
| A4708 | Acetate concentrate solution per gallon for HD | HCPCS |
| A4709 | Acid concentrate solution per gallon for HD | HCPCS |
| A4730 | Fistula cannulation set for HD | HCPCS |
| A4740 | Shunt accessory for HD (any type, each) | HCPCS |
| A4750 | Blood tubing, arterial or venous for HD (each) | HCPCS |
| A4755 | Blood tubing, arterial and venous combined for HD (each) | HCPCS |
| A4802 | Protamine sulphate per 50 mg for HD | HCPCS |
| A4870 | Plumbing and/or electrical work for home HD equipment | HCPCS |
| A4890 | Contracts, repair, and maintenance for HD equipment | HCPCS |
| A4918 | Venous pressure clamp for HD (each) | HCPCS |
| E1520 | Heparin infusion pump for HD | HCPCS |
| E1530 | Air bubble detector for HD (each, replacement) | HCPCS |
| E1540 | Pressure alarm for HD (each, replacement) | HCPCS |
| E1550 | Bath conductivity meter for HD (each) | HCPCS |
| E1560 | Blood leak detector for HD (each, replacement) | HCPCS |
| E1575 | Transducer protectors/fluid barriers for HD | HCPCS |
| E1580 | Unipuncture control system for HD | HCPCS |
| E1590 | HD machine | HCPCS |
| E1600 | Delivery and/or installation charges for HD equipment | HCPCS |
| E1610 | Reverse osmosis water purification system for HD | HCPCS |
| E1615 | Deionizer water purification system for HD | HCPCS |
| E1620 | Blood pump replacement for HD | HCPCS |
| E1625 | Water-softening system for HD | HCPCS |
| E1636 | Sorbent cartridges for HD | HCPCS |
| G0365 | Vessel mapping of vessels for HD access | HCPCS |
| G0392 | Transluminal balloon angioplasty, percutaneous, for maintenance of hemodialysis access, arteriovenous fistula or graft, arterial | HCPCS |
| G0393 | Transluminal balloon angioplasty, percutaneous for maintenance of HD access, arteriovenous fistula or graft, venous | HCPCS |
| G8081 | ESRD patient requiring HD vascular access documented to have received autogenous AV fistula | HCPCS |
| G8082 | ESRD patient requiring HD vascular access documented to have received vascular access other than autogenous AV fistula | HCPCS |
| G8085 | ESRD patient requiring hemodialysis vascular access was not candidate for autogenous arteriovenous fistula | HCPCS |
| S9335 | Home therapy for HD | HCPCS |
| 90935 | HD procedure with single evaluation by a physician or other qualified health care professional | CPT |
| 90937 | HD procedure requiring repeated evaluation(s) with or without substantial revision of dialysis prescription | CPT |
| 90940 | HD access flow study to determine blood flow in grafts and arteriovenous fistulae by an indicator method | CPT |
| 93990 | Duplex scan of HD access | CPT |
| 36800 | Insertion of cannula for other purpose for HD (separate procedure); vein to vein | CPT |
| 36810 | Insertion of cannula for other purpose for HD (separate procedure); arteriovenous, external | CPT |
| 36815 | Insertion of cannula for other purpose for HD (separate procedure); arteriovenous, external revision, or closure | CPT |
HD hemodialysis, ESRD end-stage renal failure, HCPCS healthcare common procedure coding system, CPT current procedural terminology, ICD-9-CM International Classification of Disease, 9th Revision, clinical modification
Top 10 CSS disease categories in the pre- and post-HD periods
| Cluster and Descriptive Costs (n) | CCS Disease Categories in the Pre-HD Period (%) | CCS Disease Categories in the Post-HD Period (%) |
|---|---|---|
| Cluster 1: Average to High ( | 1. Acute and unspecified ESRD (82 %) | 1. CKD (99 %) |
| Cluster 2: Very High to High ( | 1. Acute and unspecified ESRD (87 %) | 1. CKD (97 %) |
| Cluster 3: Average to Average ( | 1. CKD (92 %) | 1. CKD (100 %) |
| Cluster 4: Increasing Costs, High at Both Points ( | 1. CKD (85 %) | 1. CKD (92 %) |
ESRD end-stage renal disease, CKD chronic kidney disease