| Literature DB >> 34063965 |
Amarawan Pentrakan1,2, Cheng-Chia Yang1, Wing-Keung Wong3,4,5.
Abstract
The lack of an efficient approach in managing pharmaceutical prices in the procurement system led to a substantial burden on government budgets. In Thailand, although the reference price policy was implemented to contain the drug expenditure, there have been some challenges with the price dispersion of medicines and pricing information transparency. This phenomenon calls for the development of a potential algorithm to estimate appropriate prices for medical products. To serve this purpose, in this paper, we first developed the model by the sequential minimal optimization (SMO) algorithm for predicting the range of the prices for each medicine, using the Waikato environment for knowledge analysis software, and applying feature selection techniques also to examine improving predictive accuracy. We used the dataset comprised of 2424 records listed on the procurement system in Thailand from January to March 2019 in the application and used a 10-fold cross-validation test to validate the model. The results demonstrated that the model derived by the SMO algorithm with the gain ratio selection method provided good performance at an accuracy of approximately 92.62%, with high sensitivity and precision. Additionally, we found that the model can distinguish the differences in the prices of medicines in the pharmaceutical market by using eight major features-the segmented buyers, the generic product groups, trade product names, procurement methods, dosage forms, pack sizes, manufacturers, and total purchase budgets-that provided the highest predictive accuracy. Our findings are useful to health policymakers who could employ our proposed model in monitoring the situation of medicine prices and providing feedback directly to suggest the best possible price for hospital purchasing managers based on the feature inputs in their procurement system.Entities:
Keywords: feature selection; medicine price; prediction model; sequential minimal optimization
Year: 2021 PMID: 34063965 PMCID: PMC8196718 DOI: 10.3390/ijerph18115523
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
The features and the definitions of the dataset in our study.
| Features | Descriptions |
|---|---|
| DEPT | Purchasing departments who purchase the medicines for hospitals |
| GPU | The name of generic product use in the database which involves the virtual therapeutic moiety and strength (e.g., Omeprazole 40 mg) |
| TPU | The name of trade product use or brand (e.g., Losec®, Omezole®) |
| METHOD | Procurement method (e.g., bidding method, specific selection method) |
| WINNER | Supplier who sells the medical product |
| UNIT | The dosage form of the drug product (e.g., powder for solution for injection) |
| SIZE | The number of units per pack (e.g., 14 or 28 tablets per box) |
| TOTAL | Purchase budget for each medical product (Thai Baht) |
| PRICE a | Procurement price per unit (Thai Baht) |
a Output variable corresponding to the range of prices for pharmaceutical products.
Figure 1Pre-processing stage and discretization in WEKA with eight class labels.
Figure 2Summarized conceptual framework and research procedure.
Figure 3Sequential minimal optimization classifier in WEKA Software.
Figure 4Example of distribution in drug prices across different brands of the omeprazole (40 mg) powder for solution for injection drug (ATC: A02BC01).
Selected features ordered by most relevant for different feature selection methods and corresponding accuracy.
| Selection Methods | Selected Features | % Accuracy Measure |
|---|---|---|
| CFS | (1) GPU, (2) UNIT (3) DEPT | 84.15% |
| Wrapper Subset Evaluator | (1) GPU, (2) UNIT (3) DEPT, (4) TOTAL, (5) SIZE | 88.57% |
| Information Gain | (1) GPU, (2) UNIT (3) DEPT, (4) TOTAL, (5) SIZE (6) TPU, (7) WINNER | 89.21% |
| Gain Raito | (1) GPU, (2) UNIT (3) DEPT, (4) TOTAL, (5) SIZE (6) TPU, (7) WINNER, (8) METHOD | 92.62% |
Note. CFS = Correlation-Based Feature Subset Selection, TPU = Trade Product Name, GPU = Generic Product Name, UNIT = Dosage Form, DEPT = The Segmented Buyers, WINNER = Manufacturer/Vendor, TOTAL = Total Purchase Budget, METHOD = Procurement Method, SIZE = The Number of Units Per Pack.
Figure 5Examples of price prediction by using the model derived by the SMO algorithm with selection technique of the gain-ratio feature.
Performance evaluation results.
| Class Labels | TP Rate | FP Rate | Precision | Recall | F-Measure | ROC Area | |
|---|---|---|---|---|---|---|---|
| class 1 = (−inf–8.26] | 5(0.2) | 1.000 | 0.000 | 1.000 | 1.000 | 1.000 | 1.000 |
| class 2 = (8.26–16.08] | 1238(51.1) | 0.935 | 0.067 | 0.935 | 0.935 | 0.935 | 0.943 |
| class 3 = (16.08–23.9] | 802(33.1) | 0.908 | 0.055 | 0.891 | 0.908 | 0.899 | 0.935 |
| class 4 = (23.9–31.72] | 127(5.2) | 0.882 | 0.000 | 1.000 | 0.882 | 0.937 | 0.978 |
| class 5 = (31.72–39.54] | 28(1.2) | 0.857 | 0.000 | 1.000 | 0.857 | 0.923 | 0.974 |
| class 6 = (39.54–47.36] | 98(4.0) | 0.990 | 0.003 | 0.942 | 0.990 | 0.965 | 0.990 |
| class 7 = (47.36–55.18] | 48(2.0) | 0.938 | 0.001 | 0.957 | 0.938 | 0.947 | 0.977 |
| class 8 = (55.18–inf) | 78(3.2) | 0.974 | 0.001 | 0.974 | 0.974 | 0.974 | 0.987 |
| Weighted Average | 0.926 | 0.053 | 0.927 | 0.926 | 0.926 | 0.947 |
Note. TP Rate = True Positive Rate, FP Rate = False positive rate, ROC Area = Receiver operating characteristic area.
Summary results.
| Parameters | Results |
|---|---|
| Correctly Classified Instances, | 2245 (92.62%) |
| Incorrectly Classified Instances, | 179 (7.38%) |
| Kappa statistic | 0.8813 |
| Mean absolute error | 0.1883 |
| Root mean squared error | 0.2925 |
| Total Number of Instances | 2424 |
Figure 6Normalized confusion matrix.
Descriptive statistics of medicine code A02BC01 (omeprazole 40 mg, parenteral) used in this study.
| Trade Names | Median | Mean | Minimum | Maximum | % CV | |
|---|---|---|---|---|---|---|
| BRAND A | 278.0 | 16.1 | 17.1 | 12.4 | 29.0 | 16.9 |
| BRAND B | 117.0 | 56.0 | 55.1 | 43.6 | 62.1 | 10.0 |
| BRAND C | 4.0 | 14.7 | 15.0 | 13.4 | 17.1 | 10.9 |
| BRAND D | 56.0 | 17.1 | 18.4 | 12.4 | 35.0 | 35.6 |
| BRAND E | 8.0 | 15.2 | 18.2 | 13.7 | 29.0 | 35.1 |
| BRAND F | 24.0 | 42.4 | 42.9 | 38.5 | 50.0 | 7.4 |
| BRAND G | 255.0 | 15.3 | 16.6 | 12.8 | 26.8 | 17.4 |
| BRAND H | 19.0 | 16.2 | 17.0 | 13.4 | 23.0 | 18.7 |
| BRAND I | 15.0 | 62.1 | 61.1 | 47.1 | 62.1 | 6.3 |
| BRAND J | 719.0 | 18.0 | 18.8 | 13.2 | 31.0 | 22.5 |
| BRAND K | 195.0 | 13.2 | 14.4 | 12.4 | 30.0 | 21.6 |
| BRAND L | 68.0 | 15.3 | 15.9 | 12.5 | 23.0 | 18.7 |
| BRAND M | 12.0 | 15.5 | 18.7 | 12.5 | 26.0 | 29.5 |
| BRAND N | 10.0 | 15.0 | 15.4 | 13.9 | 20.7 | 13.0 |
| BRAND O | 22.0 | 46.9 | 46.7 | 41.0 | 52.0 | 5.2 |
| BRAND P | 50.0 | 15.5 | 17.8 | 12.4 | 29.0 | 28.0 |
| BRAND Q | 12.0 | 15.5 | 17.6 | 12.5 | 29.0 | 32.5 |
| BRAND R | 9.0 | 15.5 | 15.9 | 13.2 | 21.5 | 16.2 |
| BRAND S | 58.0 | 43.2 | 42.3 | 35.0 | 50.3 | 10.1 |
| BRAND T | 463.0 | 13.6 | 14.5 | 8.1 | 25.0 | 17.8 |
| BRAND U | 4.0 | 14.0 | 14.2 | 13.4 | 15.5 | 6.4 |
| BRAND V | 26.0 | 16.6 | 20.9 | 13.2 | 38.0 | 42.7 |