| Literature DB >> 32525888 |
Colm Crowley1,2, Steven Guitron1, Joseph Son3, Oleg S Pianykh4.
Abstract
Limited resources and increased patient flow highlight the importance of optimizing healthcare operational systems to improve patient care. Accurate prediction of exam volumes, workflow surges and, most notably, patient delay and wait times are known to have significant impact on quality of care and patient satisfaction. The main objective of this work was to investigate the choice of different operational features to achieve (1) more accurate and concise process models and (2) more effective interventions. To exclude process modelling bias, data from four different workflows was considered, including a mix of walk-in, scheduled, and hybrid facilities. A total of 84 features were computed, based on previous literature and our independent work, all derivable from a typical Hospital Information System. The features were categorized by five subgroups: congestion, customer, resource, task and time features. Two models were used in the feature selection process: linear regression and random forest. Independent of workflow and the model used for selection, it was determined that congestion feature sets lead to models most predictive for operational processes, with a smaller number of predictors.Entities:
Year: 2020 PMID: 32525888 PMCID: PMC7289434 DOI: 10.1371/journal.pone.0233810
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Feature set divided into 5 groups.
| Group | Name | Description |
|---|---|---|
| Congestion | LineCount0Strict | Number of patients in line with scheduled times after current time. |
| LineCount0/1/2/3/4 | Number of patients in line measured when a patient arrives, 15, 30, 45 & 60 minutes before. | |
| FlowCount30/60 | Number of patients starting exams in the 30- and 60-minute window before patient arrived. | |
| ScheduledFlowCount30/60 | Number of patients scheduled in the 30- and 60-minute window before patient arrived. | |
| FutureFlowCount30/60 | Number of patients scheduled in the 30- and 60-minute window after patient arrived. | |
| AheadCount | Number of patients scheduled before current patient for the day. | |
| IsFirst | First scheduled patient for the day. | |
| IsLast | Last scheduled patient for the day. | |
| NoneInLine | No patients in line. | |
| SumWaits | Sum of the wait times for patients in line. | |
| NumCustomersLast30/60/120 | Number of customers who have arrived in the last 30, 60 & 120 minutes. | |
| NumScheduledNextSlot | Number of patients scheduled in next slot. | |
| NumScheduledNext60 | Number of people scheduled in next 60 minutes. | |
| AvgWaitForDay | Average delay/wait for patients for that day. | |
| NumCompletedInLast30/60/120 | Number of exams completed in last 30, 60 and 120 minutes. | |
| NumCompletedToday | Number of exams completed up to current of day. | |
| DelayedInLine | The number of patients in line who are delayed. | |
| MinTime | Minimum wait time for the day. | |
| MaxTime | Maximum wait time for the day. | |
| DelayCount | Number of delayed exams up to current time of day. | |
| DelayCountLastHour | Number of delayed exams in last hour. | |
| AvgWaitLast30/60/120 | Average wait time last 30, 60 and 120 minutes. | |
| SumTimeToCompleteNextSlot | Expected time to completion of exams in next slot. | |
| SumTimeToCompleteNext60 | Expected time to completion of exams scheduled in next hour. | |
| InProgressSize | Number of exams in progress for facility. | |
| SumTimeToCompleteInProgress | The sum of the expected times to complete of the exams in progress | |
| NoneCompleted | No exams completed that day. | |
| NoneInProgress | No exams in progress. | |
| SumInProgress | Sum of length of time exams have been in progress. | |
| MostRecent1/2/3/4/5 | Delay/wait time for most recent patient, 2nd, 3rd, 4th & 5th most recent patients. | |
| AvgWaitLast2/4/8Customers | Average wait for the last 2, 4 and 8 customers. | |
| Median5 | Median delay/wait time for 5 most recent customers. | |
| NumAddOnsToday | Number of people who have been added to the schedule for today. | |
| NumAddOnsLast60 | Number of people who have been added to the schedule in last 60 minutes. | |
| SumHowEarlyWaiting | Sum of how early the patients in line are for their appointment. | |
| AvgHowEarlyWaiting | Average of how early the patients in line are for their appointment. | |
| SumDelayWaitingInLine | Sum of delays/waits of patients in line. | |
| SumDelayInProgress | Sum of delays/waits of exams in progress. | |
| Customer | AvgAgePeopleWaiting | Average age of the patients in line. |
| OutpatientWaitingCount | Number of outpatients waiting in line. | |
| Resource | NumScannersInUseToday | Number of scanners in facility that have been used on that day. |
| Task | WithContrastCountWaiting | Number of patients waiting for an exam with contrast. |
| WithandWithoutContrastCountWaiting | Number of patients waiting for an exam with and without contrast. | |
| WithContrastCountInProgress | Number of exams in progress with contrast. | |
| WithandWithoutContrastCountInProgress | Number of exams in progress with and without contrast. | |
| ExpectedDelayNextExam | Expected delay of the next scheduled exam. | |
| SumDelayWaitingByExamCode | Sum of delays of patients in line by exam type. | |
| AvgWaitByTaskTypeLine | Average waits of patients in line by exam type. | |
| SumWaitByTaskTypeLine | Sum of waits of patients in line by exam type. | |
| MSKCount | Number of patients waiting for musculoskeletal exam. | |
| CardiacCount | Number of patients waiting for cardiac exam. | |
| VascularCount | Number of patients waiting for vascular exam. | |
| AbdominalCount | Number of patients waiting for abdominal exam. | |
| NeuroCount | Number of patients waiting for neuro exam. | |
| PediatricCount | Number of patients waiting for pediatric exam. | |
| ThoracicCount | Number of patients waiting for thoracic exam. | |
| Time | DayOfYear | The day of the year the exam is scheduled. |
| Month | The month of the year the exam is scheduled. | |
| DayOfWeek | The day of the week the exam is scheduled. | |
| StartTime/2/3/4 | Hour of arrival, and 2nd, 3rd & 4th powers of hour of arrival to account for nonlinear trends. | |
| BeforeSlot | Time since last appointment slot. | |
| AfterSlot | Time until next appointment slot. |
Fig 1Feature selection process for linear regression, which was performed separately for each facility, on multiple data subsets.
Features were added to the best model, at each step, based on predictive performance. This process was repeated 100 times, each with different subsets of the data, to avoid data bias.
Fig 2Reduction in the percentage of test error (Testing Percentage Error) as features (N) are added to the linear regression model.
Testing error refers to the ratio of the test MSE of the model of size N to the test MSE of the model with the intercept only; with a Testing Percentage Error of 100% indicating the error made by the intercept only model. As features are added to the model, we can see that after 20 features the test error for all four facilities has plateaued, with F1 being the last facility to do so.
Fig 3Features selected in the linear regression model for each facility.
The congestion features are highlighted in blue and the time features highlighted in orange; no other feature types were identified by the selection algorithm. 100% on the horizontal axis indicates that a variable was selected in all the linear regression models for that facility.
Predictive performance, measured by mean absolute error, of reduced feature and full linear regression models for each facility.
The mean absolute errors of both models are denominated in minutes.
| Facility | Full feature set (minutes) | Best 10-feature set (minutes) |
|---|---|---|
| F1 | 17.60 | 19.20 |
| F2 | 8.55 | 9.32 |
| F3 | 22.95 | 23.07 |
| F4 | 3.91 | 3.91 |
Union of features for F1, F2 and F3 as selected by forward stepwise linear regression.
| Best features for F1, F2 and F3 | |
|---|---|
| Feature | Group |
| LineCount0Strict | Congestion |
| AheadCount | Congestion |
| StartTime4 | Time |
| DelayedInLine | Congestion |
| InProgressSize | Congestion |
| NumCompletedToday | Congestion |
| NumScheduledNextSlot | Congestion |
| LineCount0 | Congestion |
| NumScheduledNext60 | Congestion |
| AvgWaitForDay | Congestion |
| SumHowEarlyWaiting | Congestion |
| AfterSlot | Time |
| SumWaits | Congestion |
| ScheduledFlowCount30 | Congestion |
| SumDelayInProgress | Congestion |
| BeforeSlot | Time |
| IsFirst | Congestion |
Best predictive features for F4 as selected by linear regression.
| Best Features for F4 | |
|---|---|
| Feature | Group |
| LineCount0 | Congestion |
| AheadCount | Congestion |
| NoneInLine | Congestion |
| NoneInProgress | Congestion |
| NoneCompleted | Congestion |
| StartTime4 | Time |
| InProgressSize | Congestion |
| NumCompletedToday | Congestion |
| NumCompletedInLast30 | Congestion |
| AvgWaitLast30 | Congestion |
Transferability of features selected by linear regression: How well the features selected as optimal for each facility (columns) approximate the data from each facility (rows).
The ratio represents the mean of the predictions made by the transferred feature set compared to the predictions made by the facility optimal feature set. A ratio of 1 indicates that the predictions are equivalent and a ratio of 1.06 indicates that the predictions made by the transferred set are 6% worse than the predictions made by the facility optimal feature set.
| Features | ||||||
|---|---|---|---|---|---|---|
| F1 | F2 | F3 | F4 | |||
| F1 | - | 1.14 | 1.09 | 1.15 | ||
| F2 | 1.12 | - | 1 | 1.07 | ||
| F3 | 1.08 | 1.08 | - | 1.14 | ||
| F4 | 1.02 | 1.02 | 1.01 | - | ||
*When using F1, F2 and F3 features to predict F4, the best 10 predictive features that were suitable for F4 were used. When using the Union to predict F4, there were 8 features that were applicable to F4.
Predictive performance, measured by mean absolute error, of reduced feature and full feature random forest models for each facility.
The mean absolute errors of both models are denominated in minutes.
| Facility | Full feature set (minutes) | Best 10-feature set (minutes) |
|---|---|---|
| F1 | 19.23 | 18.24 |
| F2 | 9.92 | 10.02 |
| F3 | 24.47 | 24.55 |
| F4 | 3.98 | 3.98 |
Union of features for F1, F2 and F3 as selected by importance in random forest.
| Best features for F1, F3, and F2 | |
|---|---|
| Feature | Group |
| LineCount0Strict | Congestion |
| AheadCount | Congestion |
| StartTime | Time |
| StartTime2 | Time |
| StartTime3 | Time |
| StartTime4 | Time |
| NumCompletedToday | Congestion |
| DelayedInLine | Congestion |
| AvgWaitLastK3Customers | Congestion |
| Median5 | Congestion |
| SumDelayInProgress | Congestion |
| BeforeSlot | Time |
| SumWaits | Congestion |
| DelayCount | Congestion |
Best predictive features for F4 as selected by random forest.
| Best Features for F4 | |
|---|---|
| Feature | Group |
| LineCount0 | Congestion |
| AheadCount | Congestion |
| StartTime | Time |
| StartTime2 | Time |
| StartTime3 | Time |
| StartTime4 | Time |
| SumTimeToCompleteInProgress | Congestion |
| SumDelayInProgress | Congestion |
| NumCustomersInLast30 | Congestion |
| SumWaits | Congestion |
Transferability of features selected by random forest.
Given the random nature of the tree-based algorithm, 95% confidence intervals are provided in parentheses. The ratio represents the mean of the predictions made by the transferred feature set compared to the predictions made by the facility optimal feature set (a ratio of 1.06 indicates that predictions made with the transferred set are 6% worse).
| Features | ||||||
|---|---|---|---|---|---|---|
| F1 | F2 | F3 | F4 | Union | ||
| F1 | - | 1 (0.9,1.17) | 0.99 (0.91,1.08) | 1.04 (0.95,1.15) | 0.97 (0.88,1.07) | |
| F2 | 1.03 (0.95,1.1) | - | 1 (0.97,1.03) | 1.04 (0.96,1,13) | 0.98 (0.9,1.02) | |
| F3 | 1.05 (0.99,1.13) | 1.04 (0.98,1.11) | - | 1.1 (0.99,1.21) | 1 (0.96,1.07) | |
| F4 | 1.02 (0.97,1.06) | 1 (0.96, 1.05) | 1.02 (0.99,1.07) | - | 0.99 (0.96,1.05) | |
*When using F1, F2 and F3 features to predict F4, the best 10 predictive features that were suitable for F4 were used. When using the Union to predict F4, there were 12 features that were suitable for F4.