| Literature DB >> 30934913 |
Robert Andrews1, Moe T Wynn2, Kirsten Vallmuur3,4, Arthur H M Ter Hofstede5, Emma Bosley6, Mark Elcock7, Stephen Rashford8.
Abstract
While noting the importance of data quality, existing process mining methodologies (i) do not provide details on how to assess the quality of event data (ii) do not consider how the identification of data quality issues can be exploited in the planning, data extraction and log building phases of any process mining analysis, (iii) do not highlight potential impacts of poor quality data on different types of process analyses. As our key contribution, we develop a process-centric, data quality-driven approach to preparing for a process mining analysis which can be applied to any existing process mining methodology. Our approach, adapted from elements of the well known CRISP-DM data mining methodology, includes conceptual data modeling, quality assessment at both attribute and event level, and trial discovery and conformance to develop understanding of system processes and data properties to inform data extraction. We illustrate our approach in a case study involving the Queensland Ambulance Service (QAS) and Retrieval Services Queensland (RSQ). We describe the detailed preparation for a process mining analysis of retrieval and transport processes (ground and aero-medical) for road-trauma patients in Queensland. Sample datasets obtained from QAS and RSQ are utilised to show how quality metrics, data models and exploratory process mining analyses can be used to (i) identify data quality issues, (ii) anticipate and explain certain observable features in process mining analyses, (iii) distinguish between systemic and occasional quality issues, and (iv) reason about the mechanisms by which identified quality issues may have arisen in the event log. We contend that this knowledge can be used to guide the data extraction and pre-processing stages of a process mining case study to properly align the data with the case study research questions.Entities:
Keywords: GEMS; HEMS; data quality; methodologies and best practice for PODS4H; pre-hospital transport and care; process mining in healthcare
Mesh:
Year: 2019 PMID: 30934913 PMCID: PMC6479847 DOI: 10.3390/ijerph16071138
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1The PDM methodology [16].
Figure 2The L* methodology [18].
Figure 3The PM methodology [20].
Figure 4Our approach adapted from the CRISP-DM methodology [8].
Figure 5BPMN model of emergency incident management—ground and aero-medical call centre, asset deployment and patient transport.
Figure 6ORM model of QAS data.
Figure 7ORM model of RSQ data.
QAS—Data attributes and descriptions.
| Column Name | Description |
|---|---|
| T_INCIDENT | Incident identifier |
| D_RECEIVED_CAD | Date/time the QAS emergency call centre is notified of an incident and |
| FIRST_ASSIGNED_CAD | Date/time when the |
| CAD (Vehicle) Waypoints | |
| ON_SCENE_CAD | Date/time when a unit arrives at the incident scene. |
| DEPART_SCENE_CAD | Date/time when a unit departs the incident scene. |
| AT_DEST_CAD | Date/time when a unit arrives at destination. This is usually a hospital |
| CLEAR_CAD | Date/time when a unit indicates its involvement in the incident is finished |
| eARF (Patient) Waypoints | |
| EN_ROUTE_VACIS | Date/time recorded by a unit indicating it has commenced travelling |
| AT_SCENE_VACIS | Date/time recorded by a unit when it arrives at the incident scene. |
| AT_PAT_VACIS | Date/time recorded by a unit when paramedics arrive at a patient. May |
| LOADED_VACIS | Date/time recorded by a unit when a patient is loaded in the unit ready |
| NOTIFY_VACIS | Date/time recorded by a unit on leaving incident scene. |
| OFF_STRETCHER_VACIS | Date/time recorded by a unit when a patient is unloaded from the unit |
QAS—Column-level data quality summary for a sample of columns.
| Column Name | Data Type | Completeness | Precision | Uniqueness |
|---|---|---|---|---|
| T_INCIDENT | int | 100% | 100% | |
| D_RECEIVED_CAD | datetime | 100% | 83% | 100% |
| FIRST_ASSIGNED_CAD | datetime | 100% | 83% | 58% |
| CAD (Vehicle) Waypoints | ||||
| ON_SCENE_CAD | datetime | 97% | 83% | 83% |
| DEPART_SCENE_CAD | datetime | 57% | 83% | 82% |
| AT_DEST_CAD | datetime | 57% | 83% | 82% |
| CLEAR_CAD | datetime | 100% | 83% | 83% |
| eARF (Patient) Waypoints | ||||
| EN_ROUTE_VACIS | datetime | 94% | 66% | 84% |
| AT_SCENE_VACIS | datetime | 95% | 66% | 86% |
| AT_PAT_VACIS | datetime | 90% | 66% | 90% |
| LOADED_VACIS | datetime | 52% | 66% | 89% |
| NOTIFY_VACIS | datetime | 7% | 66% | 53% |
| OFF_STRETCHER_VACIS | datetime | 50% | 66% | 93% |
RSQ—Data attributes and descriptions.
| Column Name | Description |
|---|---|
| SOURCE_ID | Incident identifier. |
| DATE_RETRIEVAL_REQUESTED | Date the RSQ is notified of an incident and of a request for |
| TEAM_ACTIVATED | Date/time when the medical team crewing is alerted to fly. |
| READY_TO_DEPART | Date/time when the unit is ready to depart base. |
| DEPART_WITH_MEDICAL_TEAM | Date/time when a unit actually departs base. |
| LAND_AT_DESTINATION | Date/time when a unit arrives at the incident scene. |
| AT_SCENE_PATIENT | Date/time when the medical team arrives at a patient. May |
| DEPARTURE_READY | Date/time when a unit is ready to depart from the incident scene. |
| ACTUAL_TIME_DEPART | Date/time when a unit actually departs the incident scene. |
| ARRIVE_AT_RECIEVING_HOSPITAL | Date/time when a unit arrives at the receiving hospital. |
| DEPART_RECIEVING_HOSPITAL | Date/time when a unit departs from the receiving hospital |
| ARRIVE_BACK_AT_BASE | Date/time when a unit returns to base. |
| AVAILABLE_FOR_NEXT_TASKING | Date/time when a unit is refitted and ready for re-tasking. |
| MECHANISM_OF_INJURY | mode of injury necessitating aero-medical rather then |
RSQ—Column-level data quality summary for a sample of columns.
| Column Name | Data Type | Completeness | Precision | Uniqueness |
|---|---|---|---|---|
| SOURCE_ID | int | 100% | 100% | |
| DATE_RETRIEVAL_REQUESTED | datetime | 100% | 33% | 7% |
| TEAM_ACTIVATED | datetime | 100% | 65% | 95% |
| READY_TO_DEPART | datetime | 100% | 65% | 96% |
| DEPART_WITH_MEDICAL_TEAM | datetime | 100% | 65% | 95% |
| LAND_AT_DESTINATION | datetime | 100% | 65% | 96% |
| AT_SCENE_PATIENT | datetime | 100% | 65% | 97% |
| DEPARTURE_READY | datetime | 100% | 65% | 96% |
| ACTUAL_TIME_DEPART | datetime | 100% | 65% | 96% |
| ARRIVE_AT_RECIEVING_HOSPITAL | datetime | 100% | 65% | 96% |
| DEPART_RECIEVING_HOSPITAL | datetime | 100% | 65% | 96% |
| ARRIVE_BACK_AT_BASE | datetime | 100% | 66% | 93% |
| AVAILABLE_FOR_NEXT_TASKING | datetime | 100% | 63% | 92% |
| MECHANISM_OF_INJURY | string | 25% | 100% | 27% |
RSQ—milestone activity ordering violations.
| Milestone Activities |
| ||
|---|---|---|---|
| 434 | 66 | 0 | |
| 470 | 29 | 1 | |
| 315 | 105 | 80 |
Figure 8Event Log creation from sample data.
Figure 9QAS process model derived from sample data.
Figure 10QAS conformance model derived from sample data.
Figure 11RSQ process model derived from sample data.
Figure 12RSQ conformance model derived from sample data.