| Literature DB >> 31328133 |
Matthew C Aalsma1, Katherine Schwartz1, Konrad A Haight2, G Roger Jarjoura2, Allyson L Dir1.
Abstract
CONTEXT: Integrating electronic health records (EHR) with other sources of administrative data is key to identifying factors affecting the long-term health of traditionally underserved populations, such as individuals involved in the justice system. Linking existing administrative data from multiple sources overcomes many of the limitations of traditional prospective studies of population health, but the linking process assumes high levels of data quality and consistency within administrative data. Studies of EHR, unlike other types of administrative data, have provided guidance to evaluate the utility of big data for population health research. CASE DESCRIPTION: Here, an established EHR data quality framework was applied to identify and describe the potential shortcomings of administrative juvenile justice system data collected by one of four case management systems (CMSs) across 12 counties in a Midwest state. The CMS data were reviewed for logical inconsistencies and compared along the data quality dimensions of plausibility and completeness. MAJOR THEMES: After applying the data quality framework, several patterns of logical inconsistencies within the data were identified. To resolve these inconsistencies, recommendations regarding data entry, review, and extraction are offered.Entities:
Keywords: administrative data; data linking; data quality; electronic health records; juvenile justice system
Year: 2019 PMID: 31328133 PMCID: PMC6625535 DOI: 10.5334/egems.258
Source DB: PubMed Journal: EGEMS (Wash DC) ISSN: 2327-9214
Logical inconsistencies violating data plausibility.
| Inconsistency | Definition and examples |
|---|---|
| If a case is diverted - meaning that the case was dismissed or resolved through “warn and release,” an informal adjustment (written agreement between the juvenile and the Juvenile Court Probation department), or a referral to a treatment program – there should be no petition filed for formal court processing. If the data extraction suggested that a case had been both diverted and petitioned, often the early decision to divert was overturned by the prosecutor, who then filed a petition. In this scenario, the data would be corrected to reflect “no” for diverted and “yes” for petitioned. | |
| For this plausibility error, a case record was incorrect often because the youth was arrested in one county but held in another county’s detention center. Sometimes the inconsistency meant that a youth was being housed in the local detention center for a case in another county (i.e., a courtesy detention). To correct these data errors, these cases would not be included in the total number of cases within a county, since the court processing occurred in another jurisdiction. | |
| For the purpose of recording a diverted case, the diversion should be the last decision point. Thus, when the data extracts for a diverted case included information on subsequent decision points, it was likely that subsequent arrests for the same youth were erroneously linked to the original diversion. In most cases, the excess information was not applicable to the diversion in question, meaning that the excess information would apply to separate arrests. In other cases, the diversion field was incorrectly filled, requiring a simple correction. | |
| Similar to diversion cases, if the youth is waived to adult court, there should be no further activity in juvenile court related to the same arrest. The data extractions showed that “waived” was the decision point most likely to be incorrectly noted in the CMSs. Data entry corrections resolved these errors. | |
| An adjudicatory hearing in which the arrest was found “not true” is another way to conclude a juvenile case. There should be no formal probation or confinement in a correctional facility if the youth is not adjudicated delinquent. In most of these inconsistent adjudication cases, the error occurred because the adjudication field was left blank, and subsequent decision points accurately reflected the case outcome. | |
Case Description by CMS (N = 16,013)*.
| CMS 1 | CMS 2 | CMS 3 | CMS 4 | |||||
|---|---|---|---|---|---|---|---|---|
| n = 1,474 | n = 870 | n = 12,662 | n = 1,007 | |||||
| n | (%) | n | (%) | n | (%) | n | (%) | |
| Male | 923 | (62.6) | 604 | (69.5) | 8,527 | (67.3) | 669 | (66.4) |
| Female | 551 | (37.4) | 266 | (30.5) | 4,135 | (32.7) | 338 | (33.6) |
| African American/Black | 181 | (12.3) | 63 | (7.2) | 3,799 | (30.0) | ||
| Asian | <10 | <10 | 71 | (0.6) | ||||
| Hawaiian/Pacific Islander | <10 | <10 | <10 | |||||
| Hispanic/Latino | 79 | (5.3) | 25 | (2.9) | 624 | (4.9) | ||
| Native American or Native Alaskan | <10 | <10 | 27 | (0.2) | ||||
| White | 1,172 | (79.5) | 752 | (86.4) | 7,080 | (55.9) | ||
| Other | 30 | (2.1) | <10 | 1,057 | (8.3) | |||
| 10 | 21 | (1.4) | <10 | 97 | (0.8) | <10 | ||
| 11 | 33 | (2.2) | 14 | (1.6) | 231 | (1.8) | 18 | (1.8) |
| 12 | 90 | (6.1) | 23 | (2.7) | 503 | (4.0) | 38 | (3.8) |
| 13 | 157 | (10.6) | 73 | (8.4) | 1,011 | (8.0) | 97 | (9.7) |
| 14 | 215 | (14.6) | 116 | (13.4) | 1,790 | (14.1) | 146 | (14.5) |
| 15 | 314 | (21.3) | 156 | (17.9) | 2,475 | (19.5) | 201 | (19.9) |
| 16 | 330 | (22.4) | 225 | (25.8) | 3,186 | (25.2) | 254 | (25.2) |
| 17 | 300 | (20.4) | 254 | (29.2) | 3,206 | (25.3) | 233 | (23.2) |
| 18 | 15 | (1.0) | <10 | 163 | (1.3) | 10 | (1.0) | |
| | ||||||||
| Felony A | <10 | <10 | <10 | |||||
| Felony B | 28 | (1.9) | 15 | (1.7) | 254 | (2.0) | ||
| Felony C | 61 | (4.1) | 26 | (3.0) | 253 | (2.0) | ||
| Felony D | 336 | (22.8) | 214 | (24.6) | 1,616 | (12.8) | ||
| Misdemeanor A | 312 | (21.2) | 238 | (27.4) | 3,132 | (24.7) | ||
| Misdemeanor B | 206 | (14.0) | 83 | (9.5) | 1,805 | (14.3) | ||
| Misdemeanor C | 101 | (6.8) | 180 | (20.7) | 226 | (1.8) | ||
| Status Offense | 423 | (28.7) | 111 | (12.8) | 2,623 | (20.7) | ||
| Violation of Probation | <10 | <10 | 2,751 | (21.7) | ||||
* Cell values under 10 individuals are not reported to limit possible identification.
Violations of data completeness: Percentage of cases with missing data at each decision point by CMS, pre- and post-data cleaning.
| CMS 1 | CMS 2 | CMS 3 | CMS 4 | |||||
|---|---|---|---|---|---|---|---|---|
| n = 1,474 | n = 870 | n = 12,662 | n = 1,007 | |||||
| Pre- | Post- | Pre- | Post- | Pre- | Post- | Pre- | Post- | |
| Diverted | 0.0% | 0.0% | 0.0% | 0.0% | 0.6% | 0.6% | 0.0% | |
| Detained | 0.0% | 0.0% | 0.0% | 0.0% | 0.0% | 0.0% | 0.0% | |
| Petitioned | 0.0% | 0.0% | 0.0% | 0.0% | 0.3% | 0.6% | 0.0% | |
| Adjudicated delinquent | 0.0% | 0.0% | 0.0% | 0.0% | 0.4% | 0.8% | 29.7% | |
| Placed on probation | 2.2% | 2.2% | 0.0% | 0.0% | 0.4% | 0.8% | 29.7% | |
| Confined | 100.0% | 31.3% | 0.0% | 0.0% | 0.4% | 0.8% | 29.7% | |
| Waived | 100.0% | 18.0% | 0.0% | 0.0% | 0.6% | 0.6% | 0.0% | |
Data plausibility violations: Percentage of cases with logical inconsistencies by CMS, pre- and post-data cleaning.
| CMS 1 | CMS 2 | CMS 3 | CMS 4 | |||||
|---|---|---|---|---|---|---|---|---|
| n = 1,474 | n = 870 | n = 12,662 | n = 1,007 | |||||
| Pre- | Post- | Pre- | Post- | Pre- | Post- | Pre- | Post- | |
| Inconsistent petition | 0.0% | 0.0% | 24.9% | 0.0% | 0.0% | 0.0% | 0.0% | |
| Implausible case | 3.6% | 2.0% | 20.1% | 0.1% | 8.4% | 1.3% | 48.5% | |
| Excess information | 0.7% | 1.3% | 0.1% | 0.0% | 0.0% | 0.0% | 6.4% | |
| Inconsistent waiver | 0.0% | 0.1% | 0.0% | 0.0% | 0.0% | 0.0% | 18.7% | |
| Inconsistent adjudication | 9.9% | 10.7% | 26.0% | 0.0% | 4.0% | 4.9% | 22.4% | |
| At least one inconsistency | 14.2% | 14.1% | 71.2% | 0.1% | 12.4% | 6.2% | 95.5% | |