| Literature DB >> 32198296 |
Houyu Zhao1, Jiaming Bian2, Li Wei3, Liuyi Li4, Yingqiu Ying5, Zeyu Zhang6, Xiaoying Yao1, Lin Zhuo7, Bin Cao6, Mei Zhang2, Siyan Zhan8.
Abstract
OBJECTIVE: We aimed to evaluate the validity of an algorithm to classify diagnoses according to the appropriateness of outpatient antibiotic use in the context of Chinese free text. SETTING AND PARTICIPANTS: A random sample of 10 000 outpatient visits was selected between January and April 2018 from a national database for monitoring rational use of drugs, which included data from 194 secondary and tertiary hospitals in China. RESEARCHEntities:
Keywords: antibiotics; drug utilisation; electronic health records; prescriptions; validation
Mesh:
Substances:
Year: 2020 PMID: 32198296 PMCID: PMC7103794 DOI: 10.1136/bmjopen-2019-031191
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Basic characteristics of the sample outpatient visits*
| Subgroups | No of prescriptions | Percentage (%) |
| Type of patients | ||
| Outpatient clinic | 9105 | 91.1 |
| Emergency department | 895 | 9.0 |
| Age group | ||
| 0–5 | 430 | 4.3 |
| 6–17 | 480 | 4.8 |
| 18–44 | 4180 | 41.8 |
| 45–64 | 3129 | 31.3 |
| ≥65 | 1742 | 17.4 |
| Unknown | 39 | 0.4 |
| Gender of patients | ||
| Female | 5099 | 51.0 |
| Male | 4893 | 48.9 |
| Unknown | 8 | 0.1 |
| Hospital level | ||
| Second | 711 | 7.1 |
| Third | 9289 | 92.9 |
| Region of China† | ||
| Eastern | 5476 | 54.8 |
| Central | 542 | 5.4 |
| Western | 3021 | 30.2 |
| North-eastern | 961 | 9.6 |
| Top five most used antibiotics | ||
| Cefdinir | 127 | 9.0 |
| Azithromycin | 126 | 8.9 |
| Levofloxacin | 111 | 7.8 |
| Cefixime | 103 | 7.3 |
| Moxifloxacin | 79 | 5.6 |
| No of diagnoses | ||
| Non-valid diagnosis‡ | 16 | 0.2 |
| One diagnosis | 6597 | 66.0 |
| Two diagnoses | 1921 | 19.2 |
| Three diagnoses | 753 | 7.5 |
| Four diagnoses | 330 | 3.3 |
| Five diagnoses | 163 | 1.6 |
| >5 diagnoses | 220 | 2.2 |
| Length of diagnosis text§ | ||
| Non-valid diagnosis | 16 | 0.2 |
| 1–4 characters | 3645 | 36.5 |
| 5–9 characters | 3662 | 36.6 |
| 10–14 characters | 1477 | 14.8 |
| 15–19 characters | 574 | 5.7 |
| ≥20 characters | 626 | 6.3 |
*For rounding reasons, the sum of the percentages of some subgroups may not be exactly equal to 100%.
†Regions of China were divided according the National Bureau of Statistics of China. http://www.stats.gov.cn/ztjc/zthd/sjtjr/dejtjkfr/tjkp/201106/t20110613_71947.htm
‡Diagnosis that contained only numbers, punctuation (eg, comma, semicolon, exclamation mark, etc), and other non-Chinese characters, told nothing about the indication for antibiotics and were defined as invalid diagnosis.
§Whitespace and punctuation were not counted.
Confusion matrix for the regular expression (RE)-based diagnosis classification algorithm*
| Classified by the RE algorithm | Classified by manual prescriptions review | ||||
| Tier 1A | Tier 1B | Tier 2A | Tier 2B | Tier 3 | |
| Tier 1A | 348 | 0 | 0 | 0 | 0 |
| Tier 1B | 1 | 41 | 0 | 0 | 0 |
| Tier 2A | 2 | 1 | 1074 | 0 | 0 |
| Tier 2B | 0 | 0 | 4 | 63 | 0 |
| Tier 3 | 4 | 0 | 17 | 1 | 8428 |
*Tier 1A: tier 1 diagnoses without uncertainty. Tier 1B: tier 1 diagnoses with uncertainty. Tier 2A: tier 2 diagnoses without uncertainty. Tier 2B: tier 2 diagnoses with uncertainty.
Validation of the regular expression-based classification algorithm
| Diagnosis tiers* | Sensitivity (%, 95% CI) | Specificity (%, 95% CI) | PPV (%, 95% CI) | NPV (%, 95% CI) |
| Tier 1A | 98.0 (96.0 to 99.2) | 100.0 (100.0 to 100.0) | 100.0 (98.9 to 100.0) | 99.9 (99.9 to 100.0) |
| Tier 1B | 97.6 (87.4 to 99.9) | 100.0 (99.9 to 100.0) | 97.6 (87.4 to 99.9) | 100.0 (99.9 to 100.0) |
| Tier 2A | 98.1 (97.1 to 98.8) | 100.0 (99.9 to 100.0) | 99.7 (99.2 to 99.9) | 99.8 (99.6 to 99.9) |
| Tier 2B | 98.4 (91.6 to 100.0) | 100.0 (99.9 to 100.0) | 94.0 (85.4 to 98.3) | 100.0 (99.9 to 100.0) |
| Tier 1 | 98.2 (96.4 to 99.3) | 100.0 (100.0 to 100.0) | 100.0 (99.1 to 100.0) | 99.9 (99.8 to 100.0) |
| Tier 2 | 98.4 (97.6 to 99.1) | 100.0 (99.9 to 100.0) | 99.7 (99.2 to 99.9) | 99.8 (99.7 to 99.9) |
| Tier 3 | 100.0 (100.0 to 100.0) | 98.6 (97.9 to 99.1) | 99.7 (99.6 to 99.8) | 100.0 (99.8 to 100.0) |
*Tier 1A: tier 1 diagnoses without uncertainty. Tier 1B: tier 1 diagnoses with uncertainty. Tier 2A: tier 2 diagnoses without uncertainty. Tier 2B: tier 2 diagnoses with uncertainty.
NPV, negative predictive value; PPV, positive predictive value.
Reasons for inaccurate classifications of the regular expression (RE)-based algorithm
| Classified by the RE algorithm* | Classified by manual prescription review* | Reasons for inaccurate classifications† |
| Tier 1B | Tier 1A | Multiple diagnoses incorrectly concatenated together (n=1). |
| Tier 2A | Tier 1A | The infectious disease written after the fifth diagnosis (n=2). |
| Tier 2A | Tier 1B | The infectious disease written after the fifth diagnosis (n=1). |
| Tier 2B | Tier 2A | Multiple diagnoses incorrectly concatenated together (n=4). |
| Tier 3 | Tier 1A | Single diagnosis improperly split (n=1); |
| Tier 3 | Tier 2A | Multiple diagnoses incorrectly concatenated together (n=5); |
| Tier 3 | Tier 2B | The infectious disease written after the fifth diagnosis (n=1); |
*Tier 1A: tier 1 diagnoses without uncertainty. Tier 1B: tier 1 diagnoses with uncertainty. Tier 2A: tier 2 diagnoses without uncertainty. Tier 2B: tier 2 diagnoses with uncertainty.
†There was a visit (classified as tier 3 by computer and tier 2A by manual review) in which one diagnosis was divided into two diagnoses and the second part was written together with another one; thus, the total number of incorrect classification for different reasons was 31 in this table.