| Literature DB >> 26297364 |
J MacRae1, B Darlow2, L McBain2, O Jones3, M Stubbe2, N Turner4, A Dowell2.
Abstract
OBJECTIVE: To develop a natural language processing software inference algorithm to classify the content of primary care consultations using electronic health record Big Data and subsequently test the algorithm's ability to estimate the prevalence and burden of childhood respiratory illness in primary care.Entities:
Keywords: PRIMARY CARE
Mesh:
Year: 2015 PMID: 26297364 PMCID: PMC4550741 DOI: 10.1136/bmjopen-2015-008160
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Figure 1Hierarchy for classification of consultations, using free text notes, diagnostic Read codes and medication prescription. GP, general practitioner; URTI, upper respiratory tract infection; LRTI, lower respiratory tract infection; Wheeze-ill, wheeze-related illness.
Respiratory classification categories and the conditions included in each
| Classification category | Respiratory conditions included within category* |
|---|---|
| Upper respiratory tract infections |
Cold Croup Influenza-like illness Viral influenza in the absence of associated signs or symptoms indicative of lower respiratory tract infection Scarlet fever Tracheitis Cough in the absence of associated signs or symptoms indicative of asthma or lower respiratory tract infection |
| Lower respiratory tract infections |
Bronchitis Bronchopneumonia Chest infection Chronic lung disease Cystic fibrosis Lung abscess/bronchiectasis Pertussis Pleurisy Pneumonia Tuberculosis Whooping cough |
| Wheeze-related illness |
Bronchiolitis Virus-induced transient wheeze Persistent wheeze (non-atopic or atopic) Asthma |
| Throat infections |
Infectious mononucleosis Laryngitis Pharyngitis Pharyngotonsillitis Tonsillitis |
| Otitis media |
Acute otitis media Chronic suppurative otitis media Otitis media with effusion Glue ear |
| Other respiratory |
Conditions with very low prevalence) for which there are not individual categories Allergic rhinitis Hay fever Rhinitis Sinusitis Consultations in which respiratory symptoms are present but there is insufficient GP entered data to enable classification Consultations in which respiratory symptoms are present with sufficient GP entered data to enable classification but the algorithm fails to classify the consultation |
*These classifications are based purely on the information within the electronic health record including consultation notes, medications prescribed and diagnostic Read Codes created on the day of the consultation. It does not include subsequent laboratory tests.
GP, general practitioner.
Figure 2Use of independent consultation record sets to train, test and validate the Algorithm. GP, general practitioner.
Gold standard consultation record sets
| Blind agreement by GP clinical experts* | ||||
|---|---|---|---|---|
| Gold standard | Records included | Respiratory consultations identified | Agreement if consultation is respiratory or not respiratory | Complete agreement for all respiratory classifications included in consultations |
| Training set (set 1) | 1200 | 529 (0.44; 0.40 to 0.48) | 1139 (0.95; 0.93 to 0.97) | 1037 (0.86; 0.84 to 0.89) |
| Test set (set 2) | 1200 | 556 (0.46; 0.42 to 0.51) | 1146 (0.96; 0.94 to 0.97) | 1060 (0.88; 0.85 to 0.91) |
| Validation set (set 3) | 1200 | 553 (0.46; 0.42 to 0.51) | 1151 (0.96; 0.94to 0.98) | 1046 (0.87; 0.84 to 0.90) |
Data are n (proportion; 95% CI).
*Consensus was reached for all records with discordant classifications following initial independent coding.
GP, general practitioner.
Figure 3Demographic characteristics of the funded population, the Normal Hours cohort and children enrolled in the practices that were included in the validation set (set 3).
Automated software inference algorithm measures of performance in the validation set (set 3)
| Diagnostic category | Incidence (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | Positive predictive value (95% CI) | Negative predictive value (95% CI) | |
|---|---|---|---|---|---|---|
| Respiratory | 0.46 (0.42 to 0.50) | 0.72 (0.67 to 0.78) | 0.95 (0.93 to 0.98) | 0.93 (0.89 to 0.97) | 0.80 (0.76 to 0.84) | 0.81 (0.77 to 0.85) |
| LRTI | 0.04 (0.02 to 0.06) | 0.61 (0.39 to 0.83) | 0.99 (0.98 to 1.00) | 0.76 (0.55 to 0.95) | 0.98 (0.97 to 0.99) | 0.67 (0.47 to 0.85) |
| URTI | 0.21 (0.18 to 0.25) | 0.54 (0.45 to 0.64) | 0.98 (0.96 to 0.99) | 0.86 (0.78 to 0.94) | 0.89 (0.86 to 0.92) | 0.66 (0.57 to 0.74) |
| Wheeze ill | 0.09 (0.06 to 0.12) | 0.96 (0.90 to 1.00) | 0.96 (0.94 to 0.98) | 0.70 (0.59 to 0.82) | 1.00 (0.99 to 1.00) | 0.81 (0.73 to 0.89) |
| Throat infections | 0.10 (0.08 to 0.13) | 0.50 (0.37 to 0.64) | 0.99 (0.99 to 1.00) | 0.91 (0.79 to 1.00) | 0.95 (0.92 to 0.96) | 0.64 (0.51 to 0.76) |
| Otitis media | 0.12 (0.10 to 0.15) | 0.58 (0.45 to 0.71) | 0.99 (0.98 to 1.00) | 0.90 (0.81 to 1.00) | 0.94 (0.92 to 0.96) | 0.71 (0.59 to 0.81) |
| Other | 0.02 (0.01 to 0.04) | 0.66 (0.38 to 0.92) | 0.99 (0.98 to 1.00) | 0.68 (0.40 to 1.00) | 0.99 (0.98 to 1.00) | 0.66 (0.42 to 0.87) |
LRTI, lower respiratory tract infections; Other, other respiratory condition; URTI, upper respiratory tract infections; Wheeze ill, wheeze-related illness.