| Literature DB >> 27135409 |
Shang-Ming Zhou1, Fabiola Fernandez-Gutierrez1, Jonathan Kennedy1, Roxanne Cooksey1, Mark Atkinson1, Spiros Denaxas2, Stefan Siebert3, William G Dixon4, Terence W O'Neill4, Ernest Choy5, Cathie Sudlow6, Sinead Brophy1.
Abstract
OBJECTIVES: 1) To use data-driven method to examine clinical codes (risk factors) of a medical condition in primary care electronic health records (EHRs) that can accurately predict a diagnosis of the condition in secondary care EHRs. 2) To develop and validate a disease phenotyping algorithm for rheumatoid arthritis using primary care EHRs.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27135409 PMCID: PMC4852928 DOI: 10.1371/journal.pone.0154515
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The performance of data mining algorithm given worst case/best case assumptions.
| Outpatient data (Cellma- ABMU) | ||||
|---|---|---|---|---|
| RA | Not RA | No data | ||
| GP data | 1323 | 396 | 2560 | |
| 265 | 10084 | 459727 | ||
| 138 | 1087 | |||
| Worst case | Best case | |||
*Worst case: no record in outpatients signifies not RA.
**Best case: no record in outpatients signifies patient with RA treated elsewhere.
The performance of QOF criteria given worst case/best case assumptions.
| Outpatient data(Cellma- ABMU) | ||||
|---|---|---|---|---|
| RA | Not RA | No data | ||
| GP data | 1377 | 513 | 2851 | |
| 211 | 9967 | 459436 | ||
| 138 | 1087 | |||
| Worst case: Sensitivity 86.7%, Specificity 99%, Positive predictive value 29% | Best case: Sensitivity 95%, Specificity 99%, Positive predictive value 89% | |||
The performance of expert knowledge driven algorithm given worst case/best case assumptions.
| Outpatient data(Cellma- ABMU) | ||||
|---|---|---|---|---|
| RA | Not RA | No data | ||
| GP data | 1333 | 570 | 2139 | |
| 255 | 9910 | 460148 | ||
| 138 | 1087 | |||
| Worst case: Sensitivity 83.9%, Specificity 99%, Positive predictive value 28% | Best case: Sensitivity 94.2%, Specificity 99%, Positive predictive value 88% | |||
Fig 1RA phenotyping model: (Left) C5.0 decision tree. The value of a variable is the number of its occurrence. At each final node, RA = 1 indicates classification for positive RA and RA = 0 negative classified for RA. (Right) Decision rules generated from the C5.0 decision tree.
Algorithm code definitions.
| RARTH_COD | Rheumatoid Arthritis codes as defined in the NHS Wales QOF Rheumatoid Arthritis Indicator Set (RA Indicator Set NHS QOF Wales, 2014) |
| INTENSITY_RA | A categorical variable with 5 levels, inspired in the reasoning on Thomas et al (2008) work to define intensity of rheumatoid conditions: Class 1 is the most severe diagnosis of rheumatoid arthritis and includes sero-positive rheumatoid arthritis. Class 2 contains codes related to classification of rheumatoid arthritis. Class 3 contains conditions that are related to rheumatoid arthritis, such as rheumatoid nodule. Class 4 identifies patients that have sero-negetive rheumatoid arthritis. The final class is class 9 and identifies patients who do not have any history of rheumatoid arthritis. |
| ALTERNATIVE_RA | Alternative arthropathy Read codes inspired in the reasoning in Thomas et al (2008). |
| PSORIATIC_CD | Psoriatic Arthritis Read codes |
| PREDNISOLONE | Read codes related to different dosage of Prednisolone. Prednisolone is an corticosteroid drug for controlling a local flare in a joint, which is used effectively in combination with other drugs (Ter Wee et al., 2014). |
| METHOTREXATE_CD | Disease-modifying Antirheumatic Drugs (DMARDs) grouped by their active ingredient- methotrexate, |
| SULPHASALAZINE_CD | DMARDs grouped by their active ingredient-sulphasalazine |
| LEFLUNOMIDE_CD | DMARDs grouped by their active ingredient-leflunomide |
Fig 2Prevalence of RA for 2000–2010 in the ABMU region, Wales, UK.