| Literature DB >> 30806901 |
Tim Wilkinson1,2,3, Christian Schnier4, Kathryn Bush4,5, Kristiina Rannikmäe4,5, David E Henshall5, Chris Lerpiniere5,6, Naomi E Allen7, Robin Flaig4, Tom C Russ8,9, Deborah Bathgate10, Suvankar Pal5,6, John T O'Brien11, Cathie L M Sudlow4,5.
Abstract
Prospective, population-based studies that recruit participants in mid-life are valuable resources for dementia research. Follow-up in these studies is often through linkage to routinely-collected healthcare datasets. We investigated the accuracy of these datasets for dementia case ascertainment in a validation study using data from UK Biobank-an open access, population-based study of > 500,000 adults aged 40-69 years at recruitment in 2006-2010. From 17,198 UK Biobank participants recruited in Edinburgh, we identified those with ≥ 1 dementia code in their linked primary care, hospital admissions or mortality data and compared their coded diagnoses to clinical expert adjudication of their full-text medical record. We calculated the positive predictive value (PPV, the proportion of cases identified that were true positives) for all-cause dementia, Alzheimer's disease and vascular dementia for each dataset alone and in combination, and explored algorithmic code combinations to improve PPV. Among 120 participants, PPVs for all-cause dementia were 86.8%, 87.3% and 80.0% for primary care, hospital admissions and mortality data respectively and 82.5% across all datasets. We identified three algorithms that balanced a high PPV with reasonable case ascertainment. For Alzheimer's disease, PPVs were 74.1% for primary care, 68.2% for hospital admissions, 50.0% for mortality data and 71.4% in combination. PPV for vascular dementia was 43.8% across all sources. UK routinely-collected healthcare data can be used to identify all-cause dementia in prospective studies. PPVs for Alzheimer's disease and vascular dementia are lower. Further research is required to explore the geographic generalisability of these findings.Entities:
Keywords: Alzheimer disease; Cohort studies; Data accuracy; Dementia; Predictive value of tests; Validation studies
Mesh:
Year: 2019 PMID: 30806901 PMCID: PMC6497624 DOI: 10.1007/s10654-019-00499-1
Source DB: PubMed Journal: Eur J Epidemiol ISSN: 0393-2990 Impact factor: 8.082
Fig. 1Flow diagram of participant selection
Fig. 2Area proportional Euler diagram indicating the datasets from which dementia cases were identified (n = 120). Distribution of cases identified at any time until end of follow-up (September 2015)
Fig. 3Adjudicator subtype diagnoses of the 99 cases adjudicated to have dementia. AD Alzheimer’s disease, VaD vascular dementia, Mixed mixed AD/VaD, DLB/PDD dementia with Lewy bodies and Parkinson’s disease dementia, FTD frontotemporal dementia
Adjudicator agreement
| Agreement | Number agreed/total number | Percentage agreement (%) | Kappa coefficient (95% CI) |
|---|---|---|---|
| All-cause dementia | 27/30 | 90 | 0.76 (0.48–1.00) |
| Dementia subtypesa | 13/20 | 65 | 0.57 (0.29–0.84) |
Percentage agreement and kappa coefficients for whether adjudicators agreed on the presence or absence of dementia or the particular subtype
aAmong 20 cases where both adjudicators thought that all-cause dementia was present
Fig. 4Positive predictive values for datasets, alone and in combination, stratified by dementia subtype. FTD frontotemporal dementia, PDD Parkinson’s disease dementia, DLB dementia with Lewy bodies. PPVs displayed for datasets where n ≥ 5. FTD, PDD and DLB combined into ‘other dementias’ category due to small numbers for each disease alone
Fig. 5Effect of additional criteria on positive predictive values and numbers of cases ascertained. FTD Frontotemporal dementia, PDD Parkinson’s disease dementia, DLB dementia with Lewy bodies. ≥ 10 and ≥ 20 Alzheimer’s disease codes not shown due to small numbers (< 5). *Any Alzheimer’s disease, vascular dementia, FTD, PDD or DLB code to identify dementia of any cause
Positive predictive value and case ascertainment in suggested algorithms to identify all-cause dementia cases in UK Biobank
| Algorithm | Number of codes required | Dataset | PPV (95% CI) | Total (TP) cases identified |
|---|---|---|---|---|
| Any dementia code in any dataset | ≥ 1 code in any dataset | P, H & M | 82.5% (74.5–88.8) | 120 (99) |
| Two or more dementia codes in any dataset | ≥ 2 codes in any dataset | P, H & M | 88.5% (80.4–94.1) | 96 (85) |
| Any diagnostic code in primary care data* | ≥ 1 diagnostic code | P | 90.1% (92.5–95.1) | 101 (92) |
P Primary care, H hospital admissions, M mortality, PPV positive predictive value, CI confidence intervals. TP true positive
*Administrative read codes excluded
Demographics of participants who were adjudicated to be false positives, true positives and whole validation group
| Group | Number of participants | Median age at recruitment (range) | Female (%) | Median age at first code (range) | Median number of codes (range) | Died during follow-up (%) | Median TDI (range) |
|---|---|---|---|---|---|---|---|
| All | 120 | 67 years (43–70) | 64 (53.3) | 70 years (41–77) | 5 (1–35) | 25 (20.8) | 1 (1–5) |
| True positives | 99 | 67 years (51–70) | 54 (54.5) | 71 years (52–77) | 6 (1–35) | 20 (20.2) | 1 (1–5) |
| False positives | 21 | 67 years (43–70) | 10 (47.6) | 68 years (41–76) | 2 (1–8) | 5 (23.8) | 3 (1–5) |
TDI Townsend deprivation index—divided into quintiles (1—lowest deprivation, 5—highest deprivation) based on 2001 census data [18]