Literature DB >> 27496226

Systematic review of validated case definitions for diabetes in ICD-9-coded and ICD-10-coded data in adult populations.

Bushra Khokhar1, Nathalie Jette2, Amy Metcalfe3, Ceara Tess Cunningham4, Hude Quan1, Gilaad G Kaplan1, Sonia Butalia5, Doreen Rabi6.   

Abstract

OBJECTIVES: With steady increases in 'big data' and data analytics over the past two decades, administrative health databases have become more accessible and are now used regularly for diabetes surveillance. The objective of this study is to systematically review validated International Classification of Diseases (ICD)-based case definitions for diabetes in the adult population. SETTING, PARTICIPANTS AND OUTCOME MEASURES: Electronic databases, MEDLINE and Embase, were searched for validation studies where an administrative case definition (using ICD codes) for diabetes in adults was validated against a reference and statistical measures of the performance reported.
RESULTS: The search yielded 2895 abstracts, and of the 193 potentially relevant studies, 16 met criteria. Diabetes definition for adults varied by data source, including physician claims (sensitivity ranged from 26.9% to 97%, specificity ranged from 94.3% to 99.4%, positive predictive value (PPV) ranged from 71.4% to 96.2%, negative predictive value (NPV) ranged from 95% to 99.6% and κ ranged from 0.8 to 0.9), hospital discharge data (sensitivity ranged from 59.1% to 92.6%, specificity ranged from 95.5% to 99%, PPV ranged from 62.5% to 96%, NPV ranged from 90.8% to 99% and κ ranged from 0.6 to 0.9) and a combination of both (sensitivity ranged from 57% to 95.6%, specificity ranged from 88% to 98.5%, PPV ranged from 54% to 80%, NPV ranged from 98% to 99.6% and κ ranged from 0.7 to 0.8).
CONCLUSIONS: Overall, administrative health databases are useful for undertaking diabetes surveillance, but an awareness of the variation in performance being affected by case definition is essential. The performance characteristics of these case definitions depend on the variations in the definition of primary diagnosis in ICD-coded discharge data and/or the methodology adopted by the healthcare facility to extract information from patient records. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

Entities:  

Keywords:  administrative data; case definition; diabetes; validation studies

Mesh:

Year:  2016        PMID: 27496226      PMCID: PMC4985868          DOI: 10.1136/bmjopen-2015-009952

Source DB:  PubMed          Journal:  BMJ Open        ISSN: 2044-6055            Impact factor:   2.692


Our systematic review was comprehensive as it had a broad search strategy that bore no language or time restriction. All included studies captured patient information at the population level with clear case definitions encompassing a broad spectrum of patients. There is the potential for a language bias as studies where full texts were not available in English were not considered. There are potential limitations for all reference standards used to validate administrative definitions for diabetes.

Background

Diabetes is a chronic disease that has increased substantially during the past 20 years.1 At present, diabetes is the leading cause of blindness,2 renal failure3 and non-traumatic lower limb amputations4 and is a major risk factor for cardiovascular disease.5 Owing to its chronic nature, the severity of its complications and the means required to control it, diabetes is a costly disease. The healthcare costs associated with this condition are substantial and can account for up to 15% of national healthcare budgets.6 Understanding the distribution of diabetes and its complications in a population is important to understand disease burden and to plan for effective disease management. Diabetes surveillance systems using administrative data can efficiently and readily analyse routinely collected health-related information from healthcare systems, provide reports on risk factors, care practices, morbidity and mortality and estimate incidence and prevalence at a population level.7 With steady increases in ‘big data’ and data analytics over the past two decades, administrative health databases have become more accessible to health services researchers and are now used regularly to study the processes and outcomes of healthcare. However, administrative health data are not collected primarily for research or surveillance. There is a need for health administrative data users to examine the validity of case ascertainment in their data sources before use.8 By definition, surveillance depends on a valid case definition that is applied constantly over time. A case definition is set of uniform criteria used to define a disease for surveillance.9 However, a variety of diabetes case definitions exist, resulting in variation in reported diabetes prevalence estimates. A systematic review and meta-analysis of validation studies on diabetes case definitions from administrative records has been performed.10 This review aimed to determine the sensitivity and specificity of a commonly used diabetes case definition, “two physician claims or one hospital discharge abstract record within a two-year period” and its potential effect on diabetes prevalence estimation. Our study extends this body of work by systematically reviewing validated International Classification of Diseases (ICD), 9th edition (ICD-9)-based and ICD-10-based case definitions for diabetes and comparing the validity of different case definitions across studies and countries.

Methods

Search strategy

This systematic review was performed using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines11 (see online supplementary appendix A). Two citation databases, MEDLINE and Embase, were searched using an OVID platform from 1980 until September 2015. The search strategy consisted of the following set of terms (see online supplementary appendix B): (1) (health services research or administrative data or hospital discharge data or ICD-9 or ICD-10 or medical record or health information or surveillance or physician claims or claims or hospital discharge or coding or codes) AND (2) (validity or validation or case definition or algorithm or agreement or accuracy or sensitivity or specificity or positive predictive value or negative predictive value) AND (3) medical subject heading terms for diabetes. Searches were limited to human studies published in English. The broad nature of the search strategy allowed for the detection of modifications of ICD codes, such as international clinical modification (eg, ICD-9-CM).

Study selection

Studies were evaluated in duplicate for eligibility in a two-stage procedure. In stage 1, all identified titles and abstracts were reviewed and in stage 2, a full text review was performed on all studies that met the predefined eligibility criteria. If either reviewer defined a study as eligible in stage 1, it was included in the full text review in stage 2. Disagreements were resolved by discussion or consultation with a third reviewer.

Inclusion/exclusion criteria

A study was included in the systematic review if it met the following criteria: (1) study population included those ≥18 years of age with type 1 diabetes mellitus or type 2 diabetes mellitus; (2) statistical estimates (sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) or κ) were reported or could be calculated; (3) an ICD-9 or ICD-10 case definition for diabetes was reported and validated; (4) a satisfactory reference standard (eg, self-report from population-based surveys or patient medical chart reviews) and (5) if it reported on original data. Studies validating diabetes in specialised populations (eg, cardiovascular disease) were excluded to ensure that the diabetes case definitions would be generalisable. Studies not employing a sole medical encounter data in their diabetes case definition (eg, inclusion of pharmacy or laboratory data) were also excluded, as the independent validity of such definitions could not be calculated. Bibliographies of included studies were manually searched for additional studies, which were then screened and reviewed using the same methods described above.

Data extraction and quality assessment

Primary outcomes were sensitivity, specificity, PPV, NPV and κ reported for each of the ICD-coded diabetes case definition. Other extracted data included sample size and ICD codes used. If statistical estimates were not reported in the original paper, estimates were calculated from data available. Calculating a pooled estimate of surveillance performance measures using meta-analytic techniques was deemed inappropriate given the heterogeneity of diabetes case definitions and reference standards used across studies. Data were tabulated by the type of administrative health data used. Study quality was evaluated using the Quality Assessment Tool for Diagnostic Accuracy Studies (QUADAS) criteria.12

Results

Identification and description of studies

A total of 2895 abstracts were identified with 193 studies reviewed in full text, of which 16 studies met all eligibility criteria (figure 1). Eight of these studies were conducted in the USA,13–20 seven in Canada21–27 and one in Australia.28 Thirteen studies used ICD-9 codes,13–19 21–23 26–28 and the remaining three studies used ICD-9 and ICD-10 codes.23–25 None of the studies differentiated or commented as to whether a particular code of interest was in the primary or in one of the secondary diagnostic positions. Of the 16 studies reviewed, 8 used medical records13 14 21 23–26 28 and 8 used either self-reported surveys or telephone surveys to validate the diabetes diagnosis.15–20 22–27 Eight studies used physician claims data,13–16 18–20 23 four studies used hospital discharge data22 24 26 28 and four studies used a combination of both.17 21 25 27 Two studies used electronic medical records (EMRs) as their health data source,29 30 but these were removed from the review since EMRs were not a part of our search strategy.
Figure 1

Study flow chart. ICD, International Classification of Diseases.

Study flow chart. ICD, International Classification of Diseases. The QUADAS Scores (table 1) ranged from 9 to 13 of a maximum of 14. Five questions were selected from QUADAS to constitute the ‘bias assessment’. Regardless of quality assessment scores, all 16 studies are discussed in this systematic review.
Table 1

Study quality characteristics using QUADAS tool

QUADAS tool itemHux et al21Robinson et al22Borzecki et al13Wilchesky et al23Crane et al14So et al24Chen et al25Nedkoff et al28Quan et al26Young et al27Hebert et al15Ngo et al16Rector et al17Miller18Singh19O’Connor et al20
Was the spectrum of patients representative of the patients who will receive the test in practice?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Were selection criteria clearly described?YesYesNoNoYesNoYesYesYesYesYesNoYesYesYesYes
Is the reference standard likely to correctly classify the target condition?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Is the time period between reference standard and index test short enough to be reasonably sure that the target condition did not change between the two tests?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Did the whole sample or a random selection of the sample, receive verification using a reference standard of diagnosis?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Did patients receive the same reference standard regardless of the index test result?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Was the reference standard independent of the index test (ie, the index test did not form part of the reference standard)?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Was the execution of the index test described in sufficient detail to permit replication of the test?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Was the execution of the reference standard described in sufficient detail to permit its replication?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Were the index test results interpreted without knowledge of the results of the reference standard?YesYesYesYesUnclearUnclearYesYesYesYesYesYesYesYesYesYes
Were the reference standard results interpreted without knowledge of the results of the index test?YesYesYesYesYesYesYesYesYesYesYesYesYesYesYesYes
Were the same clinical data available when test results were interpreted as would be available when the test is used in practice?UnclearUnclearYesYesYesYesUnclearYesYesYesUnclearYesYesYesYesUnclear
Were uninterpretable/intermediate test results reported?NoNoNoYesNoNoNoNoNoNoNoNoNoNoNoYes
Were withdrawals from the study explained?UnclearUnclearNoNoNoNoUnclearYesUnclearUnclearUnclearNoNoNoUnclearUnclear
Score (maximum 14)1111101210911131212111112121212
Bias assessment (maximum 5)5555445555555555

QUADAS tool is extracted from table 2 of Whiting et al.12

QUADAS, Quality Assessment Tool for Diagnostic Accuracy Studies.

Study quality characteristics using QUADAS tool QUADAS tool is extracted from table 2 of Whiting et al.12 QUADAS, Quality Assessment Tool for Diagnostic Accuracy Studies. The sample size varied from 93 to ∼3 million people. Sensitivity and specificity values were available from all 18 studies, PPV in 16 studies, NPV in 12 studies and κ in 6 studies. All 16 studies were categorised by the type of administrative health data source being used.

Physician claims data

Table 2 lists the eight studies13–16 18–20 23 using physician claims data. In these studies, the sensitivity ranged from 26.9% to 97%, specificity ranged from 94.3% to 99.4%, PPV ranged from 71.4% to 96.2%, NPV ranged from 95% to 99.6% and κ ranged from 0.8 to 0.9. Four of the eight studies using physician claims data had a least one diabetes case definition where sensitivity and specificity exceed 80%.
Table 2

Study characteristics and test measures of studies for physician claims data

CountryStudy yearsAuthor(reference)ReferenceType of administrative dataDiabetes case definitionICD codes usedStudy, NSensitivity % (95% CI)Specificity % (95% CI)PPV % (95% CI)NPV % (95% CI)κ
Canada1995–1996Wilchesky et al23Medical chartPhysician claimsUsing only diagnoses recorded in the claims of study physiciansICD-9 250.0-.9275251.78 (49.9 to 53.6)98.41 (98.2 to 98.6)   
Using diagnostic codes recorded on claims made by all physicians who provided medical services to patients in the year prior to the start of the studyICD-9 250.0-.964.43 (62.6 to 66.2)96.82 (96.5 to 97.1)
USA1997–2001Crane et al14Clinician documentation in EMR progress notesPhysician claimsAt least one clinician-coded diagnosesICD 9 250.0, .1, .2, .3144193 (86 to 100)99 (99 to 100)91 (83 to 99)
USA1998–1999Borzecki et al13Medical chartsPhysician claimsAt least one diagnosis in National Department of Veterans Affairs (VA) database, Outpatient Clinic file over 1 yearICD 9 250.x117697960.92
At least two diagnoses in National Department of Veterans Affairs (VA) database, Outpatient Clinic file over 1 yearICD 9 250.x0.91
At least one diagnosis in National Department of Veterans Affairs (VA) database, Outpatient Clinic file over 2 yearsICD 9 250.x0.89
At least two diagnoses in National Department of Veterans Affairs (VA) database, Outpatient Clinic file over 2 yearsICD 9 250.x0.93
USA1992–1995Hebert et al15Self-reported surveyPhysician claimsOne or more diagnoses of diabetes in any claim file over 1-year periodICD 9-CM 250.00-.93, 357.2, 362.0-362.02, 366.41 71.696.679  
One or more diagnoses of diabetes in any claim file over 2-year periodICD 9-CM 250.00-.93, 357.2, 362.0–0.02, 366.4179.194.371.4
USA1993–1994O’Connor et al20Telephone surveyPhysician claimsTwo or more ICD-9 diagnostic codesICD 9 250.x197692.22*98.62*76.15*99.63*
USA1996–1998Singh19Self-reported surveyPhysician claimsVeterans Affairs databasesICD 9 25076 (75 to 76)98 (98 to 98)91 (91 to 91)95 (94 to 95)0.79 (0.79 to 0.80)
USA1997Ngo et al16Self-reported surveyPhysician claimsOregon Medicaid Claims Data, any claim ≤24 months before interview with a diabetes diagnosis codeICD 9 250, 357.2, 362, 366.4121 56483.997.981.998.20.81 (0.77 to 0.85)
Oregon Medicaid Claims Data, any claim ≤12 months before interview with a diabetes diagnosis codeICD 9 250, 357.2, 362, 366.4188.797.476.498.90.8 (0.76 to 0.85)
USA1997–2000Miller et al18Self-reported surveyPhysician claims (Medicare)Any diagnostic codeICD 9 250, 357.2, 362.0, 366.412 924 14878.395.785.3  
Any outpatient diagnostic codeICD 9 250, 357.2, 362.0, 366.4177.595.985.8
≥2 any diagnostic codeICD 9 250, 357.2, 362.0, 366.4173.198.393.4
≥2 outpatient codesICD 9 250, 357.2, 362.0, 366.4172.298.493.7
≥3 any diagnostic codeICD 9 250, 357.2, 362.0, 366.416998.495.2
≥3 outpatient codesICD 9 250, 357.2, 362.0, 366.416898.995.4
≥4 any diagnostic codeICD 9 250, 357.2, 362.0, 366.416599.196
≥4 outpatient codesICD 9 250, 357.2, 362.0, 366.4163.899.296.2

Superior performance characteristics within studies have been highlighted in bold.

*Sensitivity, specificity, PPV and NPV are all hand-calculated:

sensitivity identifies the proportion of patients who truly do have the disease/condition;

specificity identifies the proportion of patients who truly do not have the disease/condition;

PPV is the probability that participants with a positive screening test truly have the disease/condition;

NPV is the probability that participants with a negative screening test truly do not have the disease/condition;

κ is an inter-rater agreement statistic to evaluate the agreement between two classifications on ordinal or nominal scales.

EMR, electronic medical record; ICD 10-AM, International Classification of Diseases, Tenth Revision, Australian Modification; ICD, International Classification of Diseases; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; NPV, negative predictive value; PPV, positive predictive value.

Study characteristics and test measures of studies for physician claims data Superior performance characteristics within studies have been highlighted in bold. *Sensitivity, specificity, PPV and NPV are all hand-calculated: sensitivity identifies the proportion of patients who truly do have the disease/condition; specificity identifies the proportion of patients who truly do not have the disease/condition; PPV is the probability that participants with a positive screening test truly have the disease/condition; NPV is the probability that participants with a negative screening test truly do not have the disease/condition; κ is an inter-rater agreement statistic to evaluate the agreement between two classifications on ordinal or nominal scales. EMR, electronic medical record; ICD 10-AM, International Classification of Diseases, Tenth Revision, Australian Modification; ICD, International Classification of Diseases; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; NPV, negative predictive value; PPV, positive predictive value. Studies comparing physician claims-based case definitions over multiple years13 15 16 consistently show increases in sensitivity values and a slight decrease in specificity and PPV overtime. This relationship is consistent with the study18 looking at changes in the statistical estimates with increasing the number of appearance of diagnostic codes in the case definition—the sensitivity was the highest when any diagnostic code (inpatient or outpatient) was used, whereas the specificity and PPV were the highest when most number of outpatient diagnostic codes were used.

Hospital discharge data

Table 3 lists the four studies22 24 26 28 using only hospital discharge data. In these studies, the sensitivity ranged from 59.1% to 92.6%, specificity ranged from 95.5% to 99%, PPV ranged from 62.5% to 96%, NPV ranged from 90.8% to 99% and κ ranged from 0.6 to 0.9. Two of the four studies using hospital discharge data had a least one diabetes case definition where sensitivity and specificity exceed 80%. In contrast to the physician claims-based case definitions, the sensitivity seemed to improve when a longer duration was used in the case definition, however the specificity and the PPV behaved inversely.
Table 3

Study characteristics and test measures of studies for hospital discharge data

CountryStudy YearsAuthor(Reference)ReferenceType of administrative dataDiabetes case definitionICD codes usedStudy, NSensitivity % (95% CI)Specificity % (95% CI)PPV % (95% CI)NPV % (95% CI)κ
Canada1995–2000So et al24Medical chartHospital discharge dataDiabetes with complicationsICD-9 250.1-.99380 (51.91 to 95.67)98.3 95.15 to 99.65)80 (51.91 to 95.67)98.3 (95.15 to 99.65) 
2001–2004 Diabetes with complicationsICD-10 E10.0-.8, E11.0-.8, E12.0-.8, E13.0-.8, E14.0-.866.7 (38.38 to 88.18)98.9 (96.00 to 99.86)83.3 (51.59 to 97.91)97.2 (93.67 to 99.10)
Canada2003Quan et al26Medical chartHospital discharge dataDiabetes with chronic complicationsICD 9 250.4-.7400863.698.962.5990.62
Diabetes with chronic complicationsICD 10 E10.2-.5, E10.7, E11.2-.5, E11.7, E12.2-.5, E12.7, E13.2-.5, E13.7, E14.2-.5, E14.759.19963.198.90.6
Diabetes without chronic complicationsICD 9 250.0-.3, 250.8, .977.798.486.5970.8
Diabetes without chronic complicationsE10.0, .1, .6, .8, .9, E110, .1, .6, E11.8, .9, E12.0, .1, .6, .8, .9, E13.0, .1, .6, .8, .9, E14.0, .1, .6, .8, .975.898.788.596.80.79
Western Australia1998Nedkoff et al28Medical chartHospital discharge dataLook back period: Index admissionICD 9/ICD-9 CM 250168591.198.793.397.40.912
1 year91.698.192.897.60.902
2 years92.197.992.197.80.903
5 years92.497.791.997.80.9
10 years92.697.691.497.80.9
15 years92.697.597.80.897
2002–2004Look back period: Index admissionICD 10-AM E10-E14225881.598.29690.80.825
1 year86.397.394.4930.853
2 years87.396.793.593.40.854
5 years89.395.992.294.40.859
10 years89.695.691.694.50.856
15 years89.695.591.594.50.855
Canada1989–1990Robinson et al22Self-reported surveyHospital discharge data and physician claims1, 2 or 3 physician claim or 1 hospitalisation over 3 yearsICD 9 CM2651729876980.72(0.67–0.77)

Superior performance characteristics within studies have been highlighted in bold.

Sensitivity identifies the proportion of patients who truly do have the disease/condition;

specificity identifies the proportion of patients who truly do not have the disease/condition;

PPV is the probability that participants with a positive screening test truly have the disease/condition;

NPV is the probability that participants with a negative screening test truly do not have the disease/condition;

κ is an inter-rater agreement statistic to evaluate the agreement between two classifications on ordinal or nominal scales.

ICD 10-AM, International Classification of Diseases, Tenth Revision, Australian Modification; ICD, International Classification of Diseases; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; NPV, negative predictive value; PPV, positive predictive value.

Study characteristics and test measures of studies for hospital discharge data Superior performance characteristics within studies have been highlighted in bold. Sensitivity identifies the proportion of patients who truly do have the disease/condition; specificity identifies the proportion of patients who truly do not have the disease/condition; PPV is the probability that participants with a positive screening test truly have the disease/condition; NPV is the probability that participants with a negative screening test truly do not have the disease/condition; κ is an inter-rater agreement statistic to evaluate the agreement between two classifications on ordinal or nominal scales. ICD 10-AM, International Classification of Diseases, Tenth Revision, Australian Modification; ICD, International Classification of Diseases; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; NPV, negative predictive value; PPV, positive predictive value.

Combination of physician claims and hospital discharge data

Table 4 lists out the four studies17 21 25 27 using a combination of physician claims and hospital discharge data. In these studies, the sensitivity ranged from 57% to 95.6%, specificity ranged from 88% to 98.5%, PPV ranged from 54% to 80%, NPV ranged from 98% to 99.6% and κ ranged from 0.7 to 0.8. Using a combination of two or more data sources increases the minimum value of the range for sensitivity compared to using either physician claims or hospital discharge data-based definitions individually. All four of the studies using a combination of physician claims and hospital discharge data had a least one case definition where sensitivity and specificity exceed 80%.
Table 4

Study characteristics and test measures of studies for physician claims data and hospital discharge data

CountryStudy yearsAuthor(reference)ReferenceType of administrative dataDiabetes case definitionICD codes usedStudy, NSensitivity % (95% CI)Specificity % (95% CI)PPV % (95% CI)NPV % (95% CI)κ
Canada1992–1999Hux et al21Medical chartPhysician claims and hospital discharge dataOne physician service claims or one hospitalisation with diagnosis of diabetesICD-9 250.x33179192*6199*
Two physician service claims or one hospitalisation with diagnosis of diabetesICD-9 250.x8697*8098*
Canada2000–2002Chen et al25Medical chartPhysician claims and hospital discharge data3 years observation perioddataICD 9 250.xx, ICD 10 E10.x-14.x336295.6 (92.5 to 97.7)92.8 (91.9 to 93.7)54 (49.6 to 58.5)99.6 (99.4 to 99.8)0.65 (0.61 to 0.69)
2 years observation period dataICD 9 250.xx, ICD 10 E10.x-14.x86.4 (82.4 to 90.5)97.1 (96.5 to 97.7)72.4 (67.5 to 77.3)98.8 (98.4 to 99.2)0.77 (0.73 to 0.81)
Physician claims3 years observation period dataICD 9 250.xx, ICD 10 E10.x-14.x91.2 (87.9 to 94.6)97.6 (97.1 to 98.1)72.1 (67.5 to 76.9)99.2 (98.9 to 99.5)0.82 (0.78 to 0.85)
2 years observation period dataICD 9 250.xx, ICD 10 E10.x-14.x76.6 (71.5 to 81.6)99.3 (99.0 to 99.6)90.9 (87.2 to 94.6)98 (97.5 to 98.4)0.82 (78.0 to 85.5)
USA1999Rector et al17Telephone surveysHospital discharge data and physician claimsOne 1999 claim with dxICD 9 250.xx, 357.2x, 362.0x, 366.4136339093 
One 1999 face-to-face encounter claim with dxICD 9 250.xx, 357.2x, 362.0x, 366.418296
One 1999 face-to-face encounter claim with primary dxICD 9 250.xx, 357.2x, 362.0x, 366.417298
Two 1999 claims with dxICD 9 250.xx, 357.2x, 362.0x, 366.418596
Two 1999 face-to-face encounter claims with primary dxICD 9 250.xx, 357.2x, 362.0x, 366.417098
Two 1999 face-to-face encounter claims with primary dxICD 9 250.xx, 357.2x, 362.0x, 366.415799
1999–2000One 1999 or 2000 claim with dxICD 9 250.xx, 357.2x, 362.0x, 366.41)9588
One 1999 or 2000 face-to-face encounter claim with dxICD 9 250.xx, 357.2x, 362.0x, 366.419492
One 1999 or 2000 face-to-face encounter claim with primary dxICD 9 250.xx, 357.2x, 362.0x, 366.418796
Two 1999 or 2000 claims with dxICD 9 250.xx, 357.2x, 362.0x, 366.419393
Two 1999 or 2000 face-to-face encounter claims with dxICD 9 250.xx, 357.2x, 362.0x, 366.419195
Two 1999 or 2000 face-to-face encounter claims with primary dxICD 9 250.xx, 357.2x, 362.0x, 366.417798
Canada1980–1984Young et al27Self-reported surveyHospital admission and physician claims(Hospital admissions of provincial residents claims for which are submitted to the MHSC) AND (Hospital admissions of provincial residents claims for which are submitted to the MHSC AND Claims by the physician to the MHSC or payment)ICD 9-CM100082.796.3   
(Hospital admissions of provincial residents claims for which are submitted to the MHSC AND Claims by the physician to the MHSC or payment) AND (Claims by the physician to the MHSC or payment)ICD 9-CM82.198.5
(Hospital admissions of provincial residents claims for which are submitted to the MHSC) AND (Hospital admissions of provincial residents claims for which are submitted to the MHSC AND Claims by the physician to the MHSC or payment) AND (Claims by the physician to the MHSC or payment)ICD 9-CM83.995.8

Superior performance characteristics within studies have been highlighted in bold.

*Sensitivity, specificity, PPV and NPV are all hand-calculated:

sensitivity identifies the proportion of patients who truly do have the disease/condition;

specificity identifies the proportion of patients who truly do not have the disease/condition;

PPV is the probability that participants with a positive screening test truly have the disease/condition;

NPV is the probability that participants with a negative screening test truly do not have the disease/condition;

κ is an inter-rater agreement statistic to evaluate the agreement between two classifications on ordinal or nominal scales.

ICD 10-AM, International Classification of Diseases, Tenth Revision, Australian Modification; ICD, International Classification of Diseases; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; MHSC, Manitoba Health Services Commission; NPV, negative predictive value; PPV, positive predictive value.

Study characteristics and test measures of studies for physician claims data and hospital discharge data Superior performance characteristics within studies have been highlighted in bold. *Sensitivity, specificity, PPV and NPV are all hand-calculated: sensitivity identifies the proportion of patients who truly do have the disease/condition; specificity identifies the proportion of patients who truly do not have the disease/condition; PPV is the probability that participants with a positive screening test truly have the disease/condition; NPV is the probability that participants with a negative screening test truly do not have the disease/condition; κ is an inter-rater agreement statistic to evaluate the agreement between two classifications on ordinal or nominal scales. ICD 10-AM, International Classification of Diseases, Tenth Revision, Australian Modification; ICD, International Classification of Diseases; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; MHSC, Manitoba Health Services Commission; NPV, negative predictive value; PPV, positive predictive value. Another factor affecting the statistical estimates is the number of claims being used in the definition. Rector et al's study17 shows consistent results where the sensitivity is higher when at least one claims data are used in the definition, but the specificity is higher when at least two are used. Finally, Young et al's study27 demonstrates the highest sensitivity when two physician claims and two hospital discharge data are used in the definition and the highest specificity when one physician claim and two hospital claims are used in the definition. A secondary tabulation of data was performed by the type of ICD coding system used. Eight studies using ICD-9 coding systems are from the USA and four studies from Canada. Four studies use ICD-9 and ICD-10 coding systems—three of these are from Canada and one from Western Australia. In studies using ICD-9 codes, sensitivity ranged from 26.9% to 100%, specificity ranged from 88% to 100%, PPV ranged from 21% to 100%, NPV ranged from 74% to 99.6% and κ ranged from 0.6 to 0.9; whereas, in the studies using ICD-10 codes, the ranges for sensitivity (59.1% to 89.6%) and specificity (95.5% to 99%) narrowed significantly, and PPV ranged from 63.1% to 96%, NPV ranged 90.8% to 98.9% and κ ranged from 0.6 to 0.9.

Discussion

In this systematic review, case definitions appear to perform better when more data sources are used over a longer observation period. The outcomes with respect to sensitivity, specificity and PPV for each of these studies seem to differ due to variations in the definition of primary diagnosis in ICD-coded health data, the use of hospital discharge versus physician billing claims and by the geographical location. The validity of diabetes case definitions varies significantly across studies, but we identified definition features that were associated with better performance. The combinations of more than one data source, physician claim and/or hospital discharge encounter along with an observation period of more than 1 year consistently demonstrated higher sensitivity with only a modest decline in specificity. These definition characteristics are present in the definition used by the National Diabetes Surveillance System to identify Canadians with diabetes mellitus.31 The performance of this particular definition has been widely studied, and a meta-analysis pooling the results of these studies demonstrates a pooled sensitivity of 82.3% (95% CI 75.8% to 87.4%) and a specificity of 97.9% (95% CI 96.5% to 98.8%).10 This systematic review provides new knowledge on factors that are associated with enhanced definition performance and outlines the trade-offs one encounters with respect to sensitivity and specificity (and secondarily PPV and NPV) related to data source and years of follow-up. The development of an administrative case definition of diabetes is often related to pragmatic considerations (type of data on hand); however, this systematic review provides health services researchers with important information on how case definitions may perform given definition characteristics. There was considerable ‘within-data definition’ variation in measures of validity. This variation likely reflects that neither physician claims nor hospital discharge data are primarily collected for surveillance; hence, the accuracy of diagnoses coded in these data sources remains suspect. Physician claims, while potentially rich in clinical information, are not recorded in a standardised manner. Billing practices do vary by practitioner, which may in turn be influenced by the nature of physician reimbursement (salary vs fee for service).23 32 33 Furthermore, patients with diabetes commonly carry multiple comorbidities, so while patients may have diabetes and be seen by a physician, providers will file billing claims for conditions other than diabetes.34 35 In contrast, hospital discharge data are limited to clinical information that is relevant to an individual hospitalisation, capturing diagnostic and treatment information usually for a brief window of time. The advantage of hospital discharge data for surveillance is that discharge diagnostic and medical procedure information are recorded by medical coders with standardised training with a detailed review of medical charts. However, the standard method of discharge coding does vary regionally, and thus variation around validity estimates based on these differences in coding practices will be observed. Ideal performance parameters will vary based on the clinical condition of interest, the nature of surveillance and the type of data being used for surveillance. When studying diabetes trends and incidence rate, a case definition that has high but balanced measures of sensitivity and PPV is preferred. This will ensure maximal capture of potential patients and that patients captured likely have diabetes. This systematic review suggests that the commonly used two physician outpatient billings and/or one hospitalisation within a certain period of time is appropriate. It is also important to recognise that the data source used may also affect the type of patient identified with administrative data definitions. Hospital discharge data (when used in isolation) will potentially identify patients with more advanced disease or more complications and therefore may not be fully representative of the entire diabetes population. Similarly, physician claims data may identify a comparatively well, ambulatory population that has access to physician care in the community. The greatest strength of this systematic review is its inclusiveness—the search strategy was not restricted by region, time or any particular case definition of diabetes. However, most of the studies, 15 of the 16, included in the qualitative analysis were conducted in North America with high sensitivity and specificity estimates between the cases identified through the administrative data versus medical records and the administrative data versus population-based surveys across studies, suggesting that public administrative data are a viable substitute for diabetes surveillance. Finally, the study quality across all studies included was generally high as measured by the QUADAS Scale. There is the potential for a language bias as studies whose full texts were not available in English were not considered. There are potential limitations for all reference standards used to validate administrative case definitions for diabetes. The accuracy of chart reviews depends principally on physician documentation, availability of records and the accuracy of coding.36 Self-reported surveys and telephone surveys are prone to recall bias, social desirability bias, poor understanding of survey questions or incomplete knowledge of their diagnosis. Self-reported surveys can also suffer from participation biases as patients with low diabetes risk may be less willing to participate whereas certain patients with advance diabetes may be too unwell to participate. Age, sex and a patient's level of education can have an effect on the reporting of diabetes.37–39 Those with poorly controlled diabetes have been found to underreport their disease status.40 The ideal reference standard would be a clinical measure (such as glucose or HbA1c); however, the use of a clinical reference standard is not often performed. In addition to the limitations of the reference standards used for validation, it should also be noted that even clinical measures as a references standard are imperfect and glucose and HbA1C are surrogates of the underlying disease process. It should also be noted that glucose and HbA1C thresholds for diagnosis have changed (albeit modestly) over the past 20 years. Changes in the clinical definition overtime have significant implications to diabetes surveillance. Understanding changing diagnostic thresholds is critical to interpreting surveillance data. However, the validity of an administrative data case definition is conceptually related but somewhat separate from the clinical definition. If we are to understand the clinical definition as a biological or physiologic definition that denotes the presence or absence of disease, the administrative data definitions are a surrogate of disease and denote the presence or the absence of disease based on care for the disease. The administrative definitions identify patients with a diagnosis of diabetes based on an interaction with the healthcare system in which they received care for diabetes. Therefore, the application of this definition follows the application of the clinical definition. There is a presumption that the clinical definition, whatever it may be at the time of the application, was valid. Finally, difference between type 1 diabetes mellitus and type 2 diabetes mellitus is not clear in studies using administrative databases. In this systematic review, we included only adult population (≥18 years of age), which is primarily the type 2 diabetes population.

Generalisability

Fifteen of the 16 included studies were conducted in North America, and therefore it is not surprising that the validation studies report comparable results. However, even though these studies are nested in the general population, the selected diabetes cohorts used in the validation studies may not always be truly representative of the general population.

Conclusions

Most studies included in this review use similar case definitions that require one or more diagnoses of diabetes. The performance characteristics of these case definitions depends on the variations in the definition of primary diagnosis in ICD-coded discharge data and/or the methodology adopted by the healthcare facility to extract information from patient records. Purpose of surveillance and the type of data being used should command the performance parameters of an administrative case definition. Approaches used in developing case definitions for diabetes can be simple and practical and result in high sensitivity, specificity and PPV. Overall, administrative health databases are useful for undertaking diabetes surveillance,21 25 but an awareness of the variation in performance being affected by case definition is essential.
  37 in total

Review 1.  Diabetes and cardiovascular disease: a statement for healthcare professionals from the American Heart Association.

Authors:  S M Grundy; I J Benjamin; G L Burke; A Chait; R H Eckel; B V Howard; W Mitch; S C Smith; J R Sowers
Journal:  Circulation       Date:  1999-09-07       Impact factor: 29.690

2.  Denial of disease in Type 2 diabetes mellitus: its influence on metabolic control and associated factors.

Authors:  M E Garay-Sevilla; J M Malacara; A Gutiérrez-Roa; E González
Journal:  Diabet Med       Date:  1999-03       Impact factor: 4.359

3.  Evaluating the quality of self-reports of hypertension and diabetes.

Authors:  Noreen Goldman; I-Fen Lin; Maxine Weinstein; Yu-Hsuan Lin
Journal:  J Clin Epidemiol       Date:  2003-02       Impact factor: 6.437

4.  Developing and validating a diabetes database in a large health system.

Authors:  Janice C Zgibor; Trevor J Orchard; Melissa Saul; Gretchen Piatt; Kristine Ruppert; Andrew Stewart; Linda M Siminerio
Journal:  Diabetes Res Clin Pract       Date:  2006-08-28       Impact factor: 5.602

5.  Diabetes case identification methods applied to electronic medical record systems: their use in HIV-infected patients.

Authors:  Heidi M Crane; Joseph B Kadane; Paul K Crane; Mari M Kitahata
Journal:  Curr HIV Res       Date:  2006-01       Impact factor: 1.581

6.  Identifying hypertension-related comorbidities from administrative data: what's the optimal approach?

Authors:  Ann M Borzecki; Ashley T Wong; Elaine C Hickey; Arlene S Ash; Dan R Berlowitz
Journal:  Am J Med Qual       Date:  2004 Sep-Oct       Impact factor: 1.852

Review 7.  [Epidemiology of diabetic foot].

Authors:  Sebastiano Leone; Renato Pascale; Mario Vitale; Silvano Esposito
Journal:  Infez Med       Date:  2012

8.  Validation of diagnostic codes within medical services claims.

Authors:  Machelle Wilchesky; Robyn M Tamblyn; Allen Huang
Journal:  J Clin Epidemiol       Date:  2004-02       Impact factor: 6.437

Review 9.  National, regional, and global trends in fasting plasma glucose and diabetes prevalence since 1980: systematic analysis of health examination surveys and epidemiological studies with 370 country-years and 2·7 million participants.

Authors:  Goodarz Danaei; Mariel M Finucane; Yuan Lu; Gitanjali M Singh; Melanie J Cowan; Christopher J Paciorek; John K Lin; Farshad Farzadfar; Young-Ho Khang; Gretchen A Stevens; Mayuree Rao; Mohammed K Ali; Leanne M Riley; Carolyn A Robinson; Majid Ezzati
Journal:  Lancet       Date:  2011-06-24       Impact factor: 79.321

10.  Accuracy of Veterans Affairs databases for diagnoses of chronic diseases.

Authors:  Jasvinder A Singh
Journal:  Prev Chronic Dis       Date:  2009-09-15       Impact factor: 2.830

View more
  39 in total

1.  A National Population-based Study of Adults With Coronary Artery Disease and Coarctation of the Aorta.

Authors:  Sarah S Pickard; Kimberlee Gauvreau; Michelle Gurvitz; Joshua J Gagne; Alexander R Opotowsky; Kathy J Jenkins; Ashwin Prakash
Journal:  Am J Cardiol       Date:  2018-09-21       Impact factor: 2.778

2.  Assessing the Risk for Gout With Sodium-Glucose Cotransporter-2 Inhibitors in Patients With Type 2 Diabetes: A Population-Based Cohort Study.

Authors:  Michael Fralick; Sarah K Chen; Elisabetta Patorno; Seoyoung C Kim
Journal:  Ann Intern Med       Date:  2020-01-14       Impact factor: 25.391

3.  Measurement and application of patient similarity in personalized predictive modeling based on electronic medical records.

Authors:  Ni Wang; Yanqun Huang; Honglei Liu; Xiaolu Fei; Lan Wei; Xiangkun Zhao; Hui Chen
Journal:  Biomed Eng Online       Date:  2019-10-11       Impact factor: 2.819

4.  Validity of Canadian discharge abstract data for hypertension and diabetes from 2002 to 2013.

Authors:  Jason Jiang; Danielle Southern; Cynthia A Beck; Matthew James; Mingshan Lu; Hude Quan
Journal:  CMAJ Open       Date:  2016-10-28

5.  Techniques for improving ophthalmic studies performed on administrative databases.

Authors:  Durga S Borkar; Lucia Sobrin; Rebecca A Hubbard; John H Kempen; Brian L VanderBeek
Journal:  Ophthalmic Epidemiol       Date:  2018-12-06       Impact factor: 1.648

6.  Developing a case definition for type 1 diabetes mellitus in a primary care electronic medical record database: an exploratory study.

Authors:  B Cord Lethebe; Tyler Williamson; Stephanie Garies; Kerry McBrien; Charles Leduc; Sonia Butalia; Boglarka Soos; Marta Shaw; Neil Drummond
Journal:  CMAJ Open       Date:  2019-05-06

7.  Evolving channeling in prescribing SGLT-2 inhibitors as first-line treatment for type 2 diabetes.

Authors:  HoJin Shin; Sebastian Schneeweiss; Robert J Glynn; Elisabetta Patorno
Journal:  Pharmacoepidemiol Drug Saf       Date:  2022-01-17       Impact factor: 2.890

8.  Trends in First-Line Glucose-Lowering Drug Use in Adults With Type 2 Diabetes in Light of Emerging Evidence for SGLT-2i and GLP-1RA.

Authors:  HoJin Shin; Sebastian Schneeweiss; Robert J Glynn; Elisabetta Patorno
Journal:  Diabetes Care       Date:  2021-06-18       Impact factor: 17.152

9.  Coronary Risk Estimation Based on Clinical Data in Electronic Health Records.

Authors:  Ben O Petrazzini; Kumardeep Chaudhary; Carla Márquez-Luna; Iain S Forrest; Ghislain Rocheleau; Judy Cho; Jagat Narula; Girish Nadkarni; Ron Do
Journal:  J Am Coll Cardiol       Date:  2022-03-29       Impact factor: 27.203

10.  Validity of International Classification of Diseases Codes for Identifying Neuro-Ophthalmic Disease in Large Data Sets: A Systematic Review.

Authors:  Ali G Hamedani; Lindsey B De Lott; Tatiana Deveney; Heather E Moss
Journal:  J Neuroophthalmol       Date:  2020-12       Impact factor: 4.415

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.