Literature DB >> 35166399

Systematic review: development of a consensus code set to identify cirrhosis in electronic health records.

Jessica E Shearer1,2, Juan J Gonzalez3, Thazin Min1, Richard Parker1, Rebecca Jones1, Grace L Su3, Elliot B Tapper3, Ian A Rowe1,2.   

Abstract

BACKGROUND: Electronic health records (EHRs) collate longitudinal data that can be used to facilitate large-scale research in patients with cirrhosis. However, there is no consensus code set to define the presence of cirrhosis in EHR. This systematic review aims to evaluate the validity of diagnostic coding in cirrhosis and to synthesise a comprehensive set of ICD-10 codes for future EHR research.
METHOD: MEDLINE and EMBASE databases were searched for studies that used EHR to identify cirrhosis and cirrhosis-related complications. Validated code sets were summarised, and the performance characteristics were extracted. Citation analysis was done to inform development of a consensus code set. This was then validated in a cohort of patients.
RESULTS: One thousand six hundred twenty-six records were screened, and 18 studies were identified. The positive predictive value (PPV) was the most frequently reported statistical estimate and was ≥80% in 17/18 studies. Citation analyses showed continued variation in those used in contemporary research practice. Nine codes were identified as those most frequently used in the literature and these formed the consensus code set. This was validated in diverse patient populations from Europe and North America and showed high PPV (83%-89%) and greater sensitivity for the identification of cirrhosis than the most often used code set in the recent literature.
CONCLUSION: There is variation in code sets used to identify cirrhosis in contemporary research practice. A consensus set has been developed and validated, showing improved performance, and is proposed to align EHR study designs in cirrhosis to facilitate international collaboration and comparisons.
© 2022 The Authors. Alimentary Pharmacology & Therapeutics published by John Wiley & Sons Ltd.

Entities:  

Mesh:

Year:  2022        PMID: 35166399      PMCID: PMC9302659          DOI: 10.1111/apt.16806

Source DB:  PubMed          Journal:  Aliment Pharmacol Ther        ISSN: 0269-2813            Impact factor:   9.524


INTRODUCTION

Cirrhosis is recognised as a growing public health burden, accounting for 1.3 million deaths worldwide each year. The economic impact of cirrhosis is considerable with higher rates of unemployment, years of life lost and reduced quality of life. The ability to identify large cohorts of patients with chronic liver disease can improve understanding of the natural history of cirrhosis and liver‐related complications. Electronic health records (EHRs) and administrative databases collate longitudinal data generated throughout the course of routine clinical care, often abstracted using diagnostic and procedure coding systems such as ICD‐9 and ICD‐10. These data are easily accessible and can provide comprehensive information regarding “real‐world” care patterns, costs and outcomes. , , The meaning and value of these data are directly related to both their validity and applicability to the population with cirrhosis. Several studies have evaluated the validity of diagnostic codes in identifying patients with cirrhosis. As there are many codes relating to liver disease and its complications there is variation among studies in terms of the codes used to define the presence of cirrhosis. The increasing importance of EHR‐based research and the role of real‐world evidence in clinical decision making demands a critical appraisal of the tools used to identify cirrhosis. The aim of this systematic review was therefore to evaluate the current evidence assessing the validity of diagnostic coding to identify cirrhosis using electronic health record databases. The review aims to synthesise and validate a comprehensive code set which can be used for future studies using EHR to study patients with cirrhosis by comparing definitions of cirrhosis based on sets of existing diagnostic and procedural codes across studies and countries.

METHODS

Data sources and search strategy

A search was completed using the OVID platforms of MEDLINE and EMBASE electronic bibliographic databases from inception (1946 and 1947 respectively) to March 2020 including “In‐Process” citations of all peer reviewed literature and conference abstracts. The full search strategy is included in the Table S1. The search was limited to articles published in English and human studies, and the studies were de‐duplicated prior to evaluation. To identify additional studies, bibliography lists were hand searched. Once the search was completed, abstracts were screened for relevance and the identified studies were reviewed in full text and assessed for eligibility against the inclusion and exclusion criteria. The systematic review protocol was prospectively registered with PROSPERO (International prospective register of systematic reviews) registration ID: CRD 42019118848. It was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta‐Analysis (PRISMA) guidelines and checklist (Table S2).

Study selection

Studies were evaluated for inclusion in two stages. In the first stage all identified titles and abstracts were screened. In the second stage relevant studies were retrieved and a full text review was done on all studies which met the predefined inclusion and exclusion criteria. We included all observational cohort and cross‐sectional validation studies, which assessed the validity of diagnostic and procedural codes (ICD‐9 and ICD‐10) used to identify cirrhosis. Studies had to report the code set or algorithm employed to search the electronic database.

Inclusion and exclusion criteria

A study was included in the systematic review if it met the following predefined criteria: age >18 years, information regarding hospital admissions stored in electronic records as part of routine care, ICD‐9 or ICD‐10 codes explicitly defined and validated in medical record review. Studies using laboratory data to identify and define those patients with cirrhosis were excluded, as these data are not routinely available through EHR data alone. Where conference abstracts and full manuscripts of the same study are identified, data were extracted from the full manuscript.

Data extraction and quality assessment

The full text of each article was reviewed. Data were extracted, tabulated and summarised onto a standardised template. The information gathered included study author, year of publication and site, start date and duration of data collection, electronic data source, sample size, ICD codes or algorithm employed. If statistical estimates were not reported in the original study, estimates were calculated from the available data. This included sensitivity, specificity, positive predictive value, negative predictive value and kappa value (a measure of agreement beyond that expected by chance). As there is no validated quality assessment tool for non‐comparator retrospective studies, we used an adaptation of the QUADAS‐2 tool (Quality Assessment of Diagnostic Accuracy Studies) to evaluate the quality of the included studies.

Data synthesis and citation analysis

Data were synthesised qualitatively, with the authors reviewing the data extraction table and then re‐reviewing the relevant articles. Citation analysis was conducted using the web resource Scopus to assess the impact, geographical reach and applicability of the studies. This analysis was conducted in September 2020. Abstracts were excluded and only those studies in which the primary objective was validation of codes within liver disease were included, as it was felt that this would be a more accurate reflection of the impact and use of these validated code sets. Only those studies published at least 5 years ago were included, and citations were analysed per publication year.

Validation of the consensus code set

ICD‐9 codes were converted to the closest possible ICD‐10 equivalent and the most common codes and definitions used across all studies were identified and considered for inclusion in the consensus code set. The consensus code set was validated using four independent cohorts. To determine sensitivity, a cohort of 300 patients (UK cohort [sensitivity]) from a secondary/tertiary care centre at the Leeds Liver Unit, United Kingdom with advanced chronic liver disease and median liver stiffness as measured by transient elastography of ≥15 kPa between 2012 and 2017. Only patients with codes occurring after transient elastography were included in the primary analysis and out‐patient codes were not used. In a sensitivity analysis, patients with decompensation before transient elastography (n = 33) were also included to describe the sensitivity of the consensus code set. Second, we evaluated a cohort of 113 patients seen at the University of Michigan Hepatology Clinic (US cohort [sensitivity]) who were enrolled prospectively in a chronic disease monitoring system between 2010 and 2015 and followed for at least 3 years. As described elsewhere all patients had a CT scan within 365 days of enrolment and received their diagnosis of cirrhosis based on imaging, laboratory and/or histological parameters from a board‐certified transplant hepatologist and were followed clinically thereafter. All diagnosis codes were entered in or mapped to ICD‐10 in the electronic medical record. In each case the full medical record was reviewed. Basic demographic information was extracted and all events following the identification of fibrosis were recorded. This included out‐patient visits in the hepatology clinic and admissions to hospital with decompensation (variceal bleeding, ascites and hepatic encephalopathy). Data held within the EHR were extracted and coded information relating to hospital admissions, investigations and procedures was collected. Following this we evaluated the positive predictive value (PPV) of the consensus code set. We evaluated PPV in a separate cohort because the above described cohort did not include patients without cirrhosis, making it impossible to assess PPV. First, a cohort of 335 patients admitted to Leeds Teaching Hospital NHS Trust (UK cohort [PPV]), United Kingdom in 2019 with one or more codes from the consensus code set. Two experienced clinicians (JS and TM) independently reviewed the medical record to confirm if the diagnosis of cirrhosis was correct. A positive diagnosis of cirrhosis was made following review on one or more of the following criteria: histological confirmation of cirrhosis, portal hypertension on imaging (varices/ascites), documentation in medical record by a Specialist Gastroenterologist or Hepatologist of an episode of decompensation (ascites, variceal bleeding, hepatic encephalopathy) or synthetic dysfunction consistent with cirrhosis (Albumin ≤30, Bilirubin ≥20, INR ≥1.2). Additionally, we evaluated PPV in 241 patients identified by any one or more of the codes in the consensus code set with an out‐patient encounter in May or June 2021 at the University of Michigan (US cohort [PPV]). The full medical record was reviewed by an experienced clinician (EBT) to determine if the patient had a confirmed diagnosis of cirrhosis, based on the criteria outlined above.

RESULTS

Study characteristics

A total of 1975 abstracts were identified. After de‐duplication 1626 abstracts remained. One hundred and thirty‐eight studies were reviewed in full text. A further 29 studies were identified and reviewed through hand searching of reference lists. Of the discounted records, 66 were conference abstracts, which did not contain sufficient information for analysis. Overall, 18 studies met the inclusion criteria and were included in the final qualitative analysis. No additional suitable studies were identified through hand searching of bibliographies. A flowchart showing the number of studies screened and included is shown in Figure 1. The studies and a description of their characteristics are shown in Table 1.
FIGURE 1

Study flow chart. ICD, international classification of diseases

TABLE 1

Study characteristics and validation standards in order of publication year

Author (year)CountryStudy yearsSource populationType of databaseSample sizeRecords validatedDefinition of validationValidator
Quan et al. 24 Canada1996–1997Patients admitted to one of three hospitals within the Calgary Regional Health AuthorityAD12001200Details not given in studyOne clinician
Hachem et al. 12 US1995–2005Veterans registered at VA medical clinics in Houston, TexasAD8484Pathology +/− radiology +/− evidence in medical recordsOne clinician
Kramer et al. 9 US1998–2004Veterans registered at VA medical clinics in Houston, TexasAD331331Stage 4 cirrhosis on liver biopsy or ≥ 2 of cirrhosis, ascites/peritonitis, varices, HCC, HRS, HE on imaging (CT/MRI/USS) or in notes or ≥ 2 albumin <30 g/L, bilirubin >2.0 mg/dl, INR >1.2 (or 1 of laboratory parameters with one of above)One clinician, 20% by second clinician, 10% by third clinician
Re et al. 18 US2005Patients enrolled in the Veterans Ageing Cohort StudyEHR137137Radiological evidence of ascites (CT/MRI/USS) or evidence of peritoneal fluid analysis +/− polymorphonuclear leucocyte count ≥250 cells/mL or bacterascites or bleeding varices on endoscopy report or documentation of mental confusion in the absence of non‐hepatic causes or diagnosis of HCC on biopsy or radiology (CT/MRI)One non‐clinician, results reviewed by two clinicians
Thygesen et al. 22 Denmark1998–2007Patients registered in the Danish National Registry in the North Jutland Region, DenmarkNR95050Discharge summary/medical record describing exact diagnosisOne clinician, One arbitrator
Singal et al. 19 US2008–2009Patients admitted to one hospital in Dallas CountyEHR15891589Consistent histology +/− cirrhotic‐appearing liver on imaging with evidence portal hypertension (ascites, HE, varices or splenomegaly with thrombocytopenia) a One clinician
Goldberg et al. 11 US1997–2011Patients receiving IP or OP care at two tertiary care hospitals in PennsylvaniaAD266244Liver biopsy demonstrating cirrhosis or radiological evidence of cirrhosis (CT/MRI/USS), or documentation of cirrhosis based on biopsy/radiologyOne clinician
Kanwal et al. 27 US2000–2007Patients receiving IP or OP care at 3 VA medical centres and 15 clinics in the MidwestEHR774300Documentation, laboratory or radiological evidence of ascites, HE, in‐patient GI bleeding, paracentesis or SBPOne clinician, 10% by second clinician
Rakoski et al. 17 US2008Patients enrolled in the national Health and Retirement Study and receiving care at University of MichiganAD317100Liver biopsy demonstrating cirrhosis or radiological evidence of cirrhotic liver with splenomegaly + platelet count of <120 000 mm/3 or evidence of decompensated cirrhosis with HE, HRS, ascites or variceal bleedingOne clinician
Fialla et al. 21 Denmark1996–2006Patients enrolled in the Funen Patient Administrative System registry in DenmarkAD13691369Consistent histology cirrhosis or evidence of portal hypertension with hepatic wedge pressure of >8 mmHg or INR >1.5 or cirrhotic liver on USS or perioperatively or evidence of complications such as varices, ascites +/− HEN/A
Rabin et al. 16 US2013Patients enrolled in the Chronic Hepatitis Cohort Study in Detroit, MichiganEHR283283Radiology, laboratory parameters, biopsy and clinical eventsTwo clinicians, one arbitrator
Nehra et al. 15 US2008–2011Patients receiving IP or OP care at one hospital in Dallas CountyEHR28932893Stage 4 cirrhosis on liver biopsy or radiological evidence of cirrhosis + evidence of portal hypertension on imaging or clinical evidence of portal hypertension/complications (ascites, varices, HE, HCC)One clinician
Ratib et al. 25 England1998–2009Patients enrolled in primary and secondary registries in EnglandEHR51182282Search of primary and secondary care records and ONS death registry data for codes related to liver disease + examination of FTD for any of the following terms: “cirrhosis,” “ascites,” “varices,” “liver,” “portal hypertension,” “hepatic,” “jaundice” or “paracentesis”N/A
Chang et al. 10 US2013–2015Patients receiving IP or OP care at four hospitals in Los AngelesEHR5343168Stage 4 cirrhosis on liver biopsy, radiological evidence of cirrhosis (CT/MRI/USS) or documented clinical diagnosisOne clinician, One non‐clinician
Lu et al. 13 US2015–2016Patients enrolled in the Chronic Hepatitis Cohort Study in Detroit, MichiganEHR296296Documented evidence of HE or GI bleeding due to portal hypertension or jaundice with bilirubin >2.5 mg/dl or ascites/hydrothorax due to portal hypertension, or HCCTwo clinicians, One arbitrator
Mapakshi et al. 14 US2015–2016Patients with data stored within the VA Corporate Data WarehouseEHR325325Stage 4 cirrhosis on liver biopsy or documentation of cirrhosis or complications in medical record, radiological or endoscopic evidence of cirrhosisOne clinician
Lapointe‐Shaw et al. 23 Canada2006–2013Patients receiving IP or OP care at two tertiary care hospitals in Ontario, CanadaAD67146714Stage 4 cirrhosis on liver biopsy or cirrhotic appearance on USS, non‐invasive test result consistent with F4 fibrosis or evidence in clinical record of ascites, bleeding varices, encephalopathy, use of spironolactone or nadolol without alternative indication or explicit mention of cirrhosis/decompensation/non‐bleeding varicesTwo clinicians, one arbitrator, 5% by second clinician
Driver et al. 26 UK2007–2016Patients diagnosed with hepatocellular carcinoma in two NHS cancer centres in EnglandEHR339339Documentation of cirrhosis in MR or MDT minutes, radiological/endoscopic evidence of portal hypertension, cirrhosis on liver biopsy, consistent TE resultThree clinicians

AD, administrative database; MR, medical record; IP, in‐patient; OP, out‐patient; EHR, electronic health record; VA, veterans affairs; NR, national registry; HCC, hepatocellular carcinoma; HRS, hepatorenal syndrome; HE, hepatic encephalopathy; CT, computerised tomography; MRI, magnetic resonance imaging; USS, ultrasound scan; SBP, spontaneous bacterial peritonitis; TE, transient elastography.

Information not in original abstract deduced from subsequent paper (14).

Study flow chart. ICD, international classification of diseases Study characteristics and validation standards in order of publication year AD, administrative database; MR, medical record; IP, in‐patient; OP, out‐patient; EHR, electronic health record; VA, veterans affairs; NR, national registry; HCC, hepatocellular carcinoma; HRS, hepatorenal syndrome; HE, hepatic encephalopathy; CT, computerised tomography; MRI, magnetic resonance imaging; USS, ultrasound scan; SBP, spontaneous bacterial peritonitis; TE, transient elastography. Information not in original abstract deduced from subsequent paper (14). The sample size ranged between 84 and 6714 people, with a total of 18 704 patients included. Twelve studies were conducted in the United States, , , , , , , , , , , , two in Denmark, , two in Canada , and two in the United Kingdom. , Of those studies from the United States, five used cohorts from the Veterans Administration (VA) population. , , , , In two studies, the evaluation was carried out in a single hospital setting. , Seventeen of the studies used medical record review to validate the diagnosis of cirrhosis. , , , , , , , , , , , , , , , , In these studies, the full medical record was retrieved and compared with the diagnostic codes of interest. Among the 17 studies, 13 outlined an explicit definition of their primary outcome measure. , , , , , , , , , , , , All of these included histological and/or radiological evidence of liver disease and five also included specific laboratory parameters. , , , , One study searched primary and secondary care records and death registry data for codes or free‐text terms relating to cirrhosis as their validation standard. Ten studies evaluated codes using electronic health records , , , , , , , , , and seven used administrative databases, , , , , , , the majority of which reported on in‐patient and out‐patient data. One study used a national registry database. Validation was the primary outcome measure in 14 studies. , , , , , , , , , , , , , Two of these studies focussed on validation of the comorbidity variables which constitute the Charlson index, of which liver disease was extracted separately. , Seven of the validation studies analysed disease severity, that is, codes representing decompensation events in addition to cirrhosis codes. , , , , , , One study validated an algorithm using ICD codes with and without the addition of a natural language processing algorithm. A description of the validation standards is included in Table 1.

Study quality

Study quality was assessed using an adapted QUADAS tool. A detailed copy of the tool and a breakdown of individual scores for each study are included in Tables S3 and S4. The QUADAS scores ranged from 7 to 11 with a maximum of 14 (median 10). Three studies used a selected population of patients; patients enrolled in the chronic hepatitis cohort study , and patients with an ICD‐10 code for hepatocellular carcinoma. Two studies did not adequately describe their selection criteria in detail. , Three studies used a random selection from their total sample to verify as a gold standard comparison. , , Seven studies stated that the individual abstracting data from the medical record were blinded to the database coding, , , , , , , while the rest did not specify. Seven studies used a single clinician to conduct chart review , , , , , , the remaining 10 studies used more than one clinician often in addition to an arbitrator.

Quality of coding sets

Details of the type and number of codes used are shown in Table 2. Ten studies used ICD‐9 codes, , , , , , , , , , three used ICD‐10 codes , , and the remaining five used a combination of ICD‐9 and/or ICD‐10 codes combined with procedural codes. , , , , Aside from one study, which specified that it used only the primary diagnostic code, none of the remaining studies commented upon whether the code was the designated primary diagnosis code or one of the subsequent 20 diagnosis codes which can be associated with an in‐patient or out‐patient encounter. It is therefore assumed that the codes of interest occurred at any position.
TABLE 2

Details of code dictionary and number of codes used in each study

Author (year)Codes usedCase definitionNo. of codes
Quan et al. 24 ICD‐9≥1 code (IP only)14
Hachem et al. 12 ICD‐9≥1 code (IP or OP)2 a
Kramer et al. 9 ICD‐9≥1 code (IP or OP)3
Re et al. 18 ICD‐91 IP + 2 OP codes22 a
Thygesen et al. 22 ICD‐101st listed code (IP or OP)11
Singal et al. 19 ICD‐9≥3 codes11 b
Goldberg et al. 11 ICD‐9≥2 codes (IP or OP)58 a
Kanwal et al. 27 ICD‐9≥2 codes (IP or OP)12
Rakoski et al. 17 ICD‐9≥1 code (IP or OP)12 a
Fialla et al. 21 ICD‐10≥1 code (IP or OP)4
Rabin et al. 16 ICD‐9 + CPT≥1 code41
Nehra et al. 15 ICD‐9≥1 code (IP or OP)11
Ratib et al. 25 ICD‐10 + OPCS4≥1 code21
Chang et al. 10 ICD‐9≥1 code (IP or OP)16
Lu et al. 13 ICD‐9/10 + CPT≥1 code (IP or OP)43
Mapakshi et al. 14 ICD‐10≥1 code (IP or OP)7
Lapointe‐Shaw et al. 23 ICD‐9/10 + CCP≥1 code (IP or OP)40
Driver et al. 26 ICD‐10 + OPCS4≥1 code (IP only)33

ICD, international classification of diseases; CPT, current procedural terminology; ONS, office for national statistics; CCP, Canadian classification of diagnostic, therapeutic and surgical procedures.

Information not in original abstract deduced from subsequent paper (30).

Paper uses ICD‐9‐CM (clinical modification) classification.

Details of code dictionary and number of codes used in each study ICD, international classification of diseases; CPT, current procedural terminology; ONS, office for national statistics; CCP, Canadian classification of diagnostic, therapeutic and surgical procedures. Information not in original abstract deduced from subsequent paper (30). Paper uses ICD‐9‐CM (clinical modification) classification. Fifteen studies reported specific ICD codes used to define liver disease in their cohort. , , , , , , , , , , , , , , The remaining three studies , , did not specify the codes, however, it was possible to obtain the information from other related studies. , , Seven studies adopted ICD code sets which had previously been used and validated by other authors, , , , , , , while 11 studies developed their own selection of codes. , , , , , , , , , , Quan et al used a coding algorithm developed previously by Deyo et al., which included 14 ICD‐9 codes in total. The “mild” liver disease category included three codes for cirrhosis, and this was therefore combined with the codes for “moderate or severe” liver disease. Thygesen et al used a larger number of codes to define “mild” liver disease which included codes we considered to be less specific for cirrhosis (K71; K74; K76.0). For this reason, we included only the coding algorithm which was employed for “moderate or severe liver disease.” There was significant variation in the number and type of codes used. Overall, there were a total of 63 ICD‐9 codes and 54 ICD‐10 codes as well as 77 procedural codes used to identify cirrhosis in the included studies (Tables S5–S8). Of those studies using the ICD‐10 classification, this included codes from five disease manifestation categories (B15.0–94.2; C22; E80‐E84.5; I81‐I98.3; K22‐K92.2) and two symptom‐related and external causation categories (R16‐R18.8; T86). Three ICD‐9 and four ICD‐10 codes appeared as clustered codes denoting that all the sub‐codes were used. Five studies incorporated procedural codes into their code sets. In one study the specific procedural codes were unavailable. In the remaining four studies the number of procedural codes used ranged between 7 and 60. , , , While there were similarities between some of the code sets used, none of the studies used the same codes from the same ICD dictionary.

Assessment of validation in the literature

The validation statistics are shown in Table 3. The positive predictive value (PPV) was available in all but one study and was >90% in 10 studies with a range of 71%–100%. , , , , , , , , , Negative predictive value (NPV) was reported in seven studies , , , , , , with a range of 72%–99%. Nine studies reported sensitivity and/or specificity values, , , , , , , , , the range for which were 20%–98% and 43%–99% respectively. Kappa values were reported in only four studies and the values ranged from 0.48 to 0.71. , , , Of the 10 studies which reported a PPV of >90%, six of these included codes taken from both the in‐patient and out‐patient setting (Table 3).
TABLE 3

Performance characteristics of each study

Author (year)Se (%)Sp (%)PPV (%)NPV (%)Kappa (κ)
Quan et al. 24 72998099 a 0.75
Hachem et al. 12 89
Kramer et al. 9 9087 a 0.70
Re et al. 18 20 b 99 b 9199 b 0.48 b
Thygesen et al. 22 100
Singal et al. 19 95
Goldberg et al. 11 94
Kanwal et al. 27 91
Rakoski et al. 17 6788
Fialla et al. 21 71
Rabin et al. 16 91727191 a
Nehra et al. 15 , b 9843 c 7891 c 0.71
Ratib et al. 25 90
Chang et al. 10 47979272 a
Lu et al. 13 , d 838985
Mapakshi et al. 14 93
Lapointe‐Shaw et al. 23 , †† 67–8277–90
Driver et al. 26 86989979 a

Se, Sensitivity; Sp, Specificity; PPV, positive predictive value; NPV, negative predictive value.

NPV defined as probability that cirrhosis was absent among those patients without a code.

Estimated performance statistics using random sample of 100 patients without codes/hepatic decompensation.

Authors validated sensitivity using cohort of 285 patients prospectively determined to have cirrhosis. NPV validated using 116 patients with liver disease but no codes for cirrhosis.

Paper uses a specific combination of codes to achieve these performance characteristics.

Range given as results separated into three separate cohorts.

Performance characteristics of each study Se, Sensitivity; Sp, Specificity; PPV, positive predictive value; NPV, negative predictive value. NPV defined as probability that cirrhosis was absent among those patients without a code. Estimated performance statistics using random sample of 100 patients without codes/hepatic decompensation. Authors validated sensitivity using cohort of 285 patients prospectively determined to have cirrhosis. NPV validated using 116 patients with liver disease but no codes for cirrhosis. Paper uses a specific combination of codes to achieve these performance characteristics. Range given as results separated into three separate cohorts. The median number of codes used was 13. There was no improvement in the statistical estimates in those studies that used more codes within their definition (≤13 codes PPV range 71%–100%; >13 codes PPV range 71%–91%). However, four studies which validated diagnostic codes found that combinations of codes improved sensitivity in comparison to a single code. , , , There was no difference in the range of PPV between studies using ICD‐9 codes (71%–95%) and those using ICD‐10 codes (71%–100%). There was also no discernible difference in PPV depending upon the type of database from which coded information was extracted (administrative database 71%–94%; electronic health record 71%–99%). The study which used the Danish national registry reported PPV of 100%, although only 50 patient records were reviewed. We observed an increase in the minimal value of the PPV range in the five studies conducted in the VA population (89%–93%) in comparison to the remaining studies (71%–100%). The 18 studies included were published over a 17‐year period (2002–2019). The range of time for data collection varied widely from 1 to 14 years with a median length of 4 years and four of the studies collected data from over 10 years. , , , None of the studies commented upon any longitudinal changes in statistical estimates during the study collection period. It was noted that there was no difference in the trend in PPV in later years compared to earlier years; in the six earliest studies published between 2002 and 2012, , , , , , , , , , the PPV ranged between 71% and 100% while in the most recent studies published between 2013 and 2018, the PPV ranged between 71% and 99%. , , , , , , ,

Citation Analysis

We conducted citation analysis focussing on those manuscripts cited most frequently over the last 3 years. The total number of citations per study, mean number of citations per year over that period and the field‐weighted citation impact (FWCI), which compare how a frequently a document is cited in comparison to similar documents (values greater than 1.00 indicate that a publication is cited more than expected according to the average), are shown in Table 4. Over that period, the code set most frequently cited was from Kramer et al, but those from Nehra et al, and Goldberg et al were also often reported. , , This use of different code sets between studies highlights the need for a consensus approach to EHR research in the identification of patients with cirrhosis.
TABLE 4

Details of citation analysis

Author (year)Total number of citationsNumber of citations within last 3 years (2018, 2019 and 2020)Field‐weighted citation impactMean number of citations per year
Kramer et al. 9 16656 (18, 21, 17)2.6712.8
Re et al. 18 7629 (10, 7 12)1.878.4
Goldberg et al. 11 7746 (8, 15 23)1.459.6
Nehra et al. 15 8646 (8, 20 18)2.9710.3

Total number of citations since publication is shown alongside the number of citations within the most recent 3 years.

Details of citation analysis Total number of citations since publication is shown alongside the number of citations within the most recent 3 years.

Consensus code set synthesis

The most common codes used across all studies (Table 5) were considered for inclusion in the consensus code set (Table 6). The most frequently used codes were (when mapped to ICD‐10) K70.3—alcoholic cirrhosis, and K74.6—other/unspecified cirrhosis. Other commonly used codes related to complications of cirrhosis and portal hypertension, including the presence of oesophageal varices and ascites. Since ascites can occur in conditions unrelated to liver disease (e.g. cardiac or renal failure, or intra‐abdominal malignancy) we considered this code to be of low specificity and it was excluded from the proposed consensus code set to evaluate for future use. This is supported in previous studies, , which have found that using the code for ascites alone rather than in combination with other codes for chronic liver disease yields a PPV between 43% and 63%.
TABLE 5

Most common codes used to identify cirrhosis with sensitivity for the prediction of cirrhosis in combined UK and US cohorts (sensitivity)

ICD‐9 codeICD‐10 codeDescription (ICD‐10 version)Number of authors using codeSensitivity of individual codes in validation group (total 413 patients), sensitivity (n)
571.5K74.6Other and unspecified cirrhosis of the liver1643% (177)
571.2K70.3Alcoholic cirrhosis of the liver1618% (74)
456I85Oesophageal varices1424% (99)
−456.0I85.0With bleeding
−456.1 I85.9Without bleeding
−456.2I98Oesophageal varices in diseases classified elsewhere
−456.21I98.2Without bleeding
−456.20I98.3With bleeding
572.3K76.6Portal hypertension1337% (153)
572.2K72.9Hepatic failure, unspecified127% (29)
572.4K76.7Hepatorenal syndrome91% (4)
571.6K74.4Secondary biliary cirrhosis90
K74.5Biliary cirrhosis, unspecified
572.8K72.1Chronic hepatic failure80
789.5R18.0Ascites814% (58)

Approximate conversions from ICD‐9 to ICD‐10 dictionary have been used to determine the most appropriate code(s). The number of authors using the code includes those papers which used the code in either ICD‐9 or ICD‐10 format. In the sensitivity calculation an individual patient can have multiple codes contributing to the identification of cirrhosis.

TABLE 6

Consensus code set

ICD‐10 codeDescription
K74.6Other and unspecified cirrhosis of the liver
K70.3Alcoholic cirrhosis of the liver
I85Oesophageal varices
I85.0With bleeding
I85.9Without bleeding
I98Oesophageal varices in diseases classified elsewhere
I98.2Without bleeding
I98.3With bleeding
K76.6Portal hypertension
K72.9Hepatic failure, unspecified
K76.7Hepatorenal syndrome

Final code set used to define cirrhosis in electronic health records.

Most common codes used to identify cirrhosis with sensitivity for the prediction of cirrhosis in combined UK and US cohorts (sensitivity) Approximate conversions from ICD‐9 to ICD‐10 dictionary have been used to determine the most appropriate code(s). The number of authors using the code includes those papers which used the code in either ICD‐9 or ICD‐10 format. In the sensitivity calculation an individual patient can have multiple codes contributing to the identification of cirrhosis. Consensus code set Final code set used to define cirrhosis in electronic health records.

Validation of code set

We used two independent samples to validate the sensitivity of the consensus code set. In the UK and US cohorts (sensitivity), a result was positive if the EHR contained one or more of these codes, either as an in‐patient or out‐patient where available. This was compared to the code set used most frequently from Kramer and colleagues. In the UK cohort (sensitivity) 300 patients were included. Sixty‐three per cent were male, the mean age at time of diagnosis was 55 years, and the majority had either non‐alcoholic fatty liver disease or alcohol associated liver disease. In the US cohort (sensitivity), 113 patients were included. The mean age was 64 years, 59% were male, and the commonest liver disease aetiology was hepatitis C virus infection. Further details are included in Table S9. The sensitivity for individual codes within the consensus code set was low (Table 5). There were three codes (K74.4, K74.5 and K72.1) which did not appear within either the UK or US cohorts (sensitivity). Given the additional benefit gained from including these codes was likely to be negligible, these were subsequently excluded from the proposed consensus code set (Table 6). The final consensus code set improved the sensitivity in the UK cohort from 44% using the Kramer et al code set to 61% using the consensus code set (P < 0.0001, McNemar’s test). The consensus code set was further evaluated in the subset of the UK validation cohort using different liver stiffness measurements (LSMs) to define cirrhosis. When using a threshold of >20 kPa rather than >15 kPa, the sensitivity for the detection of cirrhosis was improved from 61% to 68% in 227 patients. If the threshold was raised to >25 kPa LSM the sensitivity improved to 74% in 156 individuals. In comparison to the Kramer et al codes the sensitivity was 51% and 58% for patients with a liver stiffness measurement of >20 kPa and >25 kPa respectively. Sensitivity in the US cohort was also improved from 89% to 100% (P = 0.0015, McNemar’s test) highlighting the utility of the consensus code set in diverse patient populations. To understand whether relevant information was lost by excluding the term for ascites, we repeated the analyses including this code. In these analyses the sensitivity was not significantly changed; in the UK cohort the sensitivity was 60%, while in the US dataset sensitivity was maintained at 100%. To determine if the inclusion of patients with evidence of prior decompensation altered the performance characteristics, we reviewed the medical record of an additional 33 patients with decompensation events prior to index transient elastography. Twenty‐three of these patients would have been subsequently identified by the consensus code set as being cirrhotic. When combined with the UK cohort the overall sensitivity was unchanged at 61% (204/333 patients correctly identified). We used two further independent samples to validate the positive predictive value of the code set. In the UK cohort (PPV), 335 patients were included. In the US cohort (PPV), 241 patients were included, and in both cohorts alcohol‐related liver disease was the most common underling aetiology. Additional clinical information is included in Table S9. Of the 335 patients in the UK cohort, 278 patients had cirrhosis confirmed in the medical records, giving a PPV of 83%. In the US cohort 214 of 241 patients had a confirmed diagnosis of cirrhosis, equating to a PPV of 89%.

DISCUSSION

Accurate assessments of the population burden and the impact of cirrhosis in EHR research depend on the performance and validity of the coding algorithms used to identify cases. The aim of this study was to synthesise and validate an approach that can be used to facilitate future research to improve the applicability of EHR research findings internationally. We found that there was substantial variation in the codes used to define cirrhosis. We extracted the most frequently used and relevant codes and combined them into a consensus code set (Table 6), with a positive result indicated by the presence of one or more of the included codes in in‐patient or out‐patient records. This code set was validated in two diverse patient populations from Europe and North America. In contrast to the most frequently used code set for cirrhosis, we found that our consensus code set improved sensitivity for the identification of cirrhosis with maintained high PPV. It is intended that this code set is used in future EHR research, where cirrhosis is defined by the presence of one or more of the codes in the set in the in‐patient or out‐patient setting. The code set will enable researchers to collaborate internationally and compare diverse populations of patients with cirrhosis using EHR data.

The purpose and context of diagnostic coding

The increasing importance of EHR‐based research and the role of real‐world evidence in clinical decision making demands a critical appraisal of the tools used to identify cirrhosis in such studies. When reviewing the literature to determine the validity of diagnostic coding one must consider the study purpose, location and the data source from which the codes were extracted. The provision of healthcare and the databases in use vary considerably worldwide, and in developed countries the most important factor to consider is the role of medical billing. In the UK and most Scandinavian countries healthcare is financed through tax payments. European countries such as Germany and France use insurance systems and Canada employs a government led publicly funded model, with the option of privately paid insurance as a supplement. In the United States there are numerous systems in place, the majority of which rely upon medical billing and coding. Administrative and physicians claims databases were developed primarily for the purpose of billing and financial re‐payment. While the accuracy of these databases in identifying diseases has been widely reported upon , , how accurately these findings translate to those countries where databases and healthcare systems differ, and medical billing does not exist remains unclear.

The need for a consensus code set

We identified important differences in the sensitivity between our validation cohorts. This highlights the challenges in translating coding approaches derived from one dataset to another and the importance of reporting validation from different settings when these approaches are being developed and used. The lack of OP codes in the UK validation cohort likely impacted on the comparatively low sensitivity. While diagnosis and procedural codes are included in the Hospital Episodes Statistics (HES) OP dictionary they are not frequently included alongside OP attendances, and this has been highlighted as an important area of improvement for studies using HES‐derived datasets. Where available, both IP and OP codes should be used. The most widely used coding algorithm within the literature to date is adopted from Kramer et al. The VA system differs from the rest of healthcare provided in the United States, both in terms of structure, funding and demographically. The vast majority of VA patients with cirrhosis are middle‐aged males with a higher prevalence of hepatitis C and comorbidities than the general population. , Despite this, more than half of studies citing the Kramer code set were from outside the VA system suggesting wide adoption of these codes for EHR research particularly in the United States. However, to facilitate international collaboration and comparison a consensus code set that is better able to identify cirrhosis has several advantages and indeed these have gained traction in several other disease areas. , ,

Assessing code set performance

There was variation in the measures of performance of the various code sets reported. Most frequently the positive predictive value was reported, and this was often related to the study design, in which the medical records reviewed were already selected to enrich for the presence of cirrhosis. Several factors can improve the sensitivity of code sets, recognising that there is a balance to be found between sensitivity and PPV. Increasing numbers of codes used, codes from both the in‐patient and out‐patient setting, and codes that encompass the whole range of cirrhosis complications all yield improvements in the sensitivity of the described code sets. This increase in sensitivity, however, must be considered in the light of any reductions in the PPV. For example, Nehra and colleagues reported that the inclusion of multiple codes relating to liver decompensation, except for ascites, maximised the sensitivity for the detection of cirrhosis with an acceptable PPV (78.0%). Additionally, they found that almost 5% of patients with cirrhosis had a code for a complication of cirrhosis without a specific cirrhosis code, supporting their inclusion within a code set. The consensus code set incorporates each of these aspects in response to the observations made during the review.

Limitations

There are several limitations to this systematic review. First, the studies reported were often of small validation sets from single institutions without external validation with inherent bias in the assessment of the presence of cirrhosis in the medical chart review. Second, the weight of importance of the individual codes analysed in the primary reports was seldom reported meaning that a quantitative analysis was not possible to define the codes carrying the most information in the EHR and how this varied between studies. Third, developing a consensus code set that can be used across all healthcare systems is a challenge and we recognise that no two systems are the same. Fourth, validation using chart review has inherent limitations with the potential for misclassification though the extraction was done blind to the code set evaluation. The approaches taken in the qualitative synthesis recognise these limitations and validation in four diverse patient populations addresses, to some extent, issues regarding the validity of the consensus code set across healthcare systems. Fifth, it is recognised that the sensitivity in the UK cohort was comparatively low at 60%. This was in part owing to the population, which comprised of patients who had undergone transient elastography in the out‐patient setting, and due to the lack of out‐patient coded data meaning a proportion of patients did not have any coded information that could be used. Sixth, as the patients in the assessment of PPV were identified using the consensus code set we were unable to assess its specificity or negative predictive value since no code set negative cases were identified to enter the cohort. This is also a limitation to the description of existing code sets where these measures are infrequently reported. The potential impact of the uncertainty regarding the specificity of the consensus code set should be considered in the design of EHR‐based studies. Finally, as the validation was conducted in two tertiary care systems, further evaluation of the performance of the consensus code set in other healthcare systems would be appropriate.

CONCLUSIONS

A large number of diagnostic codes and combinations of these codes have been proposed to define cirrhosis in EHR research. In this systematic review we have defined a consensus code list of nine codes that increase sensitivity for the identification of cirrhosis in patients from both Europe and the United States with maintained high positive predictive values. This consensus code set is proposed to align EHR study designs in cirrhosis to facilitate international collaboration and comparisons.

AUTHORSHIP

Guarantor of the article: None. Author contributions: Jessica E. Shearer and Ian A. Rowe were involved in study concept and design. Jessica E. Shearer, Juan J. Gonzalez, Thazin Min, Grace L. Su, and Elliot B. Tapper acquired, analysed, and interpreted data. Jessica E. Shearer drafted manuscripts. Ian A. Rowe, Richard Parker, and Rebecca Jones critically revised manuscript. All authors approved the final manuscript. Table S1‐S10 Click here for additional data file.
  38 in total

1.  Validity of information on comorbidity derived rom ICD-9-CCM administrative data.

Authors:  Hude Quan; Gerry A Parsons; William A Ghali
Journal:  Med Care       Date:  2002-08       Impact factor: 2.983

2.  Validation of three coding algorithms to identify patients with end-stage liver disease in an administrative database.

Authors:  D Goldberg; Jd Lewis; Sd Halpern; Mark Weiner; Vincent Lo Re
Journal:  Pharmacoepidemiol Drug Saf       Date:  2012-06-04       Impact factor: 2.890

3.  Direct and Indirect Economic Burden of Chronic Liver Disease in the United States.

Authors:  Maria Stepanova; Leyla De Avila; Mariam Afendy; Issah Younossi; Huong Pham; Rebecca Cable; Zobair M Younossi
Journal:  Clin Gastroenterol Hepatol       Date:  2016-07-25       Impact factor: 11.382

4.  Hepatitis vaccination in patients with hepatitis C: practice and validation of codes at a large Veterans Administration Medical Centre.

Authors:  C Y Hachem; J R Kramer; F Kanwal; H B El-Serag
Journal:  Aliment Pharmacol Ther       Date:  2008-08-08       Impact factor: 8.171

5.  Validity of administrative codes associated with cirrhosis in Sweden.

Authors:  Bonnie Bengtsson; Johan Askling; Jonas F Ludvigsson; Hannes Hagström
Journal:  Scand J Gastroenterol       Date:  2020-09-22       Impact factor: 2.423

6.  Diagnosis of liver cirrhosis in England, a cohort study, 1998-2009: a comparison with cancer.

Authors:  Sonia Ratib; Joe West; Colin J Crooks; Kate M Fleming
Journal:  Am J Gastroenterol       Date:  2014-01-14       Impact factor: 10.864

7.  Global, regional, and national life expectancy, all-cause mortality, and cause-specific mortality for 249 causes of death, 1980-2015: a systematic analysis for the Global Burden of Disease Study 2015.

Authors: 
Journal:  Lancet       Date:  2016-10-08       Impact factor: 79.321

Review 8.  Systematic review of validated case definitions for diabetes in ICD-9-coded and ICD-10-coded data in adult populations.

Authors:  Bushra Khokhar; Nathalie Jette; Amy Metcalfe; Ceara Tess Cunningham; Hude Quan; Gilaad G Kaplan; Sonia Butalia; Doreen Rabi
Journal:  BMJ Open       Date:  2016-08-05       Impact factor: 2.692

9.  The use of administrative health care databases to identify patients with rheumatoid arthritis.

Authors:  John G Hanly; Kara Thompson; Chris Skedgel
Journal:  Open Access Rheumatol       Date:  2015-11-06

10.  Body composition predicts mortality and decompensation in compensated cirrhosis patients: A prospective cohort study.

Authors:  Elliot B Tapper; Peng Zhang; Rohan Garg; Tori Nault; Kate Leary; Venkat Krishnamurthy; Grace L Su
Journal:  JHEP Rep       Date:  2019-12-05
View more
  2 in total

Review 1.  Systematic review: development of a consensus code set to identify cirrhosis in electronic health records.

Authors:  Jessica E Shearer; Juan J Gonzalez; Thazin Min; Richard Parker; Rebecca Jones; Grace L Su; Elliot B Tapper; Ian A Rowe
Journal:  Aliment Pharmacol Ther       Date:  2022-02-15       Impact factor: 9.524

2.  Sodium-glucose cotransporter 2 (SGLT2) inhibitor initiation and hepatocellular carcinoma prognosis.

Authors:  Michael Hendryx; Yi Dong; Jonas M Ndeke; Juhua Luo
Journal:  PLoS One       Date:  2022-09-12       Impact factor: 3.752

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.