Literature DB >> 27016247

Validity of ICD-9-CM codes for breast, lung and colorectal cancers in three Italian administrative healthcare databases: a diagnostic accuracy study protocol.

Iosief Abraha1, Diego Serraino2, Gianni Giovannini1, Fabrizio Stracci3, Paola Casucci4, Giuliana Alessandrini4, Ettore Bidoli2, Rita Chiari5, Roberto Cirocchi6, Marcello De Giorgi4, David Franchini4, Maria Francesca Vitale7, Mario Fusco7, Alessandro Montedori1.   

Abstract

INTRODUCTION: Administrative healthcare databases are useful tools to study healthcare outcomes and to monitor the health status of a population. Patients with cancer can be identified through disease-specific codes, prescriptions and physician claims, but prior validation is required to achieve an accurate case definition. The objective of this protocol is to assess the accuracy of International Classification of Diseases Ninth Revision-Clinical Modification (ICD-9-CM) codes for breast, lung and colorectal cancers in identifying patients diagnosed with the relative disease in three Italian administrative databases. METHODS AND ANALYSIS: Data from the administrative databases of Umbria Region (910,000 residents), Local Health Unit 3 of Napoli (1,170,000 residents) and Friuli--Venezia Giulia Region (1,227,000 residents) will be considered. In each administrative database, patients with the first occurrence of diagnosis of breast, lung or colorectal cancer between 2012 and 2014 will be identified using the following groups of ICD-9-CM codes in primary position: (1) 233.0 and (2) 174.x for breast cancer; (3) 162.x for lung cancer; (4) 153.x for colon cancer and (5) 154.0-154.1 and 154.8 for rectal cancer. Only incident cases will be considered, that is, excluding cases that have the same diagnosis in the 5 years (2007-2011) before the period of interest. A random sample of cases and non-cases will be selected from each administrative database and the corresponding medical charts will be assessed for validation by pairs of trained, independent reviewers. Case ascertainment within the medical charts will be based on (1) the presence of a primary nodular lesion in the breast, lung or colon-rectum, documented with imaging or endoscopy and (2) a cytological or histological documentation of cancer from a primary or metastatic site. Sensitivity and specificity with 95% CIs will be calculated. DISSEMINATION: Study results will be disseminated widely through peer-reviewed publications and presentations at national and international conferences. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

Entities:  

Keywords:  administrative database; breast, lung and colorectal cancers; validating ICD-9 codes

Mesh:

Year:  2016        PMID: 27016247      PMCID: PMC4809074          DOI: 10.1136/bmjopen-2015-010547

Source DB:  PubMed          Journal:  BMJ Open        ISSN: 2044-6055            Impact factor:   2.692


The study will evaluate the validity of the International Classification of Diseases-Ninth Revision—Clinical Modification (ICD-9-CM) codes for breast, lung and colorectal cancers in three large Italian administrative databases. The strength of this study is that it will use a medical chart review to ascertain cases of cancer diseases. Once these administrative databases are validated for breast, lung and colorectal cancer diseases, they can be used for outcome research including pharmacoepidemiology, health service research and quality of care research. This study will be the first to validate ICD-9-CM codes of three cancers in three large administrative databases in Italy. Validation studies of administrative data are related to that context and are not generalisable to other settings.

Introduction

As computer technology continues to advance, administrative databases are increasingly growing in numerous healthcare settings worldwide. These databases anonymously store data about patients regarding the healthcare assistance they received, including birth, death or disease treatment. Usually, the diagnosis of the disease is associated with a specific code from the International Classification of Diseases, Ninth Revision (ICD-9) or 10th Revision (ICD-10) edition. The ICD is designed to map health conditions to corresponding generic categories together with specific variations.1 The merging of individual patient data from administrative databases with other sources (eg, prescription and laboratory data) allows one to investigate a wide range of relevant and often unique public health questions,2 monitor population health status over time and perform population-based pharmacoepidemiological research.2–4 To constitute a reliable source for research studies, adequate validation of administrative healthcare databases is mandatory. While non-clinical information in healthcare databases, such as demographic and prescription data, are highly accurate,5 6 the validity of registered diagnoses and procedures is variable.6 7 Determining the accuracy of the latter two categories of clinical information is vitally important to all potential users and involves confirming the consistency of information within the databases with the corresponding clinical records of patients.5 In Italy, all the Regional Health Authorities maintain large healthcare information systems containing patient data from all hospital and territorial sources. These databases have the potential to address important issues in postmarketing surveillance,8 9 epidemiology,10 quality performance and health services research.11 However, there is a concern that their considerable potential as a source of reliable healthcare information has not been realised since they have not been widely validated. A systematic review of ICD-9 code validation in Italian administrative databases12 reported that only a few regional databases have been validated for a limited number of ICD-9 codes of diseases including stroke,13 14 gastrointestinal bleeding,15 thrombocytopenia,16 epilepsy,17 infections,18 chronic obstructive pulmonary disease,19 20 Guillain-Barré syndrome21 and cancers.22 23 In addition, the use of these databases was scarce, as only six administrative databases served as sources for published research articles based on the validated ICD-9 codes. Hence, it is imperative that Regional Health Authorities systematically validate their databases for critical diseases to productively use the information they contain. Breast, colorectal and lung cancers are the most commonly diagnosed neoplasms worldwide, as well as in Italy.24 Consequently, they generate interest in the scientific community and industry as targets for the development of new drugs and for governments, given that they are an important cause of public health and economic burden. For example, variation in the epidemiology of breast,25 colorectal26 27 and lung28 cancers, treatment (pharmacological or surgical) administered to patients suffering from these cancers and potential clinical and economic outcomes29–31 can all be evaluated using validated administrative databases. The objective of the present protocol is to evaluate the accuracy of the ICD-9-CM codes related to breast, lung and colorectal cancers in correctly identifying the respective diseases using three large Italian administrative healthcare databases.

Methods

Setting and data source

Administrative databases

Starting from the early 1990s, local and regional Italian healthcare administrative databases have collected information from all patient medical records from public and private hospitals including demographics, hospital admission and discharge dates, vital statistics, the admitting hospital department, the principal diagnosis and a maximum of five secondary discharge diagnoses, and the principal and five secondary, surgical and diagnostic procedures. In addition, these databases contain the records of all drug prescriptions listed in the National Drug Formulary and the basic characteristics of patients' physicians. Each resident has a unique national identification code with which it is possible to link the various types of information, corresponding to each person, within the database. In Italy, healthcare assistance is covered almost entirely by the Italian National Health System (NHS); therefore, most residents' significant healthcare information can be found within the healthcare databases. The target administrative databases for the present study will be from the Umbria Region (910 000 residents), Local Health Unit 3 of Napoli (1 170 000 residents) and the Friuli-Venezia Giulia Region (1 227 000 residents). For each database, the corresponding Unit (Regional Health Authority of Umbria for Umbria Region, Registro Tumori Regione Campania for Local Health Unit 3 of Napoli and Centro di Riferimento Oncologico Aviano for Friuli-Venezia Giulia Region) will conduct the same validation process.

Source population

The source population will be represented by permanent residents aged 18 years or above of Umbria Region, Local Health Unit 3 of Napoli and the Friuli-Venezia Giulia Region. Any resident who has been discharged from hospital with a diagnosis of breast, lung or colorectal cancer will be considered. Residents who have been hospitalised outside the regional territory of competence will be excluded from analysis due to the difficulty in obtaining the medical charts.

Case selection and sampling method

In each administrative database, patients with the first occurrence of diagnosis of breast, lung or colorectal cancer between 2012 and 2014 will be identified using the following groups of ICD-9-CM codes located in primary position: (1) 233.0 and (2) 174.x for breast cancer; (3) 162.x for lung cancer; (4) 153.x for colon cancer and (5) 154.0–154.1 and 154.8 for rectal cancer. Only incident cases will be considered, that is, excluding cases with the same diagnosis (ICD-9-CM codes in any position) in the 5 years (2007–2011) before the period of interest. Subsequently, for each of the above reported groups of ICD-9-CM codes, a random sample of cases will be selected from each administrative database. Table 1 displays the description of the ICD-9-CM codes for each of the cancer diseases of interest.
Table 1

Description of the ICD-9-CM codes related to breast, lung and colorectal cancers

ConditionICD-9-CM diagnosis code
BreastCarcinoma in situ: 233.0Malignant neoplasm of the female breast

174.0 nipple and areola

174.1 central portion

174.2 upper-inner quadrant

174.3 lower-inner quadrant

174.4 upper-outer quadrant

174.5 lower-outer quadrant

174.6 axillary tail

174.8 other specified sites of the female breast

174.9 breast female, unspecified

LungMalignant neoplasm of the trachea, bronchus and lung

162.0 Trachea

162.2 main bronchus

162.3 upper lobe, bronchus or lung

162.4 middle lobe, bronchus or lung

162.5 lower lobe, bronchus or lung

162.8 other parts of the bronchus or lung

162.9 bronchus and lung, unspecified

ColorectalMalignant neoplasm of the colon

153.0 hepatic flexure

153.1 transverse colon

153.2 descending colon

153.3 sigmoid colon

153.4 caecum

153.5 appendix

153.6 ascending colon

153.7 splenic flexure

153.8 other specified sites of the large intestine

153.9 colon, unspecified

Malignant neoplasm of the rectum and rectosigmoid junction

154.0 rectosigmoid junction

154.1 rectum

154.8 other

ICD-9-CM, International Classification of Diseases Ninth Revision—Clinical Modification.

Description of the ICD-9-CM codes related to breast, lung and colorectal cancers 174.0 nipple and areola 174.1 central portion 174.2 upper-inner quadrant 174.3 lower-inner quadrant 174.4 upper-outer quadrant 174.5 lower-outer quadrant 174.6 axillary tail 174.8 other specified sites of the female breast 174.9 breast female, unspecified 162.0 Trachea 162.2 main bronchus 162.3 upper lobe, bronchus or lung 162.4 middle lobe, bronchus or lung 162.5 lower lobe, bronchus or lung 162.8 other parts of the bronchus or lung 162.9 bronchus and lung, unspecified 153.0 hepatic flexure 153.1 transverse colon 153.2 descending colon 153.3 sigmoid colon 153.4 caecum 153.5 appendix 153.6 ascending colon 153.7 splenic flexure 153.8 other specified sites of the large intestine 153.9 colon, unspecified 154.0 rectosigmoid junction 154.1 rectum 154.8 other ICD-9-CM, International Classification of Diseases Ninth Revision—Clinical Modification.

Chart abstraction and case ascertainment

The corresponding medical charts of the randomly selected sample cases will be obtained from hospitals for validation purposes. From each medical chart, the following information will be retrieved: initials of the patient, date of birth, sex, dates of hospital admission and discharge, any diagnostic procedure that contributed to the diagnosis of the cancer, any pharmacological or surgical intervention that was provided for the treatment of the cancer. Within each unit, two reviewers will receive training on data abstraction. An initial consensus chart review will be performed with each reviewer independently examining the same number of medical charts (n=20). The inter-rater agreement regarding the presence or absence of breast, lung or colorectal cancer among the pairs of reviewers within each unit will be calculated using the κ statistics. This process will be repeated until the strength of agreement among the pairs of reviewers will be near perfect (κ statistics between 0.81 and 1.00). Any discrepancies will be discussed and resolved through third party involvement (RC). Case ascertainment of cancer within a medical chart will be based on (1) the presence of a primary nodular lesion in the breast, lung or colon–rectum, documented with imaging or endoscopy and (2) the cytological or histological documentation of cancer from a primary or metastatic site. Following consensus review, data abstraction will be completed independently. To ensure consistency among all the reviewers, cases with uncertainty will be discussed and resolved through third party involvement (RC).

Validation criteria

For non-invasive breast cancer, we will consider the ICD-9-CM code 233.0 valid when there is evidence of a breast nodule documented with imaging (eg, mammography) and a histological diagnosis of ductal or lobular breast carcinoma in situ (pTis). For invasive breast cancer, we will consider the ICD-9-CM codes 174.x valid when there is evidence of a breast nodule documented with imaging (eg, mammography) and a cytological or histological diagnosis from a primary or metastatic site positive for ductal or lobular adenocarcinoma. For lung cancer, we will consider the ICD-9-CM codes 162.x valid when there is evidence of a pulmonary nodule documented with imaging (eg, CT scan) and a cytological or histological diagnosis from a primary or metastatic site positive for either small cell lung cancer (microcitoma) or non-small cell lung cancer (NSCLC). For colon cancer, we will consider the ICD-9-CM codes 153.x valid when there is evidence of a neoplastic lesion within the colon, documented with endoscopy (eg, colonoscopy) or imaging (eg, barium enema), and a histological diagnosis from a primary or metastatic site positive for adenocarcinoma, squamous cell carcinoma or neuroendocrine carcinoma. For rectal cancer, we will consider the ICD-9-CM codes 154.0–154.1 and 154.8 valid when there is evidence of a neoplastic lesion in the rectosigmoid junction or the rectum, documented with endoscopy (eg, coloscopy) or imaging (eg, barium enema), and a histological diagnosis from a primary or metastatic site positive for adenocarcinoma or squamous cell carcinoma.

Statistical analysis

We calculated that a sample of 130 charts of cases will be necessary to obtain an expected sensitivity of 80% with a precision of 10% and a power of 80%. For specificity calculation, we will randomly select non-cases, that is, records without the ICD-9-codes of interest from an administrative database. The corresponding medical charts will be retrieved and evaluated. We calculated that a sample of 94 charts of non-cases will be retrieved to obtain an expected specificity of 90% with a precision of 10% and a power of 80%. Overall, each unit will evaluate 1120 charts. Sensitivity and specificity will be analysed separately for each ICD-9-CM code by constructing 2×2 tables. Sensitivity expresses the proportion of ‘true positives’ (ie, cancer cases classified as positive by both the administrative database and medical record review) and all cases deemed positive by the medical chart review. Specificity expresses the proportion of ‘true negatives’ (ie, cases without cancer identified by both the administrative database and medical record review), and with all cases deemed negative by the medical chart review. For both sensitivity and specificity, 95% CIs will be calculated.

Reporting

Complete, transparent and accurate reporting is essential in diagnostic accuracy studies because it allows readers to assess internal validity as well as to evaluate the generalisability and applicability of results.32 To ensure quality reporting, any reporting or publication of the results from this study will follow recommended guidelines based on the criteria published by the Standards for Reporting of Diagnostic accuracy (STARD) initiative for the accurate reporting of investigations of diagnostic studies.32–34

Discussion

In this protocol, we present the approach we will use to analyse the validity of ICD-9-CM codes for breast, lung and colorectal cancers in administrative databases representing northern, central and southern Italy. Administrative databases constitute a valid alternative to situations in which randomised trials are not able to provide the required evidence for practical or economic reasons. In addition, despite epidemiological studies on cancer being frequently based on cancer registries,35–37 administrative databases can add a further value especially on pharmacoepidemiology3 12 38 and health services research.39 40 Accurate identification of cancer cases using the ICD-9-CM codes may contribute to monitoring cancer trends and to proposing interventions to ameliorate cancer care. In 2008, an Italian study developed and validated an algorithm using a regional administrative database to determine incident cases of breast, lung and colorectal cancers and found a sensitivity of76.7%, 80.8% and 72.4%, respectively, for the three cancers.22 This study will add value to the knowledge of the three cancer diseases given that it covers different areas of Italy.

Ethics and dissemination

Study results will be disseminated widely through peer-reviewed publications and presentations at national and international conferences.
  35 in total

Review 1.  A systematic review of discharge coding accuracy.

Authors:  S E Campbell; M K Campbell; J M Grimshaw; A E Walker
Journal:  J Public Health Med       Date:  2001-09

2.  Epidemiology of primary and secondary thrombocytopenia: first analysis of an administrative database in a major Italian institution.

Authors:  Michela Galdarossa; Fabrizio Vianello; Fabiana Tezza; Emanuele Allemand; Martina Treleani; Pamela Scarparo; Fabrizio Fabris
Journal:  Blood Coagul Fibrinolysis       Date:  2012-06       Impact factor: 1.276

3.  Fresh evidence confirms links between newer contraceptive pills and higher risk of venous thromboembolism.

Authors:  Susan S Jick
Journal:  BMJ       Date:  2015-05-26

4.  [The role of the quality of hospital discharge records on the comparative evaluation of outcomes: the example of chronic obstructive pulmonary disease (COPD)].

Authors:  Valeria Fano; Mariangela D'Ovidio; Katiuscia del Zio; Davide Renzi; Daniela Tariciotti; Nera Agabiti; Lucia Argenti; Maria Sofia Cattaruzza; Antonio Fortino
Journal:  Epidemiol Prev       Date:  2012 May-Aug       Impact factor: 1.901

5.  EU-ADR healthcare database network vs. spontaneous reporting system database: preliminary comparison of signal detection.

Authors:  Gianluca Trifirò; Vaishali Patadia; Martijn J Schuemie; Preciosa M Coloma; Rosa Gini; Ron Herings; Julia Hippisley-Cox; Giampiero Mazzaglia; Carlo Giaquinto; Lorenza Scotti; Lars Pedersen; Paul Avillach; Miriam C J M Sturkenboom; Johan van der Lei
Journal:  Stud Health Technol Inform       Date:  2011

6.  Identification of metastatic cancer in claims data.

Authors:  Beth L Nordstrom; Joanna L Whyte; Marilyn Stolar; Catherine Mercaldi; Joel D Kallich
Journal:  Pharmacoepidemiol Drug Saf       Date:  2012-05       Impact factor: 2.890

Review 7.  Towards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative.

Authors:  Patrick M Bossuyt; Johannes B Reitsma; David E Bruns; Constantine A Gatsonis; Paul P Glasziou; Les M Irwig; Jeroen G Lijmer; David Moher; Drummond Rennie; Henrica C W de Vet
Journal:  BMJ       Date:  2003-01-04

8.  Accuracy of ICD-9 codes in identifying ischemic stroke in the General Hospital of Lugo di Romagna (Italy).

Authors:  R Rinaldi; L Vignatelli; M Galeotti; G Azzimondi; P de Carolis
Journal:  Neurol Sci       Date:  2003-06       Impact factor: 3.307

9.  Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors:  Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal:  Int J Cancer       Date:  2014-10-09       Impact factor: 7.396

10.  Chronic disease prevalence from Italian administrative databases in the VALORE project: a validation through comparison of population estimates with general practice databases and national survey.

Authors:  Rosa Gini; Paolo Francesconi; Giampiero Mazzaglia; Iacopo Cricelli; Alessandro Pasqua; Pietro Gallina; Salvatore Brugaletta; Daniele Donato; Andrea Donatini; Alessandro Marini; Carlo Zocchetti; Claudio Cricelli; Gianfranco Damiani; Mariadonata Bellentani; Miriam C J M Sturkenboom; Martijn J Schuemie
Journal:  BMC Public Health       Date:  2013-01-09       Impact factor: 3.295

View more
  12 in total

1.  Clinical and Economic Consequences of Early Cancer After Kidney Transplantation in Contemporary Practice.

Authors:  Vikas R Dharnidharka; Abhijit S Naik; David Axelrod; Mark A Schnitzler; Huiling Xiao; Daniel C Brennan; Dorry L Segev; Henry Randall; Jiajing Chen; Bertram Kasiske; Krista L Lentine
Journal:  Transplantation       Date:  2017-04       Impact factor: 4.939

2.  Validity of peptic ulcer disease and upper gastrointestinal bleeding diagnoses in administrative databases: a systematic review protocol.

Authors:  Alessandro Montedori; Iosief Abraha; Carlos Chiatti; Francesco Cozzolino; Massimiliano Orso; Maria Laura Luchetta; Joseph M Rimland; Giuseppe Ambrosio
Journal:  BMJ Open       Date:  2016-09-15       Impact factor: 2.692

3.  Protocol for validating cardiovascular and cerebrovascular ICD-9-CM codes in healthcare administrative databases: the Umbria Data Value Project.

Authors:  Francesco Cozzolino; Iosief Abraha; Massimiliano Orso; Anna Mengoni; Maria Francesca Cerasa; Paolo Eusebi; Giuseppe Ambrosio; Alessandro Montedori
Journal:  BMJ Open       Date:  2017-03-29       Impact factor: 2.692

4.  Validating malignant melanoma ICD-9-CM codes in Umbria, ASL Napoli 3 Sud and Friuli Venezia Giulia administrative healthcare databases: a diagnostic accuracy study.

Authors:  Massimiliano Orso; Diego Serraino; Iosief Abraha; Mario Fusco; Gianni Giovannini; Paola Casucci; Francesco Cozzolino; Annalisa Granata; Michele Gobbato; Fabrizio Stracci; Valerio Ciullo; Maria Francesca Vitale; Paolo Eusebi; Walter Orlandi; Alessandro Montedori; Ettore Bidoli
Journal:  BMJ Open       Date:  2018-04-20       Impact factor: 2.692

5.  FasTag: Automatic text classification of unstructured medical narratives.

Authors:  Guhan Ram Venkataraman; Arturo Lopez Pineda; Oliver J Bear Don't Walk Iv; Ashley M Zehnder; Sandeep Ayyar; Rodney L Page; Carlos D Bustamante; Manuel A Rivas
Journal:  PLoS One       Date:  2020-06-22       Impact factor: 3.240

6.  Validation of chronic obstructive pulmonary disease (COPD) diagnoses in healthcare databases: a systematic review protocol.

Authors:  Joseph M Rimland; Iosief Abraha; Maria Laura Luchetta; Francesco Cozzolino; Massimiliano Orso; Antonio Cherubini; Giuseppina Dell'Aquila; Carlos Chiatti; Giuseppe Ambrosio; Alessandro Montedori
Journal:  BMJ Open       Date:  2016-06-01       Impact factor: 2.692

7.  Validation of claims-based algorithms for pulmonary arterial hypertension.

Authors:  Ravikanth Papani; Gulshan Sharma; Amitesh Agarwal; Sean J Callahan; Winston J Chan; Yong-Fang Kuo; Yun M Shim; Andrew D Mihalek; Alexander G Duarte
Journal:  Pulm Circ       Date:  2018 Apr-Jun       Impact factor: 3.017

8.  Accuracy of lung cancer ICD-9-CM codes in Umbria, Napoli 3 Sud and Friuli Venezia Giulia administrative healthcare databases: a diagnostic accuracy study.

Authors:  Alessandro Montedori; Ettore Bidoli; Diego Serraino; Mario Fusco; Gianni Giovannini; Paola Casucci; David Franchini; Annalisa Granata; Valerio Ciullo; Maria Francesca Vitale; Michele Gobbato; Rita Chiari; Francesco Cozzolino; Massimiliano Orso; Walter Orlandi; Iosief Abraha
Journal:  BMJ Open       Date:  2018-05-17       Impact factor: 2.692

9.  Sensitivity and specificity of breast cancer ICD-9-CM codes in three Italian administrative healthcare databases: a diagnostic accuracy study.

Authors:  Iosief Abraha; Diego Serraino; Alessandro Montedori; Mario Fusco; Gianni Giovannini; Paola Casucci; Francesco Cozzolino; Massimiliano Orso; Annalisa Granata; Marcello De Giorgi; Paolo Collarile; Rita Chiari; Jennifer Foglietta; Maria Francesca Vitale; Fabrizio Stracci; Walter Orlandi; Ettore Bidoli
Journal:  BMJ Open       Date:  2018-07-23       Impact factor: 2.692

10.  Accuracy of administrative databases in detecting primary breast cancer diagnoses: a systematic review.

Authors:  Iosief Abraha; Alessandro Montedori; Diego Serraino; Massimiliano Orso; Gianni Giovannini; Valeria Scotti; Annalisa Granata; Francesco Cozzolino; Mario Fusco; Ettore Bidoli
Journal:  BMJ Open       Date:  2018-07-23       Impact factor: 2.692

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.