Literature DB >> 30543626

Bayesian adjustment for trend of colorectal cancer incidence in misclassified registering across Iranian provinces.

Sajad Shojaee1, Nastaran Hajizadeh2, Hadis Najafimehr3, Luca Busani4, Mohamad Amin Pourhoseingholi3, Ahmad Reza Baghestani2, Maryam Nasserinejad2, Sara Ashtari1, Mohammad Reza Zali3.   

Abstract

Misclassification error is a common problem of cancer registries in developing countries that leads to biased cancer rates. The purpose of this research is to use Bayesian method for correcting misclassification in registered cancer incidence of eighteen provinces in Iran. Incidence data of patients with colorectal cancer were extracted from Iranian annual of national cancer registration reports from 2005 to 2008. A province with proper medical facilities can always be compared to its neighbors. Almost 28% of the misclassification was estimated between the province of East Azarbaijan and West Azarbaijan, 56% between Fars and Hormozgan, 43% between Isfahan and Charmahal and Bakhtyari, 46% between Isfahan and Lorestan, 58% between Razavi Khorasan and North Khorasan, 50% between Razavi Khorasan and South Khorasan, 74% between Razavi Khorasan and Sistan and Balochestan, 43% between Mazandaran and Golestan, 37% between Tehran and Qazvin, 45% between Tehran and Markazi, 42% between Tehran and Qom, 47% between Tehran and Zanjan. Correcting the regional misclassification and obtaining the correct rates of cancer incidence in different regions is necessary for making cancer control and prevention programs and in healthcare resource allocation.

Entities:  

Mesh:

Year:  2018        PMID: 30543626      PMCID: PMC6292591          DOI: 10.1371/journal.pone.0199273

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Colorectal cancer (CRC) is the third most common cancer among men (10.0% of the total) and the second in women (9.2% of the total) worldwide. Mortality is lower (694,000 deaths, 8.5% of the total) with more deaths (52%) in the less developed regions of the world, reflecting a poorer survival in these regions [1]. In Iran, CRC is the fourth most common type of cancer (the third most common cancer among females and the fifth among males), which accounts for 8.4% of total cancers in the country [2,3]. There is wide geographical variation in incidence across the world; the highest estimated rates is in Australia/New Zealand, and the lowest is in Western Africa. About 55% of the cases take place in more developed regions. Clearly, it is partly because of their advanced diagnostic and registration capabilities [1]. Inflammatory bowel disease, family history of CRC, obesity, dietary habits, smoking, physical inactivity [2,4], and diabetes [5] are well-known risk factors for CRC. Furthermore, environmental risk factors are found to play an important role in the incidence and development of CRC [4]. Therefore, people who live in the same or adjacent areas which are imposed on the same environmental risk factors are expected to have similar cancer incidence rates. The population-based and accurate information on the occurrence of the cancer is extremely valuable as the foundation for identifying risk factors and making purposeful cancer prevention policies, because it is a leading cause of morbidity and mortality worldwide [6-8]. Cancer registries as the main sources of epidemiological data, collect information regarding the burden of cancers by recording the incidence, prevalence, survival and mortality of different cancers in a systematic manner [9-11]. Nowadays, their role has expanded into detecting the impact of interventions for cancer control, evaluation of screening programs, and specifying future needs for materials and manpower resources. However the existence of deficiencies in registering individual’s information including patient’s permanent residence, primary site of tumor, date of diagnosis, and date of death [8], makes the recorded data inaccurate to use in future studies. In many developing countries like Iran, most cancer patients prefer to get diagnostic and medical treatment services in the capital or in their neighboring provinces, since health facilities are not distributed evenly throughout the country. [12]. Some patients never mention their permanent residence and are registered in those provinces. It causes misclassification error in cancer registry data. Misclassification error is the disagreement between the observed value and the true value in categorical data. The expected coverage of new cancer cases in different provinces can be mentioned as the evidence of existence of misclassification error in registering cancer incidence. The observed number of incidence is more than the expected number in some provinces, and on the other hand, it is less than expected in a neighboring province [13]. It occurs while it is expected that the rate of cancer incidence to be about the same in adjacent provinces; since people adopt very similar lifestyle and traditions and are exposed to same environmental conditions. There are two approaches in correcting the misclassification error; the first approach is validating a small sample of data with rechecking medical records and extending the results to the target population [14]. The second approach is employing the Bayesian method. Bayesian method is a statistical approach that let us take our prior evidence into account [15] with determining prior information for some of the parameters [16-18]. The aim of this study is to investigate the trend of colorectal cancer provinces in Iran after estimating the misclassification rate in registering cancer incidence by using Bayesian method and re-estimating the incidence rate in each province.

Material and methods

Registering of cancer reports is obtainable from different references such as pathologies, hospitals, death certificates and etc. National registration programming of cancer cases from Iranian annual of national cancer registration report is extracted during 2005 to 2008 with software which was created by health ministry, until cancer cases are collected, registered and centralized for the past couple of years and is used for data analyses. Hence all new diagnosed cancer cases in temporary information bank are sent from medical universities to ministry of health periodically. After process of duplicating and coding the recorded cancers based on 10th revision of international coding of disease, this information is registered in permanent information bank and all changes are sent to medical universities on specific duration, until permanent information bank of medical universities is equalized with permanent information bank of health ministry. So each medical university has an observed number of cancer cases and also has an expected coverage of cancer cases that are considered to be 100 per 100000 except 2008 that was 113 per 100000. By dividing the observed number to the expected number of cancer cases, the percent of expected coverage for each province is calculated [13]. Earlier this year, the national population-based cancer registry of the Islamic Republic of Iran was established, the International Agency for Research on Cancer (IARC) accepted Iran as a new Participating State and these registry data have been submitted to IARC to contribute to the next publications of GLOBOCAN and Cancer Incidence in Five Continents [19]. Since comparison of simple crude rate i.e. comparison of all cancer cases could make false images in total population regardless of age groups, age standardized rates (ASR) is calculated for all provinces of Iran using direct standardization method. The direct method for all provinces of Iran is based on, first selecting a criterion for the population and then calculating the desired outcome rate of this population using age specified rates at each of the two societies. At first, age groups were considered at level of 5 years. World standard population is the most common used standard population (W). By dividing number of incident cases to person-years of observations, ASR is calculated per 100000 (). Finally for 4 age groups(0–14 years, 15–49 years, 50–69 years and over than70 years old) and for both genders, ASR is calculated in order to compare statistics on cancer internationally(ASR = ∑(W×a)) [20-22]. For entering the data to the Bayesian model two vectors Y1 and Y2 were used. Vector Y1 = [Y11,…,Y] for the province that has an expected coverage less than 100% with exact ASR and vector Y2 = [Y12,…,Y] for a neighboring province with a more than 100% expected coverage with ASR from the first group incorrectly labeled as being in the misclassified group. Subscript r is the number of covariate patterns for age and sex group combinations. A Poisson distribution was considered for count data Y1 and Y2 which first introduced by Stamey et al [20], then developed by Pourhoseingholi et al for mortality of cancer and also adopted by Hajizadeh et al for Iranian cancer incidence [22-24]. Y1 = Poisson(Pμ) and Y2 = Poisson(Pμ) in which μ = λ(1−θ) and μ = λθ+λ and the joint distribution of the count data Y1 and Y2 is proportional to: An informative beta prior distribution was assumed for θ as the probability of a data from the first group incorrectly registered in the misclassified group; so θ~Beta(a,b). For selecting prior value for the parameters of beta distribution, the calculated expected coverage for the medical university which has a less than 100% expected coverage was used as b and a was calculated with subtracting b from 100. Thus a/(a + b) which is the expectation of beta distribution converges to the misclassified rate. Variable U with binomial distribution, i.e. U|Y1,Y2,θ,λ1,λ2~Binomial(Y,P) that was considered as the number of events from the first group that are incorrectly registered in the misclassified group. Now if θ, Y1, Y2 to be unknown; we have: But since Y1, Y2 have known values of ASR on two neighboring provinces, then just theta is unknown and with employing a latent variable approach to correct the misclassification effect according to Paulino et al. [25,26], Liu et al. [27] and Stamey et al. [20] using a Gibbs sampling algorithm, the posterior appears in the following form: To determine the low facilitated provinces and the one to adjusted, a province with low-facility provinces (usually adjacent) with a coverage of less than 100 is considered, so that the province with a coverage of over 100 (almost in neighborhood) is adjusted. The low-facility was based on the local annual statistic report of unemployment, average income, etc. After estimating the misclassification rate between each two neighboring provinces, the rates of colorectal cancer incidence for each province were re-estimated and the trend of colorectal cancer were carried out during 2005 to 2008. In order to perform the analyses the R software version 3.3.1 was used.

Results

Registered cases of colorectal cancer have been included in the study for all provinces in Iran from 2005 to 2008. ASR of CRC incidence for men was 8.02 per 100,000 population (2255 cases) in 2005, whereas that year for women 7.4 per 100,000 (1801 cases). In over time, ASR of CRC incidence for men reached 12.7 per 100,000 population (3527 cases) in 2008 and for women to 11.12 per 100,000 (2658 cases) in the same year. The trend of CRC from 2005 to 2008 for both sexes is shown in Fig 1.
Fig 1

Age standardized rate of colorectal cancer incidence and its trend for male and female in Iran (2005–2008).

Among the 30 provinces, 18 provinces in which the number of cancer cases varied from their expected number were selected based on the percentage of expected cancer coverage, to correct the misclassification error in the registered data of neighboring provinces. For example, the reported percentage of CRC expected coverage for Fars province as a province with suitable medical facilities and services was 120.8% in 2008. it means that Fars province have covered 20.8% of the new cases more than expected, while Hormozgan, which is adjacent to Fars, has a 19% expected coverage of cancer incidence, indicating clear misclassification in registering cancer cases. The expected coverage for all provinces in Iran between 2005 and 2008 is reported in Table 1. Also the estimated misclassification rate for all provinces in 2005–2008 is reported in Table 2.
Table 1

Expected coverage of cancer cases in provinces of Iran (2005–2008).

2005200620072008
East Azarbayejan108.2110.9138.5123.6
Isfahan113.9116.2119.6107.5
Razavi Khorasan114.1109.11120.9155.5
Tehran157.11162.25145.7155.6
Fars84.2119.4143.3120.8
Mazandaran777876.1102.1
West Azarbayejan81.975.382.569
Hormozgan25.425.1125.319
Chaharmahal40.734.340.738
Lorestan40.241.547.176
North Khorasan30.840.444.834.8
South Khorasan30.345.141.0241.4
Sistan27.219.118.8519.5
Golestan50.758.658.250.8
Qazvin65.171.472.866.3
Markazi43.353.0657.469.6
Qom38.662.760.953.9
zanjan52.948.554.346.4
Table 2

Bayesian estimated from misclassification rate between provinces (2005–2008).

Estimated misclassification rate
2005200620072008
East AzarbayejanWest Azarbayejan0.190.190.30.43
FarsHormozgan0.450.610.580.58
IsfahanChaharmahal0.440.430.380.47
IsfahanLorestan0.520.510.510.32
Razavi KhorasanNorth Khorasan0.570.660.530.55
Razavi KhorasanSouth Khorasan0.60.30.560.55
Razavi KhorasanSistan0.730.740.76
MazandaranGolestan0.440.530.340.43
TehranQazvin0.330.360.360.43
TehranMarkazi0.490.440.410.45
TehranQom0.480.410.330.45
Tehranzanjan0.450.480.350.58
For example by using the Bayesian method, misclassification rate was estimated 58% between Fars and Hormozgan in 2008. So, after Bayesian correction, ASR and number of cancer incidence decrease for Fars province and increase for Hormozgan province. ASR and number of cancer incidence, before and after Bayesian correction from 2005 to 2008 are reported in Tables 3 and 4.
Table 3

Age standardized rate of colorectal cancer incidence before and after Bayesian correction in Iranian provinces 2005–2008.

before Bayesian correctionafter Bayesian correction
20052006200720082005200620072008
East Azarbayejan3.523.2610.6814.152.511.958.6011.15
Isfahan8.199.269.059.544.745.005.784.14
Razavi Khorasan6.109.4411.0412.963.535.327.049.65
Tehran9.2610.787.5611.607.309.213.556.64
Fars5.106.659.9618.713.305.288.3916.59
Mazandaran9.0310.0810.3612.766.306.2013.329.21
West Azarbayejan5.316.467.596.306.548.0910.3510.23
Hormozgan3.463.004.194.999.5910.2913.8020.23
Chaharmahal6.007.8310.157.5512.4917.6519.6316.88
Lorestan3.964.525.459.529.0810.0711.3513.53
North Khorasan3.001.703.205.028.554.486.9912.94
South Khorasan2.482.892.884.407.394.816.8110.24
Sistan1.792.101.751.866.2010.138.629.09
Golestan5.447.699.127.7310.1614.6514.4514.27
Qazvin7.926.665.809.2311.9310.028.6715.22
Markazi4.886.196.167.5810.4011.3210.5612.48
Qom5.666.069.069.4912.7010.0213.9717.42
zanjan5.435.389.215.2310.0510.7015.1511.77
Table 4

Number of colorectal cancer incidence and the percent of change before and after Bayesian correction in Iranian provinces 2005–2008.

before Bayesian correctionafter Bayesian correction
20052006200720082005200620072008
East Azarbayejan95852933796851236299
Isfahan268312279302155168178131
Razavi Khorasan307396377432178223241322
Tehran8191034315481646883148276
Fars16621295118701081328011658
Mazandaran192214217274134132279198
West Azarbayejan118135157129145169214209
Hormozgan3333445691113145227
Chaharmahal4146655085104126112
Lorestan537070115122156146163
North Khorasan1910183154263980
South Khorasan911121827182842
Sistan31393334107188163167
Golestan679110690125173168166
Qazvin59574875898672124
Markazi49656376104119108125
Qom444772749978111136
zanjan39386642727610995

Discussion

It is obvious that the neighboring provinces due to the same eating habits, lifestyle and climate, have the same health outcomes [13]. But sometimes when analyzing registered data, it is observed that the neighboring provinces not only have different outcomes but are also inconsistent. This situation implies that there is misclassification in registered data. This problem is a notable matter in medicine which may results in deflecting of health programing and health resources allocation [28]. Such deflection could make irrecoverable damage on national scale. The aim of the present study was to help reducing misclassification error in registered colorectal cancer data in Iran. Firstly, the means in accessing health resources in welfare provinces and secondly the lack of health facilities in their neighboring provinces are elements in creating misclassification error. Fortunately, some studies have been conducted in Iran in order to eliminate the misclassification errors for mortality and morbidity registered cancer data in the case of Liver [29], Gastric [22], and colorectal cancers [23]. Applying the results of the studies above may be more reliable, since they had re-estimated and produced valid data. According to the result of our research, there was a non-ignorable estimated misclassification rate among adjacent provinces. The highest estimated misclassification parameter, belongs to North Khorasan, Hormozgan, and Sistan which are in the east and south of Iran. So the real rates of CRC in those provinces are higher than the rates that are reported by cancer registry system. On the contrary, in studies that use cancer registry data and ignore the existence of misclassification error, it is reported that the highest incidence rates of CRC in Iran were found in the central, northern, and western provinces; and the southwest provinces of Iran had the lowest incidence rates of CRC in the country! [2]. Therefore, ignoring the misclassification error in registered data, leads to a wrong image of distribution of CRC incidence across the country. Expected cancer coverage revealed that from 30 provinces, 18 provinces need misclassification correction. These provinces are those which are different in economic situation and there are some points in them which are welfare and probably patients for better health care, refers to those welfare places, so they have more referring people than their capacity. On the other hand, some provinces due to their fewer facilities, have fewer referring patients. Table 2 is indicating how the data of some provinces are registered in their adjacent locations. For example, some neighboring Tehrani patients like Qom’s patients, were referred to Tehran. Identifying the exact distribution of a disease in different areas is a suitable way for finding the geographic pattern of the disease and causations, assessing the influential factors on disease incidence [30,31], and quantifying the potentials for disease control and prevention [32,33]. However, spatial analysis is usually deployed for this purpose which is based on registered data while existence of misclassification is often ignored. In spatial analysis, the morbidity or mortality rates for each province are combined with local information for the same province and the result may lead to an integrated geographical map. This type of maps is helpful for comparing among different provinces in aspect of disease incidence rate or probable risk factors [34]. In order to achieve this goal, we have prepared geographical map to evaluate incidence distribution of colorectal cancer registered data for before and after misclassification correction in Fig 2. Fig 2 showed that after correction the southern provinces have high incidence rate, while in the previous studies that ignored misclassification, southern provinces had low incidence rate [35].
Fig 2

Distribution of colorectal cancer incidence in Iran before and after misclassification error correction since 2005 to 2009.

(a) 2005, (b) 2006, (c) 2007, (d) 2008.

Distribution of colorectal cancer incidence in Iran before and after misclassification error correction since 2005 to 2009.

(a) 2005, (b) 2006, (c) 2007, (d) 2008. The maps of present study also revealed that considerable changes happened in some provinces respect to before correction status. Thus, there are major differences in the incidence of CRC, while it is expected that the incidence of cancer to be the same in adjacent provinces. This can be justified by existence of misclassification error in registering permanent address of patients who are diagnosed in neighboring facilitate provinces. It leads to overestimation of CRC rate in some provinces and underestimation of its rate in some neighboring provinces. For future researches, to recognize high risk spatial clusters, using our colorectal cancer valid data, is suggested. Also we could comparison and validate the information from this study in misclassification, with random sample of provinces in the future studies. In conclusion, proper planning for cancer control and prevention, and allocating healthcare facilities to different areas, requires an increase in the quality and accuracy of registering system in different provinces and the correction of the existing deficiencies especially misclassification error in registering patient’s permanent residence. The hardware and software resources need to be enhanced, more educated staff need to be trained in different sectors of cancer registry program, and the opinions of expert researchers in medicine, biostatistics and epidemiology need to be implemented [36]. In the absence of valid data, Bayesian method can be adopted as a fast and cost effective method to correct the regional misclassification error.

The original data of colorectal cancer incidence for both sexes, and age group, and calculated ASR, for all Iranian provinces which included in this study, 2005–2008.

(RAR) Click here for additional data file.
  28 in total

1.  Binomial regression with misclassification.

Authors:  Carlos Daniel Paulino; Paulo Soares; John Neuhaus
Journal:  Biometrics       Date:  2003-09       Impact factor: 2.571

2.  A note on estimating crude odds ratios in case-control studies with differentially misclassified exposure.

Authors:  Robert H Lyles
Journal:  Biometrics       Date:  2002-12       Impact factor: 2.571

3.  Poisson regression with misclassified counts: application to cervical cancer.

Authors:  A S Whittemore; G Gong
Journal:  J R Stat Soc Ser C Appl Stat       Date:  1991       Impact factor: 1.864

4.  Multivariate disease mapping of seven prevalent cancers in Iran using a shared component model.

Authors:  Behzad Mahaki; Yadollah Mehrabi; Amir Kavousi; Mohammad E Akbari; Thomas Waldhoer; Volker J Schmid; Mehdi Yaseri
Journal:  Asian Pac J Cancer Prev       Date:  2011

Review 5.  The evolution of the population-based cancer registry.

Authors:  Donald M Parkin
Journal:  Nat Rev Cancer       Date:  2006-08       Impact factor: 60.716

Review 6.  Review of cancer registration and cancer data in Iran, a historical prospect.

Authors:  Mohammad Ali Mohagheghi; Alireza Mosavi-Jarrahi
Journal:  Asian Pac J Cancer Prev       Date:  2010

7.  Cancer registry databases: an overview of techniques of statistical analysis and impact on cancer epidemiology.

Authors:  Ananya Das
Journal:  Methods Mol Biol       Date:  2009

8.  Data quality and quality control of a population-based cancer registry. Experience in Finland.

Authors:  L Teppo; E Pukkala; M Lehtonen
Journal:  Acta Oncol       Date:  1994       Impact factor: 4.089

9.  Meta-analyses of colorectal cancer risk factors.

Authors:  Constance M Johnson; Caimiao Wei; Joe E Ensor; Derek J Smolenski; Christopher I Amos; Bernard Levin; Donald A Berry
Journal:  Cancer Causes Control       Date:  2013-04-06       Impact factor: 2.506

10.  Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Authors:  Jacques Ferlay; Isabelle Soerjomataram; Rajesh Dikshit; Sultan Eser; Colin Mathers; Marise Rebelo; Donald Maxwell Parkin; David Forman; Freddie Bray
Journal:  Int J Cancer       Date:  2014-10-09       Impact factor: 7.396

View more
  1 in total

1.  The spatial distribution of colorectal cancer relative risk in Iran: a nationwide spatial study.

Authors:  Mohamad Amin Pourhoseingholi; Hadis Najafimehr; Amir Kavousi; Leila Pasharavesh; Binazir Khanabadi
Journal:  Gastroenterol Hepatol Bed Bench       Date:  2020
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.