Literature DB >> 33035119

Fine Particulate Matter Exposure and Cancer Incidence: Analysis of SEER Cancer Registry Data from 1992-2016.

Nathan C Coleman¹, Richard T Burnett², Majid Ezzati³, Julian D Marshall⁴, Allen L Robinson⁵, C Arden Pope¹.

Abstract

BACKGROUND: Previous research has identified an association between fine particulate matter (PM2.5) air pollution and lung cancer. Most of the evidence for this association, however, is based on research using lung cancer mortality, not incidence. Research that examines potential associations between PM2.5 and incidence of non-lung cancers is limited.
OBJECTIVES: The primary purpose of this study was to evaluate the association between the incidence of cancer and exposure to PM2.5 using >8.5 million cases of cancer incidences from U.S. registries. Secondary objectives include evaluating the sensitivity of the associations to model selection, spatial control, and latency period as well as estimating the exposure-response relationship for several cancer types.
METHODS: Surveillance, Epidemiology, and End Results (SEER) program data were used to calculate incidence rates for various cancer types in 607 U.S. counties. County-level PM2.5 concentrations were estimated using integrated empirical geographic regression models. Flexible semi-nonparametric regression models were used to estimate associations between PM2.5 and cancer incidence for selected cancers while controlling for important county-level covariates. Primary time-independent models using average incidence rates from 1992-2016 and average PM2.5 from 1988-2015 were estimated. In addition, time-varying models using annual incidence rates from 2002-2011 and lagged moving averages of annual estimates for PM2.5 were also estimated.
RESULTS: The incidences of all cancer and lung cancer were consistently associated with PM2.5. The incident rate ratios (IRRs), per 10-μg/m3 increase in PM2.5, for all and lung cancer were 1.09 (95% CI: 1.03, 1.14) and 1.19 (95% CI: 1.09, 1.30), respectively. Less robust associations were observed with oral, rectal, liver, skin, breast, and kidney cancers. DISCUSSION: Exposure to PM2.5 air pollution contributes to lung cancer incidence and is potentially associated with non-lung cancer incidence. https://doi.org/10.1289/EHP7246.

Entities: Chemical Disease Gene Species

Mesh：

Substances：

Year: 2020 PMID： 33035119 PMCID： PMC7546438 DOI： 10.1289/EHP7246

Source DB: PubMed Journal: Environ Health Perspect ISSN： 0091-6765 Impact factor: 9.031

Introduction

Toxicology research indicates that the carcinogenic compounds contained in fine particulate matter (; particles in aerodynamic diameter) contribute to chronic systemic inflammation (Loomis et al. 2013), oxidative stress (Risom et al. 2005), and DNA damage (Newby et al. 2015) in the lungs. Furthermore, extensive epidemiological evidence indicates that is associated with lung cancer mortality (Crouse et al. 2015; Yin et al. 2017; Lepeule et al. 2012; Turner et al. 2011; Pope et al. 2019). For example, a recent meta-analysis estimated the hazard ratio (HR) for the association between and lung cancer to be 1.14 [95% confidence interval (CI): 1.08, 1.21] (Pope et al. 2020). Much of the epidemiological evidence to support this association, however, is based on prospective cohort studies that examined lung cancer mortality, not lung cancer incidence. Although several recent studies have used incidence data to estimate the association between and lung cancer (IARC 2013; Bai et al. 2020; Zhang et al. 2020), further research is needed to confirm the association and examine the sensitivity of the results to modeling choices and exposure windows. In addition to lung cancer, several cohort studies have found limited evidence of an association between mortality and incidence of various non-lung cancers and air pollution (Coleman et al. 2020; Turner et al. 2017; Wong et al. 2016; Ancona et al. 2015; Raaschou-Nielsen et al. 2011). However, these studies were inconsistent in their findings and often limited by small sample size. Furthermore, the use of mortality follow-up is insufficient to address the effect of air pollution on burden of disease for cancer because of the difficulty in addressing the problem of latency, accurately analyzing cancers that are highly survivable, and the possible confounding from mortality of other causes. Further evidence using cancer incidence data instead of mortality contributes to evaluating whether non-lung cancer sites are associated with exposure to . The primary purpose of the present study was to evaluate the association between the incidence of cancer and exposure to , using available cancer incidence data from U.S. cancer registries. Secondary objectives included evaluating the sensitivity of the associations to various lag structures and exposure windows, exploring the sensitivity of results to modeling assumptions, and evaluating potential nonlinearities in the exposure–response relationship for various types of cancers.

Methods

Cancer Incidence Data

The U.S. National Cancer Institute’s (NCI) Surveillance, Epidemiology, and End Results (SEER) program contains all cancer cases across cancer registries that cover approximately 34.6% of the United States (NCI 2019b). The SEER program contains individual-level cancer incidence from 1975–2016 collected from cancer registries located in California, Connecticut, Detroit, Georgia, Iowa, Kentucky, Louisiana, New Jersey, New Mexico, Seattle (Puget Sound), and Utah (NCI 2019b). A detailed description of the location of registries is contained in Table S1. These data are publicly available but require a signed SEER research data use agreement (NCI 2019a). County-level incidence rates were calculated from the SEER program’s cancer case data to estimate the association between and cancer incidence. First, cancer cases were totaled for every county-year and grouped by the International Statistical Classification of Diseases and Related Health Problems, Tenth Revision (ICD-10; WHO 2016) codes as follows: oral and oropharyngeal (defined by ICD-10 Codes C00–C14), esophageal (C15), stomach (C16), small intestine (C17), colon (C18), rectal (C19–C21), liver and biliary tract (C22–C24), pancreatic (C25), nose (C30–C31), laryngeal and trachea (C32–C33), lung and bronchus (C34), bone (C40–C41), skin (C43–C44), connective and soft tissue (C45–C49), breast (C50), cervical (C53), uterine (C54–C55), ovarian (C56), prostate (C61), other male (C60, C62–C63), kidney (C64–C65), bladder (C67), brain (C71), endocrine (C73–C75), and ill-defined cancers (C76–C80). Next, yearly cancer incidence rates per 100,000 for each county were calculated by dividing by yearly population data (provided by the SEER program via the U.S. Census) and multiplying by 100,000 for every cancer type (NCI 2019d). For the primary analysis, the yearly cancer incidence rates were averaged for each county from 1992–2016 to allow for harmonizing several key variables and for use in a time-independent model. Average incidence rates from 2008–2016 were also calculated for use in a latency sensitivity analysis. After removing counties that were missing estimates or other covariates, 607 counties remained. In addition to the time-independent analysis, cancer incidence data was also used to generate annual-average incidence rates at the county-year level in the 607 counties contained in the SEER program data for a time-varying model. Due to covariate limitations, only incidence data from 2002–2011 were available.

Air Pollution Exposure

Regulatory monitoring data for was collected nationwide starting in 1999. These regulatory data, within an integrated empirical geographic regression modeling framework, were used to generate county-level annual-average concentrations for 1999–2015. Hold-out cross-validation (CV) indicated good model performance (10-fold : 0.78, 0.90). More details describing this approach is found elsewhere (Pope et al. 2019; Kim et al. 2020). All annual estimates for are available at the Center for Air, Climate, and Energy Solutions’ website (https://www.caces.us/). In order to better account for the lagged effect of on cancer incidence, backcasted estimates for 1988–1998 were also calculated. The estimated concentration in each county from 1988–1998 was multiplied by the county’s mean to ratio from 1999–2003 to generate estimates of the concentration in each county from 1988–1998 (Pope et al. 2019). The mean concentrations for 1999–2015 and 1988–2015 were highly correlated (). average exposure from 1999–2015 and from 1988–2015 were linked to average cancer incidence rates from 1992–2016 by county for use in the primary time-independent model. For the latency sensitivity analysis average exposures from 1988–2007 were linked to cancer incidence rates from 2008–2016 to allow for a lag period. Finally, for the time-varying model, 1-, 5-, 10-, and 15-y lagged moving averages of were estimated and linked to annual incidence rates in each of the counties.

Additional Covariates

The SEER program provides additional county covariate information collected from the U.S. Census and American Community Survey, including the following: percentage male; percentage white, black, Hispanic, and other race/ethnicity; percentage of the population in each 5-y age group from 0 through 85; educational attainment (percentage to not graduate high school, percentage to graduate high school, and percentage to have some college education); median income (adjusted to 2017 U.S. dollars); median home value and rent; percentage below 150% poverty; percentage unemployed; percentage working class; and percentage of the population of the county living in rural regions of the county [NCI 2019c, 2019d]. The Behavioral Risk Factor Surveillance System and the National Health and Nutrition Examination Survey were used to obtain additional county-level information including percentage smoking (available from 1996–2012) (Dwyer-Lindgren et al. 2014), percentage alcohol consumption (available from 2002–2012) (Dwyer-Lindgren et al. 2015), and percentage physically active and obese (available from 2001–2011) (Dwyer-Lindgren et al. 2013). For the primary time-independent analysis, covariate data were averaged over the available time and linked by county to create a cross-sectional data set. For the latency analysis that used 2008–2016 incidence rate data, only covariate data for years before 2008 were averaged and linked. For the time-varying model, covariate information from 2002–2011 were linked by county-year. In addition, spatial indicator variables for urban vs. rural (classified as urban if more than 50% of a county’s population lived in an urbanized area of people or an urban cluster of between 2,500 and 50,000 people), state, and region (Pacific, West, Midwest, Northeast, or South) were constructed.

Statistical Methods

Flexible semi-nonparametric regression models were used to estimate associations between and cancer incidence for selected cancers while controlling for important county-level covariates [generalized additive model procedure in SAS (version 9.4; SAS Institute, Inc.)]. In the primary analysis, incident rate ratios (IRRs) and 95% CIs (per increase of ) were estimated by regressing the natural logarithm of the average incidence rate for selected cancer types in 607 counties from 1992–2016 on county-level mean concentrations from 1988–2015. Specifically, locally weighted smoothing (LOESS) models with three degrees of freedom (df) were used to flexibly control for possible confounders including percentage of the county in various age buckets; percentage male; percentage white, black, Hispanic, and other; percentage who did not graduate high school, graduated high school, or obtained more education than high school; median income, rent, and home value; percentage below 150% poverty; percentage working class; percentage unemployed; percentage living in a rural area; percentage smokers; percentage alcohol consumption; percentage who are physically active; and percentage of individuals in a county who are obese. Indicator variables for urban vs. rural and state were also included in the model. After estimating the IRRs and nominal , two approaches were used to adjust to account for multiple testing. The first approach was the Holm’s method—which is a common modification of the Bonferroni approach because it adjusts for multiple comparisons by controlling the family-wise error rate and providing a somewhat more powerful approach to multiple significance testing (Hochberg and Benjamini 1990). The nominal for all hypotheses tested are ordered from smallest to largest and given a rank, based on their order (the smallest is given a rank of 1). The Holm-adjusted are the nominal multiplied by the total number of tests minus the rank plus one. The second approach, the false discovery rate (FDR) method, controls for the false discovery rate and is an alternative modification of the Bonferroni approach with more power than the Holm’s method (Benjamini and Hochberg 1995). FDR are obtained by multiplying the nominal by the total number of texts divided by rank order. In addition to the primary analysis, time-varying linear regression models that accounted for changes in air pollution and cancer incidence over time were estimated using county-year–level cancer incidence data from 2002–2011. IRRs (per increase of ) were estimated by regressing the natural logarithm of the yearly incidence rate on mean concentrations for 1-, 5-, 10-, and 15-y lagged moving averages (to explore alternative cancer latency periods). To account for potential correlations within the same counties over time, 95% CIs were based on robust covariance estimators [Taylor series linearization, using SURVEYREG in SAS (version 9.4; SAS Institute, Inc.)]. To flexibly control for general changes in cancer incidence over time, annual indicator variables for each year (2002–2011) were included. Annual values of all other covariables (percentage of the county in various age buckets; percentage male; percentage white, black, Hispanic, and other; percentage who did not graduate high school, graduated high school, or obtained more education than high school; median income, rent, and home value; percentage below 150% poverty; percentage working class; percentage unemployed; percentage living in a rural area; percentage smokers; percentage alcohol consumption; percentage who are physically active; and percentage of individuals in a county who are obese) were also included. In addition, indicator variables for urban/rural and state were included. To determine whether the results were sensitive to modeling choices, the following additional models using the primary model (time-independent) were estimated: a) a LOESS model that used a cross-validated approach to select the number of degrees of freedom; b) a natural smoothing spline model that used a cross-validated approach to select the number of degrees of freedom; c) a LOESS model with 3 df, but without state indicator variables; d) Model 3, but with regional rather than state indicator variables; e) Model 3, but with SEER registry rather than state indicator variables; f) a linear regression assuming a Poisson distribution; g) a linear regression with estimated standard errors using the sandwich method (White 1980) [using the ROBUST option of the Regress Command in STATA (release 16; StataCorp.)]; h) a LOESS model with 3 df that measured exposure as the average exposure from 1999–2015 instead of 1988–2015; and i) a LOESS model with 3 df that used the average county incident rates from 2008–2016 instead of 1992–2016 and exposure from 1988–2007 as well as county averages for all covariates from 1992–2007. Finally, to determine whether the results of the primary model were sensitive to the inclusion of specific covariates, sensitivity analysis was performed by progressively adding control variables into the primary model for selected cancer types. The shapes of the exposure–response curve for and several cancer types were estimated using a LOESS model with 3 df. In addition, the exposure–response curves estimating the association for percentage of a county who identify as smokers and several cancer types were also created using a LOESS model with 3 df. The effect of an increase in the prevalence of smoking in a county on cancer incidence was then compared with the effect of an increase of in a county on cancer incidence.

Results

Figure 1 illustrates the average concentration from 1988–2015 and average cancer incidence from 1992–2016 for counties contained in the SEER database. Additional information regarding counties in the SEER registries (Table S1). The average concentration across the counties contained in the SEER program was and the average incidence rate for all cancer was 588.8/100,000. Table 1 contains the total number of cases for each cancer site in the SEER program database for the primary analysis (SEER program counties from 1992–2016) and for a sensitivity analysis (SEER program counties from 2008–2016). In addition, the average yearly incidence rate is provided for both the primary analysis and the sensitivity analysis for all cancer sites. Table 2 contains the mean and standard deviation for county characteristics that were included in the time-independent (SEER program counties from 1992–2016), latency sensitivity analysis (SEER program counties 1992–2007), and for the time-varying model (SEER program counties 2002–2011).

Figure 1.

Table 1

Summary of the total number of cancer cases in counties covered by the SEER program from 1992–2016 and 2008–2016 as well as the mean and standard deviation of incidence rates per 100,000 across counties.

Cancer	ICD-10 code(s)	Cancer cases (n)		Yearly incidence rate (mean±SD)
Cancer	ICD-10 code(s)	1992–2016	2008–2016	1992–2016	2008–2016
All cancers	C00–C80	8,658,955	4,130,604	588.83±119.07	636.60±130.65
Digestive tract
Oral	C00–C14	214,295	105,500	15.34±4.18	17.21±5.49
Esophagus	C15	77,996	36,654	5.74±2.04	6.26±2.90
Stomach	C16	150,349	67,087	8.27±2.52	8.54±3.35
Small intestine	C17	42,103	22,324	2.90±1.10	3.45±1.68
Colon	C18	599,263	249,664	43.68±13.10	41.08±12.59
Rectum	C19–C21	282,683	129,418	19.50±5.26	19.91±6.14
Liver	C22–C24	185,012	102,183	10.05±2.98	12.79±4.68
Pancreas	C25	208,078	106,060	13.78±3.72	15.75±4.82
Respiratory
Nose	C30–C31	15,186	7,302	0.97±0.55	1.04±0.87
Larynx and trachea	C32–C33	67,281	29,083	5.98±2.53	6.15±3.39
Lung	C34	1,043,065	469,176	85.95±30.06	88.84±33.16
Bone/tissue
Bone	C40–C41	484,403	256,882	33.13±8.67	39.38±11.65
Skin	C43–C44	680,627	372,095	39.23±16.28	50.56±23.17
Soft tissue	C45–C49	80,524	39,761	4.86±1.42	5.25±2.19
Sex-specific^a
Breast	C50	1,473,349	705,738	84.50±17.45	89.80±20.38
Cervix	C53	74,991	31,013	4.70±1.60	4.20±2.10
Uterine	C54–C55	237,965	120,334	7.60±2.30	7.00±3.00
Ovary	C56	126,294	53,269	14.60±4.60	16.50±5.90
Prostate	C61	1,151,454	490,964	74.63±19.08	70.87±19.22
Other male specific	C60, C62–C63	63,570	29,972	3.57±1.29	3.84±2.08
Urinary tract
Kidney	C64–C65	254,706	136,978	18.51±4.92	22.15±6.59
Bladder	C67	346,681	162,991	23.40±7.51	24.92±8.83
Other
Brain	C71	127,898	63,270	8.15±2.14	9.09±3.14
Endocrine	C73–C75	253,243	156,581	14.47±4.07	19.67±6.56
Ill defined	C76–C80	417,939	186,305	27.79±6.78	28.08±8.38

Note: ICD-10, International Statistical Classification of Diseases and Related Health Problems, Tenth Revision; SEER, Surveillance, Epidemiology, and End Results (SEER) program.

Sex-specific cancer incidence rates are calculated using the entire population, not just one sex.

Table 2

Summary of baseline county characteristics ( for continuous variables and percentages for indicator variables) from 1992–2016, 1992–2007 and 2002–2011.

Variable	1992–2016 counties	1992–2007 counties	2002–2011 counties
PM2.5 exposure (y)
1988–2015	11.50±2.60	—	—
1999–2015	10.00±2.20	—	—
1988–2007	—	12.70±3.10	—
PM2.5 moving average (y)
1	—	—	10.16±2.58
5	—	—	10.77±2.69
10	—	—	11.35±2.87
15	—	—	12.00±3.06
Age buckets [y (%)]
0	1.29±0.23	1.22±0.23	1.31±0.26
1–4	5.22±0.81	5.06±0.84	5.27±0.90
5–9	6.76±0.91	6.56±0.99	6.66±0.98
10–14	7.13±0.89	6.66±0.92	7.11±0.93
15–19	7.17±1.02	6.74±1.00	7.28±1.08
20–24	6.47±2.19	6.50±2.17	6.47±2.27
25–29	6.06±1.13	6.07±1.21	5.99±1.20
30–34	6.27±0.88	6.06±0.92	6.10±1.00
35–39	6.62±0.73	5.95±0.77	6.49±0.98
40–44	6.96±0.68	6.22±0.79	7.08±0.95
45–49	7.07±0.66	6.75±0.75	7.43±0.77
50–54	6.85±0.75	7.23±0.77	7.10±0.88
55–59	6.21±0.86	6.97±0.93	6.35±1.08
60–64	5.35±0.98	6.25±1.16	5.29±1.20
65–69	4.46±0.98	5.12±1.19	4.21±1.03
70–74	3.56±0.87	3.81±0.96	3.37±0.88
75–79	2.77±0.76	2.81±0.75	2.70±0.78
80–84	1.99±0.64	2.03±0.63	2.00±0.68
85	1.81±0.74	1.98±0.81	1.77±0.80
Race (%)
White	76.07±20.47	74.25±20.91	75.92±20.47
Black	12.76±16.51	12.99±16.55	12.79±16.55
Hispanic	8.46±13.43	9.71±14.10	8.57±13.49
Other	2.71±5.56	3.04±5.82	2.72±5.60
Sex (%)
Male	49.68±2.01	49.88±2.20	49.72±2.06
Education (%)
No high school	23.59±9.24	26.89±10.37	21.03±9.40
Graduate of high school	34.23±6.30	34.03±6.19	34.68±7.03
More than high school	42.17±12.47	39.08±12.80	44.29±12.94
Income
Median income (2017 adjusted)	37,319±10,992	32,777±9,669	41,196±13,111
Median home value	107,347±73,877	86,868±58,429	124,913±97,211
Median rent	528±171	435±145	585±220
Below 150% poverty (%)	28.21±9.93	27.74±10.33	27.66±9.95
Unemployed (%)	7.36±2.51	6.79±2.60	7.78±3.34
Working class (%)	68.89±5.74	69.82±5.82	68.97±6.48
Health (%)
Smokers	26.13±4.75	26.80±4.65	25.78±5.13
Consume alcohol	44.98±13.69	44.06±14.29	44.83±13.93
Obese (BMI>29)	34.90±4.62	33.31±4.49	35.33±5.25
Physically active	71.45±6.51	71.15±6.77	71.60±6.55
Urban vs. rural (%)
Rural counties	44.06	44.06	44.06
Individuals in rural	54.61±31.99	54.61±31.99	54.59±31.99
Region (%)
Northeast	4.78	4.78	4.79
Midwest	16.80	16.80	16.67
South	56.50	56.50	56.60
Pacific West	11.70	11.70	11.72
Mountain West	10.22	10.22	10.23
State (%)^a
California	9.56	9.56	9.57
Connecticut	1.32	1.32	1.32
Georgia	26.19	26.19	26.24
Iowa	16.31	16.31	16.17
Kentucky	19.77	19.77	19.80
Louisiana	10.54	10.54	10.56
Michigan	0.49	0.49	0.50
New Jersey	3.46	3.46	3.47
New Mexico	5.44	5.44	5.45
Utah	4.78	4.78	4.79
Washington	2.14	2.14	2.15

Note: —, not applicable; BMI, body mass index; , particles in aerodynamic diameter; SD, standard deviation; SEER, Surveillance, Epidemiology, and End Results (SEER) program.

SEER registries cover all cancer cases in each state excluding Michigan and Washington, which are limited to cases in the Detroit and Puget Sound area, respectively. See Table S1 for more detail.

Estimated (A) population-weighted mean (1988–2015) concentrations () and (B) average incidents rate of all cancer for counties in the SEER database. Note: , particles in aerodynamic diameter; SEER, Surveillance, Epidemiology, and End Results (SEER) program. Summary of the total number of cancer cases in counties covered by the SEER program from 1992–2016 and 2008–2016 as well as the mean and standard deviation of incidence rates per 100,000 across counties. Note: ICD-10, International Statistical Classification of Diseases and Related Health Problems, Tenth Revision; SEER, Surveillance, Epidemiology, and End Results (SEER) program. Sex-specific cancer incidence rates are calculated using the entire population, not just one sex. Summary of baseline county characteristics ( for continuous variables and percentages for indicator variables) from 1992–2016, 1992–2007 and 2002–2011. Note: —, not applicable; BMI, body mass index; , particles in aerodynamic diameter; SD, standard deviation; SEER, Surveillance, Epidemiology, and End Results (SEER) program. SEER registries cover all cancer cases in each state excluding Michigan and Washington, which are limited to cases in the Detroit and Puget Sound area, respectively. See Table S1 for more detail. Table 3 contains IRRs and 95% CI estimates for the association between a increase of from 1988–2015 and selected cancer sites. Statistically significant positive associations were found for oral, rectal, liver, lung, skin, and kidney cancers as well as all cancer in aggregate. A borderline statistically significant effect was also found for breast cancer. However, after multiple comparisons adjustments using the Holm’s method, only lung [ (95% CI: 1.09, 1.30)], liver [ (95% CI: 1.11, 1.57)], and all cancer [ (95% CI: 1.03, 1.14)] remained significant at a 0.05 level. Using the less conservative FDR method, significant adverse associations were also observed with skin and kidney cancers.

Table 3

Incident rate ratios and 95% confidence interval [IRR (95% CI)] estimates for the association between cancer incidence from 1992–2016 and an increase of exposure from 1988–2015.

Cancer	LOESS (3 df) [IRR (95% CI)]	Unadjusted p-value	Holm’s method p-value	FDR p-value
All cancer	1.09 (1.03, 1.14)	<0.01	0.04	0.02
Digestive tract
Oral	1.18 (1.03, 1.36)	0.03	0.42	0.09
Esophagus	1.08 (0.88, 1.32)	0.48	1.00	0.69
Stomach	0.96 (0.79, 1.16)	0.68	1.00	0.83
Small intestine	1.13 (0.87, 1.47)	0.35	1.00	0.59
Colon	1.05 (0.96, 1.15)	0.29	1.00	0.54
Rectal	1.15 (1.01, 1.30)	0.03	0.60	0.10
Liver	1.32 (1.11, 1.57)	<0.01	0.04	0.02
Pancreas	0.98 (0.85, 1.12)	0.73	1.00	0.83
Respiratory
Nose	0.57 (0.35, 0.93)	0.03	0.60	0.10
Larynx	1.19 (0.97, 1.46)	0.09	1.00	0.21
Lung	1.19 (1.09, 1.30)	<0.01	<0.01	<0.01
Bone/tissue
Bone	1.03 (0.91, 1.16)	0.67	1.00	0.83
Skin	1.22 (1.06, 1.41)	<0.01	0.15	0.04
Soft tissue	1.06 (0.86, 1.29)	0.60	1.00	0.82
Sex-specific
Breast	1.07 (1.00, 1.16)	0.06	1.00	0.17
Cervix	1.16 (0.93, 1.45)	0.20	1.00	0.43
Uterine	0.99 (0.85, 1.15)	0.87	1.00	0.87
Ovarian	0.98 (0.82, 1.17)	0.81	1.00	0.84
Prostate	0.96 (0.87, 1.06)	0.42	1.00	0.64
Other male	1.12 (0.88, 1.43)	0.36	1.00	0.59
Urinary tract
Kidney	1.21 (1.06, 1.39)	<0.01	0.13	0.04
Bladder	1.05 (0.93, 1.19)	0.77	1.00	0.83
Other
Brain	1.10 (0.93, 1.29)	0.27	1.00	0.54
Endocrine	1.19 (0.98, 1.44)	0.07	1.00	0.18
Ill defined	1.04 (0.94, 1.17)	0.77	1.00	0.83

Note: Adjusted for percentage of the county in various age buckets; percentage male; percentage white, black, Hispanic, and other; percentage who did not graduate high school, graduated high school, or obtained more education than high school; median income, rent, and home value; percentage below 150% poverty; percentage working class; percentage unemployed; percentage living in a rural area; percentage smokers; percentage who consume alcohol; percentage who are physically active; and percentage of individuals in a county who are obese using LOESS models with 3 df. A of 1 indicates a value . df, degrees of freedom; FDR, false discovery rate; LOESS, locally weighted smoothing model.

Incident rate ratios and 95% confidence interval [IRR (95% CI)] estimates for the association between cancer incidence from 1992–2016 and an increase of exposure from 1988–2015. Note: Adjusted for percentage of the county in various age buckets; percentage male; percentage white, black, Hispanic, and other; percentage who did not graduate high school, graduated high school, or obtained more education than high school; median income, rent, and home value; percentage below 150% poverty; percentage working class; percentage unemployed; percentage living in a rural area; percentage smokers; percentage who consume alcohol; percentage who are physically active; and percentage of individuals in a county who are obese using LOESS models with 3 df. A of 1 indicates a value . df, degrees of freedom; FDR, false discovery rate; LOESS, locally weighted smoothing model. Figure 2 compares the IRR estimates for the base model with estimates from time-varying models using various lagged moving average estimates (1-, 5-, 10-, and 15-y) of exposure for all cancers that were nominally significant at a 0.05 level in the primary analysis (all, lung, oral, rectal, liver, skin, breast, and kidney cancers). Numeric results for all cancer types are provided in Table S2. The associations for all, lung, oral, rectal, skin, and breast cancers and were similar for the primary time-independent model and the time-varying model—especially the time-varying models that used the relatively longer-lagged moving average exposure periods (10 or 15 y). However, for liver and kidney cancers, associations were substantially sensitive to these modeling choices.

Figure 2.

Estimated incident rate ratios (95% CIs) associated with a increase of and selected cancer type incidence from 2002–2011 using time-varying models and compared with the base (time-independent) model. Numerical estimates are included in Table S2. Open circles represent that estimates were not statistically significant at a 0.05 level. Diamonds represent the base (time-independent) model. Models adjusted for percentage of the county in various age buckets; percentage male; percentage white, black, Hispanic, and other; percentage who did not graduate high school, graduated high school, or obtained more education than high school; median income, rent, and home value; percentage below 150% poverty; percentage working class; percentage unemployed; percentage living in a rural area; percentage smokers; percentage alcohol consumption; percentage who are physically active; and percentage of individuals in a county who are obese as well as indicator variables for urban/rural, state, and year. The primary (time-independent) model used a LOESS model with 3 df for all covariates. The linear models used linear yearly estimates for all covariates and 1-, 5-, 10-, and 15-y moving average estimates for exposure. The LOESS model was a locally weighted smoothing model with 3 df for all covariates with a 15-y moving average lagged estimate for exposure. Note: CI, confidence interval; df, degrees of freedom; , particles in aerodynamic diameter; LOESS, locally weighted smoothing model. Figure 3 contains a forest plot that illustrates the sensitivity analysis performed on those cancer sites that were statistically significant based on the nominal in the primary model. Numerical results for all cancer sites are provided in Table S3. The results were most statistically robust across modeling choices for lung cancer. All, oral, and skin cancers were largely statistically significant across modeling choices, whereas rectal, liver, breast, and kidney cancers varied substantially across modeling choices. Figure S1 illustrates the sensitivity analysis where covariates were progressively added to the model for the selected cancer types. The estimated IRRs were sensitive to the inclusion of the various levels of covariates. The adverse –lung cancer association was observed in all models and was most strongly affected by controlling for smoking.

Figure 3.

Estimated incident rate ratios and 95% CIs associated with a increase of from 1988–2015 and average selected cancer type incidence in SEER counties from 1992–2016 across various models. Numerical estimates are included in Table S3. Open circles represent that estimates were not statistically significant at a 0.05 level. Diamonds represent the primary time-independent and time-varying models. Models are adjusted for percentage of the county in various age buckets; percentage male; percentage white, black, Hispanic, and other; percentage who did not graduate high school, graduated high school, or obtained more education than high school; median income, rent, and home value; percentage below 150% poverty; percentage working class; percentage unemployed; percentage living in a rural area; percentage smokers; percentage who consume alcohol; percentage who are physically active; and percentage of individuals in a county who are obese as well as indicator variables for urban/rural and state. All models use the average incidence rate from 1992–2016 (primary time-independent model) unless indicated otherwise. The models include the following: the primary (time-independent) model, a LOESS model with 3 df was used for all covariates; a time-varying mode LOESS model with 3 df for all covariates with an additional indicator variable for year that included a 15-y moving average lagged estimate for to estimate exposure for individuals living in SEER counties from 2002–2011; a cross-validated LOESS model for all covariates; a cross-validated spline model for all covariates; a LOESS model with 3 df for all covariates, with the state removed from the model; a LOESS model with 3 df for all covariates, with the state removed from the model and replaced with a region control; a LOESS model with 3 df for all covariates, with the state removed from the model and replaced with a SEER registry control; a linear regression model with only linear terms for the covariates, assuming a Poisson distribution; a linear regression model with only linear terms for the covariates and with the sandwich method used to estimate standard errors; a LOESS model with 3 df for all covariates, with mean exposure from 1999–2015; and a LOESS model with 3 df for all covariates, with mean exposure from 1988–2007 on SEER counties from 2008–2016. Note: CI, confidence interval; df, degrees of freedom; LOESS, locally weighted smoothing model; , particles in aerodynamic diameter; SEER, Surveillance, Epidemiology, and End Results (SEER) program.

Figure 4.

Estimated response relationship between lung cancer incidence and (A) smoking and (B) . Smoking is estimated as the average percentage of the county’s population that identified as smokers from 1996–2012. is measured as the population-weighted average concentration in a county from 1988–2015. A locally weighted smoothing (LOESS) model with 3 df to estimate nonlinearity is used. Note: df, degrees of freedom; , particles in aerodynamic diameter.

Discussion

A growing body of evidence indicates that lung cancer incidence is associated with exposure to (IARC 2013; Bai et al. 2020; Zhang et al. 2020). The present study supports this evidence, with a statistically significant IRR of 1.19 (95% CI: 1.09, 1.30), even after conservatively adjusting for multiple comparisons (). Furthermore, the lung cancer IRR is remarkably robust across modeling choices, spatial controls, and various exposure windows. Although the present study estimates an IRR that is somewhat higher than the estimate in a recent meta-analysis that examined the association between exposure and lung cancer incidence [ (95% CI: 1.03, 1.12)] (Huang et al. 2017), the IRR from the present study is comparable to the meta-analysis mentioned previously for the association between exposure to and lung cancer incidence or mortality [ (95% CI: 1.08, 1.21)] (Pope et al. 2020). Finally, the exposure–response curve provides evidence that although smoking is a much larger risk factor for lung cancer incidence, also contributes to the risk of lung cancer. The results for non-lung cancers are less conclusive. Although statistically significant associations were found for oral, rectal, liver, skin, and kidney cancers in the base model, none of these cancer associations were highly robust across sensitivity analysis. Furthermore, no association was found for and liver and kidney cancers when time-varying models were used. Previous studies have found statistically significant associations for and mortality or incidence from oral and oropharyngeal (Chu et al. 2019), colorectal (Coleman et al. 2020; Turner et al. 2017; Ancona et al. 2015), liver (Coleman et al. 2020; Ancona et al. 2015; Deng et al. 2017; Pan et al. 2016; VoPham et al. 2018), skin (Datzmann et al. 2018) (used instead of ), breast (Coleman et al. 2020; Ancona et al. 2015; Wong et al. 2016; Hu et al. 2013; White et al. 2019; DuPré et al. 2019), and kidney cancers (Turner et al. 2017; Raaschou-Nielsen et al. 2017). Furthermore, the association between all cancer incidence and was statistically significant (95% CI: 1.03, 1.14), even after adjusting for multiple comparisons (), indicating that the effect of exposure to on cancer sites may not be limited to the lungs. The present study has several strengths. First, the analysis is based on well-documented cancer registry data that contains cases of cancer. Second, this study was able to flexibly control for many relevant county-level risk factors, including smoking, obesity, alcohol consumption, physical activity, income, and education. Third, this study used incidence data instead of mortality data, which avoids the risk of confounding from other causes of death. Finally, the cancer incidence, covariate, and air pollution exposure data are all publicly available. This study has several limitations. First, this ecological study was unable to control for individual-level risk factors or pollution exposure; therefore, the association between cancer incidence and exposure found in this study may not reflect the individual-level association between and cancer incidence. However, other studies that have used individual-level data and controlled for a greater variety of risk factors have found comparable associations for cancer mortality and . Further, this study was unable to control for all potential risk factors of cancer incidence. Several potential confounders include occupational exposures, dietary patterns, diabetes status, or chronic hepatitis B and C virus infection status. Furthermore, the present study found that progressively adding covariates to the model had an impact on the association between and cancer incidence, which suggests a possible risk of residual confounding. Finally, the present study does not estimate cancer incidence rates for various age, sex, and race/ethnicity categories. Future studies should examine these associations to determine whether differences in exposure across various substrata, especially race/ethnicity, lead to a substantial difference in –cancer incidence associations (Zou et al. 2014). The present study is also limited in its ability to directly measure exposures. County-level concentrations are generated using population-weighted averages of U.S. Census block-level–modeled estimates that cannot account for the full range of spatial variability. Sensitivity analyses suggest that most cancer associations are not highly sensitive to regional, state, or SEER cancer registry spatial control. It is unclear, however, how the estimates would be affected if the analysis could be conducted at the U.S. Census tract or block level. In addition, the present study had a limited ability to identify the most relevant exposure window for cancer incidence. The present study found that the associations between and cancer incidence are not sensitive to changes in the exposure windows from 1988–2015, 1999–2015, 1988–2007. Especially for lung cancer, stronger associations were observed for 10- or 15-y lagged moving averages vs. 1- or 5-y lagged moving averages—indicative of a relatively long latency period. This study was unable to generate reliable exposure estimates before 1988. Finally, the primary index of air pollution used in this analysis is , which does not account for spatial differences in the constituents or characteristics of or of various co-pollutants. The present study supports the growing body of evidence that increased exposure is associated with lung cancer incidence. Furthermore, it provides moderate evidence that exposure may be associated with the incidence of cancer at other sites, such as oral and oropharyngeal, rectal, liver, skin, breast, and kidney. Although is likely not a primary risk factor for cancer incidence, the pervasive nature of air pollution exposure makes further study essential to public health. Click here for additional data file. Click here for additional data file.

32 in total

1. Expert position paper on air pollution and cardiovascular disease.

Authors: David E Newby; Pier M Mannucci; Grethe S Tell; Andrea A Baccarelli; Robert D Brook; Ken Donaldson; Francesco Forastiere; Massimo Franchini; Oscar H Franco; Ian Graham; Gerard Hoek; Barbara Hoffmann; Marc F Hoylaerts; Nino Künzli; Nicholas Mills; Juha Pekkanen; Annette Peters; Massimo F Piepoli; Sanjay Rajagopalan; Robert F Storey
Journal: Eur Heart J Date: 2014-12-09 Impact factor: 29.983

2. Long-term ambient fine particulate matter air pollution and lung cancer in a large cohort of never-smokers.

Authors: Michelle C Turner; Daniel Krewski; C Arden Pope; Yue Chen; Susan M Gapstur; Michael J Thun
Journal: Am J Respir Crit Care Med Date: 2011-10-06 Impact factor: 21.405

3. Fine particulate air pollution and human mortality: 25+ years of cohort studies.

Authors: C Arden Pope; Nathan Coleman; Zachari A Pond; Richard T Burnett
Journal: Environ Res Date: 2019-11-14 Impact factor: 6.498

4. Mortality and morbidity in a population exposed to multiple sources of air pollution: A retrospective cohort study using air dispersion models.

Authors: Carla Ancona; Chiara Badaloni; Francesca Mataloni; Andrea Bolignano; Simone Bucci; Giulia Cesaroni; Roberto Sozzi; Marina Davoli; Francesco Forastiere
Journal: Environ Res Date: 2015-02-18 Impact factor: 6.498

5. Exposure to ambient air pollution and the incidence of lung cancer and breast cancer in the Ontario Population Health and Environment Cohort.

Authors: Li Bai; Saeha Shin; Richard T Burnett; Jeffrey C Kwong; Perry Hystad; Aaron van Donkelaar; Mark S Goldberg; Eric Lavigne; Scott Weichenthal; Randall V Martin; Ray Copes; Alexander Kopp; Hong Chen
Journal: Int J Cancer Date: 2019-07-30 Impact factor: 7.396

6. Ambient PM_2.5 air pollution exposure and hepatocellular carcinoma incidence in the United States.

Authors: Trang VoPham; Kimberly A Bertrand; Rulla M Tamimi; Francine Laden; Jaime E Hart
Journal: Cancer Causes Control Date: 2018-04-25 Impact factor: 2.506

7. Chronic exposure to fine particles and mortality: an extended follow-up of the Harvard Six Cities study from 1974 to 2009.

Authors: Johanna Lepeule; Francine Laden; Douglas Dockery; Joel Schwartz
Journal: Environ Health Perspect Date: 2012-03-28 Impact factor: 9.031

8. Relationship between exposure to PM2.5 and lung cancer incidence and mortality: A meta-analysis.

Authors: Feifei Huang; Bing Pan; Jun Wu; Engeng Chen; Liying Chen
Journal: Oncotarget Date: 2017-06-27

9. Air Pollution, Clustering of Particulate Matter Components, and Breast Cancer in the Sister Study: A U.S.-Wide Cohort.

Authors: Alexandra J White; Joshua P Keller; Shanshan Zhao; Rachel Carroll; Joel D Kaufman; Dale P Sandler
Journal: Environ Health Perspect Date: 2019-10-09 Impact factor: 9.031

10. Association between fine particulate matter and oral cancer among Taiwanese men.

Authors: Yu-Hua Chu; Syuan-Wei Kao; Disline Manli Tantoh; Pei-Chieh Ko; Shou-Jen Lan; Yung-Po Liaw
Journal: J Investig Med Date: 2018-10-09 Impact factor: 2.895

7 in total

Review 1. Cohort studies of long-term exposure to outdoor particulate matter and risks of cancer: A systematic review and meta-analysis.

Authors: Pei Yu; Suying Guo; Rongbin Xu; Tingting Ye; Shanshan Li; Malcolm R Sim; Michael J Abramson; Yuming Guo
Journal: Innovation (Camb) Date: 2021-07-13

2. Prognosis and Survival Analysis of 922,317 Lung Cancer Patients from the US Based on the Most Recent Data from the SEER Database (April 15, 2021).

Authors: Sheng Hu; Wenxiong Zhang; Qiang Guo; Jiayue Ye; Deyuan Zhang; Yang Zhang; Weibiao Zeng; Dongliang Yu; Jinhua Peng; Yiping Wei; Jianjun Xu
Journal: Int J Gen Med Date: 2021-12-10

Review 3. Exposure to Outdoor Particulate Matter Air Pollution and Risk of Gastrointestinal Cancers in Adults: A Systematic Review and Meta-Analysis of Epidemiologic Evidence.

Authors: Natalie Pritchett; Emily C Spangler; George M Gray; Alicia A Livinski; Joshua N Sampson; Sanford M Dawsey; Rena R Jones
Journal: Environ Health Perspect Date: 2022-03-02 Impact factor: 9.031

4. PM2.5 promotes NSCLC carcinogenesis through translationally and transcriptionally activating DLAT-mediated glycolysis reprograming.

Authors: Qianqian Chen; Yiling Wang; Lin Yang; Liyuan Sun; Yuxin Wen; Yongyi Huang; Kaiping Gao; Wenhan Yang; Feng Bai; Lijuan Ling; Zizi Zhou; Xiaoming Zhang; Juan Xiong; Rihong Zhai
Journal: J Exp Clin Cancer Res Date: 2022-07-22

Review 5. Long-Term Exposure to Fine Particulate Matter and the Risk of Chronic Liver Diseases: A Meta-Analysis of Observational Studies.

Authors: Jing Sui; Hui Xia; Qun Zhao; Guiju Sun; Yinyin Cai
Journal: Int J Environ Res Public Health Date: 2022-08-18 Impact factor: 4.614

6. Long-term ambient air pollution exposure and risk of sinonasal inverted papilloma.

Authors: Wojciech K Mydlarz; Nyall R London; Shyam Biswal; Murugappan Ramanathan; Zhenyu Zhang
Journal: Int Forum Allergy Rhinol Date: 2022-01-25 Impact factor: 5.426

7. Different Mortality Risks of Long-Term Exposure to Particulate Matter across Different Cancer Sites.

Authors: Miyoun Shin; Ok-Jin Kim; Seongwoo Yang; Seung-Ah Choe; Sun-Young Kim
Journal: Int J Environ Res Public Health Date: 2022-03-08 Impact factor: 3.390

7 in total