Literature DB >> 32327564

Lung Function in African American Children with Asthma Is Associated with Novel Regulatory Variants of the KIT Ligand KITLG/SCF and Gene-By-Air-Pollution Interaction.

Satria Sajuthi1, Jaehyun Joo2, Shujie Xiao3, Patrick M Sleiman4,5, Marquitta J White6, Hakon Hakonarson4,5, Blanca E Himes2, L Keoki Williams3, Max A Seibold2, Angel C Y Mak7, Eunice Y Lee6, Benjamin Saef2, Donglei Hu6, Hongsheng Gui3, Kevin L Keys6,8, Fred Lurmann9, Deepti Jain10, Gonçalo Abecasis11, Hyun Min Kang11, Deborah A Nickerson12,13,14, Soren Germer15, Michael C Zody15, Lara Winterkorn15, Catherine Reeves15, Scott Huntsman6, Celeste Eng6, Sandra Salazar6, Sam S Oh6, Frank D Gilliland16, Zhanghua Chen16, Rajesh Kumar17, Fernando D Martínez18, Ann Chen Wu19, Elad Ziv6, Esteban G Burchard6,20.   

Abstract

Baseline lung function, quantified as forced expiratory volume in the first second of exhalation (FEV1), is a standard diagnostic criterion used by clinicians to identify and classify lung diseases. Using whole-genome sequencing data from the National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine project, we identified a novel genetic association with FEV1 on chromosome 12 in 867 African American children with asthma (P = 1.26 × 10-8, β = 0.302). Conditional analysis within 1 Mb of the tag signal (rs73429450) yielded one major and two other weaker independent signals within this peak. We explored statistical and functional evidence for all variants in linkage disequilibrium with the three independent signals and yielded nine variants as the most likely candidates responsible for the association with FEV1 Hi-C data and expression QTL analysis demonstrated that these variants physically interacted with KITLG (KIT ligand, also known as SCF), and their minor alleles were associated with increased expression of the KITLG gene in nasal epithelial cells. Gene-by-air-pollution interaction analysis found that the candidate variant rs58475486 interacted with past-year ambient sulfur dioxide exposure (P = 0.003, β = 0.32). This study identified a novel protective genetic association with FEV1, possibly mediated through KITLG, in African American children with asthma. This is the first study that has identified a genetic association between lung function and KITLG, which has established a role in orchestrating allergic inflammation in asthma.
Copyright © 2020 by the Genetics Society of America.

Entities:  

Keywords:  African American; FEV1 gene-by-environment interaction; GWAS; GxE; KITLG; SCF; air pollution

Mesh:

Substances:

Year:  2020        PMID: 32327564      PMCID: PMC7337089          DOI: 10.1534/genetics.120.303231

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.562


ASTHMA, a chronic pulmonary condition characterized by reversible airway obstruction, is one of the hallmark diseases of childhood in the United States (World Health Organization 2017). Asthma is also the most disparate common disease in the pediatric clinic, with significant variation in prevalence, morbidity, and mortality among U.S. racial/ethnic groups (Oh ). Specifically, African American children carry a higher asthma disease burden compared to their European American counterparts (Akinbami ; Akinbami 2015). Forced expiratory volume in the first second (FEV1), a measurement of lung function, is a vital clinical trait used by physicians to assess overall lung health and diagnose pulmonary diseases such as asthma (Johnson and Theurer 2014). We have previously shown that genetic ancestry plays an important role in FEV1 variation and that African Americans have lower FEV1 compared to European Americans, regardless of asthma status (Kumar ; Pino-Yanes ). The disparity in lung function between populations may explain disparities in asthma disease burden. Understanding the factors that influence FEV1 variation among individuals with asthma could lead to improved patient care and therapeutic interventions. Twin and family-based studies estimate that the heritability of FEV1 ranges from 26 to 81%, supporting the combined contribution by genetic and environmental factors in FEV1 variation (Chatterjee and Das 1995; Chen ; Palmer ; Hukkinen ; Yamada ; Sillanpaa ; Tian ). Genome-wide association studies (GWAS) of FEV1, including among individuals with asthma, have identified many variants that contribute to lung function (Repapi ; Soler Artigas , 2015; Li ; Liao ; Wain ). A search in the National Human Genome Research Institute-European Bioinformatics Institute (NHGRI-EBI) GWAS Catalog (version e98_r2020-03-08) on baseline lung function (FEV1) alone revealed 349 associations (Buniello ). However, most of these previous GWAS were performed in adult populations of European descent, and their results may not generalize across populations or across the life span of an individual (Carlson ; A. R. Martin ; Wojcik ). Previous GWAS results are also limited due to their reliance on genotyping arrays. In particular, variation in noncoding regions of the genome is not adequately covered by many genotyping arrays because they were not designed to account for the population-specific genetic variability of all populations (Zhang and Lupski 2015; Kim ). Whole-genome sequencing (WGS) is a newer technology that captures nearly all common variation from coding and noncoding regions of the genome, and is unencumbered by genotype array design constraints and differences in linkage disequilibrium (LD) patterns among populations. To date, no large-scale WGS studies of lung function have been performed in African American children with asthma (A. R. Martin ). In addition to genetics, FEV1 is a complex trait that is significantly influenced by both genetic variation and environmental factors, such as air pollution (Chatterjee and Das. 1995; Palmer ; Hukkinen ; Yamada ; Tian ; Sillanpaa ). Exposure to ambient air pollution has been consistently associated with poor respiratory outcomes, including reduced FEV1 (Brunekreef and Holgate 2002; Barraza-Villarreal ; Ierodiakonou ; Wise 2019). We previously showed that exposure to sulfur dioxide (SO2), an air pollutant emitted by the burning of fossil fuels, is significantly associated with reduced FEV1 in African American children with asthma in the Study of African Americans, Asthma, Genes & Environments (SAGE II) study (Neophytou ). Because the genetic variants associated with FEV1 thus far do not account for the majority of its estimated heritability, considering gene–environment (GxE) interactions, specifically gene-by-air-pollution, may improve our understanding of lung function genetics (Moore 2005; Moore and Williams. 2009). Here, we performed a genome-wide association analysis using WGS data to identify common genetic variants associated with FEV1 in African American children with asthma in SAGE II and investigated the effect of GxE (SO2) interactions on FEV1 associations.

Materials and Methods

Study population

This study examined African American children between 8–21 years of age with physician-diagnosed asthma from the SAGE II study. All SAGE II participants were recruited from the San Francisco Bay Area. The inclusion and exclusion criteria were previously described in detail (Oh ; White ). Briefly, participants were eligible if they were 8–21 years of age, self-identified as African American, and had four African American grandparents. Study exclusion criteria included the following: (1) any smoking within 1 year of the recruitment date; (2) 10 or more pack-years of smoking; (3) pregnancy in the third trimester; and (4) history of lung diseases other than asthma (for cases) or chronic illness (for cases and controls). Baseline lung function defined as FEV1 was measured by spirometry prior to administering albuterol, as previously described (Oh ).

Trans-Omics for Precision Medicine WGS data

SAGE II DNA samples were sequenced as part of the Trans-Omics for Precision Medicine (TOPMed) WGS program (Taliun preprint). WGS was performed at the New York Genome Center and Northwest Genomics Center on a HiSeq X system (Illumina, San Diego, CA) using a paired-end read length of 150 bp, with a minimum of 30× mean genome coverage. DNA sample handling, quality control, library construction, clustering and sequencing, read processing, and sequence data quality control are described in detail in the TOPMed website (National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine (TOPMed) Program 2019). Variant calls were obtained from TOPMed data freeze 8 variant call format (VCF) files corresponding to the GRCh38 human genome assembly. Variants with a minimal read depth of 10 (DP10) were used for analysis unless otherwise stated.

Genetic principal components, global ancestry, and kinship estimation

Genetic principal components (PCs), global ancestry, and kinship estimation on genetic relatedness were computed using biallelic single-nucleotide polymorphisms (SNPs) with a PASS flag from TOPMed freeze 8 DP10 data. PCs and kinship estimates were computed using the PC-Relate function from the GENESIS R package (Conomos , 2016) using a workflow available from the Summer Institute in Statistical Genetics Module 17 course website (Summer Institute in Statistical Genetics 2019). African global ancestry was computed using the ADMIXTURE package (Alexander ) in supervised mode using European (CEU), African (YRI), and Native American (NAM) reference panels as previously described (Mak ).

FEV1 GWAS

Nonnormality of the distribution of FEV1 values was tested with the Shapiro–Wilk test in R using the shapiro.test function. Since FEV1 was not normally distributed (P = 1.41 × 10−8 for FEV1 and P = 1.05 × 10−8 for log10 FEV1), FEV1 was regressed on all covariates (age, sex, height, controller medications, sequencing centers, and the first five genetic PCs) and the residuals were inverse-normalized. These inverse-normalized residuals (FEV1.res.rnorm) were the main outcome of the discovery GWAS. The controller medication covariate included the use of inhaled corticosteroids (ICS), long-acting β-agonists (LABA), leukotriene inhibitors, and/or an ICS/LABA combo in the 2 weeks prior to the recruitment date. Genome-wide single-variant analysis was performed on the ENCORE server (https://github.com/statgen/encore) using the linear Wald test (q.linear) originally implemented in EPACTS (https://genome.sph.umich.edu/wiki/EPACTS) and TOPMed freeze 8 data (DP0 PASS), with a minor allele frequency (MAF) filter of 0.1%. All pairwise relationships with degree three or more relatedness (kinship values > 0.044) were identified, and one participant of the related pair was subsequently chosen at random and removed prior to analysis. All covariates used to obtain FEV1.res.rnorm were also included as covariates in the GWAS as recommended in a recent publication (Sofer ). The association analysis was repeated using untransformed FEV1 and FEV1 percent predicted (FEV1.perc.predicted). FEV1 percent predicted was defined as the percentage of measured FEV1 relative to predicted FEV1, estimated by the Hankinson lung function prediction equation for African Americans (Hankinson ). A secondary analysis that included smoking-related covariates (smoking status and number of smokers in the family) was performed in PLINK 1.9 (version 1p9_2019_0304_dev) (Purcell and Chang. 2013; Chang ). To study whether association with FEV1 is specific to SAGE II participants with asthma, we repeated the association analysis adjusting for age, sex, height, and the first five genetic PCs in SAGE II participants without asthma on the ENCORE server. All of these participants were sequenced in the same center. Regional association results were plotted using LocusZoom 1.4 (Pruim ) with a 500 kb flanking region. LD (R2) was estimated in PLINK 1.9. An LD plot was generated using recoded genotype files (plink –recode 12) in Haploview (Barrett ). The function effectiveSize in the R package CODA was used to estimate the actual effective number of independent tests, and CODA-adjusted statistical and suggestive significance P-value thresholds were defined as 0.05 and 1, divided by the effective number of tests, respectively (Duggal ). We compared the CODA-adjusted statistical significance threshold and the widely used 5 × 10−8 GWAS genome-wide significance threshold (Pe’er ), and selected the more stringent threshold for genome-wide significance. The following WGS quality control steps were applied to all reported variant association results from the ENCORE server to ensure WGS variant quality: (1) The variant had VCF FILTER = PASS; (2) variant quality was confirmed via manual inspection on the BRAVO server based on TOPMed freeze 5 data (University of Michigan and National Heart, Lung and Blood Institute Trans-Omics for Precision Medicine (TOPMed) Program 2018); and (3) variants were reanalyzed with linear regression using PLINK 1.9 by applying the arguments –mac 5 –geno 0.1 –hwe 0.0001 using TOPMed freeze 8 DP10 PASS data. To determine if the rs73429450 association with FEV1 was only identifiable using WGS data, we repeated the linear regression association analysis on signals that passed the genome-wide significance threshold using PLINK 1.9 and genotype data generated with the Axiom Genome-Wide LAT 1 array (Affymetrix, Santa Clara, CA; dbGaP phs000921.v1.p1). These array genotype data were imputed into the following reference panels: 1000 Genomes Project (1000G) phase 3 version 5, Haplotype Reference Consortium (HRC) r1.1, the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA), and the TOPMed phase 5 panels on the Michigan Imputation Server (Das ). It should be noted that 500 SAGE II subjects were part of the TOPMed freeze 5 reference panel. A total of 349 GWAS FEV1-associated entries were retrieved from the NHGRI-EBI GWAS Catalog version 1.0.2-associations_e98_r2020-03-08 (Buniello ) using the trait names “Lung function (FEV1)”, “FEV1,” “Lung function (forced expiratory volume in 1 sec),” or “Prebronchodilator FEV1.” After adding 100-kb flanking regions to each of the 349 entries, a total of 230 nonoverlapping region were obtained. To look up whether we replicated previous GWAS loci while control for multiple testing penalties, we only used 279,495 common variants (MAF ≥ 0.01) that overlapped with the 230 regions. The 279,495 common variants were equivalent to 17,755 effective tests based on CODA and 5.63 × 10−5 (1/17,755) was used as the suggestive P-value threshold for replication.

Conditional analysis

Conditional analysis was performed to identify all independent signals in a GWAS peak using PLINK 1.9. All TOPMed freeze 8 DP10 variants within 1 Mb of the tag association signal, and with association P-value of 1 × 10−4 or smaller in the discovery GWAS, were included in the analysis. Variants were first ordered by ascending P-value. A variant was considered to be an independent signal if the association P-value after conditioning (conditional P-value) on the tag signal was < 0.05. Newly identified independent signals were included with the tag signal for conditioning on the next variant.

Region-based association analysis

Region-based association analyses were performed in 1-kb sliding windows with 500-bp increments in a 1-Mb flanking region of the tag GWAS signal using the SKAT_CommonRare function from the SKAT R package v1.3.2.1 (Ionita-Laza ). Default settings were used with method = “C” and test.type = “Joint.” A MAF threshold of 0.01 was used as the cutoff to distinguish rare and common variants. Variants were annotated in TOPMed using the Whole Genome Sequencing Annotator (WGSA) pipeline (Liu ). Since SKAT imputes missing genotypes by default by assigning mean genotype values (impute.method=“fixed”), we chose to use low-coverage genotypes instead of SKAT imputation, and hence TOPMed freeze 8 DP0 variants with a VCF FILTER of PASS were included in the analysis. The function effectiveSize in the R package CODA (Plummer ) was used to estimate the effective number of independent hypothesis tests for accurate Bonferroni multiple testing corrections. P-value thresholds for statistical significance and suggestive significance were defined as 0.05 and 1 divided by the effective number of tests, respectively (Duggal ). If a region was suggestively significant, region-based analyses were repeated with functional variants and/or rare variants (MAF ≤ 0.01) to assess contributions of common, rare, and/or functional variants. Region-based analyses using rare variants only were performed using SKAT-O (Lee ). The WGSA annotation filters used to define functional variants are provided in Supplemental Material, File S1. To study the contribution of individual variants to a region-based association P-value, drop-one variant analysis was performed by repeating the region-based analysis multiple times and dropping one variant only at a time.

Functional annotations and prioritization of genetic variants

The Hi-C Unifying Genomic Interrogator (HUGIN) (Ay ; Schmitt ; J. S. Martin ) was used to assign potential gene targets to each variant. HUGIN uses the Hi-C data generated from the primary human tissues from four donors used in the Roadmap Epigenomics Project (Schmitt ). Encyclopedia of DNA Elements (ENCODE) annotations (ENCODE Project Consortium 2011, 2012) were based on overlap of the variants with functional data downloaded from the University of California, Santa Cruz (UCSC) Table Browser (Karolchik ). These data included DNAase I hypersensitivity peak clusters (hg38 wgEncodeRegDnaseClustered table), transcription factor chromatin immunoprecipitation-sequencing (ChIP-Seq) clusters (hg38 encRegTfbsClustered table), and histone modification ChIP-Seq peaks (hg19 wgEncodeBroadHistone StdPk tables). For DNase I hypersensitivity and transcription factor binding sites, we focused on blood, bone marrow, lung, and embryonic cells. For histone modification ChIP-Seq, we focused on H3K27ac and H3K4me3 modifications in human blood (GM12878), bone marrow (K562), lung fibroblast (NHLF), and embryonic stem cells (1H-hESC). The LiftOver tool (Hinrichs ) was used to convert genomic coordinates from hg19 to hg38. Candidate cis-regulatory elements (ccREs) were a subset of representative DNase hypersensitivity sites with epigenetic activity further supported by histone modification (H3K4me3 and H3K27ac) or CTCF-binding data from the ENCODE project. Overlaps of variants with ccREs were detected using the Search Candidate cis-Regulatory Elements by ENCODE (SCREEN) web interface (ENCODE Project Consortium 2011, 2012). Prioritization of genetic variants was based on the presence of statistical, functional, and/or bioinformatic evidence as described in the Diverse Convergent Evidence prioritization framework (Ciesielski ). The priority score of each variant was obtained by counting the number of pieces of statistical, functional, and/or bioinformatic evidence that supported a potential biological function for that variant.

Replication of GWAS associations

All replication analyses were performed in subjects with asthma. Replication of GWAS FEV1 associations was attempted on TOPMed WGS data generated from four cohorts. These cohorts included Puerto Rican (n = 1109) and Mexican American (n = 649) children in the Genes-Environments and Admixture in Latino Americans (GALA II) study (Oh ), African American adults in the Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity (SAPPHIRE, n = 3428) (Levin ), and African American children in the Genetics of Complex Pediatric Disorders (GCPD-A, n = 1464) study (Ong ). Age, sex, height, controller medications, and the first five PCs were used as covariates. Additionally, replication of GWAS FEV1 associations was attempted using data of black UK Biobank subjects who had asthma (n = 627) while adjusting for age, sex, height, and the first five PCs. Asthma status was defined by International Statistical Classification of Diseases and Related Health Problems (ICD) code of 493 or self-reported asthma. UK Biobank genotype data were generated on the Affymetrix UK BiLEVE axiom or UK Biobank Axiom array and imputed into the HRC, 1000G, and UK 10K projects (Bycroft ; Canela-Xandri ). Additional details on the UK Biobank study and the replication procedures are available in File S1 (Text 2).

RNA sequencing and expression QTL analysis

Whole-transcriptome libraries of 370 nasal brushings from GALA II Puerto Rican children with asthma were constructed by using the Beckman Coulter FX automation system (Beckman, Fullerton, CA). Libraries were sequenced with the Illumina HiSeq 2500 system. Raw RNA sequencing (RNA-Seq) reads were trimmed using Skewer (Jiang ) and mapped to human reference genome hg38 using Hisat2 (Kim ). Reads mapped to genes were counted with htseq-count and using the UCSC hg38 Gene Transfer Format (GTF) file as reference (Anders ). Cis-expression QTL (eQTL) analysis of KITLG (KIT ligand) was performed as described in the Genotype-Tissue Expression (GTEx) project version 7 protocol (GTEx Consortium et al. 2017) using age, sex, body mass index, global African and European ancestries, and 60 probabilistic estimation of expression residual (PEER) factors as covariates.

Gene-by-air-pollution interaction analysis

We hypothesized that the effect of genetic variation on lung function in our study population may differ by the levels of exposure to SO2 (Neophytou ). To test for an interaction between a genetic variant and SO2, an additional multiplicative interaction term (variant × S02 exposure) was included in the original GWAS model (see section FEV). The SO2 estimates used in the interaction analysis were first-year, past-year, and lifetime exposure to ambient SO2, which were estimated as described previously (Neophytou ). Briefly, we obtained regional ambient daily air pollution data from the U.S. Environmental Protection Agency Air Quality System. SO2 estimates for each participant’s residential geographic coordinate were calculated as the inverse distance-squared weighted average from the four closest air pollution monitoring stations within 50 km of the participant’s residence. We estimated yearly exposure at the reported residential address by averaging all available daily measures (daily average of 1-hr SO2) in a given year. If the participant had a change of residential address in a given year, we estimate yearly exposure as a time-weighted estimate based on the number of months spent at each different address in that year. Average lifetime exposures were estimated using all available yearly average estimates over the lifetime of the participant until the day of spirometry testing. Since not all pollutants were measured daily, there are location- and pollutant-dependent missing values. Residuals of FEV1 were plotted against exposure to SO2 and stratified by the number of copies of the minor allele of a variant. Residuals of FEV1 were obtained as described in section FEV.

Data availability

Local institutional review boards approved the studies (number 10-02877). All subjects and legal guardians provided written informed consent. TOPMed WGS and phenotype data from SAGE II are available on dbGaP under accession number phs000921.v4.p1. Supplemental material and normalized gene count data for KITLG available at figshare: https://doi.org/10.25386/genetics.12152196.

Results

Novel lung function associations

Subject characteristics of the 867 African American children with asthma included in this study are shown in Table 1, and the distribution of their FEV1 measurements (mean = 2.56 L, SD = 0.79 L) is in Figure S1. The CODA-adjusted statistical significance thresholds 2.10 × 10−8 and 4.19 × 10−7 were used as the genome-wide and suggestive significance thresholds, respectively. According to this threshold, one SNP in chromosome 12 (chromosome 12:88846435, rs73429450, G > A) was associated with FEV1.res.rnorm (Figure 1, P = 9.01 × 10−9, β = 0.801) at genome-wide significance. The association between rs73429450 and lung function remained statistically significant when the association was repeated using untransformed FEV1 (P = 1.26 × 10−8, β = 0.302) as the outcome variable. The association between rs73429450 and lung function was suggestive using FEV1.perc.predicted (P = 1.69 × 10−7, β = 0.100). Twenty suggestive associations corresponding to four tag signals are reported in File S2. None of the suggestive associations overlapped with any of the previously reported FEV1-associated loci. When considering only common variants and applying a P-value threshold of 5.63 × 10−5, we found replication in 6 out of 230 previously reported FEV1 associations (Table S1). Our top FEV1 association, rs73429450, did not overlap with any previously reported loci and it is a novel association with FEV1 in this study population.
Table 1

Descriptive characteristics of 867 African American children with asthma included in this study

CharacteristicAfrican American
(n = 867)
Age
 Mean (SD)14.1 (3.64)
 Median (25%, 75%)13.8 (10.98, 17.11)
Sex
 Male439 (50.6%)
 Female428 (49.4%)
Height (m)
 Mean (SD)1.58 (0.145)
 Median (25%, 75%)1.60 (1.47, 1.68)
Any control medications* in last 2 weeks
 No543 (62.6%)
 Yes324 (37.4%)
ICS in last 2 weeks
 No211 (24.3%)
 Yes306 (35.3%)
 Missing350 (40.4%)
LABA in last 2 weeks
 No5 (0.6%)
 Yes94 (10.8%)
 Missing768 (88.6%)
Leukotriene inhibitor in last 2 weeks
 No11 (1.3%)
 Yes68 (7.8%)
 Missing788 (90.9%)
African ancestry
 Mean (SD)0.792 (0.129)
 Median (25%, 75%)0.826 (0.759, 0.869)
Smoking status
 Never793 (91.5%)
 Past72 (8.3%)
 Current0 (0%)
 Missing2 (0.2%)
Number of smokers in family
 0469 (54.1%)
 1137 (15.8%)
 242 (4.8%)
 3+10 (1.2%)
 Missing209 (24.1%)
SO2 first year exposure (ppb)
 Mean (SD)1.59 (0.961)
 Median (25%, 75%)1.50 (1.24, 1.87)
 Missing227 (26.2%)
SO2 past year exposure (ppb)
 Mean (SD)1.10 (0.302)
 Median (25%, 75%)1.08 (0.910, 1.27)
 Missing206 (23.8%)
SO2 lifetime exposure (ppb)
 Mean (SD)1.50 (0.371)
 Median (25%, 75%)1.47 (1.40, 1.54)
 Missing206 (23.8%)
TOPMed sequencing center and phase
 Phase 1, CAAPA6 (0.7%)
 Phase 1, NYGC460 (53.1%)
 Phase 3, NW401 (46.3%)

25% and 75%, 25th and 75th percentiles. Control medications include ICS, LABA, leukotriene inhibitor, and/or ICS/LABA combo. SO2 exposures are hourly exposures averaged over the specified time periods before spirometry testing, as previously described in Neophytou . CAAPA, Consortium on Asthma among African-ancestry Populations in the Americas; NYGC, New York Genome Center; NW, Northwest Genomics Center; ICS, inhaled corticosteroid; LABA, long-acting β-agonist; ppb, parts per billion or µg/m3; TOPMed, Trans-Omics for Precision Medicine.

Figure 1

Manhattan and LocusZoom plots from genome-wide association study of lung function. (A) Manhattan plot from genome-wide association study of lung function using linear regression on the ENCORE server. FEV1.res.rnorm was used as the phenotype for the association testing. Red horizontal line: CODA-adjusted genome-wide significance P-value of 2.10 × 10−8. Blue horizontal line: CODA-adjusted suggestive significance P-value of 4.19 × 10−7. (B) LocusZoom plot of rs73429450 (chr12: 88846435) and 500-kb flanking region. Colors show linkage disequilibrium in the study population. chr12, chromosome 12; FEV1, forced expiratory volume in the first second of exhalation.

25% and 75%, 25th and 75th percentiles. Control medications include ICS, LABA, leukotriene inhibitor, and/or ICS/LABA combo. SO2 exposures are hourly exposures averaged over the specified time periods before spirometry testing, as previously described in Neophytou . CAAPA, Consortium on Asthma among African-ancestry Populations in the Americas; NYGC, New York Genome Center; NW, Northwest Genomics Center; ICS, inhaled corticosteroid; LABA, long-acting β-agonist; ppb, parts per billion or µg/m3; TOPMed, Trans-Omics for Precision Medicine. Manhattan and LocusZoom plots from genome-wide association study of lung function. (A) Manhattan plot from genome-wide association study of lung function using linear regression on the ENCORE server. FEV1.res.rnorm was used as the phenotype for the association testing. Red horizontal line: CODA-adjusted genome-wide significance P-value of 2.10 × 10−8. Blue horizontal line: CODA-adjusted suggestive significance P-value of 4.19 × 10−7. (B) LocusZoom plot of rs73429450 (chr12: 88846435) and 500-kb flanking region. Colors show linkage disequilibrium in the study population. chr12, chromosome 12; FEV1, forced expiratory volume in the first second of exhalation. Secondary analysis that included covariates correcting for smoking status and number of smokers in the family showed that smoking-related factors were not significantly associated with FEV1 in our pediatric SAGE II cohort: using 657 out of 867 individuals with available smoking-related covariates, the FEV1.res.rnorm association P-values before and after including the smoking-related covariates were 2.01 × 10−6 and 1.89 × 10−6. Both P-values of the covariates smoking status (P = 0.27) and number of smokers in the family (P = 0.54) were not significant. Conditional analysis was performed on 45 variants with association P < 1 × 10−4 located within 1 Mb of the strongest association signal (rs73429450). Two weaker independent signals (rs17016065 and rs58475486) were identified (Table S2). None of the 45 variants showed association with FEV1.res.rnom in 251 SAGE II children without asthma (Table S3). The minor allele frequency of rs73429450 in continental populations from the 1000G is 3% in Africans and < 1% in Admixed Americans, Europeans, Asians (1000 Genomes Project Consortium et al. 2015). SNP rs73429450 was not included on the Affymetrix LAT 1 genotyping array where SAGE II participants were previously genotyped. To determine if the rs73429450 association with FEV1 was only identifiable using WGS data, we attempted to reproduce our results by imputing the genotype of rs73429450 in 851 SAGE II participants with available array data using 1000G phase 3 (n = 2504), HRC r1.1 (n = 32,470), CAAPA (n = 883), and TOPMed freeze 5 (n = 62,784) reference panels. Our results remained statistically significant when using the 1000G phase 3 (P = 4.97 × 10−8, β = 0.79, imputation R2 = 0.95) and TOPMed freeze 5 (P = 1.22 × 10−8, β = 0.80, imputation R2 = 0.98) reference panels, but lost statistical significance when rs73429450 genotypes were imputed using the HRC (P = 4.35 × 10−7, β = 0.68, imputation R2 = 0.94) and CAAPA (P = 1.95 × 10−7, β = 0.80, imputation R2 = 0.71) reference panels. Region-based association analysis including all variants conditioned on the association signal from rs73429450 was performed in its 1-Mb flanking region (chromosome 12:87846435-89846435). No windows were significantly associated after Bonferroni multiple testing correction (P < 2.80 × 10−4, Figure S2), but 20 windows were suggestively associated with FEV1.res.rnorm (P < 5.60 × 10−3, Table S5). Two of 20 windows retested using only functional variants were suggestively significant (regions 4 and 16). Both of these windows were no longer suggestively significant after removing the common variants, indicating that association signals from these regions were mostly driven by common variants. Further investigation of region 16 using drop-one analysis of the two rare and one common function variants confirmed the major contribution by the common variant, rs1895710, as shown by the major increase in P-value (Table S6). The signal was also slightly driven by the singleton, rs990979778. Drop-one analysis was not performed on region 4 because there was only one common and one rare variant. A Hi-C assay couples a chromosome conformation capture assay with next-generation sequencing to capture long-range interactions in the genome. We identified a statistically significant long-range chromatin interaction between the GWAS peak and the KITLG [also known as stem cell factor (SCF)] gene in human fetal lung fibroblast cell line IMR90 (Table S7). The long-range interaction detected in human primary lung tissue was not significant, implying that the potential long-range interactions are specific to tissue type or developmental stage.

Potential regulatory role of FEV1-associated variants on KITLG expression

To further elucidate potential regulatory relationships between the GWAS association peak and KITLG, we analyzed whether variants in the peak were eQTL of KITLG in previously published whole-blood RNA-Seq data available from the same study participants (Mak ). However, the whole blood RNA-Seq data did not yield evidence of expressed KITLG, consistent with results in GTEx. We subsequently used RNA-Seq data from nasal epithelial cells of 370 Puerto Rican children with asthma from the GALA II study, and found that 5 out of 45 variants were eQTL of KITLG (Table S8). While Puerto Ricans are a different population compared with African Americans, they are both admixed populations with substantial African genetic ancestry, and therefore could share eQTLs. All five eQTLs corresponded to one signal in a region with strong LD (r2 > 0.8, Figure S3).

Replication of genetic association with FEV1

Subject characteristics of our four replication cohorts (SAPPHIRE, GCPD-A, UK Biobank, and GALA II) are shown in Table S9. We attempted to replicate the association of the 45 SNPs in our primary FEV1 GWAS in each cohort. We used 0.05 as the suggestive P-value threshold and 0.0167 as the Bonferroni-corrected P-value threshold after correcting for three independent signals (see conditional analysis in the Results section). A total of 20 variants were replicated at P < 0.05 with consistent direction of effect in black UK Biobank participants; 14 variants in SAPPHIRE and 2 variants in GCPD-A were significant but had an opposite direction of effect (Table S10). We attempted to replicate the FEV1.res.rnorm association in Mexican American (n = 649) and Puerto Rican (n = 1109) children with asthma from the GALA II study. In Mexican Americans, we excluded 19 variants with MAF < 0.1% and associations for the remaining 26 variants did not replicate (Table S11). In Puerto Ricans, the associations were not replicated but we observed the same protective effect in 38 of the 45 variants in the locus (Table S11).

Incorporating statistical and functional evidence for candidate variant prioritization

We combined and summarized all functional evidence for the top 45 variants, along with eQTL findings from nasal epithelial RNA-Seq and replication results (Figure 2, Table 2, and Table S12). To facilitate interpretation of the variant association with FEV1, the effect sizes and P-values of both FEV1 (β and P) and FEV1.res.rnorm (βnorm and Pnorm) associations are also reported. Combined Annotation Dependent Depletion (CADD) functional prediction score and ENCODE histone modification ChIP-Seq peaks in embryonic, blood, bone marrow, and lung-related tissues were also examined, but not reported because none of the variants had a CADD score > 10 and none overlapped with histone modification sites. SNP rs73440122 received the highest priority score of 3 based on replication in the UK Biobank, overlap with a DNase I hypersensitivity site in B lymphoblastoid cells (GM12865), and overlap with an SPI1-binding site in acute promyelocytic leukemia cells. Eight other variants were prioritized with a score > 2 or evidence of being an eQTL for KITLG in nasal epithelial cells (Table 2, score marked with ^ or #, respectively). These nine candidate variants were selected for gene-by-air-pollution interaction analyses.
Figure 2

Integration of statistical and functional evidence for variant prioritization. Numbers and different shades of black in the LD plot represent LD in R2. The three independent signals identified in the conditional analysis are marked with “*”. Insertions/deletions are marked with “&”. Nasal eQTL, eQTLs of KITLG in nasal epithelial cells. ENCODE, DNase I hypersensitivity site and/or transcription factor ChIP-Seq peaks overlapping with the variants. UK Biobank, SAPPHIRE, and GCPD-A: replication results using blacks in the UK Biobank and African Americans in the SAPPHIRE and GCPD-A cohorts (R = replicated at P < 0.05; F = flip-flop association at P < 0.05). Candidate, candidate variants prioritized because of presence of two or more pieces of evidence, or because they are a nasal eQTL. + indicates presence of evidence. Boxes in the top panel were shaded gray if results were not available. ccREs, candidate cis-regulatory elements in SCREEN registry; ChIP-Seq, chromatin immunoprecipitation sequencing; ENCODE, Encyclopedia of DNA Elements; eQTL, expression QTL; GCPD-A, Genetics of Complex Pediatric Disorders study; LD, linkage disequilibrium; SAPPHIRE, Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity; SCREEN, Search Candidate cis-Regulatory Elements by ENCODE.

Table 2

Genome-wide lung function association in SAGE II children with asthma

1000 Genomes
MrsIDAltβPβnormPnormMAFALLAFRAMREURScore
1rs11835305T0.1261.93E-050.3203.69E-050.1040.0360.1190.0120.0010
2rs17015963C0.1261.93E-050.3203.69E-050.1040.0360.1200.0120.0010
3rs58475486aT0.1271.45E-050.3232.81E-050.1050.0370.1230.0120.0012^
4rs17015979T0.1271.45E-050.3232.81E-050.1050.0370.1230.0120.0010
5rs57692452C0.2451.63E-060.6541.06E-060.0330.0110.0300.0060.0012^
6rs112585732T0.2354.35E-070.6253.06E-070.0410.0160.0500.0060.0010
7rs113837356T0.2703.44E-060.7192.49E-060.0250.0080.0270.0060.0011
8rs61441836G0.2528.19E-070.6715.46E-070.0330.0100.0300.0060.0011
9rs73438172A0.2528.19E-070.6715.46E-070.0330.0100.0300.0060.0011
10rs1044043958bA0.2703.44E-060.7192.49E-060.0250
11rs73438182G0.1381.68E-040.3789.18E-050.0640.0200.0680.0060.0010
12rs73438185A0.1381.68E-040.3789.18E-050.0640.0200.0680.0060.0010
13rs73438188A0.2976.97E-080.7924.42E-080.0280.0100.0270.0060.0011
14rs73438190C0.1811.86E-050.4861.22E-050.0480.0160.0470.0060.0010
15rs73438195A0.1811.86E-050.4861.22E-050.0480.0160.0470.0060.0010
16rs111857459T0.1811.86E-050.4861.22E-050.0480.0160.0470.0060.0010
17rs144369986bT0.2851.21E-060.7569.44E-070.0250.0080.0260.0060.0010
18rs73440106G0.1811.86E-050.4861.22E-050.0480.0160.0470.0060.0010
19rs73440107A0.2976.97E-080.7924.42E-080.0280.0100.0270.0060.0011
20rs111453514C0.2976.97E-080.7924.42E-080.0280.0100.0270.0060.0011
21rs73440112T0.2976.97E-080.7924.42E-080.0280.0100.0270.0060.0011
22rs73440115G0.2976.97E-080.7924.42E-080.0280.0110.0280.0060.0011
23rs11312747bA0.1331.43E-050.3578.51E-060.1000.0360.1210.0100.0010
24rs73440120A0.2851.21E-060.7569.44E-070.0250.0080.0260.0060.0011
25rs111289668G0.2976.97E-080.7924.42E-080.0280.0100.0270.0060.0012^
26rs73440122C0.2922.08E-070.7751.55E-070.0270.0110.0300.0060.0013^
27rs73440123G0.2922.08E-070.7751.55E-070.0270.0110.0300.0060.0011
28rs17016065aG0.1123.19E-060.2962.54E-060.1770.0750.2170.0170.0091#
29rs17016066A0.1123.19E-060.2962.54E-060.1770.0750.2170.0170.0091#
30rs147400083bT0.1123.19E-060.2962.54E-060.1770
31rs866852270T0.1123.19E-060.2962.54E-060.1770
32rs141293300bC0.2922.08E-070.7751.55E-070.0270.0110.0300.0060.0011
33rs1398303A0.1041.22E-050.2741.12E-050.1860.0770.2230.0200.0091#
34rs61924868T0.1041.24E-050.2751.14E-050.1850.0780.2230.0200.0091#
35rs73440134T0.2922.08E-070.7751.55E-070.0270.0110.0300.0060.0011
36rs73429413G0.2922.08E-070.7751.55E-070.0270.0110.0300.0060.0011
37rs73429415A0.0965.13E-050.2534.84E-050.1890.0780.2250.0220.0091#
38rs112449284T0.2424.64E-060.6403.87E-060.0310.0120.0350.0070.0011
39rs111981782C0.2965.78E-080.7864.09E-080.0290.0120.0330.0060.0011
40rs150942400T0.2936.01E-080.7804.01E-080.0290.0120.0340.0060.0021
41rs147527487C0.0868.49E-050.2268.14E-050.2050.0950.2490.0160.0050
42rs111243672A0.2586.41E-070.6903.99E-070.0320.0140.0370.0070.0041
43rs73429450aA0.3021.26E-080.8019.01E-090.0310.0120.0330.0090.0021
44rs758775577C0.2172.22E-060.5741.85E-060.0410
45rs142679473bC0.2856.30E-080.7564.62E-080.0310.0120.0330.0090.0020

Score, priority score based on statistical and functional evidence, which is reported in Table S12. M, marker number that corresponds to those in Figure 2 and Table S12. Candidate variants were prioritized if they had a priority score of > 2 (^) or if they were expression QTL of KITLG in nasal epithelial cells (#). β (P) and βnorm (Pnorm) are the effect sizes (P-values) of the genetic associations of the alternate allele (Alt) with FEV1 and FEV1.res.rnorm, respectively. ALL/AFR/AMR/EUR, 1000 Genomes minor allele frequency from all/African/American/European populations. —, not available. MAF, minor allele frequency.

The three independent signals identified in the conditional analyses.

Insertions/deletions.

Integration of statistical and functional evidence for variant prioritization. Numbers and different shades of black in the LD plot represent LD in R2. The three independent signals identified in the conditional analysis are marked with “*”. Insertions/deletions are marked with “&”. Nasal eQTL, eQTLs of KITLG in nasal epithelial cells. ENCODE, DNase I hypersensitivity site and/or transcription factor ChIP-Seq peaks overlapping with the variants. UK Biobank, SAPPHIRE, and GCPD-A: replication results using blacks in the UK Biobank and African Americans in the SAPPHIRE and GCPD-A cohorts (R = replicated at P < 0.05; F = flip-flop association at P < 0.05). Candidate, candidate variants prioritized because of presence of two or more pieces of evidence, or because they are a nasal eQTL. + indicates presence of evidence. Boxes in the top panel were shaded gray if results were not available. ccREs, candidate cis-regulatory elements in SCREEN registry; ChIP-Seq, chromatin immunoprecipitation sequencing; ENCODE, Encyclopedia of DNA Elements; eQTL, expression QTL; GCPD-A, Genetics of Complex Pediatric Disorders study; LD, linkage disequilibrium; SAPPHIRE, Study of Asthma Phenotypes and Pharmacogenomic Interactions by Race-Ethnicity; SCREEN, Search Candidate cis-Regulatory Elements by ENCODE. Score, priority score based on statistical and functional evidence, which is reported in Table S12. M, marker number that corresponds to those in Figure 2 and Table S12. Candidate variants were prioritized if they had a priority score of > 2 (^) or if they were expression QTL of KITLG in nasal epithelial cells (#). β (P) and βnorm (Pnorm) are the effect sizes (P-values) of the genetic associations of the alternate allele (Alt) with FEV1 and FEV1.res.rnorm, respectively. ALL/AFR/AMR/EUR, 1000 Genomes minor allele frequency from all/African/American/European populations. —, not available. MAF, minor allele frequency. The three independent signals identified in the conditional analyses. Insertions/deletions.

Gene-by-air-pollution interaction of rs58475486

We previously found that first year of life and lifetime exposure to SO2 were associated with FEV1 in African American children (Neophytou ). We investigated whether the effect of the nine prioritized genetic variants associated with lung function varied by SO2 exposure (first year of life, past-year, and lifetime exposure). Since the nine variants represent three independent signals (see conditional analysis in the Results section), the Bonferroni-corrected P-value threshold was set to P = 0.0056 (correction for nine tests; three signals and three exposure periods to SO2). We observed a single statistically significant interaction between the T allele of rs58475486 and past-year exposure to SO2 that was positively associated with FEV1 (P = 0.003, β = 0.32; Figure 3A and Table 3). This interaction remained significant (P = 0.003, β = 0.32) in secondary analyses adjusted for smoking status, or a multiplicative interaction term of rs58475486 and smoking status as additional covariates. Interestingly, six of the remaining eight variants also displayed interaction effects with past-year exposure to SO2 that were suggestively associated (P < 0.05) with FEV1 (Table 3). We also found a suggestive interaction of the C allele of rs73440122 with first-year exposure to SO2 that was associated with decreased FEV1 (P = 0.045, β = −0.32; Figure 3B). The same allele also showed interaction with past year of exposure to SO2 that was suggestively associated with FEV1 in the opposite direction (P = 0.051, β = 0.39).
Figure 3

Gene-by-environment interaction analysis on FEV1. FEV1 residuals, residuals after FEV1 was regressed on the covariates age, sex, height, controller medications, sequencing centers, and the first five genetic PCs. FEV1 residuals were plotted against (A) past-year exposure to SO2 stratified by the number of copies of T allele of rs58475486 and (B) first year of life exposure to SO2 stratified by the number of copies of C allele of rs73440122. FEV1, forced expiratory volume in the first second of exhalation; PC, principal component.

Table 3

Gene-and-environment analysis on FEV1

VariantExposurenVariantExposureGxE
βPβPβP
rs58475486_T1SO2 first year6400.136.62E-06−0.050.003−0.030.658
rs57692452_C2SO2 first year6400.251.16E-06−0.050.003−0.240.091
rs111289668_G2SO2 first year6400.312.53E-08−0.050.003−0.270.079
rs73440122_C2aSO2 first year6400.317.78E-08−0.050.003−0.320.045
rs17016065_G3SO2 first year6400.118.82E-06−0.050.003−0.080.108
rs17016066_A3SO2 first year6400.118.82E-06−0.050.003−0.080.108
rs1398303_A3SO2 first year6400.103.15E-05−0.050.003−0.090.082
rs61924868_T3SO2 first year6400.103.30E-05−0.050.003−0.080.088
rs73429415_A3SO2 first year6400.091.09E-04−0.050.003−0.090.069
rs58475486_T1bSO2 past year6610.136.62E-060.050.3620.320.003
rs57692452_C2SO2 past year6610.251.16E-060.050.3620.290.100
rs111289668_G2aSO2 past year6610.312.53E-080.050.3620.410.037
rs73440122_C2SO2 past year6610.317.78E-080.050.3620.390.051
rs17016065_G3aSO2 past year6610.118.82E-060.050.3620.200.026
rs17016066_A3aSO2 past year6610.118.82E-060.050.3620.200.026
rs1398303_A3aSO2 past year6610.103.15E-050.050.3620.210.021
rs61924868_T3aSO2 past year6610.103.30E-050.050.3620.210.023
rs73429415_A3aSO2 past year6610.091.09E-040.050.3620.200.026
rs58475486_T1SO2 lifetime6610.136.62E-06−0.130.0010.260.173
rs57692452_C2SO2 lifetime6610.251.16E-06−0.130.0010.470.221
rs111289668_G2SO2 lifetime6610.312.53E-08−0.130.0010.320.444
rs73440122_C2SO2 lifetime6610.317.78E-08−0.130.0010.290.489
rs17016065_G3SO2 lifetime6610.118.82E-06−0.130.001−0.190.143
rs17016066_A3SO2 lifetime6610.118.82E-06−0.130.001−0.190.143
rs1398303_A3SO2 lifetime6610.103.15E-05−0.130.001−0.170.184
rs61924868_T3SO2 lifetime6610.103.30E-05−0.130.001−0.170.199
rs73429415_A3SO2 lifetime6610.091.09E-04−0.130.001−0.160.207

n, sample sizes for the gene-by-SO2 interaction analysis. Superscript 1 to 3 in the variant column, variants that are in linkage disequilibrium with the three independent signals, rs58475486, rs73429450, and rs17016065, respectively. β (P), effect sizes (P - values) from the main effects of the variants, exposure, and GxE interaction. GxE, gene–environment interaction.

GxE P < 0.05.

GxE P < Bonferroni P-value of 0.0056.

Gene-by-environment interaction analysis on FEV1. FEV1 residuals, residuals after FEV1 was regressed on the covariates age, sex, height, controller medications, sequencing centers, and the first five genetic PCs. FEV1 residuals were plotted against (A) past-year exposure to SO2 stratified by the number of copies of T allele of rs58475486 and (B) first year of life exposure to SO2 stratified by the number of copies of C allele of rs73440122. FEV1, forced expiratory volume in the first second of exhalation; PC, principal component. n, sample sizes for the gene-by-SO2 interaction analysis. Superscript 1 to 3 in the variant column, variants that are in linkage disequilibrium with the three independent signals, rs58475486, rs73429450, and rs17016065, respectively. β (P), effect sizes (P - values) from the main effects of the variants, exposure, and GxE interaction. GxE, gene–environment interaction. GxE P < 0.05. GxE P < Bonferroni P-value of 0.0056.

Discussion

Variant rs73429450 (MAF = 0.030) was identified as the strongest association signal with FEV1. Each additional copy of the protective A allele of rs73429450 was associated with a 0.3 L increase of FEV1. We did not find any statistically significant contribution of rare variants to the association signal from a 1-kb sliding window analyses in the 1-MB flanking region centered on rs73429450. We were surprised to identify a novel common variant (MAF = 0.030) associated with lung function using WGS data in a population that was previously analyzed for associations with lung function using genotype array data. Further investigation revealed that our discovered variant, rs73429450, was not captured by the LAT 1 genotyping array, and the association with lung function depended on the reference panel used to impute the variant into our population. More surprisingly, our statistically significant finding was only found to be suggestively significant using data imputed from the CAAPA reference panel (P = 1.95 × 10−7, β = 0.80). Of the imputation reference panels that we assessed, CAAPA is one of the more relevant reference panels for our study population because it is based on African populations in the Americas. However, we note that the effect size estimated from CAAPA-imputed data was comparable to that generated from WGS data. While WGS data are usually praised for enabling analysis of rare-variant contributions to phenotype variability, our results show the utility of WGS data for the reliable analysis of common variants as well as in the absence of relevant imputation panels. Although rs73429450 had the lowest P-value from our WGS association analysis, we did not find the required amount of functional evidence to prioritize this marker for inclusion in downstream gene-by-air-pollution analyses. Another variant, rs73440122, was in moderate-to-strong LD (r2 = 0.76) with rs73429450 and had a similar MAF (0.027) in our study population, but was only suggestively associated with FEV1 in our association analysis (P = 2.08 × 10−7; Table 2). In contrast to rs73429450, there were multiple lines of evidence suggesting the functional relevance of rs73440122: rs73440122 received the highest priority score based on its replicated FEV1 association in black UK Biobank participants and overlap with ENCODE gene regulatory regions, making it one of the most likely drivers of FEV1 variability among individuals, possibly mediated through KITLG. Bioinformatic interrogation of rs73440122 revealed that the variant overlapped with a ccRE (SCREEN accession EH37E0279310), DNase I hypersensitivity site, and SPI1 ChIP-Seq clusters that were indicative of a candidate open chromatin gene regulatory region (Table S12). The binding evidence of SPI1 is highly relevant to the role of KITLG in type 2 inflammation (see below). Variant rs73440122 is located in a region that physically interacts with KITLG based on Hi-C data in fetal lung fibroblast cells. Additionally, five neighboring FEV1 associated variants were identified as eQTLs of KITLG, although they appeared to be an independent signal (r2 < 0.2). Overall, these results support regulatory interactions between our novel locus and KITLG. Atopic or type 2 high asthma is the most common form of asthma in children (Comberiati ). KITLG, more commonly known as SCF, is a ligand of the KIT tyrosine kinase receptor. It plays an important role in type 2 inflammation in atopic asthma, especially in inflammatory processes mediated through mast cells, IgE, and group 2 innate lymphoid cells (Oliveira and Lukacs 2003; Da Silva and Frossard 2005; Da Silva ; Fonseca ). In the airways, KITLG is expressed in bronchial epithelial cells, lung fibroblasts, bronchial smooth muscle cells, endothelial cells, peripheral blood eosinophils, dendritic cells, and mast cells (Valent ; Wen ; Kassel ; Hsieh ; Oriss ). KITLG is a major growth factor of mast cells [reviewed in Galli , 1995), Broudy (1997), and Da Silva )]. It promotes recruitment of mast cell progenitors into tissues [reviewed in Oliveira and Lukacs (2003)], prevents mast cell apoptosis (Mekori ; Iemura ), and promotes release of inflammatory mediators such as proteases, histamine, chemotactic factors, and cytokines [reviewed in Borish and Joseph (1992) and Amin (2012)]. While KITLG promotes the production of cytokines like IL-13 upon IgE-receptor cross-linking on the surface of mast cells (Kobayashi ), IL-13 has also been reported to upregulate KITLG (Rochman ). Consistent with the critical role of KITLG for mast cells and type 2 inflammation, we found our prioritized variant, rs73440122, overlapped with an SPI1 (also known as PU.1) ChIP-Seq cluster. The transcription factor SPI1 was demonstrated in SPI1 knockout mice to be necessary for the development of B cells, T cells, neutrophils, macrophages, dendritic cells, and mast cells (Scott , 1997; McKercher ; Anderson ; Guerriero ; Walsh ). It plays an essential role in macrophage differentiation in asthmatic and other allergic inflammation (Qian ; Yashiro ). It was also shown to regulate the cell fate between mast cells and monocytes (Nishiyama ,b; Ito , 2009). The presence of an SPI1 binding site in a candidate regulatory region of KITLG is therefore highly relevant given the critical role of KITLG in mast cell survival and activation. Higher levels of KITLG (Al-Muhsen ; Da Silva ; Tayel ) and an increased number of mast cells in the lung (Fajt and Wenzel. 2013; Cruse and Bradding. 2016; Méndez-Enriquez and Hallgren 2019) were detected in individuals with asthma. The percentage of a subpopulation of circulating blood mast cell progenitors (Lin+ CD34hi CD117int/hi FcεRI+) was higher in individuals with reduced lung function (Dahlin ). These findings suggested that higher KITLG expression and/or number of mast cells may be a contributing factor to lower lung function. This notion was inconsistent with the association of our novel locus with higher KITLG expression and increased lung function in SAGE II children with asthma. Interestingly, a study of 20 subjects with severe asthma found that an increase in the number of chymase-positive mast cells in the small airway was associated with increased lung function (Balzar ). Overall, while there is still controversy on the direction of effect, previous findings support the association of our novel KITLG locus with lung function, especially in patients with allergic asthma. Our novel locus likely represents part of a complex regulatory mechanism that modulates immune cell differentiation, survival, and activation in highly cell-specific and context-dependent manners. Further studies are required to study how this locus is regulated in different airway and immune cells to affect lung function outcome in the context of asthma. GxE interactions likely account for a portion of the “missing” heritability of many complex phenotypes (Moore and Williams 2009). We previously found that lung function in SAGE II participants was associated with first year of life and lifetime exposures to SO2 [1.66% decrease (95% C.I. = −2.92 to −0.37) for first year of life and 5.30% decrease (95% C.I. = −8.43 to −2.06) for lifetime exposures in FEV1 per 1 ppb increases in SO2] (Neophytou ). We hypothesized that a significant portion of the heritability of lung function was due, in part, to gene-by-air-pollution (SO2) interaction effects. The interaction between rs58475486 and past-year exposure to SO2 that was significantly associated with lung function supports our hypothesis. The T allele of rs58475486 is common (8–14%) in African populations and showed a protective effect on lung function in the presence of past-year SO2 exposure. SNP rs58475486 is located in a ccRE (SCREEN accession EH37E0279296) and a FOXA1 binding site in the A549 lung adenocarcinoma cell line. FOXA1 has a known compensatory role with FOXA2 during lung morphogenesis in mice (Wan ). Deletion of both FOXA1 and FOXA2 inhibited cell proliferation, epithelial cell differentiation, and branching morphogenesis in fetal lung tissue. Further functional validation of the effect of rs58475486 on the binding affinity of FOXA1 is necessary to confirm whether the role of FOXA1 in this ccRE is important for KITLG regulatory and lung function. The higher frequencies of the protective alleles of both rs73440122 and rs58465486 in African populations appear to contradict previous findings that African ancestry is associated with lower lung function (Kumar ). One possible explanation for this seeming inconsistency is that FEV1 is a complex trait, whose variation is influenced by many genetic variants of small-to-moderate effect sizes whose influences on lung function may vary by exposure to environmental factors. We found suggestive evidence that the interaction between rs73440122 and first-year exposure to SO2 reverses the positive association of rs73440122 with lung function to a negative one (Table 3). When assessed independently, our genetic association analysis showed that the protective A allele of rs73440122 was associated with higher lung function. However, with increasing levels of SO2 exposure in the first year of life, increasing copies of the A allele of rs73440122 were associated with decreased lung function. Air pollution is known to negatively impact lung function, and we have previously shown that the deleterious effects of air pollution on lung phenotypes may be significantly increased in African American children compared to other populations experiencing the same amount of exposure (Nishimura ). It has also been reported that Latino and African American populations often live in neighborhoods with high levels of air pollution (Mott 1995). The increased susceptibility to negative pulmonary effects from air pollution exposure coupled with the disproportionate exposure to air pollution experienced by the African American population may also contribute to the lower lung function seen in this population, despite the presence of protective alleles. The overlap of the SPI1 binding site with rs73440122 further supports gene-by-SO2 interaction at this locus, since SPI1 plays a critical role in the development of type 2 inflammation in the airways through macrophage polarization (Qian ). We noted that the rs73440122 A allele also showed an interaction approaching suggestive threshold with past-year exposure to SO2 that was positively associated with FEV1. The difference is not surprising because age of exposure may significantly impact the effect of air pollution on lung function [reviewed in Usemann ]. Further studies are required to better understand the effect of this suggestive interaction on lung function. One strength of this study is the interrogation of independent lung function-associated signals at our novel locus. We identified evidence of three independent signals: the replicated signal that showed evidence of regulatory functions (an open chromatin region with an SPI1/PU.1 binding site), one signal that showed a statistically significant gene-by-SO2 interaction on lung function, and one signal that represents KITLG eQTLs in the nasal epithelial cells together with suggestive gene-by-SO2 interaction. Our results demonstrate a glimpse of the complicated genetic architecture behind complex traits. One limitation of this study is that the FEV1 genetic association and the eQTL analyses with KITLG were performed in different populations due to data availability constraints. Although we did not have RNA-Seq data from lung tissues from our study subjects, we previously demonstrated that there is a high degree of overlap in gene expression profiles between nasal and bronchial epithelial cells (Poole ). The direction of effect of the association was the same in GALA II Puerto Rican children with asthma but not statistically significant. This may be due to: (1) the lower minor allele frequency in Puerto Ricans with significantly lower African Ancestry compared to African Americans, and (2) the modest sample size of the replication study and the weak effect of the protective allele can lead to undetectable true associations (Altshuler ). We replicated 20 of 45 variants in black UK Biobank subjects and observed conflicting “flip-flop” associations in African Americans from the SAPPHIRE and GCPD-A studies. In the past, flip-flop associations were deemed as spurious results. The traditional association-testing approach studies the effect of each variant on phenotype independently and increases the chance of flip-flop associations detected between studies. Differences in study design, sampling variation that leads to variation in LD patterns, and lack of consideration of other disease-influencing genetic and/or environmental factors are all potential causes of flip-flop associations (Lin ; Kraft ). Hence, it is not surprising that flip-flop associations were observed when gene and environment interactions were detected at our FEV1 GWAS locus. It was previously shown that flip-flop associations can occur between and within populations, even in the presence of a genuine genetic effect (Lin ; Kraft ). Further functional analysis is thus required to validate the relationship between the candidate variants KITLG and FEV1. This may include reporter assays to validate potential enhancer or repressor activity, and clustered regularly interspaced short palindromic repeats-based editing assays to validate the regulatory role of the candidate variants on KITLG. Although literature exists describing KIT signaling for lung function in mice (Lindsey ), additional knockout experiments in a model animal system are necessary to study how KITLG contributes to variation in lung function. The average concentration of ambient SO2 exposure in our participants (Table 1) was lower than National Ambient Air Quality Standards. It is possible that SO2 acted as a surrogate for other unmeasured toxic pollutants emitted from local point sources. Major sources of SO2 in the San Francisco Bay Area during the recruitment years of 2006–2011 include airports, petroleum refineries, gas and oil plants, calcined petroleum coke plants, electric power plants, cement manufacturing factories, chemical plants, and landfills (United States Environmental Protection Agency 2008, 2011). The Environmental Protection Agency’s national emissions inventory data also show that these facilities emit volatile organic compounds, heavy metals (lead, mercury, chromium, and arsenic), formaldehyde, ethyl benzene, acrolein, 1,3-butadiene, 1,4-dichlorobenzene, and tetrachloroethylene into the air along with SO2. These chemicals are highly toxic and inhaling even a small amount may contribute to poor lung function. Another possibility is that exposure to SO2 captures unmeasured confounding socioeconomic factors. This study identified a novel protective allele for lung function in African American children with asthma. The protective association with lung function intensified with increased past-year exposure to SO2. Our findings showcase the complexity of the relationship between genetic and environmental factors impacting variation in FEV1, highlights the utility of WGS data for genetic research of complex phenotypes, and underscores the importance of including diverse study populations in our exploration of the genetic architecture underlying lung function.
  112 in total

1.  PU.1 is required for myeloid-derived but not lymphoid-derived dendritic cells.

Authors:  A Guerriero; P B Langmuir; L M Spain; E W Scott
Journal:  Blood       Date:  2000-02-01       Impact factor: 22.113

2.  Familial aggregation and heritability of adult lung function: results from the Busselton Health Study.

Authors:  L J Palmer; M W Knuiman; M L Divitini; P R Burton; A L James; H C Bartholomew; G Ryan; A W Musk
Journal:  Eur Respir J       Date:  2001-04       Impact factor: 16.671

3.  Roles of PU.1 in monocyte- and mast cell-specific gene regulation: PU.1 transactivates CIITA pIV in cooperation with IFN-gamma.

Authors:  Tomonobu Ito; Chiharu Nishiyama; Nobuhiro Nakano; Makoto Nishiyama; Yoshihiko Usui; Kazuyoshi Takeda; Shunsuke Kanada; Kanako Fukuyama; Hisaya Akiba; Tomoko Tokura; Mutsuko Hara; Ryoji Tsuboi; Hideoki Ogawa; Ko Okumura
Journal:  Int Immunol       Date:  2009-06-05       Impact factor: 4.823

Review 4.  Stem cell factor and hematopoiesis.

Authors:  V C Broudy
Journal:  Blood       Date:  1997-08-15       Impact factor: 22.113

5.  Robust inference of population structure for ancestry prediction and correction of stratification in the presence of relatedness.

Authors:  Matthew P Conomos; Michael B Miller; Timothy A Thornton
Journal:  Genet Epidemiol       Date:  2015-03-23       Impact factor: 2.135

6.  Mast cells, their subtypes, and relation to asthma phenotypes.

Authors:  Merritt L Fajt; Sally E Wenzel
Journal:  Ann Am Thorac Soc       Date:  2013-12

7.  c-Kit is essential for alveolar maintenance and protection from emphysema-like disease in mice.

Authors:  James Y Lindsey; Koustav Ganguly; David M Brass; Zhuowei Li; Erin N Potts; Simone Degan; Huaiyong Chen; Brian Brockway; Soman N Abraham; Annerose Berndt; Barry R Stripp; W Michael Foster; George D Leikauf; Holger Schulz; John W Hollingsworth
Journal:  Am J Respir Crit Care Med       Date:  2011-03-11       Impact factor: 21.405

8.  An atlas of genetic associations in UK Biobank.

Authors:  Oriol Canela-Xandri; Konrad Rawlik; Albert Tenesa
Journal:  Nat Genet       Date:  2018-10-22       Impact factor: 38.330

9.  HTSeq--a Python framework to work with high-throughput sequencing data.

Authors:  Simon Anders; Paul Theodor Pyl; Wolfgang Huber
Journal:  Bioinformatics       Date:  2014-09-25       Impact factor: 6.937

10.  Generalization and dilution of association results from European GWAS in populations of non-European ancestry: the PAGE study.

Authors:  Christopher S Carlson; Tara C Matise; Kari E North; Christopher A Haiman; Megan D Fesinmeyer; Steven Buyske; Fredrick R Schumacher; Ulrike Peters; Nora Franceschini; Marylyn D Ritchie; David J Duggan; Kylee L Spencer; Logan Dumitrescu; Charles B Eaton; Fridtjof Thomas; Alicia Young; Cara Carty; Gerardo Heiss; Loic Le Marchand; Dana C Crawford; Lucia A Hindorff; Charles L Kooperberg
Journal:  PLoS Biol       Date:  2013-09-17       Impact factor: 8.029

View more
  2 in total

1.  Lymph node-resident dendritic cells drive TH2 cell development involving MARCH1.

Authors:  Carlos A Castellanos; Xin Ren; Steven Lomeli Gonzalez; Hong Kun Li; Andrew W Schroeder; Hong-Erh Liang; Brian J Laidlaw; Donglei Hu; Angel C Y Mak; Celeste Eng; José R Rodríguez-Santana; Michael LeNoir; Qi Yan; Juan C Celedón; Esteban G Burchard; Scott S Zamvil; Satoshi Ishido; Richard M Locksley; Jason G Cyster; Xiaozhu Huang; Jeoung-Sook Shin
Journal:  Sci Immunol       Date:  2021-10-15

Review 2.  Gene-environment interactions in childhood asthma revisited; expanding the interaction concept.

Authors:  Natalia Hernandez-Pacheco; Maura Kere; Erik Melén
Journal:  Pediatr Allergy Immunol       Date:  2022-05       Impact factor: 5.464

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.