Literature DB >> 29871630

A candidate gene identification strategy utilizing mouse to human big-data mining: "3R-tenet" in COPD genetic research.

Sangeetha Vishweswaraiah1, Leema George1, Natarajan Purushothaman2, Koustav Ganguly3,4.   

Abstract

BACKGROUND: Early life impairments leading to lower lung function by adulthood are considered as risk factors for chronic obstructive pulmonary disease (COPD). Recently, we compared the lung transcriptomic profile between two mouse strains with extreme total lung capacities to identify plausible pulmonary function determining genes using microarray analysis (GSE80078). Advancement of high-throughput techniques like deep sequencing (eg. RNA-seq) and microarray have resulted in an explosion of genomic data in the online public repositories which however remains under-exploited. Strategic curation of publicly available genomic data with a mouse-human translational approach can effectively implement "3R- Tenet" by reducing screening experiments with animals and performing mechanistic studies using physiologically relevant in vitro model systems. Therefore, we sought to analyze the association of functional variations within human orthologs of mouse lung function candidate genes in a publicly available COPD lung RNA-seq data-set.
METHODS: Association of missense single nucleotide polymorphisms, insertions, deletions, and splice junction variants were analyzed for susceptibility to COPD using RNA-seq data of a Korean population (GSE57148). Expression of the associated genes were studied using the Gene Paint (mouse embryo) and Human Protein Atlas (normal adult human lung) databases. The genes were also assessed for replication of the associations and expression in COPD-/mouse cigarette smoke exposed lung tissues using other datasets.
RESULTS: Significant association (p <  0.05) of variations in 20 genes to higher COPD susceptibility have been detected within the investigated cohort. Association of HJURP, MCRS1 and TLR8 are novel in relation to COPD. The associated ADAM19 and KIT loci have been reported earlier. The remaining 15 genes have also been previously associated to COPD. Differential transcript expression levels of the associated genes in COPD- and/ or mouse emphysematous lung tissues have been detected.
CONCLUSION: Our findings suggest strategic mouse-human datamining approaches can identify novel COPD candidate genes using existing datasets in the online repositories. The candidates can be further evaluated for mechanistic role through in vitro studies using appropriate primary cells/cell lines. Functional studies can be limited to transgenic animal models of only well supported candidate genes. This approach will lead to a significant reduction of animal experimentation in respiratory research.

Entities:  

Keywords:  3R; Alternate models; Asthma; COPD; Gene; Lung; Transcriptomics

Mesh:

Year:  2018        PMID: 29871630      PMCID: PMC5989378          DOI: 10.1186/s12931-018-0795-y

Source DB:  PubMed          Journal:  Respir Res        ISSN: 1465-9921


Background

Progress in the genomics technologies continue to tremendously advance our understanding of chronic lung diseases like asthma, chronic obstructive pulmonary disease (COPD), and idiopathic pulmonary fibrosis. COPD alone is the 4th leading cause of death globally [http://www.who.int/mediacentre/factsheets/fs310/en/]. Genetic predisposition is considered to be an important risk factor for COPD susceptibility. This is evident from the fact that only 15–20% of smokers develop COPD [1, 2]. Thus, candidate gene identification has been a major focus for COPD research. This has also lead to the extensive use of inbred mouse strains for screening experiments and also to the development of transgenic mouse models to identify genetic susceptibility, elucidation of molecular patho-mechanisms and toxicity testing in COPD research. However, a spin-off of the popularity of transgenic strains to explore gene-function relationships is the increased animal usage [3]. Another corresponding concern is the large number of animals bred that are genetically unsuited for the experiment. Breeding surplus often counts for 50% of the offspring [3]. Moreover, the relevance of a mouse with a single gene inserted or knocked out for studying human diseases is also questioned. This is mainly because complex traits are multi-gene controlled that do not follow Mendelian pattern of inheritance. Pulmonary function and COPD are classic examples of such phenomenon [4-18]. Yet we believe, transgenic models may continue to serve as important resources for studying gene-function relationships particularly in the field of respiratory research. However, the strategy to select candidate genes for using transgenic models to study COPD and other chronic lung diseases is an important issue that warrants attention. Practice of the “3R tenet”-replacement, reduction and refinement warrants a scientist to adequately evaluate non-animal alternatives prior to performing animal experiments [19, 20]. Strategic genomics data mining using the public repositories can put in practice the “3R-tenet” more effectively by: i) reducing screening experiments with animals, ii) performing mechanistic studies in physiologically relevant alternate in vitro model systems and using advanced technologies like RNAi or CRISPR-Cas9 for understanding gene-function relationships, and iii) performing in vivo functional testing using transgenic animal models limited to well supported candidate genes. An accelerated decline in lung function is considered to be the earliest indicator for predisposition, onset and COPD severity assessment. We previously identified mouse strains (C3H/HeJ and JF1/MsJ) with extreme total lung capacities [5, 21, 22]. Recently, we performed a large-scale microarray study (GSE80078) to compare the lung transcript expression profiles of C3H/HeJ and JF1/MsJ mice at the completion of: (I) embryonic lung development; (II) bulk alveolar formation and (III) lung growth and maturity [18]. The generated microarray data provides a publicly available resource for performing genetic association studies as well as functional and mechanistic investigations to understand pulmonary function development and chronic lung disease (eg. COPD) susceptibility [18]. Lung developmental pathways are recollected in genetic subroutines during repair and remodeling processes following lung injury. Therefore, it is plausible that an individual with hindered lung development may have an inefficient repair/remodeling process thereby predisposing them to chronic lung diseases like COPD [23-25]. A study by Lange et al. [26] showed that forced expiratory volume in 1 s (FEV1) in early adulthood is important for the genesis of COPD and that accelerated decline in FEV1 is not an obligate feature of COPD. Therefore, in this work, we performed an in-silico study, testing the association of functional variations within human orthologs of mouse lung function candidate genes [18] in a publicly available RNAseq dataset of a COPD cohort [27].

Methods

Figure 1 illustrates the overall analysis strategy followed in this study. We focused on the missense single nucleotide polymorphisms (SNPs), insertions, deletions and splice site variations for detecting the functional relevance of the associations. Lung transcriptome data (RNA-seq; GSE57148) from a Korean cohort [27] were analyzed to call the variants and to identify the SNPs with significant (p <  0.05) allelic frequency differences between the COPD cases and controls.
Fig. 1

Strategic workflow to screen mouse lung developmental genes for their association within a human chronic obstructive pulmonary disease (COPD) cohort transcriptomic (RNAseq) data

Strategic workflow to screen mouse lung developmental genes for their association within a human chronic obstructive pulmonary disease (COPD) cohort transcriptomic (RNAseq) data

Selection of mouse genes

Mouse lung microarray dataset was retrieved (GSE80078) from our recently completed project contrasting C3H/HeJ (large total lung capacity) and JF1/MsJ (small total lung capacity) [18]. Genes exhibiting increased/decreased transcript expression levels by ≥2 fold in the lungs of JF1/MsJ mice compared to C3H/HeJ were selected for performing the association studies. We also included the top 20 genes identified in Kim et al. [27] study and other COPD associated genes by literature survey resulting in a total of 494 genes for screening. Human orthologs of some genes were not found and many were RIKEN or expressed sequence tags. Therefore, the final search list constituted of 355 genes (Additional file 1: Table S1).

Human lung transcriptome data

A publicly available RNA-seq dataset from a Korean cohort consisting of 98 COPD cases and 91 control subjects was selected for the analysis [27]. Based on our search term [(COPD RNA seq human) and “Homo sapiens”] this was the largest available COPD RNA-seq dataset at the Gene expression Omnibus (GEO) database. The raw FASTQ files of paired end reads representing the transcriptome of control and cases were retrieved from the GEO database at the National Centre for Biological Information (NCBI) through accession number GSE57148 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE57148) [27].The quality of the raw FASTQ files were analyzed using FASTQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) for the presence of sequencing adapters and low-quality bases (Phred quality score 30). The quality filtered FASTQ files (Paired end) for each sample were then mapped against the Human Reference Genome build hg19 (http://hgdownload.soe.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz)usingthe Burrows Wheeler alignment (BWA) tool version 0.7.10 (http://bio-bwa.sourceforge.net/). The whole genome alignment was performed using ‘BWA-MEM’ algorithm with default parameters [28]. The aligned reads in the Sequence Alignment/Map (SAM) format were then sorted using ‘SortSam’ algorithm of Picard tool v.1.118 (https://sourceforge.net/projects/picard/). The Sorted SAM file was converted to binary version of a SAM file (BAM file) using the SAMtools (http://samtools.sourceforge.net/). The resulting BAM file was then sorted and indexed using SAMtools (http://samtools.sourceforge.net/) for variant calling. The ‘mpileup’ algorithm of SAM tools was used for calling variants from the sorted BAM file using default parameters. The resulting variant calling file (VCF) containing SNPs was used for the further downstream analysis. The VCF files generated from COPD cases and controls were separately combined using CombineVariants command in Genome Analysis Tool Kit (GATK) v.2.3.9 (https://www.broadinstitute.org/gatk/). The allele frequency in cases and controls were calculated using VCF tools v.0.1.12a (http://vcftools.sourceforge.net/). The calculated allelic frequencies were considered to compare the differences in SNPs frequencies among the COPD cases and the controls.

Statistics

The relative odds with the “cross-products” ratio was used for calculating statistical significance. Followed by odds ratio estimation, the confidence interval was calculated. Ninety five percent confidence level was considered for the estimation [29]. The odds ratio and the significance of the associations were calculated using a statistical tool MedCalc (https://www.medcalc.org/calc/odds_ratio.php). Single variant analysis was performed and the raw p <  0.05 was considered as significant.

In silico assessment of functional consequence of the associated variations on protein biochemistry

The polymorphisms with the significant allelic frequency differences between the COPD cases and controls were further analyzed using the visualization tool ‘Golden Helix GenomeBrowse’ (http://www.goldenhelix.com) to assess the plausible effect of SNPs on protein biochemistry or splicing events. Prosite’ tool of ExPASy [30] was used to analyze the effect of amino acid changes on the functional domains of proteins.

In silico lung expression domain studies of associated genes

Transcript expression of the significantly associated genes were screened in embryonic mouse lungs using the online database “GenePaint” [31]. “The Human Protein Atlas” database [32] was used to identify the immuno-positive lung cells for the significantly associated genes in normal adult human lung.

Lung transcript expression levels of the associated genes in COPD and cigarette smoke exposed mice

The associated 20 genes were scanned for differential transcript expression in several COPD and/ or emphysematous lung tissues (GSE: 29133, 22,148, 1650, 47,460 and 54,837) [33-37] as well as in mouse cigarette smoke exposed lungs (GSE: 8790, 7310, 17,737, and 76,205) [38-40] using microarray/RNA-seq datasets from GEO database.

Results

A stringent cut off ratio of ≥2 fold increased/decreased was used to select the mouse lung function developmental genes (GSE80078) for association studies in the RNA-seq dataset of the investigated Korean COPD cohort (GSE57148). Our study identified significant association of 16 non-synonymous SNPs, 4 splice junction variations and 3 insertions involving 20 genes out of the 355 screened genes to higher COPD susceptibility in the investigated cohort (Table 1).
Table 1

Details of the gene and corresponding single nucleotide polymorphism (SNP) associated to chronic obstructive pulmonary disease (COPD) susceptibility

GeneGene nameEntrez IDAssociated SNPChromosomal LocationRef allele/Alt alleleRef AA/Alt AAOdds ratio95% CIz-statisticSignificance levelAssociation to COPD
ABCA10 ATP-binding cassette, sub-family A, member 1010,349rs496884917: 67178316A/GM/T4.091.11 to 15.012.1250.0336Novel loci, gene associated previously (NLGAP)
ADAM19 ADAM metallopeptidase domain 198728rs14227955: 156936364T/CS/G6.212.26 to 17.003.5550.0004Loci and gene associatedpreviously
BHLHE41 Basic helix-loop-helix family, member e4179365rs1104841312: 26275555G/AA/V5.292.71 to 10.354.879<  0.0001NLGAP
CD200CD200 molecule4345rs11311993: 112059768C/GS/C2.37731.11 to 5.052.2480.0246NLGAP
CYBB Cytochrome b-245, beta polypeptide1536NovelX: 37658269C/AQ/K2.90911.35 to 6.262.7260.0064NLGAP
GATM Glycine amidinotransferase2628rs128877515: 45661678T/AQ/H2.43091.35 to 4.362.9740.0029NLGAP
GBP1 Guanylate binding protein 1, interferon-inducible2633rs10484251: 89522646G/CT/S3.26111.70 to 6.243.5660.0004NLGAP
HJURP Holliday junction recognition protein55355rs22864302: 234761225C/TE/K3.361.42 to 7.942.7680.0056Novel
KIT V-Kit, sarcoma viral oncogene homolog3815rs38222144: 55593464A/CM/L5.051.07 to 23.742.0540.04Loci and gene associated previously
LEPR Leptin receptor3953rs11371011: 66058513A/GQ/R10.395.00 to 21.586.28<  0.0001NLGAP
LMO7 LIM domain 74008Insertion13: 76383319A to G Insertion3.63161.99 to 6.624.209<  0.0001Novel insertion, gene associatedPreviously (NIGAP)
LMO7 LIM domain 74008Insertion13: 76429504T insertion3.55311.50 to 8.362.9030.0037NIGAP
LRP1 Low density lipoprotein receptor-related protein 14035Splice junction12: 57605134G/C10.221.28 to 81.582.1950.0282Novel splice site; gene associatedPreviously (NSSGAP)
MCRS1 Microspherule protein 110445Splice junction12: 49957330C/T3.3651.57 to 7.193.1270.0018Novel
POP4 Processing of precursor 4, ribonuclease P/MRP subunit (S. cerevisiae)10775Splice junction19: 30101540G/A2.76691.31 to 5.832.6730.0075NSSGAP
PTCH1 Patched 15727Splice junction9: 98242373G/T11.412.58 to 50.373.2130.0013NSSGAP
SCN7A Sodium channel, voltage-gated, type VII, alpha subunit6332rs75650622: 167334085G/TT/N4.41751.97 to 9.903.6060.0003NLGAP
SCN7A Sodium channel, voltage-gated, type VII, alpha subunit6332Insertion2: 167289263AG Insertion3.35611.17 to 9.572.2630.0237NIGAP
SCN7A Sodium channel, voltage-gated, type VII, alpha subunit6332rs67380312: 167279922C/AM/I2.28171.09 to 4.762.1980.028NLGAP
SLFN12L Schlafen family member 12 like100,506,736rs230496817: 33805150T/CY/C2.21.07 to 4.512.1510.0315NLGAP
TLR8 Toll-like receptor 851,311rs3764880X: 12924826A/GM/V2.971.11 to 7.912.1810.0292Novel
TTC5 Tetratricopeptide repeat domain 591875rs374294514: 20770036T/CQ/R2.15911.14 to 4.072.3750.0176NLGAP
VEPH1 Ventricular zone expressed PH domain homolog 1 (zebrafish)79674rs119189743: 157081324A/GS/P1.86681.04 to 3.322.1160.0343NLGAP

AA amino acids, rs reference sequence, Ref reference, Alt altered, CI confidence interval

p < 0.05 was considered as significant

Details of the gene and corresponding single nucleotide polymorphism (SNP) associated to chronic obstructive pulmonary disease (COPD) susceptibility AA amino acids, rs reference sequence, Ref reference, Alt altered, CI confidence interval p < 0.05 was considered as significant

Association of novel and previously reported genes to COPD

The 20 associated genes include: ATP binding cassette subfamily A member 10 (ABCA10); a disintegrin and metallopeptidase domain 19 (ADAM19); basic helix-loop-helix family member e41 (BHLHE41), CD200 molecule (CD200); cytochrome b-245, beta polypeptide (CYBB); glycine amidinotransferasec (GATM); guanylate binding protein 1 (GBP1); holliday junction recognition protein (HJURP); KIT proto-oncogene receptor tyrosine kinase (KIT); leptin receptor (LEPR); LIM domain 7 (LMO7); LDL receptor related protein 1 (LRP1); microspherule protein 1 (MCRS1); processing of precursor 4, ribonuclease P/MRP subunit (POP4); Patched 1 (PTCH1); sodium channel, voltage-gated, type VII, alpha subunit (SCN7A); schlafen family member 12 like (SLFN12L); toll like receptor 8 (TLR8); tetratricopeptide repeat domain 5 (TTC5) and ventricular zone expressed PH domain homolog 1 (VEPH1). Our analysis, identified HJURP (rs2286430), MCRS1 (splice junction), and TLR8 (rs3764880) as three novel COPD associated genes (Table 1). The variations (missense SNPs/splice junction variations) on ABCA10 (rs496849), BHLHE41 (rs11048413), CD200 (rs1131199), CYBB (not reported in dbSNP), GATM (rs1288775), GBP1 (rs1048425), LEPR (rs1137101), LMO7 (2 insertions), LRP1 (splice junction), POP4 (splice junction), PTCH1 (splice junction), SCN7A (rs7565062, rs6738031, 1 insertion), SLFN12L (rs2304968), TTC5 (rs3742945), and VEPH1 (rs11918974) are located on genes previously associated to COPD (Table 1). The associated SNPs on ADAM19 (rs1422795) and KIT (rs3822214) have been previously reported in relation to COPD (Table 1).

In silico protein domain and gene/protein expression analysis

In silico protein domain analysis revealed the ADAM19 (rs1422795) variation at the position of Chr5: T-156936364-C resulting in an amino acid exchange of Ser17Gly (polar to non-polar) to be located within the ADAM metalloprotease domain (Additional file 1: Figure S1). None of the other amino acid changes were located within functional domains of the proteins. In silico transcript expression domain analysis using the Gene Paint database (Additional file 1: Table S2) revealed detectable lung expression of Adam19, Cd200, Cybb, Mfleg (HJURP), Kit, Lepr, Lmo7, Lrp1, Mcrs1, Pop4 and Ptch1 in mouse embryo (E14.5; at pseudoglandular stage of lung development). This further attests the role of the mentioned 11 genes in the process of lung development. Impairment in the regulation and functionality of lung developmental genes may result in predisposition to chronic lung diseases like COPD. In silico lung protein expression domain analysis using the Human Protein Atlas revealed detectable immuno-expression of 18 associated genes in macrophages and/or pneumocytes and/or nasopharynx (respiratory epithelial cells) and/or bronchus (respiratory epithelial cells) (Additional file 1: Table S2). Immuno-expression of BHLHE41 and GATM were not detectable in the normal human lung tissue. Detection of expression of the significantly associated COPD susceptibility genes within specific cell types of the normal human lung further supports their specific role in the normal lung physiology. Additional file 1: Figures S1-S4 shows the expression of HJURP, MCRS1 and TLR8 in mouse embryonic lungs and normal adult human lungs. However, human protein atlas does not provide information on the expression of proteins in COPD tissues. Therefore, we investigated the transcript expression levels of the associated genes using available datasets on the lungs of COPD patients and mouse exposed to cigarette smoke. The associated SNP rs2286430 (C/T) located on HJURP results in an amino acid change of glutamic acid (Glu: acidic, polar and negatively charged) to lysine (Lys: basic, polar and positively charged) in HJURP. Low to medium intensity of HJURP immune positive macrophages, pneumocytes, respiratory epithelial cells have been demonstrated in normal human lung tissue (Additional file 1: Figure S2) (Human Protein Atlas). Hjurp transcripts has been detected in mouse embryonic lungs (Additional file 1: Figure S2). Mcrs1 is expressed in the mouse embryonic lungs (Additional file 1: Figure S3) (Gene Paint). Medium to high intensity immune-positive MCRS1 macrophages, pneumocytes, respiratory epithelial cells have been demonstrated in normal human lung tissue (Additional file 1: Figure S3) (Human Protein Atlas). TLR8 immuno-positive (high intensity) macrophages are reported in normal human lung (Additional file 1: Figure S4). The intensity of TLR8 immuno-positive staining in the respiratory epithelial cells is low (Additional file 1: Figure S4) whereas in pneumocytes and embryonic mouse lung TLR8/Tlr8 was not detectable (Human Protein Atlas; Gene Paint).

Lung transcript expression of the associated genes in other COPD cohorts and mouse studies

We investigated the transcript expression levels of the associated 20 genes in several COPD and/ or emphysematous lung tissue data sets. SLFN12L is the only gene not exhibiting any differential expression in any of the investigated datasets. A summary of the expression pattern of the 20 genes in the investigated COPD lung tissue datasets (GSE: 29133, 22,148, 1650, 47,460 and 54,837) is provided in Additional file 1: Table S3. Mouse cigarette smoke exposure experiments are also another valuable resource to evaluate molecular patho-mechanisms as tobacco smoking is the major risk factor for COPD. We therefore also evaluated the expression of the 20 associated genes in the datasets generated from lungs of mice exposed to cigarette smoke (GSE: 8790, 7310, 17,737, and 76,205) (Additional file 1: Table S4). In case of mouse studies, Gbp1, Mcrs1, Ptch1, Slfn12l, and Ttc5 were the genes not exhibiting altered expression following cigarette smoke exposure. A summary of the expression pattern of the 20 genes in the cigarette smoke exposed mouse lung tissue datasets are provided in the Additional file 1: Table S4. Amongst the 20 candidate COPD genes identified in our study, transcripts of all except GBP1, MCRS1, PTCH1, SLFN12L and TTC5 are differentially expressed in both mouse cigarette smoke exposed lungs and human COPD/emphysematous lungs within the investigated datasets.

Discussion

All datasets investigated in this study originated from the lung samples of human and mouse thereby confirming the tissue specificity (18, 27, 37–40). The dataset GSE57148 from Kim et al. (27) study consisting of 98 COPD patients and 91 control subjects from a Korean population. This was the largest available lung RNA-seq dataset of a COPD cohort in GEO database at the time of study. However, for association studies this is a small sample size. It is important to note that most of the association studies on COPD genetics and genomics of pulmonary function originates from populations with European ancestry. Therefore, the effect of ethnicity on the current findings cannot be ruled out. Additional file 1: Table S5 shows the difference in minor allele frequencies of the associated SNPs between Korean population (http://152.99.75.168/KRGDB/browser/mainBrowser.jsp) and global population (https://www.ncbi.nlm.nih.gov/SNP/) justifying the plausible differences in ethnicity. Apart from lung specific expression of the associated genes, another strength of our study is the focus on missense SNPs (amino acid change), insertions, deletions, and splice junction variations thereby increasing the functional relevance of these associations. A genome-wide analysis of alternative splicing indicated that 40–60% of human genes undergo alternative splicing, often in a tissue specific manner [41-44]. On the other hand, since we performed the study using RNAseq data, our investigation is limited only to the exonic sequences and therefore could not detect any alterations within the promoter or intronic region. RNAseq data provides information only of a single strand. Thus, our study lacks information on the homozygosity of the identified associations. Availability of the genomic sequence of the same individuals would have overcome this drawback. We detected association of 20 genes to higher susceptibility for COPD. Our findings on the association of SNPs located on ADAM19 (rs1422795) and KIT (rs3822214) to higher COPD susceptibility replicate the previous findings by other investigators [12, 45–48]. The rs11048413 SNP on BHLHE41 causing an Ala298Val change have been associated to patient survival in lung adenocarcinoma. The Ala/Val or Val/Val genotype was associated to poor survival rate compared to Ala/Ala genotype [49]. The associated SNP on GATM (rs1288775) has been linked to lung cancer phenotypes with and without emphysema among African-American population but not among white Americans [50]. The SNP rs3764880 on TLR8 has been associated to tuberculosis. The SNP rs3761624 also located on TLR8 which has been associated to allergic rhinites in a Swedish population is in perfect linkage disequibrium with rs3764880 suggesting their complementary relationship [51]. The genes ABCA10, BHLHE41, CD200, CYBB, GATM, GBP1, LEPR, LMO7, LRP1, POP4, PTCH1, SCN7A, SLFN12L, TTC5, and VEPH1 have been previously associated to COPD [52-68]. Moreover, we detected altered transcript expression of ABCA10, ADAM19, BHLHE41, CD200, CYBB, GATM, GBP1, HJURP, KIT, LEPR, LMO7, LRP1, MCRS1, POP4, PTCH1, SCN7A, TLR8, TTC5 and VEPH1 in COPD and emphysematous lungs compared to control subjects in various datasets (GSE: 29133, 22,148, 1650, 47,460 and 54,837; Additional file 1: Table S3) [33-37]. In case of mouse lungs exposed to cigarette smoke, altered transcript expression was detected among Abca8a (ABCA10), Adam19, Bhlhe41, Cd200, Cybb, Gatm, Hjurp, Kit, Lepr, Lmo7, Lrp1, Pop4, Scn7a, Tlr8, and Veph1 (GSE: 8790, 7310, 17,737, and 76,205; Additional file 1: Table S4) [38-40]. Effect of cigarette smoke exposure on COPD development may act as a confounding factor in the analysis of candidate susceptibility genes in this study. However, considering the concept of recapitulation of developmental pathways as genetic subroutines during lung repair/remodeling processes, altered regulation of the associated genes in both COPD-and cigarette smoke exposed mouse lungs seems to be reasonable. SNPs on ADAM19 (rs2277027), PTCH1 (rs16909898), LRP1 (rs11172113) and hedgehog interacting protein (HHIP; rs12504628, rs1980057) have been associated to FEV1/forced vital capacity (FVC) ratio in samples of European ancestry [10, 12]. We previously reported decreased lung Hhip transcript levels in a mouse model lacking secreted phosphoprotein 1 (Spp1) with lower total lung capacity and enlarged alveolar size compared to control [8]. Based on the hypothesis on the origin of chronic lung diseases like COPD during the early life events [60-70], we could detect three novel (HJURP, MCRS1 and TLR8) COPD candidate genes and replicate the findings in 17 other studies using a mouse-human translational datamining approach. Gene set enrichment analysis [71] of the 20 associated genes identified COPD as one of the top enriched diseases (Additional file 1: Figure S5). HJURP is a centromeric protein (chaperone) that plays a central role in the incorporation and maintenance of histone H3-like variant CENPA at centromeres [72-74]. MCRS1 have been implicated in epithelial-mesenchymal transition, metastasis and growth of lung cancer cells [75-77]. TLR8 is also expressed in human monocytes and myeloid dendritic cells and Th1-type immune response cells. Mucus hypersecretion is induced by dual TLR7/8 agonist [78, 79]. Similarly, the murine TLR8 is involved in the activation of innate immune responses [80]. Stimulation of TLR8 causes relaxation of airway smooth muscles thereby preventing broncho-constriction [81]. Association of TLR8 have been also reported for pulmonary tuberculosis [82, 83], asthma and related atopic disorders [84].

Conclusions

Through this study we could demonstrate a candidate gene identification strategy for COPD using mouse-human translational approach using existing genomic datasets in the public repositories. The strategy warrants validation in larger sample size and in multiple cohorts. Cigarette smoke exposure studies in mice are routinely practiced to model emphysema development, a commonly associated COPD phenotype, as it causes increased pulmonary inflammation, protease activity, oxidative stress and apoptosis [85]. However, cigarette smoke exposure in mice does not result in excessive mucus production or mucus cell metaplasia that is characteristic of COPD pathogenesis [85]. It is plausible that the different response to cigarette smoke exposure in human and mouse lungs may be due to their structural differences [85]. The inbred mouse strains also differ significantly in their resistance or susceptibility to emphysema development following cigarette smoke exposure as measured by airspace enlargement [86]. This variable susceptibility among inbred mouse strains to emphysematous change following cigarette smoke exposure may be attributed to their genetic constitution and differences in lung development. Most of the COPD transcriptomic profiling studies have been performed using lung tissue from severely diseased patients requiring lobectomy. On the contrary, COPD pathogenesis occurs over decades. Molecular mechanisms that are active during initial phase of the pathogenesis may be completely different compared to the end stage of the disease. Therefore, creation of a translational profile between mouse and human COPD transcriptomic data is challenging. In this respect, we share similar views as other investigators that it is important to carefully evaluate the common lung-biology and -pathobiology existing between mice and human prior to considering cigarette smoke exposure experiments in mouse models [85]. Single gene driven spontaneous emphysema developing mouse models [47] identified through physiological phenotyping (eg. pulmonary function screening) may serve an important tool to understand molecular patho-mechanism but this requires exhaustive supportive evidence prior to testing the transgenic model. One way of accumulating convincing supportive evidence is explained in the present work. Mechanistic studies to elucidate the role of the novel candidate genes can be performed using appropriate cell lines, primary cells and physiologically relevant in vitro models [87]. This approach would lead to a significant reduction of animal screening experiments in respiratory research. Table S1. List of the genes screened for association to higher Chronic Obstructive Pulmonary Disease (COPD) susceptibility. Table S2. Summary of the transcript (Gene Paint; mouse embryo) and protein (Human Protein Atlas) expression domains of the significantly associated chronic obstructive pulmonary disease (COPD) genes. Table S3. Analysis of lung transcript expression of the associated 20 genes in chronic obstructive pulmonary disease (COPD) and/ or emphysematous lung tissues using available datasets [GSE: 29133, 22,148, 1650, 47,460 and 54,837] in Genome Expression Omnibus (GEO) database. ↓: Decreased ↑: Increased ✓: significantly altered. Table S4. Analysis of l transcript expression of the associated 20 genes in mouse cigarette smoke exposed lungs using available datasets [GSE: 8790, 7310, 17,737, and 6205] in Genome Expression Omnibus (GEO) database. ↓: Decreased ↑: Increased ✓: significantly altered. Table S5. The difference in minor allele frequencies of the associated single nucleotide polymorphisms (SNPs) between Korean population and global population indicates the influence of ethnicity on the findings. The Korean population data was accessed from the KoreanDB: http://152.99.75.168/KRGDB/menuPages/firstInfo.jsp and http://152.99.75.168/KRGDB/browser/mainBrowser.jsp Global SNP data(dbSNP database): https://www.ncbi.nlm.nih.gov/SNP/. Figure S1. Analysis of protein domain and functional sites in the “A Disintegrin and metallopeptidase domain 19” (ADAM19). Figure S2. Transcript (Gene Paint; mouse embryo) and protein expression (Human Protein atlas; normal lung) domain of holliday junction recognition protein (HJURP). Figure S3. Transcript (Gene Paint; mouse embryo) and protein expression (Human Protein atlas; normal lung) domain of microspherule protein 1 (MCRS1). Figure S4. Protein expression (Human Protein atlas; normal lung) domain of toll like receptor 8 (TLR8). Figure S5. Gene-set enrichment analysis for the associated 20 genes for (A) cellular component enrichment (B) biological process enrichment (C) molecular function enrichment (D) diseases enrichment using Enrichr interactive enrichment analysis tool [71]. (PDF 1296 kb)
  81 in total

1.  A genomic view of alternative splicing.

Authors:  Barmak Modrek; Christopher Lee
Journal:  Nat Genet       Date:  2002-01       Impact factor: 38.330

2.  Soluble CD200 Correlates With Interleukin-6 Levels in Sera of COPD Patients: Potential Implication of the CD200/CD200R Axis in the Disease Course.

Authors:  Priya Sakthivel; Angele Breithaupt; Marcus Gereke; David A Copland; Christian Schulz; Achim D Gruber; Andrew D Dick; Jens Schreiber; Dunja Bruder
Journal:  Lung       Date:  2016-11-18       Impact factor: 2.584

3.  Impaired lung homeostasis in neonatal mice exposed to cigarette smoke.

Authors:  Sharon McGrath-Morrow; Tirumalai Rangasamy; Cecilia Cho; Thomas Sussan; Enid Neptune; Robert Wise; Rubin M Tuder; Shyam Biswal
Journal:  Am J Respir Cell Mol Biol       Date:  2007-11-01       Impact factor: 6.914

4.  Superoxide dismutase 3, extracellular (SOD3) variants and lung function.

Authors:  Koustav Ganguly; Martin Depner; Cheryl Fattman; Kiflai Bein; Tim D Oury; Scott C Wesselkamper; Michael T Borchers; Martina Schreiber; Fei Gao; Erika von Mutius; Michael Kabesch; George D Leikauf; Holger Schulz
Journal:  Physiol Genomics       Date:  2009-03-24       Impact factor: 3.107

5.  Genome-wide association study of lung function phenotypes in a founder population.

Authors:  Tsung-Chieh Yao; Gaixin Du; Lide Han; Ying Sun; Donglei Hu; James J Yang; Rasika Mathias; Lindsey A Roth; Nicholas Rafaels; Emma E Thompson; Dagan A Loisel; Rebecca Anderson; Celeste Eng; Maitane Arruabarrena Orbegozo; Melody Young; James M Klocksieben; Elizabeth Anderson; Kathleen Shanovich; Lucille A Lester; L Keoki Williams; Kathleen C Barnes; Esteban G Burchard; Dan L Nicolae; Mark Abney; Carole Ober
Journal:  J Allergy Clin Immunol       Date:  2013-08-06       Impact factor: 10.793

Review 6.  Developmental pathways in lung regeneration.

Authors:  Collin T Stabler; Edward E Morrisey
Journal:  Cell Tissue Res       Date:  2016-12-13       Impact factor: 5.249

7.  Genome-wide association analysis identifies six new loci associated with forced vital capacity.

Authors:  Daan W Loth; María Soler Artigas; Sina A Gharib; Louise V Wain; Nora Franceschini; Beate Koch; Tess D Pottinger; Albert Vernon Smith; Qing Duan; Chris Oldmeadow; Mi Kyeong Lee; David P Strachan; Alan L James; Jennifer E Huffman; Veronique Vitart; Adaikalavan Ramasamy; Nicholas J Wareham; Jaakko Kaprio; Xin-Qun Wang; Holly Trochet; Mika Kähönen; Claudia Flexeder; Eva Albrecht; Lorna M Lopez; Kim de Jong; Bharat Thyagarajan; Alexessander Couto Alves; Stefan Enroth; Ernst Omenaas; Peter K Joshi; Tove Fall; Ana Viñuela; Lenore J Launer; Laura R Loehr; Myriam Fornage; Guo Li; Jemma B Wilk; Wenbo Tang; Ani Manichaikul; Lies Lahousse; Tamara B Harris; Kari E North; Alicja R Rudnicka; Jennie Hui; Xiangjun Gu; Thomas Lumley; Alan F Wright; Nicholas D Hastie; Susan Campbell; Rajesh Kumar; Isabelle Pin; Robert A Scott; Kirsi H Pietiläinen; Ida Surakka; Yongmei Liu; Elizabeth G Holliday; Holger Schulz; Joachim Heinrich; Gail Davies; Judith M Vonk; Mary Wojczynski; Anneli Pouta; Asa Johansson; Sarah H Wild; Erik Ingelsson; Fernando Rivadeneira; Henry Völzke; Pirro G Hysi; Gudny Eiriksdottir; Alanna C Morrison; Jerome I Rotter; Wei Gao; Dirkje S Postma; Wendy B White; Stephen S Rich; Albert Hofman; Thor Aspelund; David Couper; Lewis J Smith; Bruce M Psaty; Kurt Lohman; Esteban G Burchard; André G Uitterlinden; Melissa Garcia; Bonnie R Joubert; Wendy L McArdle; A Bill Musk; Nadia Hansel; Susan R Heckbert; Lina Zgaga; Joyce B J van Meurs; Pau Navarro; Igor Rudan; Yeon-Mok Oh; Susan Redline; Deborah L Jarvis; Jing Hua Zhao; Taina Rantanen; George T O'Connor; Samuli Ripatti; Rodney J Scott; Stefan Karrasch; Harald Grallert; Nathan C Gaddis; John M Starr; Cisca Wijmenga; Ryan L Minster; David J Lederer; Juha Pekkanen; Ulf Gyllensten; Harry Campbell; Andrew P Morris; Sven Gläser; Christopher J Hammond; Kristin M Burkart; John Beilby; Stephen B Kritchevsky; Vilmundur Gudnason; Dana B Hancock; O Dale Williams; Ozren Polasek; Tatijana Zemunik; Ivana Kolcic; Marcy F Petrini; Matthias Wjst; Woo Jin Kim; David J Porteous; Generation Scotland; Blair H Smith; Anne Viljanen; Markku Heliövaara; John R Attia; Ian Sayers; Regina Hampel; Christian Gieger; Ian J Deary; H Marike Boezen; Anne Newman; Marjo-Riitta Jarvelin; James F Wilson; Lars Lind; Bruno H Stricker; Alexander Teumer; Timothy D Spector; Erik Melén; Marjolein J Peters; Leslie A Lange; R Graham Barr; Ken R Bracke; Fien M Verhamme; Joohon Sung; Pieter S Hiemstra; Patricia A Cassano; Akshay Sood; Caroline Hayward; Josée Dupuis; Ian P Hall; Guy G Brusselle; Martin D Tobin; Stephanie J London
Journal:  Nat Genet       Date:  2014-06-15       Impact factor: 38.330

8.  Gene expression analysis of membrane transporters and drug-metabolizing enzymes in the lung of healthy and COPD subjects.

Authors:  Tove Berg; Tove Hegelund Myrbäck; Marita Olsson; Janeric Seidegård; Viktoria Werkström; Xiao-Hong Zhou; Johan Grunewald; Lena Gustavsson; Magnus Nord
Journal:  Pharmacol Res Perspect       Date:  2014-06-12

9.  Molecular mechanisms underlying variations in lung function: a systems genetics analysis.

Authors:  Ma'en Obeidat; Ke Hao; Yohan Bossé; David C Nickle; Yunlong Nie; Dirkje S Postma; Michel Laviolette; Andrew J Sandford; Denise D Daley; James C Hogg; W Mark Elliott; Nick Fishbane; Wim Timens; Pirro G Hysi; Jaakko Kaprio; James F Wilson; Jennie Hui; Rajesh Rawal; Holger Schulz; Beate Stubbe; Caroline Hayward; Ozren Polasek; Marjo-Riitta Järvelin; Jing Hua Zhao; Deborah Jarvis; Mika Kähönen; Nora Franceschini; Kari E North; Daan W Loth; Guy G Brusselle; Albert Vernon Smith; Vilmundur Gudnason; Traci M Bartz; Jemma B Wilk; George T O'Connor; Patricia A Cassano; Wenbo Tang; Louise V Wain; María Soler Artigas; Sina A Gharib; David P Strachan; Don D Sin; Martin D Tobin; Stephanie J London; Ian P Hall; Peter D Paré
Journal:  Lancet Respir Med       Date:  2015-09-21       Impact factor: 30.700

10.  Systemic inflammatory response to smoking in chronic obstructive pulmonary disease: evidence of a gender effect.

Authors:  Rosa Faner; Nuria Gonzalez; Tamara Cruz; Susana Graciela Kalko; Alvar Agustí
Journal:  PLoS One       Date:  2014-05-15       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.