| Literature DB >> 32486318 |
Ignazio S Piras1, Christiane Bleul1, Ashley Siniard1, Amanda J Wolfe1, Matthew D De Both1, Alvaro G Hernandez2, Matthew J Huentelman1.
Abstract
Canine idiopathic pulmonary fibrosis (CIPF) is a chronic fibrotic lung disease that is observed at a higher frequency in the West Highland White Terrier dog breed (WHWT) and may have molecular pathological overlap with human lung fibrotic disease. We conducted a genome-wide association study (GWAS) in the WHWT using whole genome sequencing (WGS) to discover genetic variants associated with CIPF. Saliva-derived DNA samples were sequenced using the Riptide DNA library prep kit. After quality controls, 28 affected, 44 unaffected, and 1,843,695 informative single nucleotide polymorphisms (SNPs) were included in the GWAS. Data were analyzed both at the single SNP and gene levels using the GEMMA and GATES methods, respectively. We detected significant signals at the gene level in both the cleavage and polyadenylation specific factor 7 (CPSF7) and succinate dehydrogenase complex assembly factor 2 (SDHAF2) genes (adjusted p = 0.016 and 0.024, respectively), two overlapping genes located on chromosome 18. The top SNP for both genes was rs22669389; however, it did not reach genome-wide significance in the GWAS (adjusted p = 0.078). Our studies provide, for the first time, candidate loci for CIPF in the WHWT. CPSF7 was recently associated with lung adenocarcinoma, further highlighting the potential relevance of our results because IPF and lung cancer share several pathological mechanisms.Entities:
Keywords: animal genetics; genomics; pulmonary fibrosis
Year: 2020 PMID: 32486318 PMCID: PMC7349241 DOI: 10.3390/genes11060609
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Details of the top 10 single nucleotide polymorphisms (SNPs) detected in the genome-wide association study (GWAS). p-values were adjusted using the Bonferroni method, accounting for 101,740 independent SNPs.
| Refsnp ID | CHR | BP | A1 | A2 | F_A | F_U | Depth (SD) | Beta |
| Adj | Ensembl Gene ID | Gene Name | Consequence Type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| rs22669389 | 18 | 54992254 | T | A | 0.704 | 0.333 | 0.451 ± 0.713 | 0.406 | 7.7 × 10−7 | 0.078 |
|
| u, i; u, i |
| rs22647286 | 18 | 54987884 | C | T | 0.704 | 0.333 | 3.183 ± 2.875 | 0.394 | 1.2 × 10−6 | 0.124 |
|
| u, i; u, i |
| rs851654341 | 18 | 54986491 | A | G | 0.704 | 0.333 | 2.324 ± 2.123 | 0.394 | 1.2 × 10−6 | 0.124 |
|
| u, i; u, i |
| rs852097932 | 18 | 54986070 | A | G | 0.704 | 0.337 | 2.861 ± 2.209 | 0.394 | 1.3 × 10−6 | 0.131 |
|
| u, i; u, i |
| rs22686152 | 18 | 54992285 | A | G | 0.704 | 0.345 | 0.732 ± 0.940 | 0.386 | 2.1 × 10−6 | 0.213 |
|
| u, i; u, i |
| rs22647289 | 18 | 54987464 | G | T | 0.704 | 0.345 | 5.423 ± 3.702 | 0.391 | 2.1 × 10−6 | 0.214 |
|
| 5’ UTR, u; 5’ UTR, u |
| rs850942449 | 18 | 54983627 | A | G | 0.704 | 0.345 | 2.831 ± 2.449 | 0.393 | 2.2 × 10−6 | 0.223 |
|
| u, i; u, i |
| - | 18 | 54984004 | G | A | 0.704 | 0.345 | 0.887 ± 0.919 | 0.393 | 2.2 × 10−6 | 0.223 |
|
| i |
| rs22647283 | 18 | 54987912 | C | T | 0.692 | 0.326 | 2.535 ± 2.709 | 0.390 | 2.6 × 10−6 | 0.263 |
|
| u, i; u, i |
| rs850871193 | 18 | 54986170 | C | T | 0.692 | 0.337 | 3.028 ± 2.646 | 0.387 | 4.1 × 10−6 | 0.413 |
|
| u, i; u, i |
A1: minor frequency allele referred to the total sample; A2: major frequency allele referred to the total sample; AF: frequency of A1 in affected; UF: frequency of A2 in unaffected; u: upstream; i: intron; 5’ UTR: 5’ untranslated region; SDHAF2: succinate dehydrogenase complex assembly factor 2; CPSF7: Cleavage And Polyadenylation Specific Factor 7.
Figure 1(A) SNP level analysis: Manhattan plot of the top 500K SNPs ranked by unadjusted p-value. The continuous and dashed lines indicate the genome-wide (p < 4.91 × 10−7) and suggestive (p < 9.83 × 10−7) significance thresholds, respectively. The p-value adjustment was conducted using the Bonferroni method, accounting for 101,740 independent SNPs estimated using regional linkage disequilibrium (LD) patterns. Gene names reported are the top 10 according to the SNP level analysis. (B) Gene level analysis: Manhattan plot of all the genes ranked by p-value. The continuous and dashed lines indicate the genome-wide and suggestive significance thresholds, respectively. The adjustment was conducted using the Bonferroni method accounting for the total number of genes tested (n = 18,110). Gene names reported are the top 10 according the gene level analysis.
Figure 2Manhattan plots showing the details of the association region in chromosome 18. The continuous and dashed lines indicate the genome-wide and suggestive significance thresholds, respectively. The color of the points indicates the LD (expressed as R) between the top (rs22669389) and the close SNPs. Values of R range from 0 (absence of LD) to 1 (complete LD). Figure 2B shows a smaller R range due to the closeness of the top SNP. (A) Region ± 1 Mb from the top SNP; (B) region ± 0.1 Mb from the top SNP. Thick sections of the genes represent the actual gene region according to Ensemble, the thin sections represent the surrounding regions (± 1500).