| Literature DB >> 34759959 |
Michiel J Thiecke1, Emma J Yang2,3, Oliver S Burren4, Helen Ray-Jones2,3, Mikhail Spivakov2,3.
Abstract
Genetic variants showing associations with specific biological traits and diseases detected by genome-wide association studies (GWAS) commonly map to non-coding DNA regulatory regions. Many of these regions are located considerable distances away from the genes they regulate and come into their proximity through 3D chromosomal interactions. We previously developed COGS, a statistical pipeline for linking GWAS variants with their putative target genes based on 3D chromosomal interaction data arising from high-resolution assays such as Promoter Capture Hi-C (PCHi-C). Here, we applied COGS to COVID-19 Host Genetic Consortium (HGI) GWAS meta-analysis data on COVID-19 susceptibility and severity using our previously generated PCHi-C results in 17 human primary cell types and SARS-CoV-2-infected lung carcinoma cells. We prioritise 251 genes putatively associated with these traits, including 16 out of 47 genes highlighted by the GWAS meta-analysis authors. The prioritised genes are expressed in a broad array of tissues, including, but not limited to, blood and brain cells, and are enriched for genes involved in the inflammatory response to viral infection. Our prioritised genes and pathways, in conjunction with results from other prioritisation approaches and targeted validation experiments, will aid in the understanding of COVID-19 pathology, paving the way for novel treatments.Entities:
Keywords: 3D chromosomal architecture; COVID-19; GWAS (genome-wide association studies); enhancers and promoters; regulatory genome
Year: 2021 PMID: 34759959 PMCID: PMC8573080 DOI: 10.3389/fgene.2021.745672
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1The COGS prioritisation scores of genes associated with A2 COVID-19 host GWAS trait. Gene-level Manhattan plot showing COGS scores generated based on A2 COVID-19 host GWAS data and PCHi-C data from COVID Javierrere et al. (2016). The top scoring genes (COGS scores >0.3) are labelled in each locus. Multiple genes are labelled when there are several top-scoring genes with a very similar score, or lower-scoring genes with compelling biological functions. For simplicity, non-coding genes are not labelled, unless there are no prioritised protein-coding genes in the same locus. See Supplementary Figures S1–S3 for the COGS gene-level Manhattan plots produced with the other three COVID-19 host GWAS traits based on PCHi-C data from Javierre et al. and Supplementary Figures S4–S7 for the prioritisation results based on PCHi-C data from Ho et al. (2021).
FIGURE 2Expression patterns of the prioritised genes. Heatmaps showing the results of k-means clustering of COGS-prioritised genes (scores >0.3) based on their relative expression levels across the tissues profiled by the GTEx consortium (A) and across primary blood cell types profiled by the BLUEPRINT consortium (B). Relative gene expression in (A) represents gene-level TPM values scaled across all GTEx genes, and in (B) gene-level RPKM values scaled across genes with the top 25% of expression in the BLUEPRINT dataset. Each cell in the heatmap represents a cluster, with the gene-to-cluster assignments listed in Supplementary Table S5A, B, respectively. Abbreviations: FPKM, fragments per kilobase of transcript per million mapped reads, TPM, transcripts per million.
FIGURE 3Enrichment of COGS-prioritised genes in COVID-19-response gene sets. Quantitative GSEA analysis using the COGS score for each gene against gene sets from the COVID-19 Drug and Gene Set Library. Diagnostic plots produced by the GSEA software demonstrate the relationship between the normalised enrichment score (NES) and measures of significance [(A), top plot] and the enrichment across COGS scores for the gene sets with the top two NES scores (A, middle and bottom plots) (B) Bubble-plot showing results for all gene sets. The “up” and “down” suffixes indicated the direction of differential expression in COVID-19 for the gene set in question.
FIGURE 4The biological functions of the prioritised genes. (A) Bubble plot showing the KEGG pathways enriched among COGS-prioritised genes (score >0.3). (B) Diagram of the NOD-like receptor signalling pathway, with the COGS-prioritised genes highlighted in red. (C) Bubble plot showing the results of a quantitative GSEA analysis using the COGS score for each gene against Hallmark gene sets.