Literature DB >> 35680011

First report on genome wide association study in western Indian population reveals host genetic factors for COVID-19 severity and outcome.

Ramesh Pandit1, Indra Singh1, Afzal Ansari1, Janvi Raval1, Zarna Patel1, Raghav Dixit2, Pranay Shah3, Kamlesh Upadhyay4, Naresh Chauhan5, Kairavi Desai6, Meenakshi Shah7, Bhavesh Modi8, Madhvi Joshi9, Chaitanya Joshi10.   

Abstract

Different human races across the globe responded in a different way to the SARS-CoV-2 infection leading to different disease severity. Therefore, it is anticipated that host genetic factors have a straight association with the COVID-19. We identified a total 6, 7, and 6 genomic loci for deceased-recovered, asymptomatic-recovered, and deceased-asymptomatic group comparison, respectively. Unfavourable alleles of the markers nearby the genes which are associated with lung and heart diseases such as Tumor necrosis factor superfamily (TNFSF4&18), showed noteworthy association with the disease severity and outcome for the COVID-19 patients in the western Indian population. The markers found with significant association with disease prognosis or recovery are of value in determining the individual's response to SARS-CoV-2 infection and can be used for the risk prediction in COVID-19. Besides, GWAS study in other populations from India may help to strengthen the outcome of this study.
Copyright © 2022 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  COVID-19; Genome wide association (GWAS); Host genetic factor; SARS-CoV-2

Mesh:

Substances:

Year:  2022        PMID: 35680011      PMCID: PMC9169419          DOI: 10.1016/j.ygeno.2022.110399

Source DB:  PubMed          Journal:  Genomics        ISSN: 0888-7543            Impact factor:   4.310


Introduction

Host-pathogen interaction studies are pivotal in understanding infectious disease biology. The genetic interaction of host and pathogen determines response, progression and severity of the infection. COVID-19 caused by the infection of SARS-CoV-2 has shaken the human race for the past two and half years. Like other RNA viruses, it has also high mutation frequency and genetic variations which has led to rapid blow-out of virus across the globe [[1], [2], [3]]. One of the most perplexing features of SARS-CoV-2 infection is diverse range of clinical symptoms observed in different populations and over different waves. COVID-19 many leads to respiratory illness, blood clotting manifested by asymptomatic to moderate (fever, cough and shortness of breath) or severe symptoms (pneumonia, acute respiratory distress, and diffuse alveolar damage) as well as death in 2–3% [4] patients. The severity of disease is also positively correlated with increased age and presence of comorbidities if any [[5], [6], [7], [8], [9]]. In addition, the fatality rate also varies with age and among the different ethnic groups [10], suggesting complex interactions between virus and host genetic makeup to determine the disease outcome. Identification of populations at higher risk of developing severe disease is important for the development and implementation of effective control measures. Genome wide study (GWAS) is widely used to identify the host genetic factors involved in disease susceptibility, following suitable drug development [11,12]. Therefore, the host genetic makeup contributing to the disease resistance, susceptibility, and severity in case of COVID-19 need to be studied in detail in different human races. Till date, several genome wide association studies on COVID-19 are available [[13], [14], [15], [16], [17], [18], [19], [20], [21], [22]]. Recently, researchers have also mapped epigenetic factors with COVID-19 severity [23]. However, no GWAS study has been carried out for the Indian population. Therefore, this study was undertaken to identify the host genetic factors involved in susceptibility and severity of COVID-19 patients during first wave of COVID-19 in Gujarat, India. Patients with asymptomatic to severe infection with monitoring of their final outcome as recovered or deceased were enrolled in this study. Data was further analysed using two different GWAS analysis pipelines, PLINK and Scalable and Accurate Implementation of Generalized mixed mode (SAIGE) to correlate association of key host genetic variants playing a significant role in COVID-19.

Results

Based on incident rate during the first wave of COVID-19 in 2020, 571 samples were collected from 25 different hospitals of 24 districts across Gujarat, India. Out of 571 samples, 172 and 399 were female and male, respectively. Median age of the patients in particular group and the percentage of patients with comorbidity increases as the disease severity increases i.e. deceased>recovered>asymptomatic and it was reversed for the ct values (viral load) as determined using the three viral genes targeted in the RT-PCR (Table 1 ).
Table 1

Details of male and female patients with reference to three disease states.

DeceasedRecoveredAsymptomaticTotal
MaleN = 60N = 228N = 111399
Median age years (Range)60.5 (35–86)52 (15–90)39 (18–84)NA
Comorbidity41/60 (68.33%)102/228 (44.73%)13/111 (11.71%)156 (52.0%)
Median Ct of N gene27.1329.3730.06NA
Median Ct of ORF gene26.3228.529.88
Median Ct of S gene27.1228.1829.9
FemaleN = 36N = 99N = 37172



Median age years (Range)62 (40–86)54 (18–90)35 (20–76)NA
Comorbidity27/36 (75.0%)56/99 (56.56%)10/37 (27%)151 (87.8%)
Median Ct of N gene24.9529.8830.7NA
Median Ct of ORF gene24.529.930.13
Median Ct of S gene24.9329.2129.37
Details of male and female patients with reference to three disease states. Population stratification by principal component analysis using PLINK. Scatter plot depict principal component one (PC1) vs principal component two (PC2), (A) without removing outliers and (B) after removing population outliers. Figure depicting Manhattan and Q-Q plots of the association statistics from the meta-analysis of three-group comparison using PLINK. (A) Manhattan plot and (B) Q-Q plot. For Manhattan plots, p-values from GWAS analysis is plotted and threshold was set P ≤ 10−6. Quantile-quantile (Q-Q) plots are showing quantile distribution of observed p-values (on the y-axis) versus the quantile distribution of expected p-values to show genomic inflation (λ) for each analysis. Regional association plots for region around the significant loci for deceased-recovered comparison. (A) rs17300100, (B) rs73246461, and (C) rs12651262. These plots were generated using LocusZoom using all the population. The most strongly associated SNPs are highlighted as purple diamond. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Regional association plot for region around the significant loci for recovered-asymptomatic comparison. (A) rs72663004, (B) rs72699049, and (C) rs72699016. These plots were generated using LocusZoom using all the population. The most strongly associated SNPs are highlighted as purple diamond. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Regional association plot for region around rs34279101 for deceased-asymptomatic comparison. These plot was generated using LocusZoom using all the population. The most strongly associated SNP is shown as purple diamond. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Significant markers for deceased-recovered comparison. imputed data was analysed using PLINK with MAF >0.05 and markers with p-value p ≤ 10–7 were considered significant. Significant markers for recovered-asymptomatic comparison. imputed data was analysed using PLINK with MAF >0.05 and markers with p-value p ≤ 10–7 were considered significant. Significant markers for deceased-asymptomatic comparison. imputed data was analysed using PLINK with MAF >0.05 and markers with p-value p ≤ 10–7 were considered significant.

Quality analysis

We used Axiom Analysis Suite for the QC of raw data. After quality filtering, 561 out of 571 patients, and 8538,78 (98.335) high resolution markers out of 8,68,298 markers were obtained. As well, based on the results of population stratification PCA plot (Fig. 1), population outliers were identified and removed. Therefore, variants from 558 (94, 317 and 147, deceased, recovered, and asymptomatic, respectively) patients were analysed using two different pipelines, PLINK and SAIGE. Finally, results of PLINK was considered and reported in this study. We found several markers at various genomic loci which are associated with different COVID-19 severity. Total 6, 7, and 6 genomic loci with p ≤ 10–7 for deceased-recovered, deceased-asymptomatic, and recovered-asymptomatic group, respectively were identified. Comparison of two tools i.e. PLINK and SAIGE used for GWAS analysis is shown in supplementary Table S2 and respective allele and genotype frequencies are mentioned in supplementary Table S3. Manhattan plot for SNPs with p ≤ 10–7 and Q-Q plots are depicted in Fig. 2, while Manhattan plots for markers with p ≤ 10–6 are shown in supplementary Fig. S2. Additionally, any markers reported for COVID-19 within the 1 MB region of the markers/loci identified in this study are shown in Supplementary Table S4.
Fig. 1

Population stratification by principal component analysis using PLINK. Scatter plot depict principal component one (PC1) vs principal component two (PC2), (A) without removing outliers and (B) after removing population outliers.

Fig. 2

Figure depicting Manhattan and Q-Q plots of the association statistics from the meta-analysis of three-group comparison using PLINK. (A) Manhattan plot and (B) Q-Q plot. For Manhattan plots, p-values from GWAS analysis is plotted and threshold was set P ≤ 10−6. Quantile-quantile (Q-Q) plots are showing quantile distribution of observed p-values (on the y-axis) versus the quantile distribution of expected p-values to show genomic inflation (λ) for each analysis.

Deceased vs recovered (mortality)

For this comparison, after imputation, we had total 29,48,95,933 SNPs out of which 63,32,698 SNPs passed the cutoff MAF > 0.05. Upon further analysis of these 63,32,698 SNPs using PLINK, we obtained a total of 6 significant markers having p ≤ 10–7 (Table 2). The genes associated with these six significant genomic loci are TNFSF4, TNFSF18, DHX15, RP1–15D23.2, GOT2P2, WAC, PPARGC1A, CTD-2036A18.2, PTP4A1P4, and LINC00540-AL354828.1. Marker, rs17300100, (chr1:173115604:T:G; 1q25.1) with p-value 9.14E-07 (CHISQ 24.21) has nearest genes tumor necrosis factor 4 (TNFSF4) (upstream, 68.127 kb), tumor necrosis factor 18 (TNFSF18) (downstream, 64.641 kb) and GOT2P2 (downstream, 25.496). Here, the frequency of altered allele (G) is 9.3% higher among the deceased patients as compared to those who recovered. The regional association plot for this marker is shown in Fig. 3A. Similarly, two very nearby (1536 bp apart) markers at chromosome 4p15.2 are rs73246461 (chr4:24511798:A:G; p-value-4.54E-07; CHIQ:25.45; and rs12651262, (chr4:24513334:A:C; p-value- 9.84E-07; CHIQ:23.96) are located within downstream region of DHX15. The regional association plots for these two markers are shown in Fig. 3B&C. For these two positions, the frequency of altered allele was found to be 6.8% lesser in deceased patients as compared to those who recovered. The Manhattan plot for markers with p-value ≤10–6 is shown in supplementary Fig. S2A and listed in supplementary Table S6.
Table 2

Significant markers for deceased-recovered comparison. imputed data was analysed using PLINK with MAF >0.05 and markers with p-value p ≤ 10–7 were considered significant.

CHRSNPBPrsIDBand positionRefAltCHISQp- valueGene
10chr10:28606315:C:T28,606,315rs1277386010p12.1CT26.023.38E-07WAC-AS1, BAMBI, WAC-AS1, RNU4ATAC6P, TPRKBP1, RNU6,1067P, snRNA
4chr4:24511798:A:G24,511,798rs732464614p15.2AG25.454.54E-07PPARGC1A, DHX15
5chr5:86011821:T:G86,011,821rs44240295q14.3TG25.075.52E-07CTD, 2036A18.2, PTP4A1P4
13chr13:22495675:AG:A22,495,675rs10714879;rs398021874;rs39807710213q12.11AGG24.995.78E-07LINC00540, AL354828.1
1chr1:173115604:T:G173,115,604rs173001001q25.1TG24.19.14E-07TNFSF4, TNFSF18, RP1–15D23.2, GOT2P2
4chr4:24513334:A:C24,513,334rs126512624p15.2AC23.969.84E-07DHX15, RN7SL16P, PPARGC1A
Fig. 3

Regional association plots for region around the significant loci for deceased-recovered comparison. (A) rs17300100, (B) rs73246461, and (C) rs12651262. These plots were generated using LocusZoom using all the population. The most strongly associated SNPs are highlighted as purple diamond. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Recovered vs asymptomatic (susceptibility)

For this comparison, we got total 6 significant genomic loci (Table 3). Interesting marker for this comparison is rs72663004 (chr1:20166359:C:T; p-value 7.92E-07; CHISQ 24.38; 1p36.12) and genes nearby to this marker are PLA2G2D, PLA2G2C, PLA2G5, and UBXN10. For this marker, the frequency of altered allele is 12.27% lesser in asymptomatic patients when compared with recovered patients. A regional association plot for this marker is shown in Fig. 4A. Another two markers present on chromosome 9 (9p24.2) are rs72699049 (chr9:3320390:C:A; p-value 3.21E-07; CHISQ 26.12) and rs72699016 (chr9:3258654:T:G; p-value 3.59E-07; CHISQ 25.91) having nearby gene RFX3. At both these loci, the frequency of altered allele is 1.5 and 4.94%, respectively lower in the patients those who remained asymptomatic after SARS-CoV-2 infection against those who recovered after infection. When the frequency of these loci was compared with deceased patents again, it was lower in asymptomatic one. Region association plot for both these markers are shown in Fig. 4B&C. Other two nearby markers (within 874 bp region) on chromosome 4 (4q35.2) are rs72717619 (chr4:188628742:C:G; p-value 9.56E-07; CHISQ 24.02) and rs1734523522 (chr4:188628742:C:G; p-value 9.56E-07; CHISQ 24.02). Nearby of these loci, only pseudogenes are present. The Manhattan plots for markers with p-value ≤10–6 is shown in Fig. S2B and listed in supplementary Table S6.
Table 3

Significant markers for recovered-asymptomatic comparison. imputed data was analysed using PLINK with MAF >0.05 and markers with p-value p ≤ 10–7 were considered significant.

CHRSNPBPrsIDBand positionRefAltCHISQp-valueGene
9chr9:3320390:C:A3,320,390Rs726990499p24.2CA26.123.21E-07RFX3
9chr9:3258654:T:G3,258,654rs726990169p24.2TG25.913.59E-07RFX3, LINC01231
10chr10:130802064:T:G130,802,064rs1225365210q26.3TG25.175.24E-07AC016816.1-MIR378C
1chr1:20166359:C:T20,166,359rs726630041p36.12CT24.387.92E-07PLA2G2D, UBXN10, LINC01757, PLA2G5, UBXN10-AS1, PLA2G2F, Z98257.1, PLA2G2C
4chr4:188628742:C:G188,628,742rs727176194q35.2CG24.029.56E-07AC093909.3, LINC01060, RNU7-192P, snRNA, LINC01060
4chr4:188627868:CTCT:C188,627,868rs17345235224q35.2CTCTC24.029.56E-07AC093909.5, LINC01060, RNU7-192P, snRNA
Fig. 4

Regional association plot for region around the significant loci for recovered-asymptomatic comparison. (A) rs72663004, (B) rs72699049, and (C) rs72699016. These plots were generated using LocusZoom using all the population. The most strongly associated SNPs are highlighted as purple diamond. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Deceased vs asymptomatic (morbidity)

For this comparison, after imputation, we got a total of 30,76,54,523 variants. Out of which 64,68,284 variants passed the threshold, MAF > 0.05. Here, we found total 7 significant markers and the associated genes are ANGEL1, LRRC74A, ANO3, MOK, CINP, TECPR2, and many uncharacterized loci (Table 4). Marker rs34279101, (chr14:76832814:C:CT; p-value 4.12E-08; CHISQ 30.09; 14q24.3) has the nearby genes ANGEL1 and LRRC74A. Here, the frequency of altered allele is 7.4% higher in deceased patients as compared to asymptomatic group. A regional association plot for rs34279101 marker is shown in Fig. 5. Two other very nearby markers present on chromosome 14 (14q32.31) are rs11160678 (chr14:102356233:A:G; p-value-5.23E-07; CHISQ-25.18) and rs12323812 (chr14:102363920:C:T; p-value-9.51E-07; CHISQ-24.02). Here the nearest genes are MOK, CINP and TECPR2. Similarly, three significant markers on chromosome 16 (16p12.3) are, rs1453512 (chr16:16924042:C:A; p-value-4.60E-07; CHISQ-25.42), rs1597988 (chr16:16900377:T:C, p-value-9.44E-07; CHISQ-24.04), and rs4371135 (chr16:16897099:G:C, p-value-9.44E-07; CHISQ-24.04). These three markers on chromosome 16 having no any gene nearby, instead all are falling under uncharacterized loci i.e. pseudogene AC098965.1. Moreover, the difference in frequency of the altered allele is also minor. The Manhattan plot for markers with p-value ≤10–6 is shown in supplementary Fig. S2C and listed in supplementary Table S7.
Table 4

Significant markers for deceased-asymptomatic comparison. imputed data was analysed using PLINK with MAF >0.05 and markers with p-value p ≤ 10–7 were considered significant.

CHRSNPBPrsIDBand positionRefAltCHISQp-valueGene
14chr14:76832814:C:CT76,832,814rs3427910114q24.3CCT30.094.12E-08ANGEL1, VASH1-AS1, AC007376.2, RN7SKP17, misc_RNA, AF111169.1, LRRC74A, RPL22P2, AF111169.4
11chr11:26675470:T:G26,675,470rs1083505611p14.2TG25.554.32E-07ANO3, SLC5A12
16chr16:16924042:C:A16,924,042rs145351216p12.3CA25.424.60E-07AC098965.1
14chr14:102356233:A:G1.02E+08rs1116067814q32.31AG25.185.23E-07MOK, TECPR2, ZNF839, CINP
16chr16:16900377:T:C16,900,377rs159798816p12.3TC24.049.44E-07AC098965.1
16chr16:16897099:G:C16,897,099rs437113516p12.3GC24.049.44E-07AC098965.1
14chr14:102363920:C:T1.02E+08rs1232381214q32.31CT24.029.51E-07MOK, CINP, TECPR2, ZNF839
Fig. 5

Regional association plot for region around rs34279101 for deceased-asymptomatic comparison. These plot was generated using LocusZoom using all the population. The most strongly associated SNP is shown as purple diamond. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Discussion

Currently, researchers across the globe are working on the different aspect of the SARS-CoV-2 infection through looking into epidemiology [10,24], viral mutations [25,26], host transcriptome signature [27], in silico analysis of different mutations [28,29] etc. to tackle the viral infection. Apart from this, researchers are also looking for herbal remedies [30,31] to prevent and cure SARS-CoV2 infection. While, all this information is very crucial to understand the viral transmissibility and/or severity, host genetic makeup is also one of the factors which are also equally important. Therefore, although comorbidities and age remain the major contributors for mortalities, host genetics appears as significant component for observed differences in individual response to COVID-19 infection, disease progression, as well as severity [[32], [33], [34]]. As mentioned previously, several GWAS studies on COVID-19 patients for various ethnic groups have been carried out and identified several genes. Very recently, COVID-19 Host Genetics Initiative has published GWAS data of 46 studies across 19 countries and reported 13 significant (P < 1.67 × 10–8) genomic loci for COVID-19 [17]. Similarly, [35] have reported 8 super-variants for COVID-19 mortality. Here we aimed to identify the host SNPs and genes associated with COVID-19 severity in the asymptomatic, recovered and deceased patients from Gujarat, India. Lungs are the major organ associated with respiratory diseases including COVID-19. Thrombosis related heart and lung malfunctions are common in COVID-19 patients [[36], [37], [38]]. Similarly, high D-Dimer in COVID-19 patients is associated with mortality [39]. Additionally, cardiac arrest in COVID-19 patients and post COVID-19 is prominent [40,41]. Therefore, genes associated with heart and lung disease are very crucial and worth to study with prospect of COVID-19 complications. The analysis of deceased vs recovered patients identified six genomic loci. Tumor Necrosis Factor Ligand Superfamily membrane genes, TNFSF4 and TNFSF18 are located in the vicinity of significantly associated marker (rs17300100) in this study and the genes associated have key role in the inflammatory disease conditions or inflammatory activation of macrophage/microglia cells [42]. Moreover, mutation in TNFSF4 have a known role in myocardial infarction [43] and Systemic Lupus Erythematosus (SLE) [44]. Several studies have discussed the role of T cell response in COVID-19 [[45], [46], [47], [48]]. TNFSF18 is associated with T-cell responses as well, can act as a co-stimulator and lower the threshold for T-cell activation and proliferation [[49], [50], [51]]. Another nearest gene to this marker is GOT2P2 i.e. Glutamic-Oxaloacetic Transaminase 2-Like 2. Two SNPs (rs6691738 and rs10158467) nearby rs17300100 have been reported for asthma [52,53]. Additionally, five SNPs/markers (rs114680188, rs12117214, rs147327230, rs9425716, and 1:173368150) within 1 MB region of TNFSF4 and TNFSF18 are already reported for their association with COVID-19 [54]. For this marker, the frequency of altered allele is 9.3% higher in deceased patients as compared to those who recovered. Similarly, if we compare this frequency with other populations, it is also high in the European population (which also experienced high mortality compared to SAS) and matching with the frequency of deceased group. The frequency of this allele in South Asian population is similar to the recovered group in this study. Another two significant markers are located on 4p15.2 and nearby genes are DHX15, RN7SL16P, PPARGC1A. It has been reported that several members of the DEXD/H box helicase family including DHX15 have key role in innate immunity against the viral infection [[55], [56], [57], [58]]. Further, knockdown of DHX15 impair the capacity of myeloid dendritic cells to synthesize IFN-b, IL-6, and TNF-a in response to dsRNA and RNA virus [55] and NF-κB regulation of cytokines, ERK and TNF-α signalling pathways play an important role in inflammation [59]. In the same way, DDX1 has been reported to interact with NSP14 of the infectious bronchitis coronavirus and enhance viral replication [60]. Moreover, in support to this study, numerous SNPs (with p-value ≤1 × 10−4) in the 1 Mb downstream stream region of DHX15 gene have been reported for their association with COVID-19. Similarly, another gene nearby these two loci is PPARGC1A (PGC-1α) which is a key regulator of mitochondrial function [61,62]. Recent evidences suggest that, SARS-CoV-2 take over the mitochondrial function and specifically disrupt the immune function in COVID-19 patients [[63], [64], [65], [66], [67]]. Downregulation of PPARGC1A during SARS-CoV-2 infection is already reported [68]. Thus, the finding of this is comparable with the previous studies. Contrary to rs17300100, here the reference alleles were found to be associated with the mortally/severalty while, altered allele was found to be protective (recovery). Here also, the allele frequency of the European population is very close to the altered allele frequency of the deceased patients. In one the study authors have performed Transcriptome Wide Association Study (TWAS) and reported genetic regulation of CXCR6, and correlated it with COVID-19 severity [69]. Thus, the findings of this study is also correlating with the previously reported data. All this data, in sum suggest that, TNFSF, GOT2P2, DHX15, and PPARGC1A may have a vital role in COVID-19 severity and mortality. Analysis of recovered vs asymptomatic patients revealed significant association of six genomic loci with SARS-CoV-2 infection. Among these, rs72663004 (chr1:20166359:C:T) has nearby genes are secretory calcium-dependent phospholipase group A2 (PLA2G5, PLA2G2D, PLA2G2F, and PLA2G2C) and UBXN10. Lipid metabolism plays an important role in viral endocytosis, exocytosis and also act as a putative target for antiviral therapy [70,71]. Up regulation of sphingomyelins, GM3s, and glycerophosphocholines have been reported in COVID-19 patients [72,73]. Function of phospholipase A2 group IID in age related susceptibility for SARS-CoV infection is already reported [74]. Moreover, it has been also anticipated that, inhibition of phospholipases A2 may help in treatment of COVID-19 patients. Another nearby gene to marker rs72663004 is ubiquitin regulatory X (UBX) and members of this family have been reported to inhibit the viral life cycle of retrovirus and lentivirus via regulation of genes involve in the pathways related to cell adhesion and immune system signalling. Innate immune system plays an important role during an early stage of infection for any pathogen. Cilia in the respiratory track play an essential role in innate immune system for the respiratory infections via removing the gasped elements [75,76]. RFX family of transcription factors including regulatory factor X3 (RFX3) is indispensable for ciliogenesis [[77], [78], [79]]. It has been also reported that RFX3 complement with FOXJ1 for cilia formation in the human airway epithelium and any mutation in this gene may cause Primary ciliary dyskinesia [75,79,80]. In this study, we also found two markers (rs72699049 & rs72699016) on chromosome 9p24.2 both are located near RFX3. If we compare the allele frequencies for the markers rs72699016, rs12253652, rs72717619, and rs1734523522, frequency of altered allele is very high both in recovered and asymptomatic group as compared to the frequency in the European and South Asian populations. And if, we compare only recovered and asymptomatic group for this study, among the above mentioned locations, rs72699016, rs12253652, rs72663004 has higher altered allele frequency in the patients those who recovered. This data altogether suggests that, the altered allele at these genomic positions in the Western Indian population might have a protective role against COVID-19, i.e. even if they are infected with SARS-CoV-2, they may have remained asymptomatic or recovered. For the analysis of deceased vs asymptomatic patients, seven significant loci are identified. Here, the important one is rs34279101and nearby genes are ANGEL1, LRRC74A, VASH1-AS1, and uncharacterized genes. As mentioned earlier, respiratory distress syndrome and lung pathology is commonly observed in severe cases of COVID-19. While comparing deceased and asymptomatic patients in the present study, we found two putative genes LRRC74A and ANGEL1 near rs34279101 which may have role in COVID-19 severity in deceased patients. In support to this study, five locations within 1 Mb region of rs34279101 are already reported for COVID-19 however, with leaser p-vale. Function of leucine rich repeat such as LRRC10 in cardiomyopathy [81] and primary ciliary dyskinesia [82] has already been reported. Similarly, allelic variants in LRRC56 are associated with primary ciliary dyskinesia, a disorder associated with chronic respiratory tract infections [83]. Another nearby gene to this locus is ANGEL1/Ccr4e. Again, this gene in either way associated with cardiac disease such as loss of myocardial cells. At this locus, the frequency of altered allele is higher in deceased individuals as compared to asymptomatic patients. However, allele frequency in other populations is quite different. rs10835056 on 11p14.2 has nearby genes ANO3 and SLC5A12. Elevated level of lactate and lactate dehydrogenase may be because of hypoxia [84,85] or inflammation, is reported in the COVID-19 patients and also associated with the mortality in the septic patients [[86], [87], [88], [89]]. Low and high affinity SLC5A12 transporters transport the lactate [90]. Previous studies suggest that SLC5A12 transport lactate into the T cells at the site of inflammation and control its function [91,92]. At this locus, the reference allele was found to be associated with mortality as its frequency was higher than the asymptomatic patients group.

Conclusion

In summary, the present study suggests that polymorphic loci around genes involved in lung and heart diseases such as Tumor necrosis factor superfamily TNFSF4&18, GOT2P2, and LRRC74A as well as genes connected with innate immune system (DHX15), and mitochondrial function (PPARGC1A) are significantly associated with COVID-19 severity in Western India population. Whereas, altered allele near RFX3 and UBXN10 genes are found to be protective in COVID-19 patients in the study population. Our findings suggest that, identified genomic markers may be decisive for the COVID-19 progression and severity in the Western Indian population. Therefore, the unfavourable alleles of the markers showing association with the disease severity and outcome can be used for risk prediction during the SARS-CoV-2 infections.

Material and methods

Recruitment of patients

In this study, we recruited 571 COVID-19 patients with different stages of disease severity. Samples were collected from 25 different hospitals of 24 districts across the Gujarat state of India (Supplementary Fig. S1). All the metadata information such as age, sex, and comorbidity if any, were recorded (Supplementary Table S1). All the patients were confirmed for SARS-CoV-2 infection using RT-PCR of nasopharyngeal swab samples using TaqPath™ 1-Step RT-qPCR kit on Applied Biosystems 7500 Fast Dx Real-Time PCR system (Thermo Fisher Scientific). Based on the clinical manifestations and disease severity, all the patients were broadly categorized as either symptomatic or asymptomatic. Asymptomatic patients are those who experienced very mild symptoms such as cough, body aches, etc. but did not required hospitalisation. Symptomatic patients had major symptoms including cold, fever, breathlessness, sore throat etc. and importantly they required ventilation or oxygenation in the intensive care unit (ICU). Symptomatic patients were further followed for the final outcome and further divided into two groups i.e. recovered and deceased. Therefore, in the final analysis, comparison was made among three groups i.e. asymptomatic, symptomatic but recovered and deceased. With these criteria, total 148, 327 and 96 patients were considered as asymptomatic, recovered and deceased, respectively.

Sample processing, genotyping, imputation and data GWAS analysis

DNA from blood samples was isolated using John's method [93]. Quantity of extracted DNA was estimated using DNA High sensitivity assay kit on Qubit fluorimeter v 4.0 (Thermo Fisher Scientific). Quality of extracted DNA was assessed using agarose gel electrophoresis and QIAxpert system (QIAGEN). For genotyping, we used Axiom™ Precision Medicine Diversity Array (PMDA) Plus Kit, 96-format containing 8,68,298 markers selected for high genomic coverage (Thermo Fisher Scientific) on GeneTitan Multi-Channel (MC) Instrument (Thermo Fisher Scientific). Best markers were selected using Axiom Analysis Suite following the best practices workflow with the following parameters: sample QC Threshold; QC call_rate: ≥97, SNP QC Threshold; scr-cutoff: ≥95, and therefore, markers with high-resolution were analysed further. For imputation, we used TOPMed Imputation Server (https://imputation.biodatacatalyst.nhlbi.nih.gov/). To perform GWAS, imputed chromosome files were merged and VCF format files were further converted to plink format. The population stratification was performed using PLINK v1.9. GWAS analysis was performed using PLINK v1.9 [94] and SAIGE v 0.44.5 [95] at minor allele frequency (MAF) >0.05. Comparison between different groups was done as; deceased vs recovered, deceased vs asymptomatic and recovered vs asymptomatic patients.

Declaration of Competing Interests

All the authors of this manuscript declare no competing interests.

Funding

This work is funded by the Department of Science and Technology (DST), Government of Gujarat, Gandhinagar, Gujarat, India.

Ethical approval

The present study involving human participants were reviewed and approved by the Institutional Ethical Committee of Gujarat Biotechnology Research Centre (GBRC), Gandhinagar, B. J. Medical College and Civil hospital, Ahmedabad, reference No. EC/Approval/38/2020 and GMERS medical College Gandhinagar, reference No. GMERS/MCG/IEC/06/2020.

Availability of data

The analysed data from the current study is submitted to https://www.covid19hg.org/.

Author contributions

Chaitanya Joshi: Conceptualization, Funding acquisition, Project administraion, Methodology, Editing manuscript, Supervision. Madhvi Joshi: Funding acquisition, Investigation, Review and editing manuscript, Supervision, Project administration. Ramesh Pandit: Formal analysis, Writing original draft. Indra Singh and Afzal Ansari: Formal analysis. Janvi Raval and Zarna Patel: Data curation. Raghav Dixit, Pranay Shah, Kamlesh Upadhyay, Naresh Chauhan, Kairavi Desai, Meenakshi Shah, and Bhavesh Modi: Resources, Review manuscript.
  90 in total

Review 1.  Transcriptional control of genes involved in ciliogenesis: a first step in making cilia.

Authors:  Joëlle Thomas; Laurette Morlé; Fabien Soulavie; Anne Laurençon; Sébastien Sagnol; Bénédicte Durand
Journal:  Biol Cell       Date:  2010-07-09       Impact factor: 4.458

2.  Understanding COVID-19 through genome-wide association studies.

Authors:  Tom H Karlsen
Journal:  Nat Genet       Date:  2022-04       Impact factor: 38.330

3.  Trans-ancestry analysis reveals genetic and nongenetic associations with COVID-19 susceptibility and severity.

Authors:  Janie F Shelton; Anjali J Shastri; Chelsea Ye; Catherine H Weldon; Teresa Filshtein-Sonmez; Daniella Coker; Antony Symons; Jorge Esparza-Gordillo; Stella Aslibekyan; Adam Auton
Journal:  Nat Genet       Date:  2021-04-22       Impact factor: 38.330

4.  The RNA helicase DHX15 is a critical regulator of natural killer-cell homeostasis and functions.

Authors:  Guangchuan Wang; Xiang Xiao; Yixuan Wang; Xiufeng Chu; Yaling Dou; Laurie J Minze; Rafik M Ghobrial; Zhiqiang Zhang; Xian C Li
Journal:  Cell Mol Immunol       Date:  2022-03-23       Impact factor: 22.096

Review 5.  Mitochondrial biogenesis and dynamics in the developing and diseased heart.

Authors:  Gerald W Dorn; Rick B Vega; Daniel P Kelly
Journal:  Genes Dev       Date:  2015-10-01       Impact factor: 11.361

6.  Blood clots in COVID-19 patients: Simplifying the curious mystery.

Authors:  Sourav Biswas; Vikram Thakur; Parneet Kaur; Azhar Khan; Saurabh Kulshrestha; Pradeep Kumar
Journal:  Med Hypotheses       Date:  2020-11-06       Impact factor: 1.538

7.  The association of diabetes with COVID-19 disease severity: evidence from adjusted effect estimates.

Authors:  Xuan Liang; Jie Xu; Wenwei Xiao; Li Shi; Haiyan Yang
Journal:  Hormones (Athens)       Date:  2020-11-25       Impact factor: 2.885

Review 8.  Silent hypoxia in COVID-19: pathomechanism and possible management strategy.

Authors:  Ahsab Rahman; Tahani Tabassum; Yusha Araf; Abdullah Al Nahid; Md Asad Ullah; Mohammad Jakir Hosen
Journal:  Mol Biol Rep       Date:  2021-04-23       Impact factor: 2.316

9.  Epigenome-wide association study of COVID-19 severity with respiratory failure.

Authors:  Manuel Castro de Moura; Veronica Davalos; Laura Planas-Serra; Damiana Alvarez-Errico; Carles Arribas; Montserrat Ruiz; Sergio Aguilera-Albesa; Jesús Troya; Juan Valencia-Ramos; Valentina Vélez-Santamaria; Agustí Rodríguez-Palmero; Judit Villar-Garcia; Juan P Horcajada; Sergiu Albu; Carlos Casasnovas; Anna Rull; Laia Reverte; Beatriz Dietl; David Dalmau; Maria J Arranz; Laia Llucià-Carol; Anna M Planas; Jordi Pérez-Tur; Israel Fernandez-Cadenas; Paula Villares; Jair Tenorio; Roger Colobran; Andrea Martin-Nalda; Pere Soler-Palacin; Francesc Vidal; Aurora Pujol; Manel Esteller
Journal:  EBioMedicine       Date:  2021-04-14       Impact factor: 8.143

10.  Large-Scale Plasma Analysis Revealed New Mechanisms and Molecules Associated with the Host Response to SARS-CoV-2.

Authors:  Elettra Barberis; Sara Timo; Elia Amede; Virginia V Vanella; Chiara Puricelli; Giuseppe Cappellano; Davide Raineri; Micol G Cittone; Eleonora Rizzi; Anita R Pedrinelli; Veronica Vassia; Francesco G Casciaro; Simona Priora; Ilaria Nerici; Alessandra Galbiati; Eyal Hayden; Marco Falasca; Rosanna Vaschetto; Pier Paolo Sainaghi; Umberto Dianzani; Roberta Rolla; Annalisa Chiocchetti; Gianluca Baldanzi; Emilio Marengo; Marcello Manfredi
Journal:  Int J Mol Sci       Date:  2020-11-16       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.