Literature DB >> 33730015

Gene variants of coagulation related proteins that interact with SARS-CoV-2.

David Holcomb¹, Aikaterini Alexaki¹, Nancy Hernandez¹, Ryan Hunt¹, Kyle Laurie¹, Jacob Kames¹, Nobuko Hamasaki-Katagiri¹, Anton A Komar², Michael DiCuccio³, Chava Kimchi-Sarfaty¹.

Abstract

Thrombosis is a recognized complication of Coronavirus disease of 2019 (COVID-19) and is often associated with poor prognosis. There is a well-recognized link between coagulation and inflammation, however, the extent of thrombotic events associated with COVID-19 warrants further investigation. Poly(A) Binding Protein Cytoplasmic 4 (PABPC4), Serine/Cysteine Proteinase Inhibitor Clade G Member 1 (SERPING1) and Vitamin K epOxide Reductase Complex subunit 1 (VKORC1), which are all proteins linked to coagulation, have been shown to interact with SARS proteins. We computationally examined the interaction of these with SARS-CoV-2 proteins and, in the case of VKORC1, we describe its binding to ORF7a in detail. We examined the occurrence of variants of each of these proteins across populations and interrogated their potential contribution to COVID-19 severity. Potential mechanisms, by which some of these variants may contribute to disease, are proposed. Some of these variants are prevalent in minority groups that are disproportionally affected by severe COVID-19. Therefore, we are proposing that further investigation around these variants may lead to better understanding of disease pathogenesis in minority groups and more informed therapeutic approaches.

Entities: CellLine Chemical Disease Gene Mutation Species

Year: 2021 PMID： 33730015 PMCID： PMC8007013 DOI： 10.1371/journal.pcbi.1008805

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.475

Introduction

The Coronavirus disease of 2019 (COVID-19) has been associated with coagulopathy, particularly microclots in the lungs [1-5], that correlates with disease severity [6-9]. There is extensive cross-talk between inflammation and coagulation, and inflammation is presumed to have a role in the observed coagulation phenotype. However, the widespread thrombotic events that are seen in severe COVID-19 patients suggest that there may be a more direct link. In a study conducted before the onset of the COVID-19 pandemic, the severe acute respiratory syndrome (SARS) coronavirus (CoV)-host interactome was investigated. A few proteins related to the coagulation cascade were experimentally identified to interact with viral proteins (Fig 1). Poly(A) Binding Protein Cytoplasmic 4 (PABPC4) was shown to interact with the nucleocapsid (N) protein. Serine/Cysteine Proteinase Inhibitor Clade G Member 1 (SERPING1 or C1 inhibitor) was shown to interact with nsp14, ORF14, ORF3b, ORF7b, nsp2, nsp8 and nsp13. In addition, Vitamin K epOxide Reductase Complex subunit 1 (VKORC1) was shown to interact with the SARS protein ORF7a. The interactions were initially identified by a high-throughput yeast two-hybrid system and confirmed with LUMIER assay [10].

Fig 1

Graphic summary of ORF7a-VKORC1 interaction and possible effects.

The interaction between ORF7a and VKORC1 and possible effects of this interaction.

Graphic summary of ORF7a-VKORC1 interaction and possible effects.

The interaction between ORF7a and VKORC1 and possible effects of this interaction. PABPC4 localizes primarily to the cytoplasm and binds to the poly(A) tail present at the 3-prime end of mRNA. However, it is also found in the surface of thrombin-activated platelets, and therefore it is known as activated-platelet protein-1 (APP-1) [11,12]. PABPC4 may also be involved in the regulation of protein translation in platelets and megakaryocytes may participate in the binding or stabilization of polyadenylates in platelet dense granules [13]. SERPING1 is a plasma protease involved in the complement, intrinsic coagulation and fibrinolytic pathways. In the coagulation cascade, SERPING1 inactivates plasma kallikrein, factor XIIa and factor XIIf. The absence of sufficient levels of functional SERPING1 leads to hereditary angioedema (HAE), which is mediated by sustained activation of kallikrein leading to cleavage of high molecular weight kininogen (HMWK), producing bradykinin [14]. ORF7a is a viral protein that has not been well studied. While it counteracts the anti-viral properties of tetherin (BST2) [15,16], allowing for easier dispersal of virions, this protein has been found to be dispensable for viral replication in cell culture [17]. ORF7a may bind to Integrin beta chain-2 (ITGB2), a protein which is necessary for movement and phagocytosis in lymphocytes [18]. VKORC1 is an enzyme critical for coagulation due to its role in converting vitamin K epoxide into active vitamin K [19], the rate-limiting step in the physiological process of vitamin K recycling. Importantly, vitamin K is necessary for the carboxylation of glutamic acid residues to produce Gla residues. Several human proteins have domains with Gla residues, including coagulation factors II, VII, IX, X, and anticoagulant proteins C, S, and Z. VKORC1 is expressed in all tissues, but particularly in the liver, lungs, and female reproductive system. It is generally embedded in the endoplasmic reticulum [20]. Dietary vitamin K deficiency is associated with coagulopathy, specifically bleeding. Vitamin K antagonists are anticoagulant drugs that work by inhibiting the activity of VKORC1, reducing the levels of available active vitamin K and coagulation factors. Of the vitamin K antagonists, warfarin is most commonly used. Some variants in VKORC1, particularly those common in African and African American populations, are reported to result in warfarin resistance. Warfarin response is also dependent on dietary factors and liver function [21]. For these reasons, dosing warfarin is complicated, and genotyping of VKORC1 to determine the presence of known polymorphisms (such as c.1173C>T) is recommended before initiating warfarin treatment. The impact of viral protein interactions with VKORC1, SERPING1 and PABPC4 on patient outcomes in COVID-19 infection is unknown. While comorbidities, age, and other factors will impact the predisposition to thrombosis or coagulopathy, binding of viral proteins to coagulation related proteins may be partially responsible for the prothrombotic phenotype that is seen in COVID-19 patients. Through computational modeling, we examined the binding of VKORC1, SERPING1 and PABPC4 to SARS-CoV-2 proteins and generated additional evidence for the binding of ORF7a to VKORC1. To further study the impact of these protein-protein interactions, we analyzed the effect of variants on patient outcomes and on protein function. We analyzed COVID-19 genome-wide association study (GWAS) results to find the most influential variants from these genes and characterize them to find potential causes of effect. Then, we investigated several VKORC1, SERPING1 and PABPC4 variants that may impact protein function and coagulopathy, and we identified some VKORC1 variants that may result in warfarin resistance. In particular, we highlight two variants, which are enriched in certain ethnic groups. Better understanding of the contribution of these genes and their variants to COVID-19 pathogenesis may lead to new therapeutic avenues and improved prognosis. This may be of crucial importance for minority groups that are disproportionally affected by severe COVID-19.

Methods

Structural similarities and computational docking of proteins

To assess the binding of SARS-CoV-2 ORF7a and human VKORC1, we used I-TASSER [22-24] to generate homology models for both proteins. However, all ORF7a models showed an interaction between the luminal and transmembrane domains, which would cause clashing between the plasma membrane and the ORF7a luminal domain. For this reason, we used only the transmembrane domains of the models. Then, using the model with the best C-score, we used Zdock [25] to find potential binding sites. From Zdock, we used the protein-protein complex with the N-terminus of the ORF7a fragment closest to the luminal portion of VKORC1, which would introduce the least amount of clashing between the ORF7a luminal domain and the plasma membrane. This complex was used as input to Rosetta Prepack and Rosetta Dock [26-29] to further refine the models by using rigid body perturbations. The top five models were retained. We examined known SARS-CoV-2 ORF7a variants [30,31]. However, all encountered variants were in the luminal domain and are unlikely to impact the interaction with VKORC1. In addition, to verify the binding of PABPC4 and SERPING1 with SARS-CoV-2 proteins, we created homology models for each using I-TASSER and Robetta [32,33]. Template structures for all models are given in S1 Table. However, because segments of PABPC4 and SERPING1 have not been crystallized, these regions in the models were of low quality. For this reason, we used Blast and Clustal Omega to create multiple sequence alignments (MSAs) of proteins similar to interacting SARS proteins, and computed the percent of columns of the homologous SARS-CoV-2 protein matching the SARS protein, as well as a loglikelihood score to measure the probability that the SARS-CoV-2 homolog would be included in the MSA (Table 1). In addition, the MSAs were filtered to remove duplicate sequences by performing affinity propagation clustering with the Levenshtein distance matrix formed from the sequences. Only the cluster centers, SARS, and SARS-CoV-2 sequences were used in the MSA. This was done to account for the large number of very similar sequences, generally from different strains of SARS-CoV-2.

Table 1

Sequence homology of selected SARS and SARS-CoV-2 proteins.

Protein	Fraction Matching	Loglikelihood
N	0.888626	-0.04824
ORF7a	0.827869	-0.05688
nsp14	0.884393	-0.19572
ORF7b	0.795455	-0.01103
nsp3	0.75078	-0.46736
nsp2	0.681818	-0.44764
nsp8	0.969697	-0.12897
nsp13	0.948767	-0.1215

MSA likelihood is the fraction of sequences in an MSA matching SARS-CoV-2 for a given column. Assuming all columns are independent, ∏P(x) gives the probability of finding the SARS-CoV-2 sequence in the MSA sequences, which ranges between 0 and 1. Taking log of this value gives log(∏P(x)) = ∑(log P(x)), an additive loglikelihood score which is nonpositive, with lower values indicating more positions in the SARS-CoV-2 sequence that differ from the MSA sequences.

MSA fraction matching is the fraction of positions in the SARS-CoV-2 protein matching the homologous SARS protein, when both are aligned in an MSA. Higher number indicates more conserved position and the range is between 0 and 1. MSA likelihood is the fraction of sequences in an MSA matching SARS-CoV-2 for a given column. Assuming all columns are independent, ∏P(x) gives the probability of finding the SARS-CoV-2 sequence in the MSA sequences, which ranges between 0 and 1. Taking log of this value gives log(∏P(x)) = ∑(log P(x)), an additive loglikelihood score which is nonpositive, with lower values indicating more positions in the SARS-CoV-2 sequence that differ from the MSA sequences. We used the ORF7a homology model to query Dali [34] for similar protein structures. The top structures in sequence and structural similarity were the ORF7a proteins for SARS and SARS-CoV-2 (PDBs 1YO4 and 6W37). All human proteins interacting with VKORC1 were taken from BIOGRID, the Biological General Repository for Interaction Datasets [35,36]. In addition, we queried Dali against all other viral protein structures, as modeled in I-TASSER.

Relevant variants from COVID19 HGI GWAS metastudies

All variants from the genomic region containing VKORC1, SERPING1, and PABPC4 ±6000 bp were taken from the ANA2, ANA5, and ANA7 metastudies from COVID19 Host Genetics Initiative [37] and The Severe Covid-19 GWAS Group [38] (Tables 2 and S2). We filtered the resulting variants to keep only those with metastudy p-value below 0.05. The resulting variants were all in non-coding regions, therefore, amino acid and codon features do not apply.

Table 2

Possible predicted effect of variants in VKORC, SERPING1 and PABPC4.

Transcript	Location	Fraction matching in MSA	Change in splicing	Average change in mRNA MFE (Z-score)	miRNA summary
VKORC1
NM_024006.4:c.-4931C>T	5’ UTR	0.431818		2.20025
NM_024006.4:c.-4851C>T	5’ UTR	0.754967		1.06847
NM_024006.4:c.-2834C>A	5’ UTR	0.950943			miRNA gained
NM_024006.4:c.-1639G>A	5’ UTR	0.0625	Possible splicing change
NM_024006.4:c.174-136C>T	Intron	0.020228		0.83549
NM_024006.4:c.283+124G>C	Intron	0.133333		-1.44871
NM_024006.4:c.283+837T>C	Intron	0.146727		1.35367
SERPING1
NM_000062.2:c.-3537C>G	5’ UTR	0.009615			miRNA decrease
NM_000062.2:c.-2415G>A	5’ UTR	0.033708	Possible splicing change	0.77565
NM_000062.2:c.-1675G>A	5’ UTR	0.426901			miRNA gained
NM_000062.2:c.52-696C>T	Intron	0.068027	Likely splicing change	0.30003
NM_000062.2:c.52-130C>T	Intron	0.833333
NM_000062.2:c.52-130C>T	Intron	0.833333		-0.28155
NM_000062.2:c.550+794C>A	Intron	0.693694		0.71906
NM_000062.2:c.685+88G>A	Intron	0.769231		-0.75902
NM_000062.2:c.685+659C>T	Intron	0.581818		-0.47852	miRNA decrease
NM_000062.2:c.685+659C>T	Intron	0.679245		-0.68662
NM_000062.2:c.685+1100C>T	Intron	0.772313		2.01226
NM__000062.2:c.685+1391C>T	Intron	0.841912	Likely splicing change
NM_000062.2:c.685+1550G>T	Intron	0.926641
NM_000062.2:c.685+1770C>T	Intron	0.793594		0.84106
NM_000062.2:c.1029+926G>T	Intron	0.015723			miRNA gained
NM_000062.2:c.1029+1443G>C	Intron	0.595745			miRNA gained
NM_000062.2:c.1029+2110T>C	Intron	0.393617	Likely splicing change
NM_000062.2:c.1029+2111G>A	Intron	0.687117			miRNA gained
NM_000062.2:c.1030-2243T>G	Intron	0.026616		-1.68665
NM_000062.2:c.1030-1975G>C	Intron	0.02551		1.92651
NM_000062.2:c.1030-1436T>C	Intron	0.823529
NM_000062.2:c.1030-20A>G	Intron	0.25		-0.76751
NM_000062.2:c.1438G>A	Exon	0.399177	Possible splicing change
NM_000062.2:c.*1323G>A	3’ UTR	0.878788			miRNA lost
NM_000062.2:c.*1521G>T	3’ UTR	0.65873
NM_000062.2:c.*2614A>T	3’ UTR	0.016129			miRNA gained
PABPC4
NM_003819.3:c.-5600T>C	5’ UTR	0.48			miRNA lost
NM_003819.3:c.-4432G>A	5’ UTR	0.009317			miRNA gained
NM_003819.3:c.-4428A>G	5’ UTR	0.221505
NM_003819.3:c.-3677T>G	5’ UTR	0.021645	Possible splicing change	-0.59763
NM_003819.3:c.-3636G>A	5’ UTR	0.022378		2.31727
NM_003819.3:c.-3198T>C	5’ UTR	0.856079	Possible splicing change		miRNA lost
NM_003819.3:c.-2286T>G	5’ UTR	0.210526
NM_003819.3:c.-650C>T	5’ UTR	0.829978			miRNA lost
NM_003819.3:c.193+796C>G	Intron	0.666667
NM_003819.3:c.504-254C>A	Intron	0.247191		0.26322
NM_003819.3:c.738+85T>C	Intron	0.333333			miRNA lost
NM_003819.3:c.877-387C>T	Intron	1			miRNA lost
NM_003819.3:c.972+53A>T	Intron	1
NM_003819.3:c.972+704C>G	Intron	0.5
NM_003819.3:c.1333+26C>G	Intron	0.3125	Likely splicing change
NM_003819.3:c.1621-348C>G	Intron	1
NM_003819.3:c.*765C>A	3’ UTR	1		1.96974
NM_003819.3:c.*1261C>T	3’ UTR	0.771242		-0.91257	miRNA decrease
NM_003819.3:c.*4685A>G	3’ UTR	0.054945		-3.4524	miRNA decrease
NM_003819.3:c.*5316C>T	3’ UTR	0.696181	Possible splicing change		miRNA lost

Change in splicing is presented when all tools find a change in splicing and all hexamer scores are greater than one standard deviation from the mean, and is marked in red when the variant appears in an intron. mRNA MFE changes are normalized (converted into a Z-score) for KineFold, remuRNA, and mFold, then averaged. When all three mRNA MFE changes are above one standard deviation, we mark the value in underline. miRNA summaries are presented when all miRNA changes agree in direction, and the total change is at least 5. miRNA changes are underlined when the variant appears upstream. We characterized these variants in terms of splicing, using hexamer scoring tools [39,40], ESEfinder [41,42], ExonScan [43-45], and FAS-ESS [43]. Where ESEfinder, ExonScan, and FAS-ESS found a change in splicing potential between the wild type (WT) and mutant, the change was reported in Table 2 as “Change in splicing”. When the variant occurred in an intron as opposed to a UTR, we further highlighted the value. Then, we calculated mRNA mean free energy using Kinefold [46], mFold [47-49], and remuRNA [50]. When all three tools were in agreement regarding the direction of the change, the changes in mRNA MFE were converted into Z-scores using mean and standard deviation values computed by randomly sampling WT and mutant sequences. The average of the three Z-scores is reported in Table 2 as “Average change in mRNA MFE (Z-score)”. We also analyzed miRNA binding changes using miRDB [51,52]. For any variant, there may be multiple affected miRNA species. miRNA binding scores are provided for both the WT and mutant flanking 501 nucleotides in S3 Table, and a summary of miRNA binding changes is provided in Table 2. When all miRNA binding changes were in the same direction, we summarized the effect. We analyzed conservation using fraction matching in a nucleotide MSA, computed as the fraction of sequences in the MSA matching the wild type sequence in the appropriate column. This value is included in Table 2 as “Fraction matching in MSA”. Finally, we collected population prevalence data from dbSNP (Tables 3 and S3).

Table 3

Population frequencies (gnomAD) of GWAS variants of VKORC1, SERPING1, and PABPC4.

Transcript	Global	African	American	Ashkenazi Jewish	East Asian	European	Other
VKORC1
NM_024006.4:c.-4931C>T	0.5758	0.5408	0.542	0.514	0.1007	0.63146	0.619
NM_024006.4:c.-4851C>T
NM_024006.4:c.-2834C>A	0.0049	0.0001	0.001	0	0	0.00762	0.007
NM_024006.4:c.-1639G>A	0.326	0.1009	0.444	0.476	0.8996	0.37236	0.369
NM_024006.4:c.174-136C>T	0.3261	0.1009	0.443	0.476	0.8995	0.37264	0.37
NM_024006.4:c.283+124G>C	0.4163	0.2564	0.442		0.8849
NM_024006.4:c.283+837T>C	0.6431	0.7907	0.546	0.517	0.1017	0.62682	0.628
NM_014699.3:c.*2082G>C	0.0049	0.0001	0.001	0	0	0.00763	0.007
NM_014699.3:c.*2737G>T	0.0048	0.017	0.002	0	0	0.00005	0
SERPING1
NM_000062.2:c.-3537C>G	0.0275	0.0078	0.021	0.017	0	0.03846	0.041
NM_000062.2:c.-2415G>A	0.0939	0.0866	0.059	0.141	0.1113	0.09708	0.087
NM_000062.2:c.-1675G>A	0.0937	0.0862	0.059	0.141	0.1105	0.09697	0.087
NM_000062.2:c.52-696C>T	0.3927	0.4767	0.529	0.452	0.7655	0.31673	0.386
NM_000062.2:c.52-130C>T	0.385	0.448	0.525	0.455	0.7668	0.31739	0.385
NM_000062.2:c.52-130C>T	0.385	0.448	0.525	0.455	0.7668	0.31739	0.385
NM_000062.2:c.550+794C>A	0.3936	0.4761	0.531	0.451	0.7697	0.31779	0.388
NM_000062.2:c.685+88G>A	0.2225	0.1006	0.15	0.262	0.1157	0.28733	0.273
NM_000062.2:c.685+1391C>T	0.0248	0.0059	0.022	0.035	0	0.03499	0.038
NM_000062.2:c.685+659C>T	0.3901	0.4743	0.541	0.455	0.769	0.31084	0.381
NM_000062.2:c.685+659C>T	0.3901	0.4743	0.541	0.455	0.769	0.31084	0.381
NM_000062.2:c.685+1100C>T	0.2253	0.1124	0.147	0.262	0.1208	0.28765	0.274
NM_000062.2:c.685+1550G>T	0.2251	0.1127	0.152	0.264	0.1207	0.28734	0.269
NM_000062.2:c.685+1770C>T	0.2216	0.0992	0.15	0.262	0.1184	0.28696	0.27
NM_000062.2:c.1029+926G>T	0.2279	0.1	0.15	0.264	0.1224	0.29523	0.284
NM_000062.2:c.1029+1443G>C	0.2282	0.1004	0.15	0.269	0.1198	0.29577	0.284
NM_000062.2:c.1029+2110T>C	0.612	0.5191	0.469	0.538	0.2347	0.69271	0.624
NM_000062.2:c.1029+2111G>A	0.227	0.1003	0.15	0.264	0.1183	0.29407	0.283
NM_000062.2:c.1030-2243T>G	0.6129	0.5195	0.47	0.541	0.2387	0.69335	0.626
NM_000062.2:c.1030-1975G>C	0.0113	0.0022	0.008	0.024	0	0.01647	0.008
NM_000062.2:c.1030-1436T>C	0.0045	0.0014	0.001	0.003	0	0.00645	0.004
NM_000062.2:c.1030-20A>G	0.6134	0.5197	0.472	0.541	0.2461	0.69353	0.623
NM_000062.2:c.1438G>A	0.2282	0.1007	0.15	0.269	0.1202	0.29561	0.285
NM_000062.2:c.*1323G>A	0.2283	0.1009	0.151	0.269	0.1175	0.29578	0.285
NM_000062.2:c.*1521G>T	0.1496	0.0855	0.166		0.0942
NM_000062.2:c.*2614A>T	0.6058	0.4936	0.463	0.538	0.2277	0.69504	0.626
PABPC4
NM_003819.3:c.-5600T>C	0.8127	0.5571	0.918	0.945	0.9909	0.90666	0.891
NM_003819.3:c.-4432G>A	0.0403	0.0093	0.095	0.024	0.3712	0.02506	0.047
NM_003819.3:c.-4428A>G	0.0432	0.0096	0.1	0.024	0.3712	0.02918	0.052
NM_003819.3:c.-3677T>G	0.1792	0.0535	0.122	0.247	0.1111	0.24258	0.219
NM_003819.3:c.-3636G>A	0.0052	0.0027	0.004	0.007	0	0.0068	0.007
NM_003819.3:c.-3198T>C	0.0025	0.001	0.002	0	0	0.00329	0.004
NM_003819.3:c.-2286T>G	0.0136	0.0031	0.014	0.01	0	0.01952	0.015
NM_003819.3:c.-650C>T	0.0079	0.0027	0.002	0	0	0.01172	0.008
NM_003819.3:c.193+796C>G	0.8013	0.509	0.913	0.945	0.9904	0.90714	0.895
NM_003819.3:c.504-254C>A	0.1438	0.0321	0.079	0.2	0.1093	0.19797	0.183
NM_003819.3:c.738+85T>C	0.0573	0.1955	0.012	0.01	0.0013	0.00354	0.014
NM_003819.3:c.877-387C>T	0.113	0.0243	0.065	0.131	0.1086	0.15452	0.146
NM_003819.3:c.972+53A>T	0.0018	0.0065	0	0	0	0	0
NM_003819.3:c.972+704C>G	0.0025	0.001	0.002	0	0	0.00328	0.004
NM_003819.3:c.1333+26C>G	0.0006	0.0001	0	0	0	0.00095	0
NM_003819.3:c.1621-348C>G	0.0003	0.0001	0	0	0	0.00042	0
NM_003819.3:c.*765C>A	0.0403	0.0086	0.096	0.024	0.3656	0.02561	0.049
NM_003819.3:c.*1261C>T	0.0073	0.0256	0	0	0.0006	0.00005	0.003
NM_003819.3:c.*4685A>G	0.7986	0.4999	0.911	0.945	0.991	0.90696	0.894
NM_003819.3:c.*5316C>T	0.007	0.0028	0.008	0	0	0.00955	0.008

Population frequencies are taken from dbSNP. Populations with greater distance from global distribution are underlined.

Characterization of synonymous and missense variants of coagulation genes of interest

We found all synonymous (S4 Table) and missense (S5 Table) variants of VKORC1, SERPING1 and PABPC4 genes [53] from NCBI’s Single Nucleotide Polymorphism Database (dbSNP) [54] and characterized them in terms of (i) population prevalence in the Genome Aggregation Database (gnomAD) [55,56], (ii) the percent of sequences matching the WT at that position in a multiple sequence alignment (MSA) [57], (iii) likelihood of the variant in the column of an MSA, (iv) mRNA MFE computed by both Kinefold and mFold, (v) relative synonymous codon usage (RSCU) and (vi) relative synonymous codon pair usage (RSCPU) [58,59], (vii) rare codon enrichment [60], (viii) and %MinMax codon usage [61]. For nonsynonymous variants, we additionally used amino acid fraction matching in an MSA, likelihood of the variant amino acid in an amino acid MSA, SIFT [62,56], and Polyphen [63,56]. The fraction matching and MSA likelihood measures use sequence homology and may imply selection against the variant. SIFT uses sequence homology as well as physical properties of amino acids, while Polyphen uses multiple sequence and structural features to predict the effect of amino acid substitutions. MFE of mRNA may affect stability of mRNA transcripts, which will affect transcript abundance and translation. Codon and codon pair usage have been shown to impact translation kinetics [64,65], and their metrics may be useful in assessing the impact of synonymous mutations on protein conformation and function [58]. For all variants, we provide the corresponding identifier in dbSNP (”rs” ID) [54]. We applied filters based on codon usage changes, mRNA MFE changes, and position conservation to identify variants that were potentially impactful on protein expression or conformation, which may affect interactions with SARS-CoV-2 proteins. Then, based on population frequencies, we computed the probability of the presence of at least one filtered variant in each population, and compared with the overall probability. For a summary of the meaning, use, and range of all scoring tools, see S6 Table.

Results

Computational verification of SARS-CoV-2 viral protein interactions

To study the role of coagulation in COVID-19 pathogenesis, we explored the interactions of VKORC1, SERPING1 and PABPC4 with viral proteins through computational docking. VKORC1 and ORF7a were confirmed to have strong binding affinity. Interactions are generally limited to transmembrane helices as opposed to intervening loops where warfarin is known to bind [66]. The top scoring complexes are shown in Fig 2. Plots of interface energy in Rosetta energy units against interface root mean square error for the RosettaDock results are given in Fig 3. The plots show convergence toward the minimum energy state.

Fig 2

Predicted dock of VKORC1 and ORF7a transmembrane domain.

A. Five protein-protein docks depict one main binding site (teal, grey, yellow, green, blue). B. The lowest interface-energy model is shown as a surface representation. C. The lowest interface-energy model, with side chains shown in wheat for amino acids at the interface. D. Another view of the lowest interface-energy model, with side chains shown in wheat at the interface and hydrophobics shown in blue. Amino acids of VKORC1 necessary for vitamin K binding (83F, 80N, 135C, 55F) or warfarin binding (134V, 133I) are given in green.

Fig 3

Plots of interface energy (I_sc) against interface root mean square error (I_rms).

Each point represents a complex formed from one of the top 5 ZDock outputs of VKORC1 and ORF7a proteins, using 10,000 decoys. All plots form energy funnels.

Predicted dock of VKORC1 and ORF7a transmembrane domain.

Plots of interface energy (I_sc) against interface root mean square error (I_rms).

Each point represents a complex formed from one of the top 5 ZDock outputs of VKORC1 and ORF7a proteins, using 10,000 decoys. All plots form energy funnels. When modeling the entire ORF7a protein, the models showed the luminal and transmembrane domains bound to one another. Due to this interaction, any dockings of ORF7a and VKORC1 implied the luminal domain of ORF7a would clash with the plasma membrane. We believe this interaction between the ORF7a transmembrane and luminal domains is predicted because the transmembrane domain has not been structurally characterized, and because the protein structure prediction software will minimize the folding energy by creating this bond between the two domains. To verify this, we used EVmutation [67] to compute the position co-evolution parameters, which measure dependency between positions in the amino acid sequence and are used to predict amino acid contacts and bonds. We found less coupling between the transmembrane and luminal domains than within each domain, which indicates lower likelihood of interaction between the domains. For these reasons, and because the luminal and transmembrane domains of ORF7a are connected by a flexible loop region, we repeated the docking, excluding the ORF7a luminal domain. Docking showed strong interaction with the minimum energy -36.078 Rosetta energy units at the interface between ORF7a and VKORC1. The Rosetta energy in this case measures the impact of interactions and bonds at the interface between ORF7a and VKORC1. Regarding SERPING1 and PABPC4, due to the lack of structural data for some segments, portions of the models for PABPC4 and SERPING1 were of low quality. Therefore, we continued our analysis by examining sequence homology of SARS-CoV-2 proteins to SARS proteins. Predictably, the homology was high (Table 1), suggesting that homologous SARS-CoV-2 proteins maintain interactions with human proteins as observed for SARS proteins. Specifically, several SARS proteins were found to interact with SERPING1, so it is likely that SARS-CoV-2 proteins interact with SERPING1 too. In addition, PABPC4 was found experimentally to bind to SARS-CoV-2 N protein [68].

Variants that may impact COVID-19 severity

GWAS metastudies on COVID-19 outcomes recently became available [37,38]. We focused on the impact of VKORC1, SERPING1, and PABPC4 gene variants on COVID-19 severity. While over 700 variants from these genes were found in the studies, only 55 variants had a p-value less than 0.05; these are listed in Tables 2 and 3. However, none of them are significantly impactful when controlling for multiple hypothesis testing. Only one variant is a coding variant and may affect protein-protein interactions. However, the non-coding variants may affect translation or splicing, leading to lower availability of protein. We characterized the 55 variants in terms of miRNA binding, splicing, mRNA minimum free energy, and sequence conservation, to understand how they may affect disease outcomes. miRNAs are involved in post-transcriptional regulation by binding to mRNA transcripts, resulting in degradation of the mRNA or less efficient translation. Therefore, higher binding will most likely result in lower expressing protein. Summaries of miRNA changes are given in Tables 2 and S2, and full data is given in S3 Table. Interestingly, for variants which effected a change in miRNA binding potential, most caused a reduction in miRNA binding potential, which may increase protein expression. The mean change between variant and wild type miRNA affinity predictions is -11.72414, and the median is -1. Splicing is involved in the production of mature mRNAs for many genes. Changes in splicing may produce alternative mature mRNAs, preventing accurate translation, and thus resulting in a protein with altered potency or affinity to the virus. While we consider splicing dysregulation as potentially impacting gene expression and disease outcome, it has rarely been shown experimentally. In vitro testing of some of these variants did not reveal differences between the splice forms and WT or substantial differences in expression. For example, Wang et al [69] examined the VKORC1 polymorphisms -1639G>A (rs9923231), 1173C>T (rs9934438), and c.-4931C>T (rs7196161) in various cell lines and did not detect any differences in expression levels. We found several intronic variants in all three genes which resulted in large changes in predicted splicing potential (Table 2). Of these, NM_000062.2:c.52-696C>T is more common in East Asian populations, NM_000062.2:c.1029+2110T>C is more common in European populations, and NM_000062.2:c.685+1391C>T and NM_003819.3:c.1333+26C>G are comparatively rare globally. Finally, sequence conservation gives an evolutionary view of the significance of any position in a sequence, but it is dependent on the conservation model and the quality of sequence and structural data. Several PABPC4 variants show perfect conservation at the variant position. The full data are given in S2 Table. We found several upstream variants in VKORC1 that resulted in higher predicted miRNA binding affinity, suggesting lower expression of the protein. Of these, NM_000062.2:c.-1675G>A is relatively common in all populations (9.37% MAF). We also found several upstream variants in PABPC4 that resulted in lower predicted miRNA binding affinity suggesting higher expression of the protein. mRNA molecules will form secondary structures based on nucleotide arrangement and affinity, which impact their structural stability. We found several variants resulting in large changes in mRNA stability. For example, NM_000062.2:c.685+1100C>T, NM_000062.2:c.1030-1975G>C, and NM_003819.3:c.*765C>A are all strongly predicted to destabilize their respective mRNA transcripts. Higher MFE may suggest higher possibility for mRNA degradation, which leads to decreased availability of transcripts and lower expression. These variants may increase mRNA degradation, reducing protein expression. In addition, known clinical consequences of these variants are summarized in Table 4.

Table 4

Variations’ clinical impact.

Variation	Clinical Impact based on Literature
VKORC1
NM_024006.4: c.283+837T>C	South Indians carrying the C nucleotide require lower warfarin dosages relative to WT (T) [70].
NM_024006.4: c.283+124G>C	European Americans carrying the G nucleotide require lower warfarin dosages relative to WT (C) [71].
NM_024006.4: c.174-136C>T	Turkish carrying the T nucleotide require lower warfarin dosages [72]; African Americans and European Americans carrying T nucleotide require lower warfarin dosages relative to WT (C) [73].
NM_024006.4: c.-1639G>A	Chinese carrying the A nucleotide require lower warfarin dosages relative to WT (G) [74].
NM_024006.4: c.-4931C>T	South Indians carrying the T nucleotide require increased warfarin dosages relative to WT (C) [75].
SERPING1
NM_000062.2: c.52-130C>T	Patients carrying the T nucleotide depicted worsened progression for age-related macular degeneration relative to WT (C) [76]; Chinese and Japanese carrying the T nucleotide lack an association with age-related macular degeneration, seen in Caucasian population studies, although was predicted as pathogenic [77].
NM_000062.2: c.1029+2110T>C	European and Mediterranean patients carrying the C nucleotide did not depict a higher association with hereditary angioedema relative to WT [78].
NM_000062.2: c.1030-1975G>C	The intronic polymorphism 1030 +1975G>C has no pathogenic influence on hereditary angioedema although predicted as pathogenic [78].
NM_000062.2: c.1030-20A>G	Association of the G allele with age-related macular degeneration was predicted to decrease the variant splicing form SERPING1, decrease protein expression and potentially limit the regulation of the compliment system [79]. No association was observed for Chinese Han carrying the G nucleotide with age-related macular degeneration [80].
NM_000062.2: c.-2415G>A	Chinese Han patients carrying the A nucleotide did not demonstrate an increased risk of polypoidal choroidal vasculopathy relative to WT (G) [81]. South Korean patients carrying the A nucleotide did not show association with an increased risk of leukemia relative to WT (G) [82]. Caucasians carrying the A nucleotide did not exhibit an increased risk of age-related macular degeneration relative to WT (G) [83].
NM_000062.2: c.52-696C>T	Patients carrying the T nucleotide did not display an increased risk for anterior uveitis relative to WT (C) [84]. Chinese Han carrying the T nucleotide did not display an increased risk for polypoidal choroidal vasculopathy relative to WT (C) [81]. Caucasians carrying the T nucleotide did not display an increased risk for age-related macular degeneration relative to WT (C) [83]. Chinese carrying the T nucleotide did not display an increased risk for diabetic retinopathy relative to WT (C) [85]. European and Mediterranean’s carrying the T nucleotide did not display an increased risk for hereditary angioedema relative to WT (C) [78].
NM_000062.2: c.52-130C>T	Chinese carrying the T nucleotide did not display a different association with age-related macular degeneration relative to WT (C) [80]. Caucasians carrying the T nucleotide displayed worsened progression of age-related macular degeneration relative to WT (C) [86]. Patients carrying the T nucleotide depicted worsened symptoms of age-related macular degeneration relative to WT (C) [76]. Chinese carrying the T nucleotide responded poorer to anti-VEGF treatment relative to WT (C) [87].
NM_000062.2: c.685+659C>T	Caucasians carrying the A nucleotide failed to depict a greater association with AMD relative to WT (G) [83]. South Korean patients carrying the A nucleotide did not depict a greater association with leukemia relative to WT (G) [88]. Han Chinese carrying the A nucleotide did not depict a significantly greater association with age-related macular degeneration relative to WT (G) [81].
NM_000062.2: c.685+1100C>T	European and Mediterranean patients carrying the T nucleotide failed to show a greater association with hereditary angioedema relative to WT (C) [78].
NM_000062.2: c.1029+926G>T	European and Mediterranean patients carrying the T nucleotide failed to show a greater association with hereditary angioedema relative to WT (G) [78]. Chinese Han carrying the T nucleotide did not depict a greater association with polypoidal choroidal vasculopathy relative to WT (G) [81].
NM_000062.2: c.1029+1443G>C	European and Mediterranean patients carrying the C nucleotide failed to show a greater association with hereditary angioedema relative to WT (G) [78].
NM_000062.2: c.1029+2111G>A	European and Mediterranean patients carrying the A nucleotide failed to show a greater association with hereditary angioedema relative to WT (G) [78].
NM_000062.2: c.1438G>A	Patients carrying the A nucleotide did not depict a change in Tacrolimus dosage requirements for transplant operations relative to WT (G) [89]. Chinese Han patients carrying the A nucleotide did not show a higher association with age-related macular degeneration or polypoidal choroidal vasculopathy relative to WT (G) [80].
PABPC4
NM_003819.3: c.504-254C>A	Increased risk for type 2 diabetes with the 40035928G>T polymorphism based on GWAS studies [90].

Prevalence of VKORC1 variants across populations

COVID-19 has spread to the entire world, affecting people with variable genetic and racial backgrounds. Therefore, we explored ORF7a interactions with variants of VKORC1 found across races. There are 160 missense VKORC1 variants in dbSNP and at least 27 which affect warfarin sensitivity [91]. The most common variants are shown in Table 5. The locations of the warfarin sensitive variants are shown in Fig 4. However, many warfarin resistance-causing variants are not listed in dbSNP, and some do not include population frequency information. In addition, there are several intronic, upstream and downstream variants which impact warfarin dosage [92]. For example, rs9923231 (c.-1639G>A, NG_011564.1:g.3588G>A), which causes warfarin sensitivity, is very common in East Asian populations (89.95%) and comparatively less common in African populations (10.09%), with intermediate frequency for other populations.

Table 5

Population frequencies of missense and synonymous VKORC1 variants.

	VKORC1 variant	Warfarin Sensitivity	Prevalence
	VKORC1 variant	Warfarin Sensitivity	Overall	African	Latino	Ashkenazi Jewish	East Asian	Finnish	Non-Finnish European	Other groups	South Asian
Missense	106GT	Resistance	0.00241	0	0.00166	0.03857	0	5.6E-05	0.00069	0.00460	0.00091
	203AG		0.00044	0	0	0	0.00571	0	8.8E-06	0.00016	9.8E-05
	352GC		0.00036	0	0	0	0.00359	0	0	0.00016	0.00078
	202CT		0.00036	6.2E-05	0	0	0	0.00083	0.00059	0.00065	0
	79CG		0.00031	0	0	0	0	0	6.8E-05	0	0.00217
	196GA		0.00020	0.00265	0.00014	0	0	0	0	0	3.3E-05
	427GA		0.00017	0	0.00116	0	0	0	8.8E-06	0.00016	0
	390TG		0.00012	0	0	0	0.00169	0	0	0	0
	157CA		0.00011	0	0	0	0.00139	0	0	0.00017	0
	163TC		1.0E-04	0	0.00067	0	0	0	0	0.00017	0
Synonymous	358CT	Resistance	0.01558	0.19520	0.01186	0.00626	0.00011	0	0.00170	0.01044	0.00065
	36GA	Resistance*	0.01511	0.00210	0.01583	0.02262	5.6E-05	0.07348	0.01171	0.02296	0.00200
	129CT		0.003643	0.00115	0.00093	0.00115	0	0.00504	0.00567	0.00415	0.00282
	54GT		7.7E-05	0	0.00030	0	0.00023	0	2.9E-05	0	3.3E-05
	234AG		3.6E-05	0	0.00012	0	0	0	3.5E-05	0.00016	0
	54GC		2.6E-05	7.1E-05	3.0E-05	0	0	0	2.9E-05	0	3.3E-05
	18GA		1.7E-05	0	9.0E-05	0	0	0	9.8E-06	0	0
	111GA		1.7E-05	0	3.0E-05	0	0.00011	0	9.8E-06	0	0
	72CT		1.7E-05	0	0	0	0	0	3.9E-05	0	0
	186TG		1.6E-05	0	0	0	0	0	3.5E-05	0	0

Warfarin sensitivity is determined by literature review.

Fig 4

Locations of warfarin dosage affecting nonsynonymous variants in VKORC1.

VKORC1 is shown in salmon, while ORF7a is shown in grey. Warfarin dosage affecting nonsynonymous variants are shown in blue. Active site amino acids are shown in green.

Locations of warfarin dosage affecting nonsynonymous variants in VKORC1.

VKORC1 is shown in salmon, while ORF7a is shown in grey. Warfarin dosage affecting nonsynonymous variants are shown in blue. Active site amino acids are shown in green. Warfarin sensitivity is determined by literature review. In the United States, COVID-19 has disproportionally affected African American populations. We sought to investigate whether VKORC1 variants could be implicated in the susceptibility of this population. We found that African and African American populations were much more likely to have at least one synonymous variant that significantly changes codon and codon pair usage in a relatively conserved position. Upon further investigation, we find that this is due to a single synonymous variant, VKORC1:c.358C>T, which is very common in African and African American populations (19.52%) while comparatively rare elsewhere (maximum 1.19% among other populations). This variant is in a relatively conserved position enriched in common codons, with negative changes in relative synonymous codon and codon pair usage (RSCU, RSCPU). This variant was not predicted to change mRNA MFE, while it showed mixed results for splicing effects: while hexamer splicing scoring tools and ESEfinder showed changes in splicing near the variant, FAS ESS and Exonscan found no changes. Furthermore, this variant is associated with warfarin resistance [93-95], and in linkage disequilibrium with another variant upstream of the coding sequence (CDS), NG_011564.1:g.3350A>G, which is also common in African and African American populations (36.40%) and associated with warfarin resistance. In addition, we identified one nonsynonymous variant, VKORC1:c.106G>T, which is relatively common in Ashkenazi Jewish populations (3.857%) and rare in other populations (max 0.4599% among other populations). This variant is predicted to be deleterious by both SIFT and Polyphen and associated with warfarin resistance. This variant appears at the end of a transmembrane helix near a loop, and likely impacts loop conformation near the warfarin binding site. These two variants were interesting, primarily due to their significant population skew. There are many other variants with different prevalence in different populations, but all others are much rarer or much more common across all populations. We additionally characterized the population prevalence of the variants identified from the GWAS studies, finding great variance in prevalence for some. For example, NM_024006.4:c.283+837T>C is very common in all populations (64.3% MAF globally), but less common in East Asian populations (10.17%). Furthermore, some nonsynonymous variants were identified from literature to impact drug response or disease status. Associated nucleotide changes are not always given for these variants, so characterizing them has not been possible. Of the 129 and 127 synonymous and missense VKORC1 variants, respectively, that we considered, 17 synonymous and 29 missense variants were in the VKORC1-ORF7a interface in at least one of our final models. Of these missense variants, only c.355T>A is predicted to be deleterious in both SIFT and Polyphen-2, and is likely to impact protein structure and binding to ORF7a. However, c.157C>T, c.157C>A, c.184G>A, c.229A>T, c.261C>A, c.277T>C, c.280T>C, c.280T>A, c.326G>A, c.344C>G, c.355T>A, c.378C>A, and c.379G>A are predicted to be deleterious by SIFT only, and may also affect binding.

VKORC1 paralog and variants that are impactful on warfarin dosage

VKORC1L1 is a VKORC1 paralog with similar function but reduced warfarin sensitivity [96,97]. We aligned VKORC1 with VKORC1L1 and analyzed the differences between them in the positions of variants, for additional insight into their impact on warfarin sensitivity and possible binding to ORF7a. In the alignment of VKORC1 and VKORC1L1, seven out of twenty positions for the nonsynonymous variants impacting warfarin dosage are not conserved. This is unsurprising because the non-conserved variants are localized to the loop between transmembrane helices one and two, which is near the warfarin binding site (Fig 4). Swapping this region between VKORC1 and VKORC1L1 causes warfarin resistance in VKORC1 and warfarin sensitivity in VKORC1L1 [96]. In addition, we examined similarities of ORF7a with VKORC1 interacting proteins. Two human proteins are structurally similar to ORF7a and interact with VKORC1: CXADR, a Coxsackievirus and Adenovirus receptor [98], and PCDH1, a Hantavirus receptor [99]. Both proteins are involved in cell-cell adhesion. The structural similarity of ORF7a protein, CXADR, and PCDH1 additionally supports the interaction of ORF7a and VKORC1. The structurally aligned regions are shown in Fig 5. Structural overlap is limited to the beta sheets, with small potential for biomimicry.

Fig 5

Structural alignment of ORF7a, CXADR, and PCDH1 proteins.

The alignment is largely confined to the beta sheets.

Structural alignment of ORF7a, CXADR, and PCDH1 proteins.

The alignment is largely confined to the beta sheets.

Discussion

COVID-19 illness is characterized by a prothrombotic phenotype that is incompletely understood [1,2,3,4,6]. Developing more effective treatments will require a thorough understanding of the root causes of COVID-19 associated coagulopathy. Although several possible mechanisms have been proposed [100-102] to explain pathologic thrombosis in COVID-19, many aspects remain unexplored. Three proteins, VKORC1, SERPING1 and PABPC4, which influence coagulation have been shown to interact with SARS proteins. We investigated computationally the binding of these proteins to SARS-CoV-2 proteins. Additionally, we identified genetic variants of these proteins and examined their prevalence across populations. We explored mechanisms by which these variants may impact COVID-19, specifically as it relates to COVID-19 associated coagulopathy. We conclude that each of these proteins may provide a potential link between COVID-19 and coagulation. VKORC1 is crucial for maintaining active vitamin K levels and hence for the function of several essential coagulation factors. We computationally characterized the interaction between VKORC1 and SARS-CoV-2 ORF7a, guided by previous experimental data showing an interaction between VKORC1 and SARS-CoV ORF7a. Whether the ORF7a-VKORC1 interaction would be inhibitory, neutral or possibly potentiate VKORC1 function is difficult to determine in the absence of experimental testing. However, the importance of this interaction should be considered. Indeed, SARS-CoV-2 interaction with the host vitamin K cycle has again been demonstrated in more recent experimental work. Gordon and colleagues expressed SARS-CoV-2 proteins in HEK cells and found gamma-glutamyl carboxylase to interact with the SARS-CoV-2 M protein, which could similarly impact vitamin K-dependent proteins [103]. The ORF7a-VKORC1 interaction theoretically could lead to insufficient carboxylation of vitamin K dependent coagulation factors. However, COVID-19 associated coagulopathy is not typified by coagulation factor deficiencies as measured using common clinical coagulation laboratory assays [104]. Vitamin K dependent proteins outside of the coagulation cascade also contribute to platelet activation and thrombosis. An example is Gas6, which contributes to platelet activation through tyrosine kinase receptors. However, loss of Gas6 signaling is protective against thrombosis in mice [105]. Clearly therefore other mechanisms linking the VKORC1- ORF7a interaction to COVID-19 pathogenesis and coagulopathy warrant consideration. Interestingly, reduced vitamin K levels are associated with worse prognosis in COVID-19 [106]. A deficiency of Vitamin K dependent proteins that protect against pulmonary and vascular elastic damage has been postulated to underlie this relationship. There is also a recognized inverse relationship of vitamin K and the inflammatory response [107,108] and interleukin-6 (IL-6) levels [109]. Inflammation and the immune response are thought to contribute significantly to the pathogenesis of COVID-19 illness [110]. Inflammation is also directly linked to coagulation activation [111]. Inflammation resulting from COVID-19 infection is not unique to adults, as children have also been found to suffer from Kawasaki disease-like symptoms [112]. Finally, the VKROC1—ORF7a interaction may also have an impact on tetherin function. SARS ORF7a is known to inhibit tetherin [113,15], also known as BST-2. Tetherin inhibits virion dispersal [114], and several viruses, including HIV, have auxiliary proteins to counter this effect. The structures of tetherin and VKORC1 are noticeably similar, sharing a coiled-coil architecture: VKORC1 has four consecutive alpha helices [66], while tetherin exists as a homomer of four alpha helices [16,115]. Of note, ORF7a has the highest RSCU of any SARS-CoV-2 protein [116], which may result in more efficient translation and high expression levels compared to other viral proteins, to more effectively counter the effect of tetherin. While many extrinsic factors can influence the presentation of COVID-19, the impact of host genetic variants on viral protein interaction has not received much attention to date. Modulating the strength of these interactions or the availability of host proteins may reduce the effectiveness of viral protein function, viral replication and ultimately the severity of infection. It is interesting to consider the impact of genetic variants in VKORC1. Individuals carrying VKORC1 variants may lead to altered protein conformation and differential binding to either warfarin or ORF7a. For instance, the synonymous variant 358C>T is characterized by a large change both in RSCPU and RSCU suggesting that it may be associated with altered cotranslational folding. Reduced VKORC1 expression or binding to ORF7a, as it may occur in individuals with VKORC1 gene variants, may increase the availability of ORF7a to bind and inhibit tetherin increasing the severity of SARS-CoV-2 infection. Due to the lack of structural data for some segments, homology models of PABPC4 and SERPING1 could not be constructed with high confidence, precluding the ability to create complexes to model and analyze the interactions between these proteins and viral proteins. While PABPC4 has been found to interact with SARS-CoV-2 N protein experimentally [68], interactions between SERPING1 and SARS-CoV-2 proteins have yet to be directly tested. SERPING1 encodes C1 esterase inhibitor, a plasma protein that inhibits the C1 complex of classical pathway of complement. C1 esterase inhibitor also is the primary inhibitor of plasma kallikrein, which produces bradykinin from high-molecular-weight kininogens. Viral interactions with SERPING1 may therefore result in excessive levels of complement activation, bradykinin production and angioedema. This impact could be more pronounced in individuals with genetic variants that result in lower expression of SERPING1 or SERPING1 activity. The relationship between the ACE2 (primary receptor for SARS-CoV-2) and the kinin system could exacerbate this impact. ACE2 inactivates des-Arg9 bradykinin (DABK) [117,118], an active bradykinin metabolite. Reduced ACE2 activity is associated with enhanced signaling of DABK, angioedema, and neutrophil infiltration in the lungs [119,117]. The combined effect of viral suppression of ACE2 expression and function [120] concurrent with SERPING1 inhibition may result in excessively high levels of bradykinin and pulmonary fluid accumulation. Altogether, the interaction of VKORC1, SERPING1 and PABPC4 with viral proteins may result in dysregulated coagulation and immune response. Genetic variants in these genes may impact the host-viral protein interaction by altering protein conformation or expression. Because these genetic variants appear at different frequencies in different populations, this may contribute to differential outcomes for COVID-19 patients from various ethnic groups. Indeed, COVID-19 has had an unequal impact on populations across the globe [121,122]. In the United States, as elsewhere, it is clear that demographic subgroups are more susceptible to severe COVID-19 disease. Certainly a large number of non-genetic factors influence clinical outcomes within populations, including age, access to health care, and presence of comorbidities [123,124]. The genetic underpinnings of host-viral protein interaction may also play an underappreciated role in determining the course of COVID-19 illness and COVID-19 associated coagulopathy.

Template crystal structures used for all protein models, from I-TASSER.