Literature DB >> 35314694

Genetic alteration of human MYH6 is mimicked by SARS-CoV-2 polyprotein: mapping viral variants of cardiac interest.

Praveen Anand1, Patrick J Lenehan2, Michiel Niesen2, Unice Yoo2, Dhruti Patwardhan1, Marcelo Montorzi2,3, A J Venkatakrishnan4, Venky Soundararajan5,6.   

Abstract

Acute cardiac injury has been observed in a subset of COVID-19 patients, but the molecular basis for this clinical phenotype is unknown. It has been hypothesized that molecular mimicry may play a role in triggering an autoimmune inflammatory reaction in some individuals after SARS-CoV-2 infection. Here we investigate if linear peptides contained in proteins that are primarily expressed in the heart also occur in the SARS-CoV-2 proteome. Specifically, we compared the library of 136,704 8-mer peptides from 144 human proteins (including splicing variants) to 9926 8-mers from all the viral proteins in the reference SARS-CoV-2 proteome. No 8-mers were exactly identical between the reference human proteome and the reference SARS-CoV-2 proteome. However, there were 45 8-mers that differed by only one amino acid when compared to the reference SARS-CoV-2 proteome. Interestingly, analysis of protein-coding mutations from 141,456 individuals showed that one of these 8-mers from the SARS-CoV-2 Replicase polyprotein 1a/1ab (KIALKGGK) is identical to an MYH6 peptide encoded by the c.5410 C > A (Q1804K) genetic variation, which has been observed at low prevalence in Africans/African Americans (0.08%), East Asians (0.3%), South Asians (0.06%), and Latino/Admixed Americans (0.003%). Furthermore, analysis of 4.85 million SARS-CoV-2 genomes from over 200 countries shows that viral evolution has already resulted in 20 additional 8-mer peptides that are identical to human heart-enriched proteins encoded by reference sequences or genetic variants. Whether such mimicry contributes to cardiac inflammation during or after COVID-19 illness warrants further experimental evaluation. We suggest that SARS-CoV-2 variants harboring peptides identical to human cardiac proteins should be investigated as "viral variants of cardiac interest".
© 2022. The Author(s).

Entities:  

Year:  2022        PMID: 35314694      PMCID: PMC8935120          DOI: 10.1038/s41420-022-00914-9

Source DB:  PubMed          Journal:  Cell Death Discov        ISSN: 2058-7716


Introduction

Cardiac injury is a prevalent complication associated with COVID-19 [1]. In a study of 100 recently recovered COVID-19 patients, cardiovascular magnetic resonance imaging revealed cardiac involvement or ongoing myocardial inflammation in 78 and 60 patients, respectively [2]. In another study of 39 consecutive autopsies from patients who died of COVID-19, viral RNA was detectable in the heart of 24 (62%) patients. A large nationwide study from Israel reported that SARS-CoV-2 infection is associated with increased rates of myocarditis, arrhythmia, myocardial infarction, and pericarditis [3]. Myocarditis has also been reported in a small fraction of individuals after receiving an mRNA COVID-19 vaccine [4-6]. Despite these phenotypic associations, the mechanisms underlying myocardial inflammation in the setting of COVID-19 infection and vaccination remain unclear. A prevalent hypothesis, known as molecular mimicry, posits that T lymphocytes and/or antibodies that recognize SARS-CoV-2 antigens and mediate virus neutralization may also cross-react against host cardiac proteins and trigger an autoimmune response against cardiomyocytes [7]. This mechanism has also been suggested to contribute to other inflammatory conditions seen in the context of COVID-19 infection [8-10]. Indeed, autoimmune sequelae of other infectious diseases have been attributed to mimicry between host and microbial antigens [11-16]. Advances in next-generation sequencing technologies have facilitated the rapid development of large-scale multi-omic datasets and genomic epidemiology resources to better understand the COVID-19 pandemic. Bulk and single-cell RNA-sequencing datasets have elucidated the transcriptional signatures of most healthy human tissues and cell types [17-20]. Amino acid sequences of human proteins, including genetic variants and immunologic epitopes, are available in UniProt [21], gnomAD [22], and Immune Epitope Database (IEDB) [23]. The GISAID database currently hosts 4.85 million SARS-CoV-2 genomes from more than 200 countries[24]. The availability of such genome-scale data enables us to investigate the potential for molecular mimicry between SARS-CoV-2 and human cardiac proteins. Here we present a systematic comparison of peptides from human cardiac proteins and SARS-CoV-2 proteins. We show that no 8-mer peptides are identical between the reference sequences of these two groups of proteins. However, when including human and viral genetic variants in this comparison, we found 21 8-mer peptides to be identical between human cardiac proteins and SARS-CoV-2 proteins. Among these, a human genetic variant of MYH6 (c.5410 C > A; Q1804K) is identical to a peptide of the reference SARS-CoV-2 replicase polyprotein. Finally, we propose that the SARS-CoV-2 variants that have peptides identical to human cardiac proteins should be studied as potential “viral variants of cardiac interest”.

Results

Identification of genes that are overexpressed in cardiac tissue

To identify heart-enriched proteins, we compared the expression of all human protein-coding genes in heart samples (n = 861) versus all non-striated muscle samples (n = 15,718) from the Genotype-Tissue Expression (GTEx) project. There were 137 genes expressed at least fivefold higher in the heart with a Cohen’s D value greater than or equal to 0.5 (Fig. 1 and Supplementary Table 1). Similarly, we compared the expression of all genes between cardiomyocytes (n = 8.9 K cells) and non-cardiomyocytes (n = ~2.5 million cells) across a database of 52 single-cell RNA-sequencing studies covering 62 tissues[18]. There were 46 genes overexpressed in cardiomyocytes based on the same criteria outlined above (Fig. 1 and Supplementary Table 1). Combining the lists of genes identified from bulk and single-cell RNA-sequencing analyses, we identified a total of 144 candidate cardiac proteins.
Fig. 1

Identification of mimicked peptides between SARS-CoV-2 and human proteins.

a Identification of cardiac-specific proteins based on analysis of bulk RNAseq and single-cell RNA-seq data and identification of SARS-CoV-2 proteins. b Comparison of peptide libraries of human cardiac proteins and SARS-CoV-2 proteins.

Identification of mimicked peptides between SARS-CoV-2 and human proteins.

a Identification of cardiac-specific proteins based on analysis of bulk RNAseq and single-cell RNA-seq data and identification of SARS-CoV-2 proteins. b Comparison of peptide libraries of human cardiac proteins and SARS-CoV-2 proteins.

MYH6 variant is mimicked by an epitope of SARS-CoV-2 Replicase polyprotein 1a/1ab

We computed peptide 8-mers from the reference sequences of the 144 cardiac proteins (including isoforms) and all 17 proteins from the reference SARS-CoV-2 sequence. We then systematically compared the pairwise sequence identity (using Hamming Distance; see Methods) for the 136,704 cardiac protein 8-mers with the 9,926 8-mers derived from the SARS-CoV-2 proteins. No peptides were fully identical between these two groups. However, 45 8-mers were nearly identical, with only a single mismatched amino acid (Table 1). To determine whether human genetic variation results in any 8-mers, which exactly match the reference SARS-CoV-2 proteome, we then analyzed amino acid mutations from 141,456 individuals using the gnomAD database (including 83,623 mutations in the 144 cardiac proteins) [22].
Table 1

List of peptide pairs from SARS-CoV-2 proteins and human cardiac proteins that have a Hamming distance less than or equal to 1.

SARS-CoV-2 ProteinsHuman cardiac-enriched proteins
Protein nameAmino acid positionsAmino acid sequenceProtein nameIsoforms containing sequenceAmino acid sequence
Spike glycoprotein WT/2 P491–498PLQSYGFQKLHL411, 2PLQSYFFQ
Spike glycoprotein WT/2 P856–863NGLTVLPPFHOD31, 2, 3, 4IGLTVLPP
Spike glycoprotein WT/2 P857–864GLTVLPPLFHOD31, 2, 3, 4GLTVLPPP
Spike glycoprotein WT/2 P1087–1094AHFPREGVCMYA51AHFPAEGV
Replicase polyprotein 1ab4937–4944KYAISAKNTTN1, 3, 2, 5, 12, 13, 7, 8, 4, 9, 10, 11KYIISAKN
Replicase polyprotein 1ab5604–5611LQGPPGTGMYLK31, 4KQGPPGTG
Replicase polyprotein 1ab5605–5612QGPPGTGKMYLK31, 4QGPPGTGR
Replicase polyprotein 1ab5813–5820NRPQIGVVCASQ21, 2FRPQIGVV
Replicase polyprotein 1ab5814–5821RPQIGVVRCASQ21, 2RPQIGVVN
Replicase polyprotein 1ab5955–5962DTKFKTEGTTN1, 3, 2, 5, 12, 13, 7, 8, 4, 9, 10, 11DTKFKTTG
Replicase polyprotein 1ab5955–5963DTKFKTEGLTTN1, 3, 2, 5, 12, 13, 7, 8, 4, 9, 10, 11DTKFKTTGL
Replicase polyprotein 1ab5956–5963TKFKTEGLTTN1, 3, 2, 5, 12, 13, 7, 8, 4, 9, 10, 11TKFKTTGL
Replicase polyprotein 1ab6516–6523KPVPEVKICMYA51KPSPEVKI
Replicase polyprotein 1ab6516–6523KPVPEVKITTN1, 2, 5, 12, 13, 7, 8, 4, 11KPVPEEKI
Replicase polyprotein 1a/1ab207–214RAGKASCTTTN1, 2, 5, 12, 13, 7, 8, 4, 11EAGKASCT
Replicase polyprotein 1a/1ab208–215AGKASCTLTTN1, 2, 5, 12, 13, 7, 8, 4, 11AGKASCTT
Replicase polyprotein 1a/1ab345–352GTENLTKETNNI3K1, 3, 4GTESLTKE
Replicase polyprotein 1a/1ab459–466VNINIVGDENO31, 3, 2VNIQIVGD
Replicase polyprotein 1a/1ab492–499KGLDYKAFCASQ22KKLDYKAF
Replicase polyprotein 1a/1ab512–519TKGKAKKGMYH71GKGKAKKG
Replicase polyprotein 1a/1ab513–520KGKAKKGAMYH71KGKAKKGS
Replicase polyprotein 1a/1ab879–886VIKTLQPVLMOD31VIKTLKPV
Replicase polyprotein 1a/1ab963–970GATSAALQANKRD21, 2GAQSAALQ
Replicase polyprotein 1a/1ab1143–1150VLLAPLLSHJV1, 2, 3TLLAPLLS
Replicase polyprotein 1a/1ab1144–1151LLAPLLSAHJV1, 2, 3LLAPLLSG
Replicase polyprotein 1a/1ab1197–1204KQVEQKIAGOT11, 2KKVEQKIA
Replicase polyprotein 1a/1ab2246–2253STAALGVLSLC4A31, 2, 3STAVLGVL
Replicase polyprotein 1a/1ab2533–2540KGSLPINVTTN1, 2, 5, 12, 13, 7, 8, 4, 11KGSLPITV
Replicase polyprotein 1a/1ab2550–2557EESSAKSAMYH7B1EESKAKSA
Replicase polyprotein 1a/1ab2630–2637LSTFISAATENM21, 2LSTFFSAA
Replicase polyprotein 1a/1ab2757–2764KIALKGGKMYH61QIALKGGK
Replicase polyprotein 1a/1ab2757–2764KIALKGGKMYH71QIALKGGK
Replicase polyprotein 1a/1ab2758–2765IALKGGKIMYH61IALKGGKK
Replicase polyprotein 1a/1ab2758–2765IALKGGKIMYH71IALKGGKK
Replicase polyprotein 1a/1ab3908–3915FEKMVSLLMYH61EEKMVSLL
Replicase polyprotein 1a/1ab3908–3915FEKMVSLLMYH71EEKMVSLL
Replicase polyprotein 1a/1ab3909–3916EKMVSLLSMYH61EKMVSLLQ
Replicase polyprotein 1a/1ab3909–3916EKMVSLLSMYH71EKMVSLLQ
Replicase polyprotein 1a/1ab4137–4144VKLQNNELTBX201VKLTNNEL
Putative ORF9c protein47–54AAVGELLLASB101, 2, 3AAVVELLL
ORF7b protein14–21LAFLLFLVTMEM1822, 1, 3LAGLLFLV
ORF7a protein43–50NSPFHPLAFLNC1, 2NSPFHVLA
Nucleoprotein192–199NSSRNSTPCMYA51NSSRSSTP
Nucleoprotein374–381KKADETQAMYPN1EKADETQA
Nucleoprotein375–382KADETQALMYPN1KADETQAR
List of peptide pairs from SARS-CoV-2 proteins and human cardiac proteins that have a Hamming distance less than or equal to 1. Interestingly, one specific 8-mer from the SARS-CoV-2 Replicase polyprotein 1a/1ab (KIALKGGK) is identical to a mutant peptide encoded by the c.5410 C > A(Gln1804Lys) variation in human MYH6 (KIALKGGK), which is a subunit of a cardiac motor protein. This genetic variation has been identified in Africans/African Americans (0.08% prevalence), East Asians (0.3% prevalence), South Asians (0.06% prevalence) and Latino/Admixed Americans (0.003% prevalence) [22]. Analysis of peptides from IEDB shows that the non-mutated 7-mer in this peptide (IALKGGK) also overlaps with a known B-cell epitope from SARS-CoV-2 (IALKGGKIVNNWLKQ, IEDB ID:1039277) [25]. Whether the mimicry between SARS-CoV-2 Replicase polyprotein 1a/1ab and wild-type or mutant MYH6 contributes to cardiac inflammation in the setting of COVID-19 warrants further investigation.

SARS-CoV-2 variants harbor epitopes that are identical to peptides of cardiac proteins

We next examined whether SARS-CoV-2 evolution has given rise to variants that harbor peptides identical to human cardiac proteins. To this end, we analyzed 4.85 million SARS-CoV-2 genomes from over 200 countries obtained from the GISAID database [24]. We identified 21 8-mer peptides from SARS-CoV-2 variants that are identical to cardiac proteins encoded by reference sequences or genetic variants (Tables 2 and 3). Of these 21 peptides, the one present in the highest number of viral sequences was STAVLGVL, which mapped to the NSP3 protein (part of Replicase polyprotein sequence) in 4501 (0.09%) SARS-CoV-2 genomes. This sequence is present in a transmembrane helix of all three described isoforms of the cardiac protein SLC4A3 (UniProtKB: P48751), an anion exchange protein that is associated with short QT syndrome (Tables 2 and 3) [21, 26]. Although no epitopes containing this full sequence were present in IEDB, a BLAST search at 90% similarity identified a similar epitope (HEAQAVLGVLL) that is reported to bind to both HLA-B*40:1 and HLA-B*58:01 [25, 27]. We additionally analyzed the temporal and geographical emergence of the five variant SARS-CoV-2 8-mers that are identical to cardiac protein 8-mers and that occur in over 100 reported SARS-CoV-2 genomes (Supplementary Fig. S1). Variants with exactly matching 8-mers have occurred sporadically in different geographic locations, but they have not persisted over time (Supplementary Fig. S1).
Table 2

List of identical cardiac peptides found in SARS-CoV-2 variants in GISAID.

No. of GISAID entries with identical matchSARS-CoV-2 geneMimicked peptideApprox. start-end in SCOV2 proteinCardiac proteinCardiac protein start-endExact IEDB epitope matchIdentical IEDB epitope seq (HLA/antigen infor) [≥90% identity]
4501NSP3STAVLGVL1429–1436SLC4A3746–753NoHEAQAVLGVLL (HLA-B*40:01); HEAQAVLGVLL (HLA-B*40:01)
1322NSP2GTESLTKE166–173TNNI3K294–301NoNA
580NNSSRSSTP193–200CMYA588–95NoTINSSRSSQESY (B-cell epitopes, MHC ligands)
205NS9cAAVVELLL48–55ASB10307–314NoMVDPQLDGPQLAALA AVVELGSFDA ()
118NSP2KGKAKKGS334–341MYH7635–642YesAGADAPIEKGKGKAKKGSS (MHC ligand)
33NSP3VIKTLKPV60–67LMOD3514–521NoKVAIKTLKPGTMS (HLA-A*02:01);
TKVAIKTLKPGTMSPE (HLA-A*02:01)
17NSP3KKVEQKIA380–387GOT155–62GEKVEQKIEGKWVNEKKAQEDKLQ
(MHC Class I, II, B-cell epitope)
14NS7aNSPFHVLA44–51FLNC1727–1734NoNA
11NEKADETQA375–382MYPN89–96NoADETQALPQRQKKQQ (HLA Class II)
9NSP3TLLAPLLS304–311HJV409–416NFNQHEVLLAPLLS (B-cell epitope and MHC ligand)
6NSP3LSTFFSAA1811–1818TENM21036–1043NoAGTLSTFFGVPLVLT (HLA class II MHC restriction)
6NSP3LLAPLLSG327–334HJV410–417NoENFNQHEVLLAPLLS
(B cell and MHC)
6SpikeIGLTVLPP854–861FHOD3971–978NoGFIKQYGDCLGDIAA RDLICAQKFNGLTVL PPLLTDEMIAQYT (T cell, B cell and MHC ligands)
2NSP13QGPPGTGR282–289MYLK3373–380NoILYGPPGTGK (HLA-A*03:01)
1NSP15KPVPEEKI65–72TTN10277–10284NoEAPLYVVDKPVPEESE (HLA-DRB1*04:01)
1NSP3KGSLPITV1716–1722TTN5437–5444NoSLPITVYYAV (T cell, B cell and MHC ligands)

The NSP proteins are cleaved products of the replicase polyprotein.

Table 3

List of mutated cardiac peptide n-mers from human genetic variants identical to SARS-CoV-2 variants.

Human cardiac genersIDMutation consequenceMutated cardiac peptideWild-type cardiac peptideNo. of GISAID genomes (Pango lineage distribution in %)SARS-CoV-2 geneIEDB epitope exact matchIEDB epitope info (≥90 % seq identity)
FLNCrs374848954p.Val1732LeuNSPFHLLANSPFHVLA1661 (AY.4: 30.8% AY.44: 10.6%; B.1.1.7: 9.4%; AY.43: 6.85%;NS7aYesNSPFH (HLA-A*01:01)
LMOD3rs370869958p.Lys519ArgVIKTLRPVVIKTLKPV85 (B.1.617.2: 24.7%; AY.43: 20.98%; AY.4: 8.64%; B.1.1: 6.17%; B.1.1.7: 6.17%; P.1: 6.17%;)NSP3NoNA
TMEM182rs774398171p.Gly215ValLAVLLFLVLAGLLFLV21 AY.3: 57.14%; B.1.243: 14.28%; B.1.1.7: 9.52%; B.1: 4.76%)NS7bNoNA
MYLK3rs771870674p.Arg380CysQGPPGTGCQGPPGTGR1 (B.1)NSP13NoGPPGTGKSHFAIGLA (B cell, T cell, MHC ligand)

The NSP proteins are cleaved products of the replicase polyprotein.

List of identical cardiac peptides found in SARS-CoV-2 variants in GISAID. The NSP proteins are cleaved products of the replicase polyprotein. List of mutated cardiac peptide n-mers from human genetic variants identical to SARS-CoV-2 variants. The NSP proteins are cleaved products of the replicase polyprotein.

Discussion

Understanding the mechanistic basis of acute cardiac injury in COVID-19 patients is important to develop countermeasures. Viral infections have previously been proposed to trigger autoimmune reactions, and it has been hypothesized that molecular mimicry plays a role in mediating autoimmune reactions [7–9, 11–13, 28]. Here, a systematic analysis of gene expression from cardiac tissues, based on bulk RNA-seq data and single-cell RNA-seq data, combined with analysis of human genetic variation and SARS-CoV-2 genomes has led to the identification of candidate proteins and peptide regions therein that might be involved in immune cross-reactivity (Tables 2 and 3). These newly identified identical peptides expand the set of known shared peptides between the human proteome and SARS-CoV-2, such as the furin cleavage site “RRARSVAS” present in both human ENaC-ɑ and the Spike glycoprotein[10, 29, 30]. Further research is warranted to ascertain whether these mimicked peptides contribute to autoinflammatory pathology in the context of COVID-19 infection. Additionally, given the occurrence of myocarditis in some individuals shortly after receiving an mRNA COVID-19 vaccine, the potential for molecular mimicry between the antigen encoded by these vaccines (i.e., the pre-fusion stabilized Spike glycoprotein) and human cardiac proteins should be evaluated[4–6, 31, 32]. There are a few limitations to this study. First, the human proteins that we surveyed were shortlisted based on their overexpression in cardiac tissue. There could be mimicked proteins that are shared between cardiac tissues and other tissues that are not accounted for in the current analysis. Second, there are other mechanisms that could contribute to autoimmunity after a viral infection such as bystander activation, epitope spreading, and viral persistence [33]. Third, the presence of identical peptides in cardiac proteins and the SARS-CoV-2 proteome could occur due to chance. Comparing all SARS-CoV-2 8-mers to a set of brain enriched proteins and a set of skin enriched proteins shows a similar probability distribution of Hamming distance (Supplementary Fig. S2), suggesting that the observed similarity with SARS-CoV-2 peptides is not specific to human cardiac proteins. Fourth, it is possible that peptides with lower degrees of similarity could contribute to immunologic mimicry, as T cells can be highly cross-reactive against different major histocompatibility complex (MHC)-presented peptides [34-38]. Taken together, by studying the intersection of human genetic variation in cardiac proteins and SARS-CoV-2 evolution, we have identified candidates of molecular mimicry that have the potential to contribute to cardiac inflammation in the context of COVID-19. It will be important to perform follow-up functional studies evaluating the potential of SARS-CoV-2 reactive T cells and antibodies (e.g., from active or recovering COVID-19 patients) to cross-react with these peptides. Thus, we propose that SARS-CoV-2 variants harboring peptides identical to host heart-enriched proteins should be studied as “viral variants of cardiac interest”. We highlight that a similar strategy can be applied to identify and categorize plausible mimicry candidates from any human tissues that are targeted by other autoimmune responses in COVID-19 patients.

Methods

Identification of proteins enriched in cardiac tissue

Bulk RNA-sequencing (RNA-seq) data was accessed from the Genotype-Tissue Expression (GTEx) project V8 [17]. For each sample, FASTQ files were processed using Salmon (in mapping-based mode) to quantify gene expression in transcripts per million (TPM). Specifically, the expression of each transcript isoform was first determined by passing FASTQ files to Salmon quant with the following parameters passed: validateMappings, rangeFactorizationBins 4, gcBias, biasSpeedSamp 10. All isoforms are then summed via a transcript-to-gene map, generating a gene-level expression value. GRCh38 was used as the reference, including cDNA and non-coding RNA. For single-cell RNA-seq studies, processed count matrices were accessed from Gene Expression Omnibus or other publicly available data repositories. There were two datasets analyzing heart tissues which captured cardiomyocytes, our main cell type of interest for this report [20, 39]. Other datasets captured a wide variety of immune, stromal, and parenchymal cell types from tissues including the respiratory tract, gastrointestinal tract, genitourinary tract, hepatobiliary system, skeletal muscle, brain, skin, eyes, and endocrine organs. Each dataset was processed using Scrublet and Seurat v3.0 as described previously [18, 40–42]. Cell type annotations were obtained from associated metadata files if available; otherwise, annotation was performed manually, guided by the cell types reported in the associated publication. To identify genes that are overexpressed in cardiac tissue, we calculated fold change and Cohen’s D values between defined sample cohorts. For bulk RNA-seq data, Cohort A was defined as all GTEx heart samples (n = 861), and Cohort B was defined as all remaining GTEx samples except for those derived from skeletal muscle (n = 15,718). For single-cell RNA-seq data, Cohort A was defined as all cells annotated as cardiomyocytes (n ~8900 cells), and Cohort B was defined as all other cells from all processed studies (n ~2.5 million cells). Fold change and Cohen’s D were calculated as follows: Cohen’s D = , where the pooled standard deviation SDpooled is defined as: SD = , where NCohort A and NCohort B are the number of samples in Cohorts A and B, respectively, and SDCohort A and SDCohort B are the standard deviation of TPM values for the given gene in Cohorts A and B, respectively. Genes with fold change ≥5 and Cohen’s D ≥ 0.5 from either the bulk or single-cell RNA-seq analysis were considered to be enriched in cardiac tissue. In the volcano plots used to visualize these analyses, we filtered to genes with a TPM or CP10K value ≥1 in either Cohort A or Cohort B, and genes meeting the criteria for overexpression in heart or cardiomyocytes are colored in red. For a control analysis, we also identified genes overexpressed in the brain or skin by bulk RNA-sequencing from the GTEx project. We used the same approach as described above, except that Cohort A was defined as either all brain samples (n = 2351) or all skin samples (n = 1305), and Cohort B was defined as all other samples.

Comparison of 8-mers from reference sequences of cardiac proteins and SARS-CoV-2 proteins

The translated proteome from reference to the SARS-CoV-2 genome (NC_045512.2) was downloaded from UniProt (https://covid-19.uniprot.org/) [21, 43]. A sliding window approach was used to enumerate all 8-mers from the 17 proteins in this viral proteome. Similarly, we used a sliding window approach to generate all 8-mers from the reference amino acid sequences of the previously defined 144 cardiac proteins, including the canonical isoforms and all described isoforms indicated in UniProt. We then performed a pairwise comparison of all 8-mers in these two groups by calculating the Hamming distance using the stringdist function from the stringdist package (version 0.9.8) in R (version 4.0.3). In a control analysis, we used the same approach to calculate the Hamming distance between all SARS-CoV-2 8-mers and the control sets of 369 human proteins enriched in the brain or 198 human proteins enriched in the skin (described above).

Assessing the impact of human and SARS-CoV-2 variants on cardiac peptide matches

To assess the impact of human genetic variation on potential molecular mimicry, we retrieved all missense variants from the gnomAD database for the previously identified cardiac proteins that had at least one 8-mer similar to a peptide in the SARS-CoV-2 reference proteome (Hamming distance = 1) [22]. We used the gnomad-api (https://gnomad.broadinstitute.org/api) to fetch the variant calls from the gnomad_r2_1 version from the Human GRCh37 genome assembly. The variants in this gnomad version (GRCh37/hg19) are derived from 125,748 exome sequences and 15,708 whole-genome sequences from unrelated individuals sequenced as part of various disease-specific and population genetic studies. For any variants that alter the amino acid sequence of a potentially mimicked peptide, we determined whether the mutation resulted in an exact match (Hamming distance = 0) to the corresponding 8-mer from the SARS-CoV-2 reference proteome. To assess the impact of viral evolution on potential molecular mimicry, we queried the cardiac 8-mers with Hamming distance of 1 (including any alterations of these 8-mers arising from human genetic variation as described above) against all protein variants encoded in 4,854,709 SARS-CoV-2 genomes deposited in the GISAID database (last accessed 11/5/2021) [24]. Here, we determined whether any mutations in viral genomes (relative to the reference sequence) resulted in 8-mers which exactly match one or more cardiac peptides.

Evaluation of mimicked peptides for inclusion in immune epitopes

For any 8-mers which showed an exact match between a cardiac peptide (reference or variant sequences) and a SARS-CoV-2 peptide (reference or variant sequences), we queried the 8-mer using the Immune Epitope Database (IEDB; www.iedb.org) and Analysis Resource [23]. We searched for any linear peptide epitope with a Blast similarity of at least 90% from any human host that had positive experimental evidence in any assay (T cell, B cell, or MHC Ligand). No MHC class restrictions or disease filters were applied. Supplementary Material
  42 in total

Review 1.  Autoimmunity and molecular mimicry in the pathogenesis of post-streptococcal heart disease.

Authors:  Madeleine W Cunningham
Journal:  Front Biosci       Date:  2003-05-01

2.  A single autoimmune T cell receptor recognizes more than a million different peptides.

Authors:  Linda Wooldridge; Julia Ekeruche-Makinde; Hugo A van den Berg; Anna Skowera; John J Miles; Mai Ping Tan; Garry Dolton; Mathew Clement; Sian Llewellyn-Lacey; David A Price; Mark Peakman; Andrew K Sewell
Journal:  J Biol Chem       Date:  2011-11-18       Impact factor: 5.157

3.  Acute Myocarditis Following COVID-19 mRNA Vaccination in Adults Aged 18 Years or Older.

Authors:  Anthony Simone; John Herald; Aiyu Chen; Neil Gulati; Albert Yuh-Jer Shen; Bruno Lewin; Ming-Sum Lee
Journal:  JAMA Intern Med       Date:  2021-12-01       Impact factor: 44.409

4.  Benchmarking evolutionary tinkering underlying human-viral molecular mimicry shows multiple host pulmonary-arterial peptides mimicked by SARS-CoV-2.

Authors:  A J Venkatakrishnan; Nikhil Kayal; Praveen Anand; Andrew D Badley; George M Church; Venky Soundararajan
Journal:  Cell Death Discov       Date:  2020-10-02

5.  The universal protein resource (UniProt).

Authors: 
Journal:  Nucleic Acids Res       Date:  2007-11-27       Impact factor: 16.971

6.  Molecular mimicry: good artists copy, great artists steal.

Authors:  Uthaman Gowthaman; Veraragavan P Eswarakumar
Journal:  Virulence       Date:  2013-07-17       Impact factor: 5.882

7.  Knowledge synthesis of 100 million biomedical documents augments the deep expression profiling of coronavirus receptors.

Authors:  A J Venkatakrishnan; Arjun Puranik; Akash Anand; David Zemmour; Xiang Yao; Xiaoying Wu; Ramakrishna Chilaka; Dariusz K Murakowski; Kristopher Standish; Bharathwaj Raghunathan; Tyler Wagner; Enrique Garcia-Rivera; Hugo Solomon; Abhinav Garg; Rakesh Barve; Anuli Anyanwu-Ofili; Najat Khan; Venky Soundararajan
Journal:  Elife       Date:  2020-05-28       Impact factor: 8.140

8.  The Immune Epitope Database (IEDB): 2018 update.

Authors:  Randi Vita; Swapnil Mahajan; James A Overton; Sandeep Kumar Dhanda; Sheridan Martini; Jason R Cantrell; Daniel K Wheeler; Alessandro Sette; Bjoern Peters
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

9.  The mutational constraint spectrum quantified from variation in 141,456 humans.

Authors:  Konrad J Karczewski; Laurent C Francioli; Grace Tiao; Beryl B Cummings; Jessica Alföldi; Qingbo Wang; Ryan L Collins; Kristen M Laricchia; Andrea Ganna; Daniel P Birnbaum; Laura D Gauthier; Harrison Brand; Matthew Solomonson; Nicholas A Watts; Daniel Rhodes; Moriel Singer-Berk; Eleina M England; Eleanor G Seaby; Jack A Kosmicki; Raymond K Walters; Katherine Tashman; Yossi Farjoun; Eric Banks; Timothy Poterba; Arcturus Wang; Cotton Seed; Nicola Whiffin; Jessica X Chong; Kaitlin E Samocha; Emma Pierce-Hoffman; Zachary Zappala; Anne H O'Donnell-Luria; Eric Vallabh Minikel; Ben Weisburd; Monkol Lek; James S Ware; Christopher Vittal; Irina M Armean; Louis Bergelson; Kristian Cibulskis; Kristen M Connolly; Miguel Covarrubias; Stacey Donnelly; Steven Ferriera; Stacey Gabriel; Jeff Gentry; Namrata Gupta; Thibault Jeandet; Diane Kaplan; Christopher Llanwarne; Ruchi Munshi; Sam Novod; Nikelle Petrillo; David Roazen; Valentin Ruano-Rubio; Andrea Saltzman; Molly Schleicher; Jose Soto; Kathleen Tibbetts; Charlotte Tolonen; Gordon Wade; Michael E Talkowski; Benjamin M Neale; Mark J Daly; Daniel G MacArthur
Journal:  Nature       Date:  2020-05-27       Impact factor: 69.504

10.  Outcomes of Cardiovascular Magnetic Resonance Imaging in Patients Recently Recovered From Coronavirus Disease 2019 (COVID-19).

Authors:  Valentina O Puntmann; M Ludovica Carerj; Imke Wieters; Masia Fahim; Christophe Arendt; Jedrzej Hoffmann; Anastasia Shchendrygina; Felicitas Escher; Mariuca Vasa-Nicotera; Andreas M Zeiher; Maria Vehreschild; Eike Nagel
Journal:  JAMA Cardiol       Date:  2020-11-01       Impact factor: 14.676

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.