| Literature DB >> 36051243 |
João H C Campos1, Gustavo V Alves1, Juliana T Maricato2, Carla T Braconi2, Fernando M Antoneli1, Luiz Mario R Janini2, Marcelo R S Briones1.
Abstract
The epitranscriptomics of the SARS-CoV-2 infected cell reveals its response to viral replication. Among various types of RNA nucleotide modifications, the m6A is the most common and is involved in several crucial processes of RNA intracellular location, maturation, half-life and translatability. This epitranscriptome contains a mixture of viral RNAs and cellular transcripts. In a previous study we presented the analysis of the SARS-CoV-2 RNA m6A methylation based on direct RNA sequencing and characterized DRACH motif mutations in different viral lineages. Here we present the analysis of the m6A transcript methylation of Vero cells (derived from African Green Monkeys) and Calu-3 cells (human) upon infection by SARS-CoV-2 using direct RNA sequencing data. Analysis of these data by nonparametric statistics and two computational methods (m6anet and EpiNano) show that m6A levels are higher in RNAs of infected cells. Functional enrichment analysis reveals increased m6A methylation of transcripts involved in translation, peptide and amine metabolism. This analysis allowed the identification of differentially methylated transcripts and m6A unique sites in the infected cell transcripts. Results here presented indicate that the cell response to viral infection not only changes the levels of mRNAs, as previously shown, but also its epitranscriptional pattern. Also, transcriptome-wide analysis shows strong nucleotide biases in DRACH motifs of cellular transcripts, both in Vero and Calu-3 cells, which use the signature GGACU whereas in viral RNAs the signature is GAACU. We hypothesize that the differences of DRACH motif biases, might force the convergent evolution of the viral genome resulting in better adaptation to target sequence preferences of writer, reader and eraser enzymes. To our knowledge, this is the first report on m6A epitranscriptome of the SARS-CoV-2 infected Vero cells by direct RNA sequencing, which is the sensu stricto RNA-seq.Entities:
Keywords: RNA methylation; SARS-CoV-2 Genome; direct RNA sequencing; epigenetics; epitranscriptome; m6A
Mesh:
Substances:
Year: 2022 PMID: 36051243 PMCID: PMC9425070 DOI: 10.3389/fcimb.2022.906578
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 6.073
Distribution of read lengths, qualities and mapping in different datasets analyzed in this study.
| Total raw reads | Average percent identity | Fraction of bases aligned | Mean read length | Mean read quality | Median percent identity | Median read length | Median read quality | Number of mapped reads | Read length N50 | STDEV read length | Total bases | Total bases aligned | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Uninfected Vero ( | 1,439,291 | 89.7 | 0.8 | 1,084.3 | 10.6 | 90.8 | 831 | 10.7 | 1,045,491 | 1,371 | 829.1 | 1,133,635,322 | 922,214,398 |
| Infected Vero | 879,679 | 89.3 | 0.8 | 1,059.6 | 10.4 | 90.4 | 798 | 10.5 | 189,127 | 1,333 | 881.3 | 200,397,442 | 160,417,377 |
| Infected Vero ( | 680,347 | 90.1 | 0.8 | 1,067.4 | 10.8 | 91.3 | 803 | 11 | 390,641 | 1,408 | 843 | 416,982,983 | 340,053,176 |
| Infected Vero ( | 22,601 | 88.5 | 0.8 | 849 | 11.6 | 89.1 | 732 | 11.7 | 11,727 | 874 | 379.1 | 9,956,327 | 8,336,639 |
| SARS-CoV-2 in lysate ( | 879,679 | 90.8 | 1 | 2,585.5 | 10.9 | 91.6 | 1,738 | 11.2 | 645,942 | 3,440 | 2,512.6 | 1,670,076,092 | 1,613,107,683 |
| SARS-CoV-2 in lysate ( | 680,347 | 90.8 | 1 | 1,827.1 | 11 | 91.4 | 1,577 | 11.3 | 210,202 | 2,575 | 1,602.7 | 384,054,811 | 373,263,396 |
| SARS-CoV-2 in lysate ( | 22,601 | 91.1 | 1 | 1,262.8 | 11.7 | 91.6 | 1,128 | 11.8 | 7,842 | 1,602 | 698.9 | 9,902,573 | 9,636,856 |
| SARS-CoV-2/Supernatant ( | 430,923 | 88.8 | 1 | 1,083.5 | 9.9 | 89.6 | 811 | 10.1 | 18,266 | 1,515 | 950.6 | 19,790,887 | 19,270,761 |
| SARS-CoV-2/Supernatant ( | 1,488,392 | 89.6 | 1 | 1,376.6 | 10.8 | 90.2 | 1,091 | 11 | 1,721 | 1,905 | 1,103.4 | 2,369,172 | 2,271,417 |
| Uninfected Calu-3 ( | 952,606 | 89.5 | 0.9 | 1,111.7 | 10.7 | 90.4 | 864 | 10.8 | 916,464 | 1,434 | 799.9 | 1,018,788,064 | 962,906,633 |
| Infected Calu-3 ( | 1,068,683 | 89.6 | 0.9 | 1,106.4 | 10.8 | 90.4 | 865 | 10.9 | 935,132 | 1,428 | 789.9 | 1,034,668,826 | 976,315,097 |
| SARS-CoV-2 Calu-3 ( | 1,070,290 | 89.7 | 0.9 | 1,653 | 11 | 90.5 | 1,524 | 11.2 | 98,204 | 2,399 | 1,418 | 162,327,257 | 150,141,191 |
| Uninfected Vero Cell ( | 1,452,561 | 88.3 | 0.8 | 1,123.1 | 10.4 | 89.3 | 845 | 10.4 | 1,262,145 | 1,473 | 855.5 | 1,417,475,246 | 1,143,645,900 |
All data are from Vero cells samples except for the three rows indicating samples from Calu-3 cells.
Distribution of m6A sites in the epitranscriptome of the infected Vero cell sample from (Campos et al., 2021).
| Transcript stable ID | Gene coordinate | Gene name | Gene description | Transcript position | Number of reads | Probability of modification |
|---|---|---|---|---|---|---|
| ENSCSAT00000015767.1 | 28:16195607-16199048 |
| – | 758 | 31 | 0.9920354 |
| ENSCSAT00000001773.1 | 11:86509722-86519010 |
| lumican | 1106 | 49 | 0.971132 |
| ENSCSAT00000011971.1 | 22:13362375-13364975 |
| – | 554 | 40 | 0.9563033 |
| ENSCSAT00000009109.1 | 19:7000793-7001757 |
| macrophage migration inhibitory factor | 435 | 33 | 0.8981868 |
| ENSCSAT00000015767.1 | 28:16195607-16199048 |
| – | 966 | 41 | 0.87318575 |
| ENSCSAT00000008980.1 | 23:54222838-54238006 |
| secreted protein acidic and cysteine rich | 1997 | 37 | 0.8047131 |
| ENSCSAT00000017330.1 | 20:88010616-88015477 |
| ribosomal protein S8 | 1082 | 37 | 0.8037766 |
| ENSCSAT00000010859.1 | 21:54602398-54632780 |
| collagen type I alpha 2 chain | 3830 | 43 | 0.77824545 |
| ENSCSAT00000011971.1 | 22:13362375-13364975 |
| – | 546 | 31 | 0.770081 |
| ENSCSAT00000007846.1 | 15:68596968-68599389 |
| – | 685 | 33 | 0.7656064 |
| ENSCSAT00000005189.1 | 26:48105216-48110896 |
| actin alpha cardiac muscle 1 | 1632 | 30 | 0.73731476 |
| ENSCSAT00000000028.1 | MT:10751-12128 |
| mitochondrially encoded NADH:ubiquinone oxidoreductase core subunit 4 | 960 | 31 | 0.733373 |
| ENSCSAT00000007475.1 | 12:60353817-60357759 |
| – | 420 | 46 | 0.7216429 |
| ENSCSAT00000009292.1 | 5:1859454-1862305 |
| ribosomal protein S2 | 667 | 99 | 0.71115077 |
| ENSCSAT00000011687.1 | 8:68898693-68901812 |
| ribosomal protein L7 | 998 | 34 | 0.7100177 |
| ENSCSAT00000010859.1 | 21:54602398-54632780 |
| collagen type I alpha 2 chain | 4270 | 36 | 0.6952559 |
| ENSCSAT00000013675.1 | 21:14525637-14528513 |
| myosin light chain 7 | 730 | 43 | 0.6849283 |
| ENSCSAT00000018059.1 | 12:21401910-21402595 |
| – | 675 | 42 | 0.6834394 |
| ENSCSAT00000011706.1 | 26:14002075-14004474 |
| ribosomal protein lateral stalk subunit P1 | 198 | 53 | 0.66291857 |
| ENSCSAT00000001773.1 | 11:86509722-86519010 |
| lumican | 654 | 45 | 0.6566237 |
| ENSCSAT00000013859.1 | 21:12905280-12914205 |
| insulin like growth factor binding protein 3 | 2371 | 31 | 0.63804483 |
| ENSCSAT00000006068.1 | 23:82798085-82805936 |
| receptor for activated C kinase 1 | 1250 | 40 | 0.6353171 |
| ENSCSAT00000018668.1 | 18:70826193-70826794 |
| – | 218 | 41 | 0.60119855 |
| ENSCSAT00000000697.1 | 16:66963460-66968731 |
| – | 783 | 34 | 0.59995514 |
| ENSCSAT00000010859.1 | 21:54602398-54632780 |
| collagen type I alpha 2 chain | 3800 | 39 | 0.58406496 |
| ENSCSAT00000011706.1 | 26:14002075-14004474 |
| ribosomal protein lateral stalk subunit P1 | 513 | 56 | 0.58338195 |
| ENSCSAT00000013859.1 | 21:12905280-12914205 |
| insulin like growth factor binding protein 3 | 2053 | 38 | 0.5830461 |
| ENSCSAT00000018059.1 | 12:21401910-21402595 |
| – | 75 | 49 | 0.57314473 |
| ENSCSAT00000008980.1 | 23:54222838-54238006 |
| secreted protein acidic and cysteine rich | 1943 | 38 | 0.5681411 |
| ENSCSAT00000016933.1 | 6:50663431-50670230 |
| ribosomal protein S5 | 398 | 45 | 0.55939466 |
| ENSCSAT00000019107.1 | 16:59453767-59454246 |
| – | 8 | 49 | 0.5383941 |
| ENSCSAT00000006068.1 | 23:82798085-82805936 |
| receptor for activated C kinase 1 | 1030 | 37 | 0.537564 |
| ENSCSAT00000013016.1 | 4:76392625-76396331 |
| – | 275 | 43 | 0.52486014 |
| ENSCSAT00000000555.1 | 6:42666353-42671124 |
| – | 1015 | 90 | 0.5181154 |
| ENSCSAT00000013675.1 | 21:14525637-14528513 |
| myosin light chain 7 | 519 | 43 | 0.50369656 |
“-” indicates non-annotated genes in the reference genome.
Differentially methylated known genes obtained by comparison of uninfected cell (U) and infected cell (I) datasets from (Kim et al., 2020).
| Transcript stable ID | Gene coordinate | Gene name | Gene description | Position (U) | Reads (U) | Prob. (U) | Position (I) | Reads (I) | Prob. (I) |
|---|---|---|---|---|---|---|---|---|---|
| ENSCSAT00000000076.1 | 11:118913078-118928281 |
| transmembrane p24 trafficking protein 2 | 884 | 189 | 0.90227556 | 1394 | 33 | 0.80859786 |
| ENSCSAT00000000076.1 | 11:118913078-118928281 |
| transmembrane p24 trafficking protein 2 | 1703 | 251 | 0.8100335 | – | – | – |
| ENSCSAT00000000890.1 | 6:41640636-41650080 |
| KDEL endoplasmic reticulum protein retention receptor 1 | 1107 | 237 | 0.8039204 | 1640 | 38 | 0.8297022 |
| ENSCSAT00000001690.1 | 1:77213836-77324875 |
| phosphatidylinositol binding clathrin assembly protein | 173 | 49 | 0.8412698 | 895 | 33 | 0.8023325 |
| ENSCSAT00000001942.1 | 1:66788240-66799072 |
| serpin family H member 1 | 1561 | 224 | 0.96509045 | 2082 | 31 | 0.94127834 |
| ENSCSAT00000001942.1 | 1:66788240-66799072 |
| serpin family H member 1 | 436 | 87 | 0.8628933 | – | – | – |
| ENSCSAT00000001942.1 | 1:66788240-66799072 |
| serpin family H member 1 | 1233 | 178 | 0.8468479 | – | – | – |
| ENSCSAT00000002495.1 | 7:30805749-30810720 |
| heterogeneous nuclear ribonucleoprotein D like | 2078 | 30 | 0.9407665 | 1576 | 68 | 0.85807246 |
| ENSCSAT00000003445.1 | 9:95672648-95689247 |
| tripartite motif containing 8 | 1412 | 46 | 0.9639772 | 2315 | 31 | 0.83026695 |
| ENSCSAT00000003445.1 | 9:95672648-95689247 |
| tripartite motif containing 8 | 1568 | 88 | 0.94590265 | – | – | – |
| ENSCSAT00000003445.1 | 9:95672648-95689247 |
| tripartite motif containing 8 | 1486 | 69 | 0.90411556 | – | – | – |
| ENSCSAT00000003903.1 | 16:33752556-33799102 |
| clathrin heavy chain | 3347 | 241 | 0.8372275 | 4409 | 40 | 0.8006732 |
| ENSCSAT00000003903.1 | 16:33752556-33799102 |
| clathrin heavy chain | 252 | 139 | 0.8231322 | – | – | – |
| ENSCSAT00000004018.1 | 16:31833160-31900963 |
| protein phosphatase, Mg2+/Mn2+ dependent 1D | 870 | 76 | 0.9798185 | 2058 | 42 | 0.87876064 |
| ENSCSAT00000004018.1 | 16:31833160-31900963 |
| protein phosphatase, Mg2+/Mn2+ dependent 1D | – | – | – | 1657 | 37 | 0.8740219 |
| ENSCSAT00000004018.1 | 16:31833160-31900963 |
| protein phosphatase, Mg2+/Mn2+ dependent 1D | 1396 | 108 | 0.8597186 | – | – | – |
| ENSCSAT00000004018.1 | 16:31833160-31900963 |
| protein phosphatase, Mg2+/Mn2+ dependent 1D | 2907 | 34 | 0.83353394 | – | – | – |
| ENSCSAT00000004698.1 | 10:117477904-117486774 |
| nucleolin | 2309 | 274 | 0.8647666 | 2014 | 33 | 0.87197196 |
| ENSCSAT00000005041.1 | 5:20128176-20175564 |
| methyltransferase like 9 | 847 | 248 | 0.801356 | 992 | 32 | 0.8637458 |
| ENSCSAT00000005327.1 | 14:96801349-96828716 |
| protein disulfide isomerase family A member 6 | 471 | 290 | 0.90395755 | 1509 | 50 | 0.8532138 |
| ENSCSAT00000005327.1 | 14:96801349-96828716 |
| protein disulfide isomerase family A member 6 | 1723 | 490 | 0.8176958 | – | – | – |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | 623 | 37 | 0.994408 | 479 | 30 | 0.96617603 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | – | – | – | 916 | 34 | 0.8894909 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | – | – | – | 463 | 30 | 0.8717268 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | – | – | – | 940 | 38 | 0.809437 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | 686 | 65 | 0.970193 | – | – | – |
| ENSCSAT00000008192.1 | 8:122282630-122288384 |
| MYC proto-oncogene, bHLH transcription factor | 1336 | 38 | 0.9933568 | 2231 | 42 | 0.8794924 |
| ENSCSAT00000008192.1 | 8:122282630-122288384 |
| MYC proto-oncogene, bHLH transcription factor | 2248 | 89 | 0.8645151 | – | – | – |
| ENSCSAT00000009503.1 | 9:32202204-32258154 |
| integrin subunit beta 1 | 758 | 338 | 0.80759764 | 1725 | 55 | 0.8165182 |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | 2434 | 39 | 0.99647045 | 3301 | 30 | 0.95335376 |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | – | – | – | 3136 | 34 | 0.82845277 |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | 3165 | 90 | 0.9791137 | – | – | – |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | 2372 | 37 | 0.93598187 | – | – | – |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | 3394 | 71 | 0.86949 | – | – | – |
| ENSCSAT00000013859.1 | 21:12905280-12914205 |
| insulin like growth factor binding protein 3 | 1093 | 343 | 0.8282119 | 1378 | 404 | 0.8150883 |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 5866 | 69 | 0.9969953 | 7534 | 31 | 0.884178 |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 5434 | 41 | 0.984392 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 5226 | 39 | 0.98384225 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 5129 | 38 | 0.98288566 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 6065 | 61 | 0.9710264 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 6453 | 81 | 0.9707899 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 6145 | 73 | 0.96241 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 5332 | 43 | 0.91376925 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 5659 | 62 | 0.8688911 | – | – | – |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 6577 | 99 | 0.80982953 | – | – | – |
| ENSCSAT00000017358.1 | 20:90002030-90008033 |
| solute carrier family 2 member 1 | 2060 | 114 | 0.98355216 | 2315 | 32 | 0.93140286 |
| ENSCSAT00000017358.1 | 20:90002030-90008033 |
| solute carrier family 2 member 1 | 2302 | 173 | 0.98312795 | – | – | – |
| ENSCSAT00000017358.1 | 20:90002030-90008033 |
| solute carrier family 2 member 1 | 1977 | 135 | 0.94395334 | – | – | – |
| ENSCSAT00000017358.1 | 20:90002030-90008033 |
| solute carrier family 2 member 1 | 2010 | 156 | 0.89053315 | – | – | – |
| ENSCSAT00000017358.1 | 20:90002030-90008033 |
| solute carrier family 2 member 1 | 2093 | 173 | 0.8239867 | – | – | – |
| ENSCSAT00000017703.1 | 20:116843218-116863010 |
| EF-hand domain family member D2 | 232 | 39 | 0.8788946 | 1959 | 40 | 0.8244804 |
| ENSCSAT00000018156.1 | 14:87249005-87249592 |
| ras homolog family member B | 260 | 71 | 0.8870451 | 122 | 31 | 0.8521231 |
| ENSCSAT00000018156.1 | 14:87249005-87249592 |
| ras homolog family member B | 359 | 167 | 0.8001116 | – | – | – |
| ENSCSAT00000018928.1 | 6:11479407-11480450 |
| JunB proto-oncogene, AP-1 transcription factor subunit | 1032 | 89 | 0.9923255 | 99 | 36 | 0.9681243 |
| ENSCSAT00000018928.1 | 6:11479407-11480450 |
| JunB proto-oncogene, AP-1 transcription factor subunit | – | – | – | 893 | 60 | 0.9514262 |
Reads = Coverage and Prob. – Probability of m6A methylation as calculated by m6anet program.
Unique m6A sites in known genes of the infected sample – obtained by comparison of uninfected and infected cells dataset (Kim et al., 2020).
| Transcript stable ID | Gene coordinate | Gene name | Gene description | Transcript position | Number of reads | Probability of modification |
|---|---|---|---|---|---|---|
| ENSCSAT00000000076.1 | 11:118913078-118928281 |
| transmembrane p24 trafficking protein 2 | 1394 | 33 | 0.80859786 |
| ENSCSAT00000000813.1 | 6:42108267-42112471 |
| protein phosphatase 1 regulatory subunit 15A | 1760 | 30 | 0.85488987 |
| ENSCSAT00000000813.1 | 6:42108267-42112471 |
| protein phosphatase 1 regulatory subunit 15A | 2022 | 35 | 0.8545834 |
| ENSCSAT00000000890.1 | 6:41640636-41650080 |
| KDEL endoplasmic reticulum protein retention receptor 1 | 1640 | 38 | 0.8297022 |
| ENSCSAT00000001690.1 | 1:77213836-77324875 |
| phosphatidylinositol binding clathrin assembly protein | 895 | 33 | 0.8023325 |
| ENSCSAT00000001942.1 | 1:66788240-66799072 |
| serpin family H member 1 | 2082 | 31 | 0.94127834 |
| ENSCSAT00000002458.1 | 1:56799838-56813650 |
| eukaryotic translation initiation factor 3 subunit F | 503 | 44 | 0.80193645 |
| ENSCSAT00000002495.1 | 7:30805749-30810720 |
| heterogeneous nuclear ribonucleoprotein D like | 1576 | 68 | 0.85807246 |
| ENSCSAT00000003445.1 | 9:95672648-95689247 |
| tripartite motif containing 8 | 2315 | 31 | 0.83026695 |
| ENSCSAT00000003903.1 | 16:33752556-33799102 |
| clathrin heavy chain | 4409 | 40 | 0.8006732 |
| ENSCSAT00000004018.1 | 16:31833160-31900963 |
| protein phosphatase, Mg2+/Mn2+ dependent 1D | 2058 | 42 | 0.87876064 |
| ENSCSAT00000004018.1 | 16:31833160-31900963 |
| protein phosphatase, Mg2+/Mn2+ dependent 1D | 1657 | 37 | 0.8740219 |
| ENSCSAT00000004351.1 | 5:26221337-26225083 |
| Tu translation elongation factor, mitochondrial | 1336 | 35 | 0.8196384 |
| ENSCSAT00000004698.1 | 10:117477904-117486774 |
| nucleolin | 2014 | 33 | 0.87197196 |
| ENSCSAT00000005041.1 | 5:20128176-20175564 |
| methyltransferase like 9 | 992 | 32 | 0.8637458 |
| ENSCSAT00000005327.1 | 14:96801349-96828716 |
| protein disulfide isomerase family A member 6 | 1509 | 50 | 0.8532138 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | 479 | 30 | 0.96617603 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | 916 | 34 | 0.8894909 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | 463 | 30 | 0.8717268 |
| ENSCSAT00000006347.1 | 17:41712528-41714126 |
| immediate early response 3 | 940 | 38 | 0.809437 |
| ENSCSAT00000006678.1 | 22:52391813-52417575 |
| ribophorin I | 1615 | 36 | 0.8343291 |
| ENSCSAT00000007637.1 | 23:75149710-75200192 |
| ATPase H+ transporting V0 subunit e1 | 845 | 44 | 0.8655202 |
| ENSCSAT00000007678.1 | 23:74929863-74934532 |
| dual specificity phosphatase 1 | 1649 | 49 | 0.9319972 |
| ENSCSAT00000007678.1 | 23:74929863-74934532 |
| dual specificity phosphatase 1 | 870 | 33 | 0.86039066 |
| ENSCSAT00000007678.1 | 23:74929863-74934532 |
| dual specificity phosphatase 1 | 1281 | 41 | 0.80852 |
| ENSCSAT00000008192.1 | 8:122282630-122288384 |
| MYC proto-oncogene, bHLH transcription factor | 2231 | 42 | 0.8794924 |
| ENSCSAT00000009011.1 | 24:72156784-72163500 |
| serpin family A member 1 | 1109 | 40 | 0.84530336 |
| ENSCSAT00000009503.1 | 9:32202204-32258154 |
| integrin subunit beta 1 | 1725 | 55 | 0.8165182 |
| ENSCSAT00000011100.1 | 6:309506-321785 |
| basigin (Ok blood group) | 1931 | 31 | 0.90602887 |
| ENSCSAT00000011100.1 | 6:309506-321785 |
| basigin (Ok blood group) | 1863 | 30 | 0.81299096 |
| ENSCSAT00000012419.1 | 4:90497754-90605481 |
| calpastatin | 1944 | 60 | 0.8325281 |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | 3301 | 30 | 0.95335376 |
| ENSCSAT00000012714.1 | 25:24016930-24036641 |
| NUAK family kinase 2 | 3136 | 34 | 0.82845277 |
| ENSCSAT00000012734.1 | 14:45086920-45111271 |
| chaperonin containing TCP1 subunit 4 | 685 | 55 | 0.8591496 |
| ENSCSAT00000013859.1 | 21:12905280-12914205 |
| insulin like growth factor binding protein 3 | 1378 | 404 | 0.8150883 |
| ENSCSAT00000013875.1 | 21:11292788-11554933 |
| tensin 3 | 7534 | 31 | 0.884178 |
| ENSCSAT00000014245.1 | 4:54629194-54636032 |
| polo like kinase 2 | 210 | 36 | 0.8035949 |
| ENSCSAT00000015796.1 | 28:15672442-15694148 |
| KDEL endoplasmic reticulum protein retention receptor 2 | 790 | 50 | 0.83011115 |
| ENSCSAT00000015906.1 | 28:13491483-13519507 |
| cytochrome P450 family 3 subfamily A member 5 | 2297 | 52 | 0.81895936 |
| ENSCSAT00000016112.1 | 28:12047529-12060202 |
| serpin family E member 1 | 2083 | 139 | 0.8208885 |
| ENSCSAT00000017104.1 | 20:47783935-47785997 |
| cellular communication network factor 1 | 411 | 104 | 0.8570591 |
| ENSCSAT00000017358.1 | 20:90002030-90008033 |
| solute carrier family 2 member 1 | 2315 | 32 | 0.93140286 |
| ENSCSAT00000017703.1 | 20:116843218-116863010 |
| EF-hand domain family member D2 | 1959 | 40 | 0.8244804 |
| ENSCSAT00000018156.1 | 14:87249005-87249592 |
| ras homolog family member B | 122 | 31 | 0.8521231 |
| ENSCSAT00000018928.1 | 6:11479407-11480450 |
| JunB proto-oncogene, AP-1 transcription factor subunit | 99 | 36 | 0.9681243 |
| ENSCSAT00000018928.1 | 6:11479407-11480450 |
| JunB proto-oncogene, AP-1 transcription factor subunit | 893 | 60 | 0.9514262 |
| ENSCSAT00000019531.1 | 14:37013880-37014947 |
| poly(rC) binding protein 1 | 866 | 58 | 0.90854704 |
Figure 1Summary of number of m6A sites, number of transcripts and number of known genes in host cell epitranscriptome data here analysed. The Venn diagrams show the number of m6A sites in uninfected and infected Vero cells (derived from African Green Monkeys) (Kim et al., 2020) (A), and between uninfected and infected Calu-3 cells (human derived) (Chang et al., 2021) (B). In each panel, the number of m6A sites, transcripts and known genes are displayed for each host cell type.
Figure 2Violin plots of the distributions of differentially methylated transcripts in Uninfected and Infected Vero cell datasets using program m6anet. The areas indicate the data distribution of each sample and the horizontal bars in the middle of the areas indicate the Medians. The Effect size and p-values are as obtained by the Wilcoxon-Mann-Whitney test as described in Material and Methods. The abscissas idicate the samples and the ordinates indicate the quantity S=log(methylated sites/transcript), which is the logarithm with base 10 of the proportion of methylated sites per transcript, for all transcript types in each dataset.
Figure 3Violin plots of the distributions of differentially methylated transcripts between Uninfected and Infected Vero cell datasets of using program EpiNano. The areas indicate the data distribution of each sample and the horizontal bars in the middle of the areas indicate the Medians. The Effect size and p-values are as obtained by the Wilcoxon-Mann-Whitney test as described in Material and Methods. The abscissas idicate the samples and the ordinates indicate the quantity S=log(methylated sites/transcript), which is the logarithm with base 10 of the proportion of methylated sites per transcript, for all transcript types in each dataset.
Results of the Wilcoxon-Mann-Whitney test (WMW) for comparison of differentially methylated transcripts between Uninfected (U) and Infected (I) Vero cell datasets using programs m6anet and EpiNano as indicated.
| Comparison (WMW) | # Transcripts | Effect Size | Confidence (95%) |
|
|---|---|---|---|---|
|
| ||||
| Kim (U) x Kim (I) | 1871 x 137 | 0.259 | 0.145 - 0.371 | 1.04 × 10-5 |
| Kim (U) x Taiaroa (I) | 1871 x 544 | 0.099 | 0.039 - 0.159 | 1.13 × 10-3 |
| Chang (U) x Kim (I) | 2372 x 137 | 0.231 | 0.117 - 0.344 | 7.77 × 10-5 |
| Chang (U) x Taiaroa (I) | 2372 x 544 | 0.072 | 0.013 - 0.131 | 1.58 × 10-2 |
| * Kim (U) x Chang (U) | 1871 x 2372 | 0.026 | -0.011 - 0.065 | 0.168 |
|
| ||||
| Kim (U) x Kim (I) | 1064 x 119 | 0.353 | 0.240 - 0.465 | 3.433 × 10-9 |
| Kim (U) x Taiaroa (I) | 1064 x 302 | 0.146 | 0.069 - 0.222 | 1.636 × 10-4 |
| Chang (U) x Kim (I) | 1566 x 119 | 0.314 | 0.200 - 0.428 | 1.632 × 10-7 |
| Chang (U) x Taiaroa (I) | 1566 x 302 | 0.107 | 0.032 - 0.182 | 4.927 × 10-3 |
| * Kim (U) x Chang (U) | 1064 x 1566 | 0.037 | -0.008 - 0.084 | 0.111 |
(*) Comparison between two Uninfected Vero cell datasets. Datasets are as indicated in Table 1, being Kim (Kim et al., 2020), Taiaroa (Taiaroa et al., 2020) and Chang (Chang et al., 2021).
Figure 4Methylated DRACH motifs reveal different sequence profiles in the cellular epitranscriptome and the viral epigenome. DRACH sequences containing predicted m6a sites (plus 5 nucleotides for each end) were aligned and stacked together to provide an overview of the informational content of methylated regions. Motif profiles in epitranscriptomes were obtained from uninfected Vero cells (A), and from different samples of infected Vero cells (B–D). DRACH profiles in SARS-CoV-2 epigenomes were obtained from different samples of Vero cell lysates (E–G), and from different samples of supernatants (H, I). Motif profiles in epitranscriptomes obtained from uninfected Calu-3 cells (J), infected Calu-3 cells (K). DRACH profiles in SARS-COV-2 epigenome were obtained from infected Calu-3 cells (L). Direct RNA sequencing data were obtained from Kim et al., 2020 (in A, B, E), Taiaroa et al., 2020 (in C, F, H). Sequencing data obtained by Campos et al., 2021 study are presented in (D, G, I). Data from Chang et al., 2021 on Calu-3 cells (J–L). Numbers in the abscissa indicate the DRACH motif positions (from 6 to 10), the flanking 5’ (from 1 to 5) and 3’ (from 11 to 15). The ordinates indicate the score in Bits as it deviates from the null hypothesis, in other words the stronger the bias, the higher the score. A and C have 100% frequency in positions 8 and 9 of all DRACH motifs analyzed whereas positions with zero or near zero indicate that the four canonical bases are at equilibrium frequency f0.25 in the same sampling space.
Figure 5Functional enrichment analysis of the infected Vero cell m6A epitranscriptome (dataset from Campos et al., 2021). A total of 24 transcripts common to infected cells was used in enrichment analysis ( ), with Gene Ontology and KEGG biological pathways as data sources for overrepresentation. The analysis was performed with default gProfiler web server options, with g:SCS algorithm for computing multiple testing correction for p-values. Terms are grouped by data sources (Gene Ontology classifications or KEGG biological pathways). The GO categories are in the left columns, green bars indicate the -log10 of the p-value, blue and black squares indicate significant positive hits of the transcript IDs (vertical top columns) with GO categories. GO : MF, Molecular Function; GO : BP, Biological Process; GO : CC, Cellular Component and KEGG, Kyoto Encyclopedia of Genes and Genomes.