Literature DB >> 34662705

Sequence complementarity between human noncoding RNAs and SARS-CoV-2 genes: What are the implications for human health?

Rossella Talotta1, Shervin Bahrami2, Magdalena Janina Laska3.   

Abstract

OBJECTIVES: To investigate in silico the presence of nucleotide sequence complementarity between the RNA genome of Severe Acute Respiratory Syndrome CoronaVirus-2 (SARS-CoV-2) and human non-coding (nc)RNA genes.
METHODS: The FASTA sequence (NC_045512.2) of each of the 11 SARS-CoV-2 isolate Wuhan-Hu-1 genes was retrieved from NCBI.nlm.nih.gov/gene and the Ensembl.org library interrogated for any base-pair match with human ncRNA genes. SARS-CoV-2 gene-matched human ncRNAs were screened for functional activity using bioinformatic analysis. Finally, associations between identified ncRNAs and human diseases were searched in GWAS databases.
RESULTS: A total of 252 matches were found between the nucleotide sequence of SARS-CoV-2 genes and human ncRNAs. With the exception of two small nuclear RNAs, all of them were long non-coding (lnc)RNAs expressed mainly in testis and central nervous system under physiological conditions. The percentage of alignment ranged from 91.30% to 100% with a mean nucleotide alignment length of 17.5 ± 2.4. Thirty-three (13.09%) of them contained predicted R-loop forming sequences, but none of these intersected the complementary sequences of SARS-CoV-2. However, in 31 cases matches fell on ncRNA regulatory sites, whose adjacent coding genes are mostly involved in cancer, immunological and neurological pathways. Similarly, several polymorphic variants of detected non-coding genes have been associated with neuropsychiatric and proliferative disorders.
CONCLUSION: This pivotal in silico study shows that SARS-CoV-2 genes have Watson-Crick nucleotide complementarity to human ncRNA sequences, potentially disrupting ncRNA epigenetic control of target genes. It remains to be elucidated whether this could result in the development of human disease in the long term.
Copyright © 2021. Published by Elsevier B.V.

Entities:  

Keywords:  Bioinformatics; COVID-19; Epigenetics; Long non-coding RNA; SARS-CoV-2; Small nuclear RNA

Mesh:

Substances:

Year:  2021        PMID: 34662705      PMCID: PMC8518135          DOI: 10.1016/j.bbadis.2021.166291

Source DB:  PubMed          Journal:  Biochim Biophys Acta Mol Basis Dis        ISSN: 0925-4439            Impact factor:   5.187


Introduction

The COronaVIrus Disease-19 (COVID-19) pandemic, outbreaking in December 2019 [1], continues to challenge the health and economic systems of countries worldwide. The infection, caused by Severe Acute Respiratory Syndrome CoronaVirus-2 (SARS-CoV-2), is characterized by a polyhedral and unpredictable clinical presentation, which remarkably complicates management, and a higher risk of hospitalization and mortality compared to seasonal influenza [2]. Besides Acute Respiratory Distress Syndrome (ARDS) and disseminated intravascular coagulation (DIC), which are the most threatening complications, many other manifestations have been reported to occur in the short or medium term in SARS-CoV-2-infected or recovered individuals. These include immunological disorders [3], cardiac arrhythmias [4], neurological complications [5] and the occurrence of dysmetabolic conditions, such as diabetes mellitus [6]. On the other hand, the long-term effects of COVID-19 on human health are still undetermined. SARS-CoV-2 infection may trigger a cascade of events in the host, ranging from activation of the innate and acquired immune response [3] to coagulopathy [7] and pro-fibrotic pathways [8]. Indeed, the immune system plays a central role in coordinating the various steps of COVID-19 pathogenesis. Both viral proteins and nucleic acids are highly immunogenic and therefore capable of inducing and perpetuating inflammation [9,10]. The exaggerated immune response that occurs in predisposed individuals in response to SARS-CoV-2 infection could eventually lead to immune-mediated disorders, cancer or cardiovascular disease. The development of autoimmunity or autoinflammation may follow an external trigger, such as viral infections, and several cases of autoimmune disorders have been reported in COVID-19 patients [3]. Similarly, the role of viruses in the induction of oncogenesis is also well-known [11]. Viruses may directly or indirectly favor cancer cell transformation by producing oncogenic proteins, chronically stimulating immune cells, and evading tumor suppressor signaling. Similarly, cumulative evidence suggests that viruses may be considered as new players in the pathogenesis of neurodegenerative and cardiovascular diseases [[12], [13], [14]]. Indeed, the interaction between host and virus is crucial for the containment or spread of infection. It is now clear that viruses may establish a nucleic acid crosstalk within host cells based on the production of non-coding (nc)RNAs. More specifically, viral genomes or transcripts may interact with ncRNAs produced by target cells or produce ncRNAs themselves, ultimately affecting viral lifecycle and antiviral response [15]. NcRNAs represent an area that has only recently been rediscovered. They are now widely recognized as protagonists of several human diseases, including cancer, autoimmunity and neurodegenerative disorders [[16], [17], [18]], all of which may be initially triggered by infections. By definition, these transcripts are unable to code for proteins and function mainly as epigenetic controllers of crucial cellular processes, such as proliferation, differentiation, migration and apoptosis [17]. Similar to other viruses, it is very likely that SARS-CoV-2 infection might induce or accelerate the progression of oncological, immunological, neurological or cardiovascular diseases. These events likely result from an epigenetic imbalance between the SARS-CoV-2 genome and host non-coding transcripts. SARS-CoV-2 is an enveloped RNA virus that has a single-stranded positive sense genome of about 30,000 nucleotides [19]. Sequencing of the Wuhan-Hu-1 SARS-CoV-2 genome has revealed the presence of 11 coding genes, namely ORF1ab, ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF10, S, E, M and N (NCBI.nlm.nih.gov/gene, reference sequence NC_045512.2). In addition, SARS-CoV-2 could also produce ncRNAs, as shown in a recent computational study [20]. The aberrant expression of human ncRNAs following SARS-CoV-2 infection is supported by a number of recent studies [[21], [22], [23]]. However, it has not been clarified whether sequences within the SARS-CoV-2 genome could directly complement to human ncRNAs and interfere with associated pathways. Deciphering this hypothesis would be critical to understand potential long-term impact of COVID-19 on human health. Indeed, the physical interaction of SARS-CoV-2 sequences with host ncRNAs could, over time, lead to epigenetic disruption of physiological cellular cascades, which in turn are precursors of human disease.

Aim

The primary objective of this in silico study was to evaluate the presence of a Watson-Crick nucleotide sequence complementarity between the RNA genome of SARS-CoV-2 and human ncRNA genes. Secondary outcomes were the functional characterization of detected ncRNAs and the evaluation of potential associations with human pathological conditions.

Methods

Identification of SARS-CoV-2-complementary human ncRNA genes

The FASTA sequence of each of the 11 genes of the SARS-CoV-2 isolate Wuhan-Hu-1 (ORF1ab, ORF3a, ORF6, ORF7a, ORF7b, ORF8, ORF10, S, E, M, N) was retrieved from NCBI Reference Sequence: NC_045512.2 (https://www.ncbi.nlm.nih.gov/gene/?term=NC_045512). The nucleotide sequence of each SARS-CoV-2 gene was reversed using an online bioinformatics tool (https://www.bioinformatics.org/sms/rev_comp.html) [24] and used as a key input to search for matching human ncRNA genes in the Ensembl.org library (Human GRCh38.p13) [25]. Briefly, we queried the Ensembl.org database by entering the nucleotide sequence of the SARS-CoV-2 transcripts and selecting 100 as the maximum number of hits to be reported, 10 as the maximum E-value for reported alignments, and the range 1–3 as match/mismatch scores. BLASTN analysis was performed for human ncRNA genes only.

Analysis of molecular interactions and biological function of retrieved human ncRNAs

Human ncRNAs matching SARS-CoV-2 sequences were analyzed for their functional activity and molecular interactions by consulting the bioinformatics tools freely available online: Ensembl.org [25] for genomic location, annotation of neighboring genes and detection of variants or regulatory sites, UCSC Genome Browser GRCh38/hg38 (http://genome.ucsc.edu) for genomic location and annotation of neighboring genes, RNAct [26], RNAInter (https://www.rna-society.org/rnainter/home.html) [27] and IntaRNA (http://rna.informatik.uni-freiburg.de/IntaRNA/Input.jsp) [28] for prediction of protein–RNA, RNA-RNA and DNA-RNA interactions. Human ncRNA FASTA sequences complementary to SARS-CoV-2 genes were also entered as input to the online tool R-loop Forming Sequence (RLFS) finder (http://rloop.bii.a-star.edu.sg/?pg=qmrlfs-finder) [29] and QmRLFS models m1 and m2 were selected.

Analysis of polymorphic variants of detected human ncRNAs and associated diseases

Associations between polymorphic variants of retrieved ncRNAs and human diseases were searched in the NHGRI-EBI Genome-Wide Association Study (GWAS) Catalog (https://www.ebi.ac.uk/gwas) [30], GeneCards (https://www.genecards.org) and Genome Aggregation Database (gnomAD) [31], which also provided information, when available, on tissue and intracellular localization of ncRNAs.

Results

Detection of human ncRNAs showing a sequence complementarity to SARS-CoV-2 genes

A total of 252 matches were found between SARS-CoV-2 genes and human ncRNAs (ORF1ab: 28, ORF3a: 9, ORF6: 50, ORF7a: 31, ORF7b: 16, ORF8: 23, ORF10: 5, S: 24, E: 17, M: 32, N: 17), with percentage of alignment ranging from 91.30% to 100% and mean nucleotide alignment length of 17.5 ± 2.4, Table 1 and Supplementary Files S1 and S2. Specifically, SARS-CoV-2 genes overlapped with the transcripts of 130 long non-coding (lnc)RNAs and two small nuclear (sn)RNAs. Thirty-eight (28.7%) and 32 (24.2%) of the identified ncRNA transcripts were reported to be expressed under physiological conditions in testis and central nervous system, respectively. Because many of them are still poorly characterized, cellular localization was available only for the SARS-CoV-2 E gene-matching lncRNA COX10-AS1 (nucleus); the SARS-CoV-2 ORF6 gene-matching lncRNAs SLFN12L (cytosol, cell membranes), NUTM2A-AS1 (extracellular), MEG8 (nucleus), and LINC02872 (mitochondrion); the SARS-CoV-2 M gene-matching lncRNA KIAA1614-AS1 (cytoskeleton); the SARS-CoV-2 ORF7a gene-matching lncRNA FAM30A (plasma membrane, extracellular, nucleus and cytosol); and the SARS-CoV-2 ORF10 gene-matching lncRNA MIR100HG (plasma membrane, extracellular, nucleus and cytosol). Multiple complementarity to nucleotide sequences within the ORF1ab, N and ORF6 and S and ORF7b genes was found for the lncRNAs AC10198.2 and XACT, respectively.
Table 1

List of the human ncRNAs (gene and transcripts) displaying a nucleotide sequence complementarity to SARS-CoV-2 genes.

SARS-CoV-2 geneHuman ncRNAChromosomeBase-pairingntAlignment%Type
ORF1abENST00000548564.1, LINC023541221100%lncRNA
ORF1abENST00000550720.5, LINC023541221100%lncRNA
ORF1abENST00000550909.1, LINC023541221100%lncRNA
ORF1abENST00000546523.1, LINC023541221100%lncRNA
ORF1abENST00000550684.1, LINC023541221100%lncRNA
ORF1abENST00000651612.1AC095060.142596%lncRNA
ORF1abENST00000653602.1, AC000065.1720100%lncRNA
ORF1abENST00000650674.1, AL162253.2920100%lncRNA
ORF1abENST00000663562.1, DIRC3219100%lncRNA
ORF1abENST00000663156.2, DIRC3219100%lncRNA
ORF1abENST00000654616.2, DIRC3219100%lncRNA
ORF1abENST00000474063.5, DIRC3219100%lncRNA
ORF1abENST00000657418.1, DIRC3-AS1219100%lncRNA
ORF1abENST00000655995.1, DIRC3-AS1219100%lncRNA
ORF1abENST00000600489.1, MYO3B-AS1219100%lncRNA
ORF1abENST00000610954.4, MYO3B-AS1219100%lncRNA
ORF1abENST00000630532.2, MYO3B-AS1219100%lncRNA
ORF1abENST00000609532.5, MYO3B-AS1219100%lncRNA
ORF1abENST00000609890.5, MYO3B-AS1219100%lncRNA
ORF1abENST00000628535.2, MYO3B-AS1219100%lncRNA
ORF1abENST00000660742.1, AC009107.21619100%lncRNA
ORF1abENST00000656934.1, AC009107.21619100%lncRNA
ORF1abENST00000569580.2, AC009107.21619100%lncRNA
ORF1abENST00000650198.1, AC010198.21219100%lncRNA
ORF1abENST00000648050.1, AC010198.21219100%lncRNA
ORF1abENST00000562691.2, AC010168.21219100%lncRNA
ORF1abENST00000654499.1, AL133166.11419100%lncRNA
ORF1abENST00000425058.1, AP001136.1212792,59%lncRNA
SENST00000665074.1, AL118523.12022100%lncRNA
SENST00000668185.1, AL118523.12022100%lncRNA
SENST00000444436.1, AL118523.12022100%lncRNA
SENST00000654444.1, AC009230.122792,59%lncRNA
SENST00000622355.1, AL139260.2119100%lncRNA
SENST00000325660.3, ZNRF3-AS12219100%lncRNA
SENST00000654363.1, AL606970.5618100%lncRNA
SENST00000511921.2, AC034199.1518100%lncRNA
SENST00000654434.1, AC068989.142892,86%lncRNA
SENST00000674361.1, XACTX18100%lncRNA
SENST00000657367.1, AC092447.572395,65%lncRNA
SENST00000669438.1, AC092447.572395,65%lncRNA
SENST00000429367.1, AC092447.572395,65%lncRNA
SENST00000559783.2, AC104574.21518100%lncRNA
SENST00000668041.1, LINC015151018100%lncRNA
SENST00000601926.6, LINC015151018100%lncRNA
SENST00000670657.1, LINC015151018100%lncRNA
SENST00000667597.1, LINC015151018100%lncRNA
SENST00000562669.1, AC110597.11818100%lncRNA
SENST00000657322.1, LINC01515CHR_HSCHR10_118100%lncRNA
SENST00000634691.2, LINC01515CHR_HSCHR10_118100%lncRNA
SENST00000658144.1, LINC01515CHR_HSCHR10_118100%lncRNA
SENST00000665231.1, LINC01515CHR_HSCHR10_118100%lncRNA
SENST00000575446.1, AC110597.2CHR_HSCHR18_218100%lncRNA
NENST00000659452.1, AC092957.132295,45%lncRNA
NENST00000506892.1, AC008667.3518100%lncRNA
NENST00000668999.1, AC008555.21918100%lncRNA
NENST00000671069.1, AC096577.142295,45%lncRNA
NENST00000506386.1, AC096577.142295,45%lncRNA
NENST00000506148.5, AC096577.142295,45%lncRNA
NENST00000605778.1, AC018647.2718100%lncRNA
NENST00000670642.1, CCNT2-AS1217100%lncRNA
NENST00000659940.1, LINC01358117100%lncRNA
NENST00000635571.1, LINC01358117100%lncRNA
NENST00000649638.1, AC008170.1217100%lncRNA
NENST00000648050.1, AC010198.21217100%lncRNA
NENST00000665899.1, LINC01033517100%lncRNA
NENST00000665869.1, LINC01033517100%lncRNA
NENST00000662351.1, LINC01033517100%lncRNA
NENST00000667636.1, LINC01033517100%lncRNA
NENST00000630399.1, INE2X2195,24%lncRNA
EENST00000430640.1, AL449983.192395,65%lncRNA
EENST00000660804.1, COX10-AS11717100%lncRNA, TEC
EENST00000656685.1, COX10-AS11717100%lncRNA, TEC
EENST00000652924.1, COX10-AS11717100%lncRNA, TEC
EENST00000664394.1, COX10-AS11717100%lncRNA, TEC
EENST00000664612.1, COX10-AS11717100%lncRNA, TEC
EENST00000623598.1, COX10-AS11717100%lncRNA, TEC
EENST00000653162.1, COX10-AS11717100%lncRNA, TEC
EENST00000661551.1, COX10-AS11717100%lncRNA, TEC
EENST00000428283.5, AC092162.2217100%lncRNA, retained intron
EENST00000445785.6, LINC00102X17100%lncRNA
EENST00000577698.1, AC005332.11716100%lncRNA
EENST00000608299.1, AF250324.1416100%lncRNA
EENST00000664890.1, AL022098.1616100%lncRNA
EENST00000625875.1, AF250324.2CHR_HSCHR4_616100%lncRNA
EENST00000626001.1, AF250324.4CHR_HSCHR4_716100%lncRNA
EENST00000627559.1, AF250324.6CHR_HSCHR4_316100%lncRNA
ORF8ENST00000584544.5, LINC028641819100%lncRNA, retained intron
ORF8ENST00000664364.1, LINC02465417100%lncRNA
ORF8ENST00000654940.1, AC093765.342195.24%lncRNA
ORF8ENST00000563286.1, AC107068.1417100%lncRNA
ORF8ENST00000662591.1, LINC00877316100%lncRNA
ORF8ENST00000668168.1, LINC00877316100%lncRNA
ORF8ENST00000664410.1, LINC00877316100%lncRNA
ORF8ENST00000469218.6, LINC00877316100%lncRNA
ORF8ENST00000608654.6, LINC00877316100%lncRNA
ORF8ENST00000671527.1, LINC00877316100%lncRNA
ORF8ENST00000665453.1, LINC00877316100%lncRNA
ORF8ENST00000626474.3, LINC00877316100%lncRNA
ORF8ENST00000470712.2, LINC00877316100%lncRNA
ORF8ENST00000656335.1, LINC00877316100%lncRNA
ORF8ENST00000498432.6, LINC00877316100%lncRNA
ORF8ENST00000666244.1, LINC00877316100%lncRNA
ORF8ENST00000468646.6, LINC00877316100%lncRNA
ORF8ENST00000650029.1, LINC00251816100%lncRNA
ORF8ENST00000502083.2, AC107959.1816100%lncRNA
ORF8ENST00000654493.1, MCM3AP-AS12116100%lncRNA
ORF8ENST00000421927.1, MCM3AP-AS12116100%lncRNA
ORF8ENST00000444998.1, MCM3AP-AS12116100%lncRNA
ORF8ENST00000432735.5, MCM3AP-AS12116100%lncRNA
ORF6ENST00000652420.1, CDKN2B-AS1918100%lncRNA
ORF6ENST00000468603.7, CDKN2B-AS1918100%lncRNA
ORF6ENST00000651507.1, SLFN12L1717100%lncRNA
ORF6ENST00000457356.9, MSC-AS1817100%lncRNA
ORF6ENST00000655314.1, MSC-AS1817100%lncRNA
ORF6ENST00000610270.1, AC027271.1417100%lncRNA
ORF6ENST00000661271.1, CHROMR22195.24%lncRNA
ORF6ENST00000665039.1, CHROMR22195.24%lncRNA
ORF6ENST00000438049.5, LINC00689716100%lncRNA
ORF6ENST00000658288.1, AC091544.4152095%lncRNA
ORF6ENST00000654742.1, AC091544.4152095%lncRNA
ORF6ENST00000620192.1, AC091544.4152095%lncRNA
ORF6ENST00000665942.1, AC091544.2152095%lncRNA
ORF6ENST00000612985.1, RNVU1–4116100%snRNA
ORF6ENST00000425211.5, FAM106A1715100%lncRNA, retained intron
ORF6ENST00000665060.1, AC239809.3115100%lncRNA
ORF6ENST00000655320.1, LINC01965215100%lncRNA
ORF6ENST00000607671.1, WAKMAR2615100%lncRNA
ORF6ENST00000448942.5, WAKMAR2615100%lncRNA
ORF6ENST00000515337.1, AC008691.152095%lncRNA
ORF6ENST00000602934.3, LINC02532615100%lncRNA
ORF6ENST00000660173.1, LINC02208515100%lncRNA
ORF6ENST00000669704.1, ZBED5-AS11115100%lncRNA
ORF6ENST00000670949.1, AC055807.11515100%lncRNA
ORF6ENST00000607979.1, AL365434.21015100%lncRNA
ORF6ENST00000654503.1, NUTM2A-AS11015100%lncRNA
ORF6ENST00000638012.2, MEG81415100%lncRNA
ORF6ENST00000668725.1, MEG81415100%lncRNA
ORF6ENST00000646849.1, AC103718.1815100%lncRNA
ORF6ENST00000648050.1, AC010198.2121994.74%lncRNA
ORF6ENST00000654635.1, LMCD1-AS1315100%lncRNA
ORF6ENST00000441861.5, LMCD1-AS1315100%lncRNA
ORF6ENST00000660413.1, LINC01446715100%lncRNA
ORF6ENST00000665927.1, LINC01446715100%lncRNA
ORF6ENST00000663312.1, LINC01446715100%lncRNA
ORF6ENST00000662259.1, LINC01446715100%lncRNA
ORF6ENST00000670507.1, LINC01446715100%lncRNA
ORF6ENST00000659481.1, LINC01446715100%lncRNA
ORF6ENST00000659250.1, LINC01446715100%lncRNA
ORF6ENST00000659794.1, LINC01446715100%lncRNA
ORF6ENST00000666213.1, LINC01446715100%lncRNA
ORF6ENST00000669638.1, LINC01446715100%lncRNA
ORF6ENST00000652440.1, LINC01446715100%lncRNA
ORF6ENST00000558940.1, MGC158851515100%lncRNA
ORF6ENST00000568092.1, AC126323.61515100%lncRNA
ORF6ENST00000609599.1, AC009570.1415100%lncRNA
ORF6ENST00000623052.1, LINC0287291994.74%lncRNA
ORF6ENST00000659662.1, AP001021.11815100%lncRNA
ORF6ENST00000664630.1, AP001021.11815100%lncRNA
ORF6ENST00000375713.1, AL359649.11315100%lncRNA
MENST00000523083.1, AC015909.21718100%lncRNA
MENST00000611237.1, LINC02809118100%lncRNA
MENST00000623471.1, LINC02809118100%lncRNA
MENST00000563931.1, AC135012.11618100%lncRNA
MENST00000661161.1, TMEM30A-DT618100%lncRNA
MENST00000585065.1, AC015813.11716100%lncRNA
MENST00000577267.1, AC015813.11716100%lncRNA
MENST00000582096.1, AC015813.11716100%lncRNA
MENST00000415647.1, KIAA1614-AS112592%lncRNA
MENST00000435411.6, LINC01934216100%lncRNA
MENST00000564619.1, AP000997.31116100%lncRNA
MENST00000511013.2, LINC027531116100%lncRNA
MENST00000528316.5, LINC027531116100%lncRNA
MENST00000652445.1, AC012020.1316100%lncRNA, retained intron
MENST00000656340.1, AC139795.2516100%lncRNA
MENST00000665249.1, AC139795.2516100%lncRNA
MENST00000499900.2, AC139795.2516100%lncRNA
MENST00000668367.1, AL591519.1616100%lncRNA
MENST00000588761.5, AL445465.1616100%lncRNA
MENST00000591821.6, AL445465.1616100%lncRNA
MENST00000418031.2, GRM3-AS1716100%lncRNA
MENST00000648211.1, AC100801.1816100%lncRNA, retained intron
MENST00000649460.1, AC004129.3716100%lncRNA
MENST00000424662.1, AL035250.1202095%lncRNA
MENST00000661565.1, LINC00382132195.24%lncRNA
MENST00000658610.1, LINC00382132195.24%lncRNA
MENST00000660928.1, LINC00382132195.24%lncRNA
MENST00000657824.1, LINC00382132195.24%lncRNA
MENST00000663622.1, LINC00382132195.24%lncRNA
MENST00000667336.1, LINC00382132195.24%lncRNA
MENST00000667673.1, LINC00382132195.24%lncRNA
MENST00000427918.2, LINC00382132195.24%lncRNA
ORF7aENST00000664048.1, AC092881.21218100%lncRNA
ORF7aENST00000549651.1, PRANCR1218100%lncRNA
ORF7aENST00000670041.1, PRANCR1218100%lncRNA
ORF7aENST00000656495.1, PRANCR1218100%lncRNA
ORF7aENST00000652952.1, AC012500.1216100%lncRNA
ORF7aENST00000669743.1, LINC024051216100%lncRNA
ORF7aENST00000484703.1, PRICKLE2-AS2316100%lncRNA
ORF7aENST00000654828.1, FBXO30-DT616100%lncRNA
ORF7aENST00000663890.1, FBXO30-DT616100%lncRNA
ORF7aENST00000606388.6, FBXO30-DT616100%lncRNA
ORF7aENST00000670304.1, AC109811.1416100%lncRNA
ORF7aENST00000669995.1, AC109811.1416100%lncRNA
ORF7aENST00000512833.1, AC109811.1416100%lncRNA
ORF7aENST00000606629.1, AL359715.3616100%lncRNA
ORF7aENST00000630242.2, FAM30A1416100%lncRNA
ORF7aENST00000456049.1, VSTM2A-OT1716100%lncRNA
ORF7aENST00000669200.1, LINC01606816100%lncRNA
ORF7aENST00000659585.1, LINC01606816100%lncRNA
ORF7aENST00000667730.1, LINC01606816100%lncRNA
ORF7aENST00000654770.1, LINC01606816100%lncRNA
ORF7aENST00000519160.5, LINC01606816100%lncRNA
ORF7aENST00000662371.1, AC080132.1416100%lncRNA
ORF7aENST00000660388.1, AC080132.1416100%lncRNA
ORF7aENST00000660833.1, AL033539.262195.24%lncRNA
ORF7aENST00000520890.5, AC083973.1816100%lncRNA
ORF7aENST00000518994.2, AC083973.1816100%lncRNA
ORF7aENST00000521802.6, AC083973.1816100%lncRNA
ORF7aENST00000661382.1, AC083973.1816100%lncRNA
ORF7aENST00000665933.1, LINC02405CHR_HSCHR12_916100%lncRNA
ORF7aENST00000633766.1, FAM30ACHR_HSCHR14_316100%lncRNA
ORF7aENST00000633454.1, LINC01606CHR_HSCHR8_116100%lncRNA
ORF3aENST00000664367.1, SPANXA2-OT1X19100%lncRNA
ORF3aENST00000666172.1, SPANXA2-OT1X19100%lncRNA
ORF3aENST00000665569.1, SPANXA2-OT1X19100%lncRNA
ORF3aENST00000666501.1, SPANXA2-OT1X19100%lncRNA
ORF3aENST00000660273.1, LINC024181217100%lncRNA
ORF3aENST00000567788.1, LINC024181217100%lncRNA
ORF3aENST00000291374.11, LINC024181217100%lncRNA
ORF3aENST00000562284.1, AC107398.342195.24%lncRNA
ORF3aENST00000558967.1, INO80-AS1152195.24%lncRNA
ORF7bENST00000605233.3, POC1B-AS11218100%lncRNA
ORF7bENST00000425205.1, AL590640.1116100%lncRNA
ORF7bENST00000674361.1, XACTX16100%lncRNA
ORF7bENST00000674361.1, XACTX15100%lncRNA
ORF7bENST00000446091.1, LINC01991315100%lncRNA
ORF7bENST00000626826.1, HELLPAR1215100%lncRNA
ORF7bENST00000567148.2, AC009053.31615100%lncRNA
ORF7bENST00000434579.6, LHFPL3-AS1715100%lncRNA, retained intron
ORF7bENST00000417290.6, LHFPL3-AS1715100%lncRNA, retained intron
ORF7bENST00000416376.6, LHFPL3-AS1715100%lncRNA, retained intron
ORF7bENST00000411448.5, LHFPL3-AS1715100%lncRNA, retained intron
ORF7bENST00000449764.5, LHFPL3-AS1715100%lncRNA, retained intron
ORF7bENST00000555772.2, LINC015791515100%lncRNA
ORF7bENST00000442753.1, LINC026211015100%lncRNA
ORF7bENST00000665487.1, LINC00278Y1994.74%lncRNA
ORF7bENST00000651090.1, LINC00278Y1994.74%lncRNA
ORF10ENST00000649558.1, AC090644.1318100%lncRNA
ORF10ENST00000648163.1, MIR100HG1115100%lncRNA
ORF10ENST00000660256.1, AL356124.162391.30%lncRNA
ORF10ENST00000562632.1, AC106754.1515100%lncRNA
ORF10ENST00000411280.1, RNU4-74P715100%snRNA
List of the human ncRNAs (gene and transcripts) displaying a nucleotide sequence complementarity to SARS-CoV-2 genes.

Biological characterization of the detected human ncRNAs

Characterization of biological function and associated pathways was not available for most of the detected ncRNAs. After a literature search, we found that some of them have been associated with proliferation and metabolic processes as well as the cellular response to hypoxia, being consequently hyper-expressed during carcinogenesis and wound healing [32,33]. A total of 33 (13.09%) detected ncRNAs were predicted to contain R-loop-forming sequences, which are sites of triple interaction with DNA (RNA-DNA-DNA) that affect chromatin stability and accessibility to the transcriptional machinery [34]. Overall, SARS-CoV-2 gene-complementary human lncRNAs were calculated to form 539 R-loops, which however did not overlap the nucleotide sequence complementary to SARS-CoV-2 genes. In 31 cases, the complementary SARS-CoV-2 sequences fell into ncRNA regulatory regions (1 open chromatin site; 16 promoter flanks; 4 enhancers; 8 promoters; 1 CTCF-binding site; 1 transcription factor- and CTCF-binding site), Table 2 . Given the epigenetic role played by ncRNAs on the transcription of neighboring genes [35], we analyzed the flanking chromatin regions of the 31 ncRNAs that matched the SARS-CoV-2 sequences on regulatory sites, by consulting both the Ensembl.org database and the human UCSC Genome Browser GRCh38/hg38. Interestingly, we found that neighboring coding genes, listed in Table 2 , were involved in cancer pathways in 15 cases, regulation of immune response in 10 cases, neurogenesis and nervous system health in 7 cases, metabolic processes in 6 cases, cardiovascular physiology in 5 cases, lung physiology in 3 cases, and mineralization and striated muscle function in 2 and 1 cases, respectively, Fig. 1 .
Table 2

Human ncRNAs having a SARS-CoV-2 sequence complementarity within a regulatory site and list of neighboring coding genes.

SARS-CoV-2 geneComplementary human ncRNAFunction of the ncRNA regulatory domainAdjacent coding geneFunction of coded protein
ORF1abAC095060.1Open chromatinGABRA2Component of the GABA receptor; mediates the GABA inhibitory neurotransmission and regulates the formation of functional inhibitory GABAergic synapses
AC000065.1Promoter flankCDK6Cell cycle, cell division and differentiation
AC010198.2Promoter flankCAPRIN2Increased canonical Wnt signaling through the phosphorylation of the Wnt coreceptor LRP6mRNA-binding and expression modulator of several proteins involved in synaptic plasticityControl of erythroblast growth and differentiation; involved in apoptosis
SAC009230.1Promoter flankLYPD6BModulator of nicotinic acetylcholine receptor activity
AL139260.2PromoterGJA9Involved in the formation of gap junctions
NCCNT2-AS1Promoter flankTMEM163Binds zinc and other divalent cations sequestering them into vesicular organelles
MAP3K19Serine/threonine-protein kinase and transferase activity
ACMSDImplicated in the metabolism of alpha-amino-beta-carboxymuconate-epsilon-semialdehyde and tryptophan and, consequently, in the pathogenesis of neurodegenerative disorders
INE2Promoter flankCA5BZinc metalloenzyme with a mitochondrial localization catalyzing the hydration of carbon dioxide; involved in several biological processes, like acid-base balance, bone resorption and calcification, and respiration
EAC005332.1PromoterARSGCalcium-binding hydrolase with a lysosomal localization, involved in hormone biosynthesis, cell signaling control and degradation of macromolecules
SLC16A6Proton-linked monocarboxylate transporter, presiding over the transport of monocarboxylate across the plasma membrane
AMZ2Zinc metalloprotease with antagonizing activity against angiotensin-3 in vitro; gene defects associated with pulmonary tumorigenesis
ORF6SLFN12LPromoterSLFN12, SLFN13, SLFN14Unfavorable prognostic marker in renal cancer
RNVU1–4PromoterPPIAL4AInvolved in protein folding
AC008691.1Promoter flankIL12BCytokine promoting the survival and potentiating the lytic activity of activated T and NK cells, stimulator of IFN-gamma release by resting PBMCs
ZBED5-AS1Promoter flankZBED5Zinc-binding protein displaying a coding sequence mostly derived from Charlie-like DNA transposon; prognostic marker in liver and urothelial cancer
AC055807.1Promoter flankIGF1RReceptor tyrosine kinase mediating the actions of IGF1, like cell growth and survival and cancer cell transformation
LMCD1-AS1EnhancerLMCD1Transcriptional corepressor preventing GATA6, GATA4 and GATA1 activation of downstream target genes. Likely involved the calcineurin/NFAT signaling pathway and in the development of cardiac hypertrophy and surfactant metabolism
MGC15885Promoter flankTLN2Component of the focal adhesion plaque linking integrin to the actin cytoskeleton; involved in cell adhesion and motility
AC009570.1PromoterENAMInvolved in mineralization
JCHAINSecreted protein linking monomers of either IgM or IgA and favouring their secretion
UTP3Gene silencer; involved in brain development
RUFY3Involved in neuronal polarity and malignant cell migration through the interaction with P21-activated kinase-1
MAC015909.2Promoter flankSGCAPrevalently expressed in skeletal muscle where it links F-actin in the cytoskeleton to extracellular matrix fibers
AC135012.1Promoter flankIRF8Myeloid cell maturation, antiviral response, presumable tumor suppressor
AC015813.1PromoterVEZF1Presumable metal-binding transcription factor. It may promote the transcription of IL-3
SRSF1Splicing activator or repressor RNA-binding protein interacting with many components of the spliceosome
AC012020.1Promoter flankIFT57DNA-binding protein, required for the formation of cilia; involved in the hedgehog signaling; additional pro-apoptotic function through the recruitment of CASP8; it may regulate the transcription of CASP1, CASP8 and CASP10
CD47Cell adhesion mediator in platelets and T lymphocytes in which it may enhance superantigen-dependent proliferation and activation; involved in synaptic plasticity, maturation and cytokine secretion of immature and mature dendritic cells; presumably involved in membrane permeability changes during viral infection
ORF7aPRANCRPromoterCNOT2mRNA synthesis and degradation regulator within the CCR4-NOT complex; presumably involved in mRNA splicing and transport. It represses gene transcription through the intervention of histone deacetylases and polymerase II
AC012500.1CTCF (CCCTC- binding factor)PDE1ACyclic nucleotide phosphodiesterase with specificity for both cAMP and cGMP
FAM30ATF, CTCFIGHHeavy chain of the immunoglobulins
AC083973.1Promoter flankPLATSecreted serine protease converting plasminogen to plasmin and inducing fibrinolysis
IKBKBPhosphorylation of the inhibitor of NF-kB, inducing its dissociation and the activation of NF-kB, with downstream pro-inflammatory effects
ORF3aINO80-AS1Promoter flankINO80 complex subunitATPase belonging to the chromatin remodeling INO80 complex, involved in transcriptional regulation, DNA replication and repair
ORF10MIR100HGPromoter flankBLIDApoptosis inducer through a caspase-dependent mechanism
AL356124.1EnhancerLAMA2Extracellular protein expressed in the basement membrane mediating cell adhesion and migration

Abbreviations: ACMSD, aminocarboxymuconate semialdehyde decarboxylase; AMZ2, archaelysin family metallopeptidase 2 ARSG, arylsulfatase G; BLID, BH3-like motif containing, cell death inducer; CA5B, carbonic anhydrase 5B; cAMP, cyclic adenosine monophosphate; CAPRIN2, caprin family member 2; CASP, caspases; CD47, cluster of differentiation 47; CDK6, cyclin dependent kinase 6; cGMP, cyclic guanosine monophosphate; CNOT2, CCR4-NOT transcription complex subunit 2; ENAM, enamelin; GABA, gamma-aminobutyric acid; GABRA2, gamma-aminobutyric acid type A receptor alpha2 subunit; GATA, gata binding protein; GJA9, gap junction protein alpha 9; IFN, interferon; IFT57, intraflagellar transport 57; IGF1, insulin-like growth factor 1; IGF1R, insulin-like growth factor 1 receptor; IGH, immunoglobulin heavy chain Locus; IKBKB, inhibitor of nuclear factor kappa B kinase subunit beta; IL12B, interleukin-12B; INO80, INO80 complex subunit; IRF8, interferon regulatory factor 8; JCHAIN, joining chain of multimeric IgA and IgM; LAMA2, laminin subunit alpha 2; LMCD1, LIM And Cysteine Rich Domains 1; LRP6, LDL receptor related protein 6; LYPD6B, LY6/PLAUR domain containing 6B; MAP3K19, mitogen-activated protein kinase 19; NFAT, nuclear factor of activated T-cells; NF-kB, nuclear factor kappa-light-chain-enhancer of activated B cells; NK, natural killer; PBMCs, peripheral blood mononuclear cells; PDE1A, phosphodiesterase 1A; PLAT, plasminogen activator, tissue type; PPIAL4A, peptidylprolyl isomerase A like 4A; RUFY3, RUN and FYVE domain containing 3; SARS-CoV-2, Severe Acute Respiratory Syndrome CoronaVirus-2; SGCA, sarcoglycan alpha; SLC16A6, solute carrier family 16 member 6; SLFN, Schlafen family member; SRSF1, serine and arginine rich splicing factor 1; TLN2, Talin2; TMEM163, transmembrane protein 163; UTP3, UTP3 small subunit processome component; VEZF1, vascular endothelial zinc finger 1; Wnt, Wingless-related integration site; ZBED5, zinc finger BED-type containing 5.

Fig. 1

Regulatory pathways potentially disrupted by binding of SARS-CoV-2 sequences to lncRNA gene regulatory sites.

SARS-CoV-2 genes contain nucleotide sequence homology to the regulatory site of 31 human lncRNA genes whose adjacent coding genes may be involved in oncological, immunological, neurological, cardiovascular, pulmonary, metabolic and musculoskeletal diseases.

Abbreviations: ACMSD, aminocarboxymuconate semialdehyde decarboxylase; AMZ2, archaelysin family metallopeptidase 2 ARSG, arylsulfatase G; BLID, BH3-like motif containing, cell death inducer; CA5B, carbonic anhydrase 5B; CAPRIN2, caprin family member 2; CD47, cluster of differentiation 47; CDK6, cyclin dependent kinase 6; CNOT2, CCR4-NOT transcription complex subunit 2; ENAM, enamelin; GABRA2, gamma-aminobutyric acid type A receptor alpha2 subunit; GJA9, gap junction protein alpha 9; IFT57, intraflagellar transport 57; IGF1R, insulin-like growth factor 1 receptor; IGH, immunoglobulin heavy chain Locus; IKBKB, inhibitor of nuclear factor kappa B kinase subunit beta; IL12B, interleukin-12B; INO80, INO80 complex subunit; JCHAIN, joining chain of multimeric IgA and IgM; LAMA2, laminin subunit alpha 2; LMCD1, LIM And Cysteine Rich Domains 1; LYPD6B, LY6/PLAUR domain containing 6B; MAP3K19, mitogen-activated protein kinase 19; PDE1A, phosphodiesterase 1A; PLAT, plasminogen activator, tissue type; PPIAL4A, peptidylprolyl isomerase A like 4A; RUFY3, RUN and FYVE domain containing 3; SGCA, sarcoglycan alpha; SLC16A6, solute carrier family 16 member 6; SLFN, Schlafen family member; SRSF1, serine and arginine rich splicing factor 1; TLN2, Talin2; TMEM163, transmembrane protein 163; UTP3, UTP3 small subunit processome component; VEZF1, vascular endothelial zinc finger 1; ZBED5, zinc finger BED-type containing 5.

Human ncRNAs having a SARS-CoV-2 sequence complementarity within a regulatory site and list of neighboring coding genes. Abbreviations: ACMSD, aminocarboxymuconate semialdehyde decarboxylase; AMZ2, archaelysin family metallopeptidase 2 ARSG, arylsulfatase G; BLID, BH3-like motif containing, cell death inducer; CA5B, carbonic anhydrase 5B; cAMP, cyclic adenosine monophosphate; CAPRIN2, caprin family member 2; CASP, caspases; CD47, cluster of differentiation 47; CDK6, cyclin dependent kinase 6; cGMP, cyclic guanosine monophosphate; CNOT2, CCR4-NOT transcription complex subunit 2; ENAM, enamelin; GABA, gamma-aminobutyric acid; GABRA2, gamma-aminobutyric acid type A receptor alpha2 subunit; GATA, gata binding protein; GJA9, gap junction protein alpha 9; IFN, interferon; IFT57, intraflagellar transport 57; IGF1, insulin-like growth factor 1; IGF1R, insulin-like growth factor 1 receptor; IGH, immunoglobulin heavy chain Locus; IKBKB, inhibitor of nuclear factor kappa B kinase subunit beta; IL12B, interleukin-12B; INO80, INO80 complex subunit; IRF8, interferon regulatory factor 8; JCHAIN, joining chain of multimeric IgA and IgM; LAMA2, laminin subunit alpha 2; LMCD1, LIM And Cysteine Rich Domains 1; LRP6, LDL receptor related protein 6; LYPD6B, LY6/PLAUR domain containing 6B; MAP3K19, mitogen-activated protein kinase 19; NFAT, nuclear factor of activated T-cells; NF-kB, nuclear factor kappa-light-chain-enhancer of activated B cells; NK, natural killer; PBMCs, peripheral blood mononuclear cells; PDE1A, phosphodiesterase 1A; PLAT, plasminogen activator, tissue type; PPIAL4A, peptidylprolyl isomerase A like 4A; RUFY3, RUN and FYVE domain containing 3; SARS-CoV-2, Severe Acute Respiratory Syndrome CoronaVirus-2; SGCA, sarcoglycan alpha; SLC16A6, solute carrier family 16 member 6; SLFN, Schlafen family member; SRSF1, serine and arginine rich splicing factor 1; TLN2, Talin2; TMEM163, transmembrane protein 163; UTP3, UTP3 small subunit processome component; VEZF1, vascular endothelial zinc finger 1; Wnt, Wingless-related integration site; ZBED5, zinc finger BED-type containing 5. Regulatory pathways potentially disrupted by binding of SARS-CoV-2 sequences to lncRNA gene regulatory sites. SARS-CoV-2 genes contain nucleotide sequence homology to the regulatory site of 31 human lncRNA genes whose adjacent coding genes may be involved in oncological, immunological, neurological, cardiovascular, pulmonary, metabolic and musculoskeletal diseases. Abbreviations: ACMSD, aminocarboxymuconate semialdehyde decarboxylase; AMZ2, archaelysin family metallopeptidase 2 ARSG, arylsulfatase G; BLID, BH3-like motif containing, cell death inducer; CA5B, carbonic anhydrase 5B; CAPRIN2, caprin family member 2; CD47, cluster of differentiation 47; CDK6, cyclin dependent kinase 6; CNOT2, CCR4-NOT transcription complex subunit 2; ENAM, enamelin; GABRA2, gamma-aminobutyric acid type A receptor alpha2 subunit; GJA9, gap junction protein alpha 9; IFT57, intraflagellar transport 57; IGF1R, insulin-like growth factor 1 receptor; IGH, immunoglobulin heavy chain Locus; IKBKB, inhibitor of nuclear factor kappa B kinase subunit beta; IL12B, interleukin-12B; INO80, INO80 complex subunit; JCHAIN, joining chain of multimeric IgA and IgM; LAMA2, laminin subunit alpha 2; LMCD1, LIM And Cysteine Rich Domains 1; LYPD6B, LY6/PLAUR domain containing 6B; MAP3K19, mitogen-activated protein kinase 19; PDE1A, phosphodiesterase 1A; PLAT, plasminogen activator, tissue type; PPIAL4A, peptidylprolyl isomerase A like 4A; RUFY3, RUN and FYVE domain containing 3; SGCA, sarcoglycan alpha; SLC16A6, solute carrier family 16 member 6; SLFN, Schlafen family member; SRSF1, serine and arginine rich splicing factor 1; TLN2, Talin2; TMEM163, transmembrane protein 163; UTP3, UTP3 small subunit processome component; VEZF1, vascular endothelial zinc finger 1; ZBED5, zinc finger BED-type containing 5. RNAct analysis performed for the 252 ncRNA transcripts revealed that the most plausible protein interactions occurring within the SARS-CoV-2-matching sequences were with the onco-suppressors nischarin (NISH, mean predicted score 25.1 ± 10.2) and AE Binding Protein 2 (AEBP2, mean predicted score 22.8 ± 6.2), whereas lesser interactions (total predicted score 15.1 ± 4.7) were found with the proteins Proline, Glutamate And Leucine Rich Protein 1 (PELP1), Cysteine Rich Hydrophobic Domain 1 (CHIC1), Coiled-Coil Domain Containing 180 (CCDC180), DDB1 And CUL4 Associated Factor 8 Like 2 (DCAF8L2), POTE Ankyrin Domain Family Member D (POTED) and Suppressor of Ty homolog-5 (SUPT5), Table 3 . Importantly, all these proteins except CHIC1 have been associated with cancer risk and prognosis, as they may act as silencers or enhancers of genes responsible for cell proliferation, differentiation, apoptosis and migration [[36], [37], [38], [39], [40], [41], [42]]. On the other hand, hyper-expression of CHIC1 has been reported in salivary glands of patients with Sjögren's syndrome [43], therefore representing a potential autoimmunity biomarker.
Table 3

RNA-binding proteins that are predicted to interact with human ncRNA transcripts within the SARS-CoV-2-complementary sequences.

RNA-binding proteinLncRNA transcripts interacting with the RNA-binding protein on SARS-CoV-2-nucleotide complementary sequenceNumber of interacting lncRNA transcriptsProtein functionMean ± SD prediction score
NISCHLINC02354, AC095060.1, DIRC3, MYO3B-AS1, AC009107.2, AL139260.2, ZNRF3-AS1, AC034199.1, AC092447.5, AC104574.2, CCNT2-AS1, LINC01358, INE2, AC092162.2, AC005332.1, AC107068.1, LINC00877, MCM3AP-AS1, CDKN2B-AS1, MSC-AS1, LINC00689, RNVU1–4, FAM106A, WAKMAR2, AC008691.1, LINC02872, AL359649.1, AC015909.2, LINC02809, AC135012.1, KIAA1614-AS1, LINC02753, AC139795.2, AL035250.1, PRANCR, FBXO30-DT, AC109811.1, LINC02418, AC107398.3, INO80-AS1, POC1B-AS1, AL590640.1, AC009053.353Onco-suppressor, regulates cell growth, differentiation and apoptosis, involved in protein cargo traffic25.1 ± 10.2
AEBP2LINC02354, DIRC3-AS1, MYO3B-AS1, AC010198.2, AL133166.1, AP001136.1, AL118523.1, XACT, LINC01515, AC008667.3, AC096577.1, AL449983.1, COX10-AS1, LINC00102, LINC02864, LINC00877, MCM3AP-AS1, AC027271.1, WAKMAR2, AL365434.2, MEG8, LMCD1-AS1, AC126323.6, AC009570.1, LINC01934, LINC02753, AC012020.1, AL445465.1, LINC00382, AL359715.3, LINC01606, AC083973.1, LINC01991, LHFPL3-AS1, LINC01579, LINC02621, AC106754.1, RNU4-74P48Onco-suppressor and DNA-binding transcription repressor; involved in rRNA processing in the nucleus and cytosol22.8 ± 6.2
PELP1DIRC3, MGC15885, FAM30A3Proto-oncogene and transcription factor inducing estrogen receptor responsive gene transcription and repressing genes activated by other hormone receptors or transcription factors18.1 ± 4.3
CHIC1LINC025321Protein-coding genes found near the X-inactivation centre10.8
CCDC80AC135012.11Cancer biomarker, supposed to be involved in regulation of transcription and cell adhesion, abundant in testis and regulated by SRY9.8
DCAF8L2AC010168.2, AC135012.1, AP000997.33Abundant in testis, binds histone deacetylases. Prognostic cancer biomarker16.5 ± 4.5
POTEDVSTM2A-OT11Ankirin domain family member D; abundant in testis with a plasma membrane localization9.2
SUPT5AC018647.21Proto-oncogene; regulates transcription elongation by RNA polymerase II17.6

Abbreviations: AEBP2, AE Binding Protein 2; PELP1, Proline, Glutamate And Leucine Rich Protein 1; CHIC1, Cysteine Rich Hydrophobic Domain 1; CCDC180, Coiled-Coil Domain Containing 180; DCAF8L2, DDB1 And CUL4 Associated Factor 8 Like 2; NISH, nischarin; POTED, POTE Ankyrin Domain Family Member D; SUPT5, suppressor of Ty homolog-5.

RNA-binding proteins that are predicted to interact with human ncRNA transcripts within the SARS-CoV-2-complementary sequences. Abbreviations: AEBP2, AE Binding Protein 2; PELP1, Proline, Glutamate And Leucine Rich Protein 1; CHIC1, Cysteine Rich Hydrophobic Domain 1; CCDC180, Coiled-Coil Domain Containing 180; DCAF8L2, DDB1 And CUL4 Associated Factor 8 Like 2; NISH, nischarin; POTED, POTE Ankyrin Domain Family Member D; SUPT5, suppressor of Ty homolog-5.

Polymorphisms of SARS-CoV-2-complementary lncRNA genes and associated human diseases

The detected human ncRNA genes showed high polymorphism that also affected nucleotides within the SARS-CoV-2-complementary sequences. When the two GWAS databases were queried, 106 (81.5%) ncRNAs had polymorphic variants predisposing to various human diseases or health problems; Fig. 2 . In particular, these included neuropsychiatric disorders (54 ncRNAs), obesity and variations in anthropometric indices (37 ncRNAs), cancer (34 ncRNAs), metabolic disorders (31 ncRNAs), and cardiovascular diseases (28 ncRNAs); Table 4 . Interestingly, dysmetabolism, alterations in anthropometric indices and neurological disorders are known risk factors for symptomatic and severe forms of COVID-19 [44,45]. Single nucleotide polymorphisms (SNPs) in 13 human lncRNA genes matching SARS-CoV-2 RNA have also been described in patients with immune-mediated disorders, like inflammatory bowel diseases (IBD), multiple sclerosis (MS), psoriasis (PsO), autoimmune arthritis or connective tissue diseases (CTDs) [46]. Furthermore, 15 lncRNA genes have also been associated with alterations in immunological pathways, including those involving interleukin-2 (IL-2), IL-6, IL-12, IL-12 receptor (IL-12R), IL-13, IL-17, macrophage-colony stimulating factor (M-CSF), C-X-C motif chemokine ligand 10 (CXCL10), tumor necrosis factor-related apoptosis-inducing ligand receptor (TRAIL-R) 2, IgA synthesis and IgG glycosylation (data not shown).
Fig. 2

Absolute number and percentage of detected ncRNAs whose polymorphic variants are associated with human diseases according to EBI GWAS Catalog and GeneCards database.Abbreviations: BMI, body mass index.

Table 4

Pathological conditions associated with polymorphic variants of SARS-CoV-2-matching ncRNA genes.

Human health condition, diseaseHuman ncRNAComplementary SARS-CoV-2 gene
Anthropometric indices (height, weight, body mass index, body fat mass, fat-free mass, waist-hip ratio, obesity, visceral adipose tissue measurement, subcutaneous adipose tissue measurement, waist circumference, fat distribution, hip circumference adjusted for body mass index, waist circumference adjusted for body mass index)- DIRC3- DIRC3-AS1- MYO3B-AS1- AC010168.2- AL133166.1ORF1ab
- AL118523.1- AL139260.2- ZNRF3-AS1- AC104574.2S
- AC092957.1- AC096577.1- CCNT2-AS1- LINC01358N
- COX10-AS1E
- LINC02465- AC107959.1ORF8
- CDKN2B-AS1- LINC02208- ZBED5-AS1- AC103718.1- LMCD1-AS1- LINC02872- AL359649.1ORF6
- AC135012.1- LINC02753- AC004129.3M
- FBXO30-DT- AL359715.3- AL033539.2- LINC01606ORF7a
- LINC01991- HELLPAR- LINC01579ORF7b
- AC090644.1- MIR100HG- AL356124.1ORF10
Cardiovascular diseases (arrhythmia, arterial stiffness measurement, congenital heart diseases, systolic and diastolic blood pressure, coronary artery calcification and disease, aortic root size, artery dissection and aneurysm, venous thromboembolism, heart failure, stroke, carotid atherosclerosis, myocardial infarction, mitral valve prolapse)- AC000065.1- AL162253.2- DIRC3- AL133166.1ORF1ab
- ZNRF3-AS1S
- AC096577.1- CCNT2-AS1- LINC01358N
- AC092162.2- AC005332.1E
- LINC02864- LINC02465- LINC00877ORF8
- CDKN2B-AS1- MSC-AS1- LINC02208- LMCD1-AS1- MGC15885ORF6
- AC135012.1- LINC01934- AL035250.1M
- PRANCR- LINC02405- AL033539.2ORF7a
- POC1B-AS1- AL590640.1ORF7b
- MIR100HGORF10
Cancer (breast, thyroid, colorectum, melanoma and non-melanoma skin cancer, glioma and glioblastoma, hepatocellular and renal cancer, nasopharyngeal carcinoma, endometrial and prostate cancer, oesophageal squamous cell cancer and adenocarcinoma, oral cavity cancer, acute lymphoblastic leukemia, acute myeloid leukemia, lymphoma, epithelial ovarian cancer, squamous cell lung cancer and lung adenocarcinoma, testicular germ cell tumor, gallbladder and cervical cancer, neuroblastoma, pancreatic cancer)- DIRC3- AL162253.2- AC010198.2ORF1ab
- AC104574.2- LINC01515S
- AC096577.1- LINC01358- AC010198.2N
- LINC02864- LINC02465- AC107959.1-MCM3AP-AS1ORF8
- CDKN2B-AS1- MSC-AS1- LINC00689- LINC02532- LINC02208- MEG8- AC010198.2- LMCD1-AS1- AC126323.6- AL359649.1ORF6
- LINC01934- LINC00382M
- PRANCR- PRICKLE2-AS2- AC109811.1- FAM30A- VSTM2A-OT1- AC080132.1- AL033539.2ORF7a
- LINC01579- LINC02621ORF7b
- MIR100HGORF10
Immune-mediated disorders (inflammatory bowel diseases, acute graft-versus-host disease, systemic lupus erythematosus, multiple sclerosis, psoriasis, psoriatic arthritis, ankylosing spondylitis, rheumatoid arthritis, systemic sclerosis, sarcoidosis, autoimmune thyroiditis, sclerosing cholangitis, celiac disease, type 1 diabetes mellitus, juvenile idiopathic arthritis, Takayasu arteritis, IgA deficit, atopy)- XACTS
- LINC01358N
- COX10-AS1E
- AC093765.3ORF8
- CDKN2B-AS1- CHROMR- WAKMAR2- AC008691.1- LMCD1-AS1ORF6
- LINC01934M
- XACT- LINC01991- LINC02621ORF7b
Pulmonary diseases and impairment in pulmonary function tests (FEV1, FVC, post bronchodilator FEV1/FVC ratio, asthma, forced expiratory volume, response to bronchodilator, vital capacity, chronic obstructive pulmonary disease, bronchopulmonary dysplasia, obstructive sleep apnoea during REM sleep, emphysema)- AC095060.1- DIRC3- DIRC3-AS1ORF1ab
- AC092957.1- AC096577.1- CCNT2-AS1N
- COX10-AS1E
- AC093765.3ORF8
- MSC-AS1- ZBED5-AS1- LMCD1-AS1- LINC01446- AP001021.1ORF6
- AC135012.1- LINC01934- LINC00382M
- PRANCR- PRICKLE2-AS2- AL359715.3ORF7a
- POC1B-AS1- LINC01991- HELLPARORF7b
- MIR100HGORF10
Susceptibility/response to infections (Tripanosoma cruzi, tuberculosis, mumps, rubella, leprosy, severe malaria, scarlet fever, measles, HIV, HCV, H1N1 virus, sepsis)- AC092957.1- AC096577.1N
- LINC00877ORF8
- AC008691.1- LINC02532- LINC02208- LINC01446- AP001021.1ORF6
- AC004129.3M
- PRANCRORF7a
Neuropsychiatric disorders (Alzheimer's disease and age of onset, general cognitive ability, memory performance, brain volume, mathematical ability, intelligence, cerebral cortical surface area measurement, schizophrenia, autism, generalised epilepsy, anorexia nervosa, attention-deficit/hyperactivity disorder, autism spectrum disorder, bipolar disorder, major depression, obsessive-compulsive disorder, unipolar depression, functional impairment measurement, periventricular white matter hyperintensities and white matter microstructure, PHF-tau measurement, insomnia, Parkinson's disease, education and temperament, spinal muscular atrophy type 1, childhood muscular atrophy, migraine without aura, neurofibrillary tangles, amyotrophic lateral sclerosis, caudal middle frontal gyrus volume, narcolepsy, suicide attempts in bipolar disorder or schizophrenia, sleep pattern and duration, sphingomyelin measurement, Tourette syndrome, risk-taking behaviour, brain connectivity, social communication impairment)- AC095060.1- AL162253.2- DIRC3- MYO3B-AS1- AL133166.1ORF1ab
- AC009230.1- AC034199.1- AC104574.2- LINC01515- AC110597.1S
- AC092957.1- AC096577.1- CCNT2-AS1- LINC01358- AC008170.1- LINC01033- INE2N
- COX10-AS1- AC092162.2E
- LINC02465- LINC00877- LINC00251- AC107959.1ORF8
- CDKN2B-AS1- MSC-AS1- LINC01965- AC008691.1- LINC02532- LINC02208- ZBED5-AS1- MEG8- LMCD1-AS1- LINC01446- MGC15885- AC126323.6- AP001021.1-AL359649.1ORF6
- AC135012.1- LINC01934- LINC02753-AC139795.2-AL591519.1- GRM3-AS1- AC100801.1M
- PRANKR-AC080132.1ORF7a
-AC107398.3ORF3a
-HELLPAR-LHFPL3-AS1-LINC01579-LINC02621ORF7b
- MIR100HG-AL356124.1-RNU4-74PORF10
Dysmetabolism (type 1 and 2 diabetes mellitus, dyslipidemia, uric acid serum levels, leptin serum levels)-MYO3B-AS1-AL133166.1ORF1ab
-AL118523.1-AL139260.2S
-AC092957.1-AC096577.1-CCNT2-AS1-LINC01033N
-COX10-AS1-AC092162.2E
-LINC02465-AC107959.1ORF8
-CDKN2B-AS1-MSC-AS1-AC027271.1-CHROMR- AC008691.1-LINC02532-AC103718.1-LMCD1-AS1-AL359649.1ORF6
-AC135012.1- LINC00382M
-PRANCR-LINC02405-AL033539.2-AC083973.1ORF7a
-LINC02418ORF3a
-LINC01991ORF7b
-AC090644.1-MIR100HGORF10
Endocrine gland dysfunction (hypothyroidism)-AL139260.2S
-CCNT2-AS1N
-LINC01934M
- LINC02621ORF7b
Hematopoietic cell disorders (red and white blood cell count, hematocrit, neutrophil/lymphocyte ratio, platelet count and aggregation, mean corpuscular volume, reticulocyte count, red blood cell distribution width)-LINC02354ORF1ab
-AL118523.1S
-AC092957.1-AC096577.1-CCNT2-AS1-LINC01033N
-LINC00877ORF8
-CDKN2B-AS1-SLFN12L-MSC-AS1-WAKMAR2- AC008691.1-LINC02532-NUTM2A-AS1-AC103718.1-LMCD1-AS1ORF6
-LINC01934M
-PRANCRORF7a
- POC1B-AS1-LINC01991ORF7b
-MIR100HGORF10
Renal diseases (estimated glomerular filtration rate, diabetic nephropathy, renal insufficiency)- LINC02532-LMCD1-AS1-MGC15885ORF6
-INO80-AS1ORF3a
-LINC01991-HELLPARORF7b
-AC090644.1ORF10
Cutaneous diseases (rosacea, eczema)- AC009107.2ORF1ab
-AC012020.1M
-LINC01991ORF7b
Bone disorders (heel and hip bone mineral density)- DIRC3ORF1ab
-AL118523.1-ZNRF3-AS1S
-AC096577.1N
-LINC00251ORF8
- LINC02208- ZBED5-AS1-AC103718.1- LINC02872ORF6
- FBXO30-DTORF7a
-LINC01579ORF7b
-MIR100HGORF10
Reproductive disorders (sex hormone serum levels, fertility, endometriosis)-AL162253.2ORF1ab
-AL118523.1- LINC01515S
- AC018647.2-LINC01358N
- COX10-AS1E
-LINC02465ORF8
- CDKN2B-AS1-MEG8-LMCD1-AS1-AL359649.1ORF6
- AL033539.2ORF7a
Gastrointestinal diseases (dysgeusia, hepatitis, pancreatitis, Barrett's oesophagus, dysphagia, velopharyngeal dysfunction, gut microbiota composition)- DIRC3ORF1ab
-AC092957.1-CCNT2-AS1N
-LINC00877ORF8
- LMCD1-AS1- AL359649.1ORF6
-AC135012.1M
-SPANXA2-OT1ORF3a
-AC009053.3ORF7b
-MIR100HGORF10

Abbreviations: FEV1, forced expiratory volume in the 1st second; FVC, forced vital capacity; HCV, hepatitis C virus; HIV, human immunodeficiency virus; PHF, paired helical filaments; REM, rapid eye movement.

Absolute number and percentage of detected ncRNAs whose polymorphic variants are associated with human diseases according to EBI GWAS Catalog and GeneCards database.Abbreviations: BMI, body mass index. Pathological conditions associated with polymorphic variants of SARS-CoV-2-matching ncRNA genes. Abbreviations: FEV1, forced expiratory volume in the 1st second; FVC, forced vital capacity; HCV, hepatitis C virus; HIV, human immunodeficiency virus; PHF, paired helical filaments; REM, rapid eye movement. A total of 131 polymorphisms, including SNPs, nucleotide deletions and insertions, were reported from the Ensembl.org database, which fall exactly within the SARS-CoV-2-matched nucleotide sequence of 75 (29.7%) ncRNA genes. Remarkably, no associated phenotype was described for any of them (gnomAD).

Discussion

The results of this in silico analysis show that the reverse nucleotide strand of each of the 11 SARS-CoV-2 genes can be complementary to short nucleotide sequences belonging to 252 transcripts of human ncRNA genes, which are predominantly lncRNAs. Nucleotide alignment reached 100% in 214 (85%) cases. Despite the high polymorphism of the detected ncRNA genes, no pathogenic variants were found in the SARS-CoV-2-matched nucleotide sequences. However, sequence matches occurred in 31 ncRNA gene regulatory sites and in 111 protein-binding sites of ncRNA transcripts. The abnormal binding of the SARS-CoV-2 genome to these sequences might disrupt epigenetic pathways presiding over the control of chromatin stability as well as many other cellular physiological processes, and promote the development of human disease in the long term. LncRNAs consist of non-coding RNA transcripts of >200 nucleotides in length whose role is poorly characterized although they represent the majority of transcribed genes [47]. Thanks to new technologies and computational studies, >14,000 intergenic and intragenic lncRNAs have been identified throughout the human genome [35]. Although their structure is similar to that of coding genes and they likely undergo canonical and alternative splicing, lncRNA genes typically contain 2 or fewer exons that have a severely restricted translation into functional proteins [48]. It seems unlikely that these transcripts represent a mere extension of neighboring genes. Rather, evidence suggests that their main function is the epigenetic control of gene expression, which could be operated by in cis and in trans mechanisms [35]. In this way, lncRNAs would eventually regulate several cellular processes, like proliferation, differentiation, migration and apoptosis. LncRNAs contain multiple interaction sites with proteins, such as transcription factors, and nucleic acids (microRNA, mRNA, DNA) [49] and could act as scaffolds or guides for the formation of multimolecular complexes that eventually affect the transcriptional activity of chromatin [50]. Using a method based on RNA-antisense purification, researchers showed that both snRNAs and lncRNAs can directly interact with nascent mRNA transcripts and influence their splicing, polyadenylation and cleavage [51]. Consistent with this hypothesis, lncRNAs have been localized mainly in the nucleus rather than in the cytosolic and organelle compartments and their expression appears to be up- or down-regulated in a tissue-specific manner [35]. Some of them are physiologically expressed or silenced during specific developmental stages or diseases and therefore represent ideal candidates as diagnostic or prognostic biomarkers [34]. Although it has been demonstrated that viral genomes or transcripts are capable of physically interacting with human ncRNAs on the basis of Watson-Crick complementarity, current knowledge focuses mainly on microRNAs, tRNAs, and U1snRNAs [52]. Moreover, it is now clear that viruses may be a source of viral ncRNAs, mainly microRNA, through which they evade the immune response, enhance their replication, promote the stability of their transcripts or even produce alternative transcripts resulting in viral proteins with increased virulence [15]. Given their role as microRNA sponges, human lncRNAs could be hyperproduced during viral infections with the intention of sequestering viral microRNAs. Interestingly, microRNAs are small RNA molecules of about 22 nucleotides [50] whose length perfectly matches the average number of complementary nucleotides between SARS-CoV-2 genes and human lncRNAs found in our analysis. Like cellular microRNAs, viral microRNAs can complement to mRNAs of at least 20 nucleotides in length and prevent their expression in RNA-induced silencing complexes (RISCs). Through this mechanism, viral microRNA may interfere with the transcription of genes involved in the antiviral response. A computational study predicted the formation of 8 microRNAs from the SARS-CoV-2 genome capable of disrupting the transforming growth factor-beta (TGF-β) and glucose pathways [20]. In our study, we also found 100% complementarity to SARS-CoV-2 genes involving ≥20 nucleotides in 10 ncRNA transcripts, Table 5 . In this case, it may be hypothesized that lncRNAs could sequester short sequences of SARS-CoV-2 RNA that act as microRNAs in order to antagonize the viral-induced epigenetic repression of host antiviral genes.
Table 5

Human lncRNA transcripts having 100% complementarity to SARS-CoV-2 genes and > 20 nucleotide alignment length.

TranscriptSARS-CoV-2 matched geneAlignment length (bp)DNA regulatory siteRNA protein- bindingsite (protein)Adjacent genesR-loops (n.)Human diseases or conditions associated with lncRNA gene SNPs(none of these SNPs placed in the SARS-CoV-2 complementary sequence)
ENST00000653602.1, AC000065.1ORF1ab20Yes, promoterNoCDK6RNU6-10P0Arrhythmia
ENST00000650674.1, AL162253.2ORF1ab20NoNoCD174RIC1PDCD1LG20Arterial stiffness;Alzheimer's disease;Immune checkpoints;Female fertility
ENST00000548564.1, LINC02354ORF1ab21NoYes (NISCH)HDAC7VDR0Red blood cell count
ENST00000550720.5, LINC02354ORF1ab21NoYes (NISCH)HDAC7VDR0
ENST00000550909.1, LINC02354ORF1ab21NoYes (NISCH)HDAC7VDR0
ENST00000546523.1, LINC02354ORF1ab21NoYes (NISCH)HDAC7VDR0
ENST00000550684.1, LINC02354ORF1ab21NoYes (AEBP2)HDAC7VDR0
ENST00000665074.1, AL118523.1S22NoNoATG3P1HSPE1P10Leukocyte count;Anthropometric indexes;Estradiol levels;Metabolic parameters;Mitochondrial DNA measurement;Bone mineral density
ENST00000668185.1, AL118523.1S22NoNoATG3P1HSPE1P10
ENST00000444436.1, AL118523.1S22NoYes (AEBP2)ATG3P1HSPE1P10

Abbreviations: AEBP2, AE-Binding Protein 2; ATG3P1, autophagy related 3 pseudogene 1; bp, base pair; CD174, cluster of differentiation 174; CDK6, cyclin dependent kinase 6; HDAC7, histone deacetylase 7; HSPE1P1, Heat Shock protein family E (Hsp10) member 1 Pseudogene 1; NISCH, nischarin; PDCD1LG2, programmed cell death 1 ligand 2; RIC1, Rop-Interactive Crib motif-containing protein 1; RNU6-10P, RNA U6 small nuclear 10 pseudogene; SNPs, single nucleotide polymorphisms; VDR, vitamin D receptor.

Human lncRNA transcripts having 100% complementarity to SARS-CoV-2 genes and > 20 nucleotide alignment length. Abbreviations: AEBP2, AE-Binding Protein 2; ATG3P1, autophagy related 3 pseudogene 1; bp, base pair; CD174, cluster of differentiation 174; CDK6, cyclin dependent kinase 6; HDAC7, histone deacetylase 7; HSPE1P1, Heat Shock protein family E (Hsp10) member 1 Pseudogene 1; NISCH, nischarin; PDCD1LG2, programmed cell death 1 ligand 2; RIC1, Rop-Interactive Crib motif-containing protein 1; RNU6-10P, RNA U6 small nuclear 10 pseudogene; SNPs, single nucleotide polymorphisms; VDR, vitamin D receptor. Host lncRNAs might also directly complement to viral mRNAs and interfere with the viral lifecycle by preventing the maturation and translation of viral transcripts. The results of a recent computational study highlight that the human lncRNAs H19, FENDRR, HOTAIR and LINC01505 may potentially interact with the SARS-CoV-2 spike mRNA and this event would be of critical importance in the development of pulmonary complications given the role of H19 in the pathogenesis of pulmonary arterial hypertension [53]. In some cases, lncRNAs may induce the potentiation of immune pathways leading to an antiviral response and, in predisposed individuals, autoimmunity or autoinflammation. Indeed, studies have shown that many lncRNAs are highly expressed in CD4+ and CD8+ T lymphocytes and can upset the T helper (Th)1/Th2 cell subpopulations [47]. In addition, some lncRNAs may control macrophage polarization [54] and Th17 cell differentiation [55]. Following this theory, it could be hypothesized that the immune-mediated manifestations observed in some patients with COVID-19 result in part from the formation of lncRNAs that activate pro-inflammatory pathways. A few lncRNAs present in our database have been associated with immune mechanisms involved in the antiviral response. The two lncRNAs SLFN12L and NUTM2A-AS1, which correspond to ORF6, have been reported to be involved in the control of either innate or acquired immunity. Specifically, SLFN12L may be induced by type I interferon (IFN) and is typically downregulated during T-cell activation [56], while NUTM2A-AS1 may modulate the expression of the High-Mobility Group Box 1 (HMGB1) protein secreted by monocyte/macrophage cells in response to pathogens [57]. Regarding B-cell immunity, it has been shown that the lncRNA FAM30A, which contains a complementary sequence to the ORF7a gene, upregulates antibody production and can influence the response to vaccines [58]. Finally, a very recent work demonstrated an association between the lncRNA LINC00278, which matches SARS-CoV-2 ORF7b gene, and the severity of respiratory syncytial virus (RSV)-induced viral bronchiolitis [59]. On this basis, it may be postulated that complementation of human ncRNA to the SARS-CoV-2 genome redirects both innate and acquired immunity in a manner that favors SARS-CoV-2 replication. In addition, viruses can control the expression of lncRNAs involved in metabolic pathways that are beneficial for their survival. In this context, recent research has described the role of the lncRNA ACOD1 in promoting viral replication. Virus-induced upregulation of ACOD1 may actually promote infection by increasing the activity of the metabolic enzyme glutamic-oxaloacetic transaminase 2 (GOT2) via an IFN-independent mechanism [60]. Importantly, ACOD1 was not annotated in our list and to our knowledge its association with COVID-19 remains unexplored. Finally, epigenetic crosstalk between virus and host may also promote carcinogenesis in the long term. A recent paper demonstrated the upregulation of the lncRNA CDKN2B-AS1, matching ORF6 in our analysis, in tissue sections from human papillomavirus (HPV)-positive individuals with head and neck squamous cell carcinoma compared to controls [61], providing intriguing insight about the link between SARS-CoV-2 infection and tumorigenesis. With respect to COVID-19, a number of papers show an aberrant expression of lncRNAs in infected individuals [[21], [22], [23]]. These data are consistent with the results of a deep-sequencing study performed in an animal model of SARS-CoV infection and support the hypothesis that lncRNA transcription may represent a common tract of cellular response to viral infection, which is in turn related to the potentiation of innate immunity [62]. In subjects with COVID-19, GO-analysis showed that hyper-expressed lncRNAs can have a broad spectrum of action in cis- or in trans-regulation. They direct Wingless-related integration site (Wnt)/β-catenin and IL-1-mediated signaling pathways, control protein synthesis, transport, phosphorylation and degradation as well as autophagy, angiogenesis and migration of fibroblasts and immune cells [63]. A whole transcriptome study conducted on peripheral blood mononuclear cells (PBMCs) collected from COVID-19 patients during treatment, convalescence and rehabilitation found 405 differentially expressed lncRNAs that included CCNT2-AS1, SLFN12L, NUTM2A-AS1, LMCD1-AS1 and POC1B-AS1, which were also found in our analysis [64]. Although none of them was significantly associated with a specific disease stage, the results showed the hyper-expression of the snRNA RNVU1–4, which corresponds to the SARS-CoV-2 gene ORF6, during recovery. SARS-CoV-2-infected patients with more severe course of disease typically show lymphopenia with exhaustion of CD4+ Th1, Treg and CD8+ T cells and an increase in peripheral neutrophils with overproduction of innate immunity cytokines [64]. These features may be related to differential genetic landscapes, which also include ncRNA genes [[64], [65], [66]]. Some lncRNAs, such as TSLNC8, MALAT1, NEAT1 and GAS5, have been reported to influence the secretion of IL-6 and the formation of inflammasome platforms, two processes that typically characterize innate immunity [66]. T cell reconstitution during COVID-19 recovery has instead been associated with the lncRNAs CCR7-AS-1, LEF1-AS-1, LINC-CCR7–2, LINC-TCF7–1 and TCF7-AS-1 [64]. Remarkably, none of the above ncRNAs were present in our database. Epigenetic hyperactivation of pathways related to potentiation of the acquired immune response may however lead to long-term transition to established immune-mediated diseases, especially in individuals with poor clearance of nucleic acids and defects in apoptosis [67]. In addition, two studies reported the upregulation of the lncRNA LINC00278 in PBMCs from severe COVID-19 patients compared with non-severe patients [22] and of the lncRNA AL139260.2 in SARS-CoV-2-infected normal human bronchial epithelial (NHBE) cells compared with non-infected cells [23]. According to our analysis, the lncRNAs LINC00278 and AL139260.2 contain a complementary sequence to SARS-CoV-2 S and ORF7b genes, respectively. LINC00278 is normally expressed in whole blood and its upregulation may represent an attempt to prevent viral replication. AL139260.2, on the other hand, is normally expressed in testis, heart, and adipose tissue and polymorphic variants have been associated with obesity and dysmetabolism. The upregulation of AL139260.2 in obese and dysmetabolic individuals may be responsible for a more severe course of COVID-19 as reported in several epidemiological studies [68]. The complementary sequence of SARS-CoV-2 may bind AL139260.2 within a promoter site and thus affect the transcription of neighboring genes, such as AL139260.3, MYCBP, RRAGC, which may induce pulmonary fibrosis via a Myc-dependent mechanism [69]. In another RNA-seq study, 21 lncRNAs were found to be up- or downregulated in NHBE cells 24 h after SARS-CoV-2 infection [21]. Among them, the lncRNA FAM106A, which is also present in our database and shows sequence complementarity to the SARS-CoV-2 ORF6 gene, was significantly hypo-expressed. According to the authors, FAM106A interacts with miRNA let-7c, miRNA let-7f and miRNA-185-5p, which are involved in the Janus kinase (JAK)-signal transducer and activator of transcription (STAT), the Wnt/β-catenin and the mitogen-activated protein kinase (MAPK) pathways. Therefore, it may be hypothesized that SARS-CoV-2 can induce the downregulation of FAM106A via direct nucleotide binding, leading to the overexpression of FAM106A-target miRNAs and eventually to pro-inflammatory and pro-fibrotic events. The potentiation of the Wnt/β-catenin pathway during COVID-19 may also be attributed to the interaction between the SARS-CoV-2 ORF6, ORF7a and ORF10 genes and the lncRNAs MSC-AS1/LINC000689, LINC01606 and MIR100HG, respectively [[70], [71], [72], [73]]. LncRNAs found in other studies and associated with COVID-19-induced neurological damage or cytokine storm [65,74] were not present in our database, but these conflicting results might depend on the high tissue-selectivity and time-dependent expression exhibited by these transcripts. For instance, the SARS-CoV-2 gene-matching ncRNAs AC034199.1, COX10-AS1, AC005332.1, AC107068.1, LINC00877, SLFN12L, RNVU1-4, AC100801.1 and LINC00278 retrieved in our analysis show a blood tissue specificity, while pulmonary localization was identified only for the ncRNAs AC110597.1, AC107959.1 and MCM3AP-AS1. As respiratory tissues and leukocytes are the primary targets of infection, these transcripts would play a crucial role in epigenetically controlling the first steps of infection. However, literature data support an alternative SARS-CoV-2 access route via the olfactory tract and thus the central nervous system [75]. Interestingly, 32 SARS-CoV-2-complementary human lncRNAs listed in our database have a central nervous system-selective expression, and this would be of utmost importance for the epigenetic regulation of viral replication or clearance in this site. However, as shown in other studies [64], human lncRNA expression in tissue may change longitudinally during the course of infection, thus affecting COVID-19 outcome. The results of our analysis revealed 13 matches between the SARS-CoV S, N, E, ORF8, ORF6, M and ORF7b genes and human lncRNAs whose polymorphic variants have been associated with a spectrum of immunological diseases [[76], [77], [78]]. These include IBD [77,79,80], acute Graft-Versus-Host Disease (aGVHD) [81], systemic lupus erythematosus (SLE) [[82], [83], [84]], MS [[85], [86], [87]], PsO or atopic dermatitis [76,77,88], systemic sclerosis (SSc) [89,90], rheumatoid arthritis (RA) [[91], [92], [93]] and ankylosing spondylitis (AS) [77,94]. However, it cannot be excluded that hyper-expression of SARS-CoV-2-complementary lncRNAs may have a protective role against infection in in patients with full-blown autoimmune diseases. Complementary ncRNAs may act as decoys for viral RNA genomes and compete with them for binding pattern recognition receptors (PRRs) in the cytosol and endosomes. By preventing viral nucleic acid from interacting with sensing platforms, lncRNAs would eventually silence downstream activation of the innate immune response [15]. However, chronic fomentation of this mechanism might have a long-term negative effect on immunosurveillance against pathogens and even transformed cells. Polymorphisms of CDKN2B-AS1, a lncRNA gene containing an ORF6-complementary sequence, have been associated with MS and type II diabetes mellitus [85,95], but evidence suggests that this gene promotes the growth and metastasis of human hepatocellular carcinoma by targeting the microRNA let-7c-5p/NAP1L1 axis [32]. The lncRNA WAKMAR2, which also corresponds to a sequence within the ORF6 gene, has been linked to several immune-mediated disorders in GWAS [76,84,85,88,89]. This transcript is particularly abundant in the cytosol and nucleus of keratinocytes [33], where it could be expressed upon stimulation by TGF-β and Smad3 signaling. It has been suggested that WAKMAR2 promotes wound healing and skin re-epithelialization while preventing the expression of chemokines, such as IL-8 and CXCL5, and the activation of the nuclear factor-kB (NF-κB) cascade. Remarkably, TGF-β is a key-cytokine in the development of SARS-CoV-2 pulmonary fibrosis [8] and is also associated with carcinogenesis [96], SSc and interstitial lung disease [97,98]. The ORF6-matching lncRNA LMCD1-AS1 gene, which is associated with SSc risk in Iranian and Turkish populations [90], is also a certain oncogene for osteosarcoma [99], cholangiocarcinoma, hepatocellular carcinoma and thyroid cancer [100]. All these data suggest that disruption of this delicate epigenetic balance by SARS-CoV-2 might potentially lead to immune-mediated diseases as well as cancer. Unlike patients with autoimmune diseases [101], in whom hyper-activity of immune pathways related to the antiviral response might even be useful to counteract the infection, cancer patients usually suffer from a burden of additional comorbidities that expose them to more severe forms of COVID-19 compared to the general population [102]. Furthermore, although evidence is lacking, the latter may also have impaired clearance of the virus, whose persistence within host cells could epigenetically accelerate cancer progression. In our analysis, the SARS-CoV-2 ORF6 and ORF10 genes contained sequences showing a Watson–Crick complementarity to two human snRNAs. These are ncRNAs that regulate transcription, splicing and polyadenylation of nascent mRNA transcripts in the nucleus by recruiting specific adaptors such as the Smith (Sm) proteins [103]. Of note, Sm and other small nuclear ribonucleoproteins (snRNPs) contain multiple epitopes recognized by pathognomonic autoantibodies in mixed connective tissue disease (MCTD) and SLE [104,105]. In this case, SARS-CoV-2 sequence complementarity could disrupt mRNA processing or create new epitopes in snRNPs that could fuel autoimmunity on the ground of a favorable pro-inflammatory background triggered by infection. In this regard, the hyperexpression of RNVU1-4 during COVID-19 recovery coinciding with T-cell response reconstitution [64] should deserve further investigation. In summary, two scenarios could be depicted based on our findings, Fig. 3 . In the first scenario, impaired expression of human ncRNAs might be pre-existent in individuals with certain diseases or disease predispositions and not induced by infection, towards which they may instead play a protective role. In the case of up-regulation, these transcripts could sequester SARS-CoV-2 mRNAs, preventing translation into viral proteins and stimulation of PRRs. This could ultimately lead to either a weakening of the innate immune response or an inhibition of viral replication. Although some studies show the opposite [106], it may be hypothesized that this mechanism functions as a kind of “genetic immune system” that blocks the initial steps of viral infections. In support of this view, we found that polymorphisms of most of the detected ncRNA genes were associated with neurodegenerative and neuropsychiatric diseases and there is evidence that approximately 40% of lncRNAs are expressed in the mammalian brain during neurogenesis and neuronal differentiation [34]. Consequently, humans with neurological diseases may have impaired expression of these ncRNAs, with unfavorable repercussions on SARS-CoV-2 infection. In line with this hypothesis, a recent UK Biobank study found an increased risk of complicated COVID-19 in Alzheimer's disease patients [107].
Fig. 3

Hypothetical scenarios triggered by SARS-CoV-2 and host nucleic acid crosstalk.

In the first scenario (a), ncRNAs are pre-existent and hyper-expressed in a cell undergoing SARS-CoV-2 infection. Due to sequence complementarity to SARS-CoV-2 RNA, these transcripts may intercept the viral genome in the cytosol and prevent translation into functional proteins and interaction with PRRs. In addition, they may compete with viral RNA for PRRs and thus mediate a downstream inhibitory signal on the activation of the immune response.

In the second scenario (b), SARS-CoV-2 infection may alter the expression of ncRNAs. Taking advantage by its sequence complementarity, SARS-CoV-2 RNA may interfere with the binding of transcription factors and other proteins to regulatory sites of lncRNA genes, thereby indirectly affecting the transcription of adjacent genes. This would lead to a profound alteration of the epigenetic landscape that eventually translates into uncontrolled proliferation pathways. Furthermore, binding of the SARS-CoV-2 genome to complementary snRNA sequences may generate novel epitopes within the RNP complex that fuel autoimmunity.

Abbreviations: SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus 2; ncRNA, non-coding RNA; PRR: pattern recognition receptor; snRNA, small nuclear RNA.

Hypothetical scenarios triggered by SARS-CoV-2 and host nucleic acid crosstalk. In the first scenario (a), ncRNAs are pre-existent and hyper-expressed in a cell undergoing SARS-CoV-2 infection. Due to sequence complementarity to SARS-CoV-2 RNA, these transcripts may intercept the viral genome in the cytosol and prevent translation into functional proteins and interaction with PRRs. In addition, they may compete with viral RNA for PRRs and thus mediate a downstream inhibitory signal on the activation of the immune response. In the second scenario (b), SARS-CoV-2 infection may alter the expression of ncRNAs. Taking advantage by its sequence complementarity, SARS-CoV-2 RNA may interfere with the binding of transcription factors and other proteins to regulatory sites of lncRNA genes, thereby indirectly affecting the transcription of adjacent genes. This would lead to a profound alteration of the epigenetic landscape that eventually translates into uncontrolled proliferation pathways. Furthermore, binding of the SARS-CoV-2 genome to complementary snRNA sequences may generate novel epitopes within the RNP complex that fuel autoimmunity. Abbreviations: SARS-CoV-2, Severe Acute Respiratory Syndrome Coronavirus 2; ncRNA, non-coding RNA; PRR: pattern recognition receptor; snRNA, small nuclear RNA. In the second scenario, SARS-CoV-2 infection would be the starting point for aberrant expression of ncRNAs in human cells, which could lead to long-term health complications. SARS-CoV-2 nucleic acids could enhance or repress the transcription of ncRNAs by binding the corresponding nucleotide sequences on the human genome. We found that SARS-CoV-2 gene complementarities lie within 31 regulatory sites whose neighboring coding genes may be involved in oncological, immunological, neurological, cardiovascular, pulmonary, metabolic, and musculoskeletal diseases. In addition, our results show that SARS-CoV-2 sequences may disrupt interactions between lncRNAs transcripts and transcription factors or other regulatory RNA- and DNA-binding proteins, potentially leading to abnormal activation of downstream signaling pathways associated with cancer and autoimmunity. Finally, interaction with snRNAs may contribute to the formation of self-epitopes within the RNP complex, increasing the risk of autoimmune diseases. These nuclear effects presuppose that SARS-CoV-2 RNA may cross the nuclear membrane and localize in the nucleus. Interestingly, a recent paper based on computational analysis reported that SARS-CoV-2 RNA may have a subcellular residency within the nucleolus or mitochondrial matrix of host cells [108]. The authors found that among all ORF3a, S, ORF7b, ORF8, ORF6 and ORF7a showed the strongest residency signal towards the nucleolus. Trafficking of the SARS-CoV-2 RNA, either as a positive or negative strand, within the nucleus could explain a plausible interaction with the human lncRNAs MEG8, FAM30A and MIR100HG, which show a nuclear localization and, according to our analysis, correspond to ORF6, ORF7a and ORF10 sequences, respectively. Further confirmation comes from an in vitro study by Zhang et al. showing that SARS-CoV-2 RNA could be retrotranscribed and integrated into the human genome [109]. This event would occur mainly in individuals with enhanced activity of Long Interspersed Nuclear Elements-1 (LINEs-1) and telomerase, which may be induced by the infection itself or by chronic cytokine stimulation or other signaling pathways occurring in cancer or autoimmune diseases [110,111]. A major limitation of this study lies in the in silico design that prevents from extensively investigating the dynamic expression and interactions between SARS-CoV-2 genes and host ncRNAs during disease progression. Further in vitro or ex-vivo studies are needed to explore how host SARS-CoV-2-complementary lncRNAs change after virus invasion and subsequently affect virus replication.

Conclusion

This in silico study suggests the possibility of Watson-Crick complementarity between SARS-CoV-2 RNA and human ncRNAs, including lncRNAs and snRNAs. The matches may involve either chromatin regulatory sequences or RNA protein-binding sites, thus affecting the transcription of multiple genes associated with human diseases. Although the possibility of direct base-pairing between viral RNA and host ncRNA remains to be further confirmed in vitro, it seems plausible that SARS-CoV-2 infection could lead to aberrant virus-host nucleic acid crosstalk with long-term implications for human health. Polymorphic variants of the retrieved ncRNAs could be associated with different COVID-19 outcomes (e.g., severe forms versus asymptomatic cases) and long-term complications and therefore represent potential biomarkers for identifying individuals at higher risk of severe disease. The following are the supplementary data related to this article.

Fig. S1

Sequence homologies between SARS-CoV-2 genes and human ncRNAs. Figure shows an example of each SARS-CoV-2 gene hit and indicates the complementary human ncRNA sequence having 100% alignment identity and highest alignment score according to the Ensembl.org database.

Fig. S2

Examples of predicted interactions between SARS-CoV-2 genes and human ncRNAs. Computational analysis was carried out by using IntaRNA (http://rna.informatik.uni-freiburg.de/IntaRNA/Input.jsp).

CRediT authorship contribution statement

Rossella Talotta: Conceptualization, Formal analysis, Investigation, Project administration, Writing – original draft, Writing – review & editing, Visualization. Shervin Bahrami: Conceptualization, Supervision. Magdalena Janina Laska: Conceptualization, Supervision, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  108 in total

1.  Hypomethylation of LINE-1 but not Alu in lymphocyte subsets of systemic lupus erythematosus patients.

Authors:  Jeerawat Nakkuntod; Yingyos Avihingsanon; Apiwat Mutirangura; Nattiya Hirankarn
Journal:  Clin Chim Acta       Date:  2011-04-07       Impact factor: 3.786

2.  QmRLFS-finder: a model, web server and stand-alone tool for prediction and analysis of R-loop forming sequences.

Authors:  Piroon Jenjaroenpun; Thidathip Wongsurawat; Surya Pavan Yenamandra; Vladimir A Kuznetsov
Journal:  Nucleic Acids Res       Date:  2015-04-16       Impact factor: 16.971

3.  Analysis of the genetic component of systemic sclerosis in Iranian and Turkish populations through a genome-wide association study.

Authors:  David González-Serna; Elena López-Isac; Neslihan Yilmaz; Farhad Gharibdoost; Ahmadreza Jamshidi; Hoda Kavosi; Shiva Poursani; Faraneh Farsad; Haner Direskeneli; Guhrer Saruhan-Direskeneli; Sofia Vargas; Amr H Sawalha; Matthew A Brown; Sule Yavuz; Mahdi Mahmoudi; Javier Martin
Journal:  Rheumatology (Oxford)       Date:  2019-02-01       Impact factor: 7.580

4.  Methods for the study of long noncoding RNA in cancer cell signaling.

Authors:  Yi Feng; Xiaowen Hu; Youyou Zhang; Dongmei Zhang; Chunsheng Li; Lin Zhang
Journal:  Methods Mol Biol       Date:  2014

5.  A genome-wide association study identified AFF1 as a susceptibility locus for systemic lupus eyrthematosus in Japanese.

Authors:  Yukinori Okada; Kenichi Shimane; Yuta Kochi; Tomoko Tahira; Akari Suzuki; Koichiro Higasa; Atsushi Takahashi; Tetsuya Horita; Tatsuya Atsumi; Tomonori Ishii; Akiko Okamoto; Keishi Fujio; Michito Hirakata; Hirofumi Amano; Yuya Kondo; Satoshi Ito; Kazuki Takada; Akio Mimori; Kazuyoshi Saito; Makoto Kamachi; Yasushi Kawaguchi; Katsunori Ikari; Osman Wael Mohammed; Koichi Matsuda; Chikashi Terao; Koichiro Ohmura; Keiko Myouzen; Naoya Hosono; Tatsuhiko Tsunoda; Norihiro Nishimoto; Tsuneyo Mimori; Fumihiko Matsuda; Yoshiya Tanaka; Takayuki Sumida; Hisashi Yamanaka; Yoshinari Takasaki; Takao Koike; Takahiko Horiuchi; Kenshi Hayashi; Michiaki Kubo; Naoyuki Kamatani; Ryo Yamada; Yusuke Nakamura; Kazuhiko Yamamoto
Journal:  PLoS Genet       Date:  2012-01-26       Impact factor: 5.917

6.  lncRNA MIR100HG-derived miR-100 and miR-125b mediate cetuximab resistance via Wnt/β-catenin signaling.

Authors:  Yuanyuan Lu; Xiaodi Zhao; Qi Liu; Cunxi Li; Ramona Graves-Deal; Zheng Cao; Bhuminder Singh; Jeffrey L Franklin; Jing Wang; Huaying Hu; Tianying Wei; Mingli Yang; Timothy J Yeatman; Ethan Lee; Kenyi Saito-Diaz; Scott Hinger; James G Patton; Christine H Chung; Stephan Emmrich; Jan-Henning Klusmann; Daiming Fan; Robert J Coffey
Journal:  Nat Med       Date:  2017-10-16       Impact factor: 53.440

7.  Alzheimer's and Parkinson's Diseases Predict Different COVID-19 Outcomes: A UK Biobank Study.

Authors:  Yizhou Yu; Marco Travaglio; Rebeka Popovic; Nuno Santos Leal; Luis Miguel Martins
Journal:  Geriatrics (Basel)       Date:  2021-01-26

8.  Suppressor of Ty homolog-5, a novel tumor-specific human telomerase reverse transcriptase promoter-binding protein and activator in colon cancer cells.

Authors:  Rui Chen; Jing Zhu; Yong Dong; Chao He; Xiaotong Hu
Journal:  Oncotarget       Date:  2015-10-20

9.  The mutational constraint spectrum quantified from variation in 141,456 humans.

Authors:  Konrad J Karczewski; Laurent C Francioli; Grace Tiao; Beryl B Cummings; Jessica Alföldi; Qingbo Wang; Ryan L Collins; Kristen M Laricchia; Andrea Ganna; Daniel P Birnbaum; Laura D Gauthier; Harrison Brand; Matthew Solomonson; Nicholas A Watts; Daniel Rhodes; Moriel Singer-Berk; Eleina M England; Eleanor G Seaby; Jack A Kosmicki; Raymond K Walters; Katherine Tashman; Yossi Farjoun; Eric Banks; Timothy Poterba; Arcturus Wang; Cotton Seed; Nicola Whiffin; Jessica X Chong; Kaitlin E Samocha; Emma Pierce-Hoffman; Zachary Zappala; Anne H O'Donnell-Luria; Eric Vallabh Minikel; Ben Weisburd; Monkol Lek; James S Ware; Christopher Vittal; Irina M Armean; Louis Bergelson; Kristian Cibulskis; Kristen M Connolly; Miguel Covarrubias; Stacey Donnelly; Steven Ferriera; Stacey Gabriel; Jeff Gentry; Namrata Gupta; Thibault Jeandet; Diane Kaplan; Christopher Llanwarne; Ruchi Munshi; Sam Novod; Nikelle Petrillo; David Roazen; Valentin Ruano-Rubio; Andrea Saltzman; Molly Schleicher; Jose Soto; Kathleen Tibbetts; Charlotte Tolonen; Gordon Wade; Michael E Talkowski; Benjamin M Neale; Mark J Daly; Daniel G MacArthur
Journal:  Nature       Date:  2020-05-27       Impact factor: 69.504

View more
  2 in total

Review 1.  The leptomeninges as a critical organ for normal CNS development and function: First patient and public involved systematic review of arachnoiditis (chronic meningitis).

Authors:  Carol S Palackdkharry; Stephanie Wottrich; Erin Dienes; Mohamad Bydon; Michael P Steinmetz; Vincent C Traynelis
Journal:  PLoS One       Date:  2022-09-30       Impact factor: 3.752

2.  HYGIEIA: HYpothesizing the Genesis of Infectious Diseases and Epidemics through an Integrated Systems Biology Approach.

Authors:  Bradley Ward; Jean Cyr Yombi; Jean-Luc Balligand; Patrice D Cani; Jean-François Collet; Julien de Greef; Joseph P Dewulf; Laurent Gatto; Vincent Haufroid; Sébastien Jodogne; Benoît Kabamba; Sébastien Pyr Dit Ruys; Didier Vertommen; Laure Elens; Leïla Belkhir
Journal:  Viruses       Date:  2022-06-23       Impact factor: 5.818

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.