Literature DB >> 32864402

Dataset of mutational analysis, miRNAs targeting SARS-CoV-2 genes and host gene expression in SARS-CoV and SARS-CoV-2 infections.

Rahila Sardar1,2, Deepshikha Satish1, Shweta Birla1, Dinesh Gupta1.   

Abstract

The identification of host-miRNAs targeting mutated virus genes is crucial to understand the miRNA mediated host-defense mechanism in virus infections. To understand the mechanism in COVID-19 infections, we collected genome sequences of SARS-CoV-2 with its metadata from the GISAID database (submitted till April 2020) and identified mutational changes in the sequences. The dataset consists of genes with mutation event count and entropy scores. We predicted host-miRNAs targeting the genes in the genomes and compared it with that in related viral species. We have identified 2284 miRNAs targeting MERS genomes, 2074 miRNAs targeting SARS genomes, and 1599 miRNAs targeting SARS-CoV-2 genomes, identified using the miRNA target prediction software miRanda. The host miRNAs targeting SARS-CoV-2 genes were further validated to be anti-viral miRNAs and their role in respiratory diseases through a literature survey, which helped in the identification of 42 conserved antiviral miRNAs. The data could be used to validate the anti-viral role of the predicted miRNAs and design miRNA-based therapeutics against SARS-CoV-2.
© 2020 The Author(s).

Entities:  

Keywords:  Antiviral; COVID-19; Gene expression; Mutation; miRNA

Year:  2020        PMID: 32864402      PMCID: PMC7442128          DOI: 10.1016/j.dib.2020.106207

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the Data The data presented here consists of mutational information such as event count and entropy score of SARS-CoV-2 genomes from different countries with the miRNA targets that could be used to understand host-virus interaction mechanisms and to design miRNA-based therapeutics. The data can be useful to the virologists, biologists, bioinformaticians, pharmacologists, and biochemists interested in investigating the role of host miRNAs during SARS-CoV-2 infection. The previously validated 42 antiviral miRNAs, which are also predicted to target SARS-CoV-2 genes, may be used to design experiments for miRNA-based therapeutics for COVID-19. Out of these 42 miRNAs, 12 are artificial miRNAs that are experimentally validated to inhibit HIV-1 replication without any off-targets. The gene expression data analysis will help to understand the host response to SARS-CoV-2 infection and may be used for comparative analysis with other viral infections. Worldwide research efforts are on to control the COVID-19 pandemic. The dataset provides useful information that may be explored experimentally towards the development of alternate therapies. Hence the data could potentially make an impact on society.

Data description

The SARS-CoV-2 reference genome shares 83% aa sequence similarity with the SARS-CoV genome, the details are presented in Table S1. Table S1 also provides details of aa and nucleotide variations in the SARS-CoV-2 genome sequence with that of SARS-CoV. Mutational analysis results for the SARS-CoV-2 sequences downloaded from GISAID are stored as an excel file which consists of information of entropy scores (between 0 and 1) for the 28 SARS-CoV-2 genes, with genomic location of the mutations, in Table S2 (Sheet1 named as Entropy). The mutation event counts from different countries with corresponding SARS-CoV-2 genes are present in Table S2 (Sheet2 named as events). The list of miRNAs targeting SARS, MERS, and SARS-CoV-2 reference genomes is available as an excel file named Table S3. The file lists the predicted targets with corresponding miRanda score, binding energy, miRNA start and end site, target start and end site, alignment length, similarity, and identity percentage. These miRNAs were compared with a list of experimentally proven anti-viral miRNAs (against any known virus) from the literature. We found 42 such miRNAs, the details of these miRNAs with miRBase accession ID, miRNA sequence, viral target with their UniProt ID and PubMed accession number (PMID) is available in the Table S3, sheet3 (named as 42_miRNA). The gene expression datasets representing virus-infected human cell lines were retrieved from the GEO database. We downloaded the NCBI GEO SARS-CoV microarray dataset (GEO ID GSE17400) and SARS-CoV-2 RNA sequencing dataset (GEO ID GSE147507) (Table S4), analysis sample collected at 24 hr in both the datasets. The SARS-COV-2 data, generated from A459 and NHBE cell lines, showed 131 and 606 differentially expressed genes respectively. During SARS-CoV infection, 2796 genes were differentially expressed (Table S4).

Experimental design, materials and methods

High coverage and complete SARS-CoV-2 genome sequences and its corresponding metadata, submitted till 15th April 2020, were retrieved from the GISAID database [1]. The SARS-CoV (NC_004718.3) and MERS (KC164505.2) genomes were downloaded from the NCBI genome database and compared with SARS-CoV-2 reference genome sequence (NC_045512.2; sequence from Wuhan, China). Complete, high coverage SARS-CoV-2 genomes from the GISAID database were subjected to mutational analysis using Genome Detective Coronavirus subtyping Tool (version 1.1.3) [2]. To remove redundancy in the sequences, the genome nucleotide sequences that share 99.99% similarity with the reference genome (NC_004718.3) were excluded from further analysis. We downloaded the sequences of all the available 2654 mature human miRNAs from the miRBase (Release 22.1) [3]. Additionally, we also surveyed the literature to identify experimentally validated antiviral miRNAs, including artificial miRNAs. Comparing our human miRNA target predictions for the three viruses and the list of antiviral miRNAs (against any known virus) from the literature survey, we found 42 miRNAs to be common. Twelve out of these are artificial miRNAs, however these also target the three viral genomes studied by us. These miRNAs were used to identify potential miRNA target sites in the virus genome sequences, using miRanda (3.3 a version) [4], with an energy threshold of −20 kcal/mol, a threshold used in other studies too [5]. SARS-CoV (NC_004718.3), MERS (NC_019843.3), and SARS-CoV-2 isolate from Wuhan (NC_045512.2) were used as target viral genomes. Differentially expressed genes with the p-value <=0.005 for SARS-CoV and SARS-CoV-2 expression data were used for the analysis. The details of the analysis of these datasets are available from Sardar et al. [6].

Ethics statement

The dataset is based on bioinformatic analysis; therefore, no animal has been used and/or harmed in the present investigation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.
SubjectBioinformatics, Genetics and Molecular Biology
Specific subject areaBioinformatics
Type of dataExcel Files
How data were acquiredGISAIDNCBI GenomemiRBasemiRandaGEO database
Data formatSecondary data. The secondary data Excel files have been uploaded.
Parameters for data collectionThe Genome Detective tool was used for mutational analysis. miRanda (3.3 a version), with an energy threshold of −20 kcal/mol. Differentially expressed genes were filtered for analysis with p-value <=0.005.
Description of data collectionEntropy calculations were performed using metadata extracted from GISAID. miRNA targets in the virus genomes were obtained using the miRanda tool.
Data source locationGISAIDNCBI-GenomemiRBaseGEO dataset ID: GSE17400 and GSE147507
Data accessibilityWith the article
Related research articleJournal:R. Sardar, D. Satish, S. Birla and D. Gupta. Integrative analyses of SARS-CoV-2 genomes from different geographical locations reveal unique features potentially consequential to host-virus interaction, pathogenesis and clues for novel therapies, Heliyon. In Press.
  5 in total

1.  miRBase: microRNA sequences, targets and gene nomenclature.

Authors:  Sam Griffiths-Jones; Russell J Grocock; Stijn van Dongen; Alex Bateman; Anton J Enright
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

Review 2.  The Potential for microRNA Therapeutics and Clinical Research.

Authors:  Johora Hanna; Gazi S Hossain; Jannet Kocerha
Journal:  Front Genet       Date:  2019-05-16       Impact factor: 4.599

3.  Data, disease and diplomacy: GISAID's innovative contribution to global health.

Authors:  Stefan Elbe; Gemma Buckland-Merrett
Journal:  Glob Chall       Date:  2017-01-10

4.  Genome Detective Coronavirus Typing Tool for rapid identification and characterization of novel coronavirus genomes.

Authors:  Sara Cleemput; Wim Dumon; Vagner Fonseca; Wasim Abdool Karim; Marta Giovanetti; Luiz Carlos Alcantara; Koen Deforche; Tulio de Oliveira
Journal:  Bioinformatics       Date:  2020-06-01       Impact factor: 6.937

5.  The microRNA.org resource: targets and expression.

Authors:  Doron Betel; Manda Wilson; Aaron Gabow; Debora S Marks; Chris Sander
Journal:  Nucleic Acids Res       Date:  2007-12-23       Impact factor: 16.971

  5 in total
  4 in total

1.  Deregulated miRNA expression is associated with endothelial dysfunction in post-mortem lung biopsies of COVID-19 patients.

Authors:  Ariana Centa; Aline S Fonseca; Solange G da Silva Ferreira; Marina Luise V Azevedo; Caroline B Vaz de Paula; Seigo Nagashima; Cleber Machado-Souza; Anna Flavia R Dos Santos Miggiolaro; Cristina P Baena; Lucia de Noronha; Luciane R Cavalli
Journal:  Am J Physiol Lung Cell Mol Physiol       Date:  2020-12-02       Impact factor: 5.464

Review 2.  Gene Network Analysis of the Transcriptome Impact of SARS-CoV-2 Interacting MicroRNAs in COVID-19 Disease.

Authors:  Alexandra Ioana Moatar; Aimee Rodica Chis; Catalin Marian; Ioan-Ovidiu Sirbu
Journal:  Int J Mol Sci       Date:  2022-08-17       Impact factor: 6.208

3.  Expression analysis of miRNA hsa-let7b-5p in naso-oropharyngeal swabs of COVID-19 patients supports its role in regulating ACE2 and DPP4 receptors.

Authors:  Andrea Latini; Chiara Vancheri; Francesca Amati; Elena Morini; Sandro Grelli; Matteucci Claudia; Petrone Vita; Vito Luigi Colona; Michela Murdocca; Massimo Andreoni; Vincenzo Malagnino; Massimiliano Raponi; Dario Cocciadiferro; Antonio Novelli; Paola Borgiani; Giuseppe Novelli
Journal:  J Cell Mol Med       Date:  2022-09-08       Impact factor: 5.295

Review 4.  Nucleic Acid-Based COVID-19 Therapy Targeting Cytokine Storms: Strategies to Quell the Storm.

Authors:  Mai Abdel Haleem Abusalah; Moad Khalifa; Mohammad A I Al-Hatamleh; Mu'taman Jarrar; Rohimah Mohamud; Yean Yean Chan
Journal:  J Pers Med       Date:  2022-03-03
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.