| Literature DB >> 34718748 |
Zhidong Tang1, Weiliang Fan1, Qiming Li1, Dehe Wang1, Miaomiao Wen2, Junhao Wang1, Xingqiao Li1, Yu Zhou1,2,3,4.
Abstract
Virus infections are huge threats to living organisms and cause many diseases, such as COVID-19 caused by SARS-CoV-2, which has led to millions of deaths. To develop effective strategies to control viral infection, we need to understand its molecular events in host cells. Virus related functional genomic datasets are growing rapidly, however, an integrative platform for systematically investigating host responses to viruses is missing. Here, we developed a user-friendly multi-omics portal of viral infection named as MVIP (https://mvip.whu.edu.cn/). We manually collected available high-throughput sequencing data under viral infection, and unified their detailed metadata including virus, host species, infection time, assay, and target, etc. We processed multi-layered omics data of more than 4900 viral infected samples from 77 viruses and 33 host species with standard pipelines, including RNA-seq, ChIP-seq, and CLIP-seq, etc. In addition, we integrated these genome-wide signals into customized genome browsers, and developed multiple dynamic charts to exhibit the information, such as time-course dynamic and differential gene expression profiles, alternative splicing changes and enriched GO/KEGG terms. Furthermore, we implemented several tools for efficiently mining the virus-host interactions by virus, host and genes. MVIP would help users to retrieve large-scale functional information and promote the understanding of virus-host interactions.Entities:
Mesh:
Year: 2022 PMID: 34718748 PMCID: PMC8689837 DOI: 10.1093/nar/gkab958
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Schematic view of MVIP. (A) Information retrieval of omics data related to viral infections in various public databases. (B) Manual curation of metadata. (C) Overview of the generated data from the analyses of different types of omics data. (D) Overview of the database design and web interface of MVIP.
Summary of metadata in MVIP
| Class | Count |
|---|---|
| Sample | 6586 |
| Dataset | 4255 |
| Assay | 22 |
| Species | 33 |
| Biosample type | 9 |
| Tissue type | 76 |
| Cell type | 114 |
| Host | 183 |
| Virus family | 34 |
| Virus genus | 53 |
| Virus species | 68 |
| Virus name | 77 |
Summary of processed files during analysis
| File type | Count | Description |
|---|---|---|
| FeatureCounts (.tsv) | 4615 | The read counts for annotated genes |
| Stringtie (.tsv) | 4378 | The FPKM and TPM for annotated genes |
| Differential expression (.csv) | 1950 | Differentially expression analysis results of viral infection vs. controls |
| GO/KEGG (.tsv) | 1846 | GO/KEGGs analysis results of DEGs |
| Alternative splicing (.csv) | 5465 | Alternative splicing events analysis of viral infected datasets |
| Peaks (.bed, .tsv, .txt, .png) | 2470 | Peaks calling and annotation results of protein binding data |
| Translation efficiency (.txt) | 80 | Translation efficiency of ORFs |
| Methylation (.bed.gz) | 64 | Three types of methylation value (CHG, CHH, CPG) |
| Mapping (.bw) | 10,361 | Genomic signals from mapped reads |
Figure 2.Overview of metadata and data models. (A) Sankey diagram of virus, cell types, and host species in metadata. (B) Data counts of different assays by GEO series (GSE) and sample (GSM) for curated data up to Sep. 2019. The insert pie charts represent the distribution of RNA-seq and ChIP-seq samples by species. (C) Data counts of RNA-seq and scRNA-seq by GSE and GSM for SARS-CoV-2 related data up to December 2020. (D) Modeling of metadata. (E) Modeling of data analysis pipeline and processed data.
Figure 3.Main modules and usage of MVIP. (A) Navigation bar of the main modules in MVIP web page. (B) Matrix-like viewer of available multi-omics data. (C) Three search modes for filtering MVIP records. (D) An example of search results including experiment ID, virus, host, species, assay, and target. (E) An example of the summary page describing an experiment. (F) An example of the analysis results for RNA-seq data. (G) An example of ChIP-seq analysis results. (H) An example of JBrowse2 view of omics data signals. (I) An example of UCSC genome browser view via MVIP track hub along with ENCODE ChIP-seq, UCSC conservation data and GTEx RNA-seq data.
Figure 4.Analysis tools in MVIP web server. (A) List of six analysis tools in MVIP. (B–G) Interactive figure examples generated from these six tools.