Literature DB >> 34634807

VThunter: a database for single-cell screening of virus target cells in the animal kingdom.

Dongsheng Chen1,2, Cong Tan1, Peiwen Ding1,2, Lihua Luo1,2, Jiacheng Zhu1,2, Xiaosen Jiang1,2, Zhihua Ou1,3, Xiangning Ding1,2, Tianming Lan1,2, Yixin Zhu1,2, Yi Jia1, Yanan Wei1,4, Runchu Li1,4, Qiuyu Qin1,4, Chengcheng Sun1,2, Wandong Zhao1,4, Zhiyuan Lv1,4, Haoyu Wang1,2, Wendi Wu1,4, Yuting Yuan1,5, Mingyi Pu1,4, Yuejiao Li1, Yanan Zhang1,6, Ashley Chang1, Guoji Guo7, Yong Bai1, Xin Jin1,8,9, Huan Liu1,2.   

Abstract

Viral infectious diseases are a devastating and continuing threat to human and animal health. Receptor binding is the key step for viral entry into host cells. Therefore, recognizing viral receptors is fundamental for understanding the potential tissue tropism or host range of these pathogens. The rapid advancement of single-cell RNA sequencing (scRNA-seq) technology has paved the way for studying the expression of viral receptors in different tissues of animal species at single-cell resolution, resulting in huge scRNA-seq datasets. However, effectively integrating or sharing these datasets among the research community is challenging, especially for laboratory scientists. In this study, we manually curated up-to-date datasets generated in animal scRNA-seq studies, analyzed them using a unified processing pipeline, and comprehensively annotated 107 viral receptors in 142 viruses and obtained accurate expression signatures in 2 100 962 cells from 47 animal species. Thus, the VThunter database provides a user-friendly interface for the research community to explore the expression signatures of viral receptors. VThunter offers an informative and convenient resource for scientists to better understand the interactions between viral receptors and animal viruses and to assess viral pathogenesis and transmission in species. Database URL: https://db.cngb.org/VThunter/.
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 34634807      PMCID: PMC8728219          DOI: 10.1093/nar/gkab894

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

The COVID-19 pandemic has caused huge loss of human life, economic recession, and social disruption worldwide, underscoring the destructive impact of infectious diseases on human health and global security. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) can infect various animals, including bats, pangolins, cats, dogs, ferrets and minks (1–6). With the occurrence of anthropogenic transmission of SARS-CoV-2 to animals, the potential host range of this virus continues to raise concerns within the scientific community. Moreover, the emergence of SARS-CoV-2 also highlights the need for more rapid host range identification upon the emergence of novel pathogens.fun In recent decades, tremendous achievements have been made to characterize the host range and tissue tropism of viruses using traditional methods like epidemiological investigations or animal infection experiments. While these approaches are essential to elucidate bona fide viral infection in animals, it is impossible to carry out large-scale screening on the versatile species that might be susceptible to this pathogen, due to the limited availability of virus/animal/experimental resources. Recent advances in scRNA-seq technology has opened up new ways to identify all cell types in various tissues and/or organs and profile gene expression landscapes at single cell resolution, which holds tremendous potential to predict the potential cell types, tissues and organs targeted by viruses based on the expression profiles of their receptors in all cell types. For example, we conducted scRNA-seq for 11 representative species in pets, livestock, poultry, and wildlife to build the expression pattern profiles of all cell types to screen the potential target cell types and hosts of SARS-CoV-2 in a previous study (7), where cats were found to be highly susceptible to SARS-CoV-2, in accordance with serological and experimental findings by other researchers (3,6,8). Because cellular receptors play a critical role in the cell entry process of virus, identifying the tissue tropism is the first step towards understanding the pathogenesis and transmission of viruses in different hosts, thus laying the foundation for the prevention and control of putative outbreaks (9). Predicting host and tissue tropism based on comprehensive gene expression patterns at single-cell resolution is promising but presents challenges for laboratory biologists and experimentalists as the huge amount of data obtained from scRNA-seq studies can be daunting for those with limited backgrounds in bioinformatics. Several databases have been developed to make available the rapidly increasing volume of scRNA-seq data. Raw data and expression matrix datasets produced in scRNA-seq studies can be submitted to several freely available primary archives, such as ArrayExpress (10), and Gene Expression Omnibus (GEO) (11) for academic publication. In addition, several value-added databases have been developed based on manual curation and comprehensive integration of numerous datasets produced in scRNA-seq studies, such as CancerSEA (12), CellMarker (13), SC2disease (14) and TISCH (15), which are mainly produced for researches on human disease and cancers. Currently, multidimensional integrating analysis between viral receptor information and all publicly available gene expression patterns profiled by scRNA-Seq to determine the host tropism of animal viruses is in urgent need. Unfortunately, there is no comprehensive database available for bench scientists and researchers to conveniently obtain viral receptor expression information on the tissue/organ specific cell types of the various animal species. To fill this gap, we collected and manually curated 285 up-to-date scRNA-seq datasets, which included 2 100 962 cells from 47 animal species. We analyzed the datasets using a unified processing pipeline, integrated them with expert-curated receptor information of 142 viruses, and obtained accurate expression signatures of the viral receptors in 47 animal species. Information on viral receptor expression signatures is fundamental for understanding the molecular mechanisms underlying host infection by viruses. Thus, we also developed a comprehensive and user-friendly database, named VThunter, to ensure that the curated data were publicly available and could be easily utilized. In short, VThunter is a-value-added database with transformative information to facilitate study of the cross-species transmission mechanisms of animal viruses.

DATA COLLECTION AND DATABASE CONTENT

In total, 285 animal scRNA-seq datasets generated from 2 100 962 cells in 47 animal species were collected and used to predict the cell types targeted by viruses (Figure 1A, Supplementary Data 1 and Supplementary Data 2). The list of these 285 scRNA-Seq datasets is available on the database ‘Download’ page, and includes detailed metadata for each dataset, such as data source, time, technology, species name, sample tissue, treatment, cell number and URL for related literature. scRNA-seq datasets were retrieved based on literature search and downloaded from multiple scRNAseq data repositories including Gene Expression Omnibus (NCBI/GEO) (11), Human Cell Atlas Portal (HCA) (16), Single Cell Expression Atlas (EMBL-EBI/SCEA) (17) and Mouse Cell Atlas (MCA) (18). The information of receptor information of 142 animal viruses were obtained from the Viral Receptor database (19) and UniProt (20) (Supplementary Data 3).
Figure 1.

Overview of data collection, data processing and functional modules of VThunter database. (A) Diagram of species and tissues of collected scRNA-Seq datasets. (B) Steps of unified pipeline to process the curated scRNA-Seq datasets. (C) Overview of functional modules and user interfaces.

Overview of data collection, data processing and functional modules of VThunter database. (A) Diagram of species and tissues of collected scRNA-Seq datasets. (B) Steps of unified pipeline to process the curated scRNA-Seq datasets. (C) Overview of functional modules and user interfaces. All the literatures of the studies generating the above scRNA-seq datasets were manually confirmed by a group of experienced researchers and all the scRNA-seq dataset were processed with a unified analyzing pipeline (Figure 1B). Briefly, they are processed with steps composed of both utilities packaged in Seurat v3.0 and in-house scripts according to previously study (21,22). Firstly, we conduct the quality control by filtering out cells with expressed genes <200 and genes those expressed in <1 cell for each dataset. Then, function of ‘NormalizedData’ in Seurat v3.0 were used to normalize the sparse single cell gene expression matrix. The highly variable genes were identified using the function ‘FindVariableGenes’ and the top 2000 highly variable genes were used for dimensionality reduction using principal component analysis (PCA). Based on the PCA elbow plot, the top 20 PCs were selected and used for clustering. Based on the transcriptomic profiles resulted from the scRNA-seq datasets, the expression patterns of virus receptor genes in various cell types were investigated. In total, the expression signature of 107 viral receptors in all obtained cell types of various tissues from 47 animal species were generated. The 107 viral receptors could be recognized and potentially infected by 142 viruses from 23 families.

DATABASE CONSTRUCTION AND USER INTERFACE

VThunter could be publicly and freely accessed through web browser by bench researchers worldwide. The web application of VThunter was implemented on a high-performance Linux server with open-source software. VThunter was equipped with a real-time search engine. VThunter's web interface allows users to intuitively browse and exactly query the expression signature of viral receptors at single-cell resolution. Figure 1C shows the schematic workflow and main functional modules of this database. The navigation menu contains seven icons including ‘Home’, ‘Virus Spectrum’, ‘Host Spectrum’, ‘Demo’, ‘Co-expression’, ‘Download’ and ‘Help’, which could lead users to the functional interfaces. On the ‘Home’ page, there are four main elements in addition to the header and navigation menu, including search forms for virus receptors or virus target genes, galleries for representative viral and animal species, and statistics related to data resources maintained by VThunter (Figure 2). If users are interested with searching the host spectrum of certain virus, they could query the virus in the search box or select it in the virus gallery to enter a virus page with comprehensive information of viral receptor and host tropism including target genes and target species with the expression profiles of the target genes in tissues and cell types. If a researcher only wants to fucus on a specific animal species, they could select the species of interest in the animal species gallery and enter an animal species page where receptor expression profiles of all viruses that may potentially infect that animal species will be present.
Figure 2.

Statistics of viruses of receptors and animal species with scRNA-Seq datasets in this database. Pie chart showing the percentage of viruses of curated receptors according to their strand type (A), viruses of curated receptors according to their family type (B), according to their receptors (C), curated scRNA-Seq datasets according to species classification (D), curated scRNA-Seq datasets according to species system (E), curated scRNA-Seq datasets according to organs (F).

Statistics of viruses of receptors and animal species with scRNA-Seq datasets in this database. Pie chart showing the percentage of viruses of curated receptors according to their strand type (A), viruses of curated receptors according to their family type (B), according to their receptors (C), curated scRNA-Seq datasets according to species classification (D), curated scRNA-Seq datasets according to species system (E), curated scRNA-Seq datasets according to organs (F). On the ‘Virus Spectrum’ page, users could browse all the viruses with comprehensive receptor and host tropism information collected in this database. The ‘Check Details’ button under a virus icon will lead users to the virus page (Figure 3). Similarly, users could select a certain animal species in the ‘Host Spectrum’page and click the ‘Check Details’ button under the species icon to enter an animal species page (Figure 4). On the ‘Demo’ page, users can quickly view the content and format of data retrievable from VThunter. In the ‘Download’ page, links of all the raw data and resultant files maintained in this database are provided for interested researchers to conduct further analysis to meet their personalized needs. In the ‘Help’ page, a graphical operation guide is prepared for new users to get used to query relevant information easily, which will help them fully use the resource in VThunter. Besides, the ‘Co-expression’ module is also implemented for further inspect the co-expression genes of the certain viral receptors based on the comprehensive scRNA-seq expression profiles in VThunter (Figure 5).
Figure 3.

Demonstration of Virus Spectrum module. (A) Searching for virus via ‘Home’ page or ‘Virus spectrum’. (B) Operation to obtain detail information in virus page. (C) Screenshot of virus information page. (D) Operation to view different receptor information of each virus in virus page. (E, F) Operation and screenshot of checking scRNA-seq expression profile of viral receptors.

Figure 4.

Demonstration of Host Spectrum module. (A) Screenshot of searching animal species via ‘Home’ page or ‘Host spectrum’ page. (B) Operation to check detail information of certain species. (C) Screenshot of species information page. (D) Operation to check information of potential viruses in species page. (E, F) Operation and screenshot of checking scRNA-seq expression profile of virus receptors in species page.

Figure 5.

Demonstration of gene co-expression module. (A–D) Operations to enter ‘Co-expression’ page and perform co-expression query. (E) Screenshot of co-expression information of given genes.

Demonstration of Virus Spectrum module. (A) Searching for virus via ‘Home’ page or ‘Virus spectrum’. (B) Operation to obtain detail information in virus page. (C) Screenshot of virus information page. (D) Operation to view different receptor information of each virus in virus page. (E, F) Operation and screenshot of checking scRNA-seq expression profile of viral receptors. Demonstration of Host Spectrum module. (A) Screenshot of searching animal species via ‘Home’ page or ‘Host spectrum’ page. (B) Operation to check detail information of certain species. (C) Screenshot of species information page. (D) Operation to check information of potential viruses in species page. (E, F) Operation and screenshot of checking scRNA-seq expression profile of virus receptors in species page. Demonstration of gene co-expression module. (A–D) Operations to enter ‘Co-expression’ page and perform co-expression query. (E) Screenshot of co-expression information of given genes.

APPLICATION CASE

VThunter is a comprehensive database designed for virological study, where users can search for viruses of interest to obtain information on host tropism, including target tissues and cell types in certain animal species, and to investigate viruses potentially infecting an animal species. If a researcher wonders what animal species may be targeted by Rabies lyssavirus, they could conduct the following steps to obtain the relevant information as shown in Figure 3: (i) Select ‘Rhabdoviridae’ in virus family option list → select ‘Rabies lyssavirus’ in virus option list → select ‘GRM2’ → click ‘Search’. (ii) (optional) Find the Rabies lyssavirus in the representative virus gallery or in the ‘Virus Spectrum’ page, then click the ‘Check Details’ button under the virus icon. (iii) Users will be guided to the virus tropism page. In this page, animals with expression record of GRM2 will be displayed. If we click on the ‘Check Details’ button on the right of a specific animal, such as civet, the taxonomy lineage information for civet and the literature-based general information about Rabies lyssavirus and its infecting receptor are provided. All the scRNA-seq datasets collected in this database and relevant metadata including tissue type, animal health status, experimental details will be given in a data source form. After choosing a certain dataset, taking ‘Vthunter_007’ for example, an overall cell type cluster figure will be displayed on the left and the gene expression of gene GRM2 in each cell types will be displayed on the right. Furthermore, a boxplot showing the expression level of gene GRM2 in different tissues of civet will also be provided. In addition to querying host tropism for a certain virus, we may also want to know which viruses can affect the health of a certain animal species. Here, we take cat as the animal of interest (Figure 4). Briefly, our search process involves three steps: (i) click on the cat specie icon in the representative galley in the homepage, or, enter the ‘Host Spectrum’ page → select ‘Mammals’ in the classification option list and select ‘Cat’ in the species option list → click ‘Search’ → enter the cat page (ii) in this page, you could have an overview of what viruses might attack cat, what genes be targeted as the receptor and what tissues of cat have the expression of the target genes. (iii) After clicking the link of target gene ‘ACE2’ as the receptor of SARS-CoV-2, users will be led to the page showing the details of scRNA-seq studies conducted on cat and the expression levels of ACE2 in each cell type and different tissues. The above simple search steps highlight the user-friendly and highly interactive interface of VThunter, which can help users explore the expression signatures of viral receptors. VThunter provides the expression signatures of viral receptors in all cell types of 47 animal species at single-cell resolution to help clarify the interactions between host cells and viral surface proteins. In addition, it also provides quick download access to all raw data and resultant files maintained in the database to meet personalized needs. These features support VThunter as a reliable and useful database for the study of the cross-species transmission mechanisms of animal viruses.

SUMMARY AND FUTURE PERSPECTIVES

With the rapid accumulation of scRNA-Seq data from more and more species, it is time to fully archive and apply these resources in virological study, especially during the emergence of a novel animal virus. We believe host range assessment based on archived cellular receptor profiles could serve as an effective surrogate to narrow down the suspected host list and guide experimental designs for bench scientists, given that viral entry is the single step of infection and transmission in complete viral life cycle. Here, we have presented VThunter to reach this goal, where the expression signature of 107 viral receptors utilized by 142 viruses in various cell types of the tissues from 47 animal species is freely accessible. However, the expression level of a viral receptor can sometimes be low, but the virus may still be capable of infecting various tissues or cells (23). Therefore, researchers are advised to keep these limitations in mind and use scRNA-seq data wisely. Further extension will be conducted in the following aspects. First, feedbacks and suggestions from users will be addressed timely to improve the performance and scientific value of the database. Second, more comprehensive scRNA-seq datasets produced in the future studies and latest achievement of viral receptor will be collected at regular period and incorporated into this database in time. Third, other multi-omics datasets like proteomics, metabolism related to the animal virus infection and transmission are also expected to be manually curated and integrated into this database in the future. Fourth, as emerging study and validation of viral infection in various species is released, the information will be constantly curated by experts and integrated with relevant data in this database. Click here for additional data file.
  20 in total

1.  TISCH: a comprehensive web resource enabling interactive single-cell transcriptome visualization of tumor microenvironment.

Authors:  Dongqing Sun; Jin Wang; Ya Han; Xin Dong; Jun Ge; Rongbin Zheng; Xiaoying Shi; Binbin Wang; Ziyi Li; Pengfei Ren; Liangdong Sun; Yilv Yan; Peng Zhang; Fan Zhang; Taiwen Li; Chenfei Wang
Journal:  Nucleic Acids Res       Date:  2020-11-12       Impact factor: 16.971

2.  Mapping the Mouse Cell Atlas by Microwell-Seq.

Authors:  Xiaoping Han; Renying Wang; Yincong Zhou; Lijiang Fei; Huiyu Sun; Shujing Lai; Assieh Saadatpour; Ziming Zhou; Haide Chen; Fang Ye; Daosheng Huang; Yang Xu; Wentao Huang; Mengmeng Jiang; Xinyi Jiang; Jie Mao; Yao Chen; Chenyu Lu; Jin Xie; Qun Fang; Yibin Wang; Rui Yue; Tiefeng Li; He Huang; Stuart H Orkin; Guo-Cheng Yuan; Ming Chen; Guoji Guo
Journal:  Cell       Date:  2018-02-22       Impact factor: 41.582

Review 3.  Virus-Receptor Interactions: The Key to Cellular Invasion.

Authors:  Melissa S Maginnis
Journal:  J Mol Biol       Date:  2018-06-18       Impact factor: 5.469

4.  A pneumonia outbreak associated with a new coronavirus of probable bat origin.

Authors:  Peng Zhou; Xing-Lou Yang; Xian-Guang Wang; Ben Hu; Lei Zhang; Wei Zhang; Hao-Rui Si; Yan Zhu; Bei Li; Chao-Lin Huang; Hui-Dong Chen; Jing Chen; Yun Luo; Hua Guo; Ren-Di Jiang; Mei-Qin Liu; Ying Chen; Xu-Rui Shen; Xi Wang; Xiao-Shuang Zheng; Kai Zhao; Quan-Jiao Chen; Fei Deng; Lin-Lin Liu; Bing Yan; Fa-Xian Zhan; Yan-Yi Wang; Geng-Fu Xiao; Zheng-Li Shi
Journal:  Nature       Date:  2020-02-03       Impact factor: 69.504

5.  Cell membrane proteins with high N-glycosylation, high expression and multiple interaction partners are preferred by mammalian viruses as receptors.

Authors:  Zheng Zhang; Zhaozhong Zhu; Wenjun Chen; Zena Cai; Beibei Xu; Zhiying Tan; Aiping Wu; Xingyi Ge; Xinhong Guo; Zhongyang Tan; Zanxian Xia; Haizhen Zhu; Taijiao Jiang; Yousong Peng
Journal:  Bioinformatics       Date:  2019-03-01       Impact factor: 6.937

6.  Infection of dogs with SARS-CoV-2.

Authors:  Thomas H C Sit; Christopher J Brackman; Sin Ming Ip; Karina W S Tam; Pierra Y T Law; Esther M W To; Veronica Y T Yu; Leslie D Sims; Dominic N C Tsang; Daniel K W Chu; Ranawaka A P M Perera; Leo L M Poon; Malik Peiris
Journal:  Nature       Date:  2020-05-14       Impact factor: 49.962

7.  UniProt: the universal protein knowledgebase in 2021.

Authors: 
Journal:  Nucleic Acids Res       Date:  2021-01-08       Impact factor: 16.971

8.  ArrayExpress update - from bulk to single-cell expression data.

Authors:  Awais Athar; Anja Füllgrabe; Nancy George; Haider Iqbal; Laura Huerta; Ahmed Ali; Catherine Snow; Nuno A Fonseca; Robert Petryszak; Irene Papatheodorou; Ugis Sarkans; Alvis Brazma
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

9.  Susceptibility of ferrets, cats, dogs, and other domesticated animals to SARS-coronavirus 2.

Authors:  Jianzhong Shi; Zhiyuan Wen; Gongxun Zhong; Huanliang Yang; Chong Wang; Baoying Huang; Renqiang Liu; Xijun He; Lei Shuai; Ziruo Sun; Yubo Zhao; Peipei Liu; Libin Liang; Pengfei Cui; Jinliang Wang; Xianfeng Zhang; Yuntao Guan; Wenjie Tan; Guizhen Wu; Hualan Chen; Zhigao Bu
Journal:  Science       Date:  2020-04-08       Impact factor: 47.728

10.  Transmission of SARS-CoV-2 on mink farms between humans and mink and back to humans.

Authors:  Bas B Oude Munnink; Reina S Sikkema; David F Nieuwenhuijse; Robert Jan Molenaar; Emmanuelle Munger; Richard Molenkamp; Arco van der Spek; Paulien Tolsma; Ariene Rietveld; Miranda Brouwer; Noortje Bouwmeester-Vincken; Frank Harders; Renate Hakze-van der Honing; Marjolein C A Wegdam-Blans; Ruth J Bouwstra; Corine GeurtsvanKessel; Annemiek A van der Eijk; Francisca C Velkers; Lidwien A M Smit; Arjan Stegeman; Wim H M van der Poel; Marion P G Koopmans
Journal:  Science       Date:  2020-11-10       Impact factor: 47.728

View more
  3 in total

1.  The 2022 Nucleic Acids Research database issue and the online molecular biology database collection.

Authors:  Daniel J Rigden; Xosé M Fernández
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

2.  Screening of cell-virus, cell-cell, gene-gene crosstalk among animal kingdom at single cell resolution.

Authors:  Dongsheng Chen; Zhihua Ou; Jiacheng Zhu; Haoyu Wang; Peiwen Ding; Lihua Luo; Xiangning Ding; Chengcheng Sun; Tianming Lan; Sunil Kumar Sahu; Weiying Wu; Yuting Yuan; Wendi Wu; Jiaying Qiu; Yixin Zhu; Qizhen Yue; Yi Jia; Yanan Wei; Qiuyu Qin; Runchu Li; Wandong Zhao; Zhiyuan Lv; Mingyi Pu; Boqiong Lv; Shangchen Yang; Ashley Chang; Xiaofeng Wei; Fengzhen Chen; Tao Yang; Zhenyong Wei; Fan Yang; Peijing Zhang; Guoji Guo; Yuejiao Li; Yan Hua; Huan Liu
Journal:  Clin Transl Med       Date:  2022-08

3.  Comparative analysis of single cell lung atlas of bat, cat, tiger, and pangolin.

Authors:  Xiran Wang; Peiwen Ding; Chengcheng Sun; Daxi Wang; Jiacheng Zhu; Wendi Wu; Yanan Wei; Rong Xiang; Xiangning Ding; Lihua Luo; Meiling Li; Wensheng Zhang; Xin Jin; Jian Sun; Huan Liu; Dongsheng Chen
Journal:  Cell Biol Toxicol       Date:  2022-09-28       Impact factor: 6.819

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.