| Literature DB >> 33339864 |
Nicholas Parkinson1, Natasha Rodgers1, Max Head Fourman1, Bo Wang1, Marie Zechner1, Maaike C Swets1,2, Jonathan E Millar1, Andy Law1, Clark D Russell1,3, J Kenneth Baillie4, Sara Clohisey5.
Abstract
The increasing body of literature describing the role of host factors in COVID-19 pathogenesis demonstrates the need to combine diverse, multi-omic data to evaluate and substantiate the most robust evidence and inform development of therapies. Here we present a dynamic ranking of host genes implicated in human betacoronavirus infection (SARS-CoV-2, SARS-CoV, MERS-CoV, seasonal coronaviruses). We conducted an extensive systematic review of experiments identifying potential host factors. Gene lists from diverse sources were integrated using Meta-Analysis by Information Content (MAIC). This previously described algorithm uses data-driven gene list weightings to produce a comprehensive ranked list of implicated host genes. From 32 datasets, the top ranked gene was PPIA, encoding cyclophilin A, a druggable target using cyclosporine. Other highly-ranked genes included proposed prognostic factors (CXCL10, CD4, CD3E) and investigational therapeutic targets (IL1A) for COVID-19. Gene rankings also inform the interpretation of COVID-19 GWAS results, implicating FYCO1 over other nearby genes in a disease-associated locus on chromosome 3. Researchers can search and review the gene rankings and the contribution of different experimental methods to gene rank at https://baillielab.net/maic/covid19 . As new data are published we will regularly update the list of genes as a resource to inform and prioritise future studies.Entities:
Mesh:
Substances:
Year: 2020 PMID: 33339864 PMCID: PMC7749145 DOI: 10.1038/s41598-020-79033-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Methodologies accepted for inclusion in meta-analysis and associated labels.
| Accepted methodologies | MAIC category |
|---|---|
| CRISPR screen | CRISPR Screen |
| RNAi screen | RNAi |
| Protein–protein interaction e.g. yeast-2-hybrid screen | Protein–protein interaction |
| Host proteins incorporated into virion or virus like particle | Virus |
| Genetic Association Studies Human | Human genetics |
| Genetic Association Studies Non-human | Non-human genetics |
| Proteomic studies e.g. mass-spectrometry | Proteomics |
| Selected gene set screens | Gene set screen |
Figure 1Overview of MAIC approach. (A) Schematic showing the operation of MAIC. Each entity in a list is given a score, based on overlap with other lists and rank where relevant, and each list is given a weight determined by the scores of its constituent entities. Entity scores are iteratively updated using list weights, and list weights are updated using entity scores, until convergence occurs. (B) Circular plot showing overlap between different data sources included in MAIC. Size of data source blocks is proportional to the summed information content (MAIC scores) of the input list. Lines are coloured according to the dominant data source. Data source categories share the same colour; the largest categories and data sources are labelled (see Supplementary Information for full source data). (C) Relative information contributions (determined by sum of MAIC score contributions) of each experimental category to the evidence base for the top 100 genes in the MAIC output. (D) Distribution of MAIC scores by gene rank. The shaded region indicates the range of possible scores for a gene supported by a single gene list only. Beyond ranks around 700 in this study, gene scores approach baseline, indicating they have little corroborative evidence.
Entry criteria.
| Inclusion | Exclusion |
|---|---|
| Infection of any species with SARS-CoV, SARS-CoV-2, MERS-CoV, HCoV-229E, HCoV-OC43, HCoV-HKU1 or HCoV-NL63 | Candidate in vitro or in vivo gene, transcript or protein studies and screens—defined here as < 50 genes, transcripts or proteins investigated |
| Human studies: in vivo, in vitro, primary human cells, in vitro human cell lines | Candidate-gene human genetic studies |
| Animal studies: in vivo, ex vivo, in vitro, primary cells, in vitro cell lines | < 5 hosts in virus group or control group in patient studies |
| Accepted experimental designs in Table | Meta-analyses, in silico anayses, re-analysis of data published elsewhere |
| Insufficient data available |
Data extracted from each publication.
| Extracted information | Examples |
|---|---|
| Virus & virus component/modification | SARS-CoV-2, HCoV-229E |
| Method/experimental design | See Table |
| Organism | Human, rodent, Non-human primate |
| Cell/tissue type | Vero6, A549, serum |
| Peer reviewed or pre-print | Peer-reviewed, pre-print |
Figure 2Highest ranked genes in the MAIC output and overlap with other conditions. (A) Heatmap of the top 50 genes implicated in SARS-CoV-2 infection, as ranked by the MAIC algorithm. The heatmap shows the information sources contributing to each of the top genes, by experimental category. Full details of all scored genes, including specific studies contributing to each, are given in Supplementary Table S1. (B) Venn diagram of overlap between the top 500 hits from this study and the top 500 hits from our previous MAIC analysis of Influenza A virus. (C) Venn diagram of overlap between the top 500 hits from this study and manually curated lists from available literature on HLH and ARDS.
Figure 3Gene Set Enrichment Analysis of MAIC rankings. (A) Violin plots of MAIC score distributions of top enriched pathways significant with both FGSEA and Enrichr algorithms, from the KEGG 2019 (Human) and WikiPathways 2019 (Human) databases. Highly similar pathways and irrelevant specific disease terms are not shown. n: number of gene set members included in the overall MAIC output; NES: normalised enrichment score from FGSEA. (B): Information contribution by methodology for selected enriched KEGG terms. Relative contributions of different information sources vary between functional annotations, but no single methodology predominates to drive enrichment.
Figure 4Cellular functions of the 100 highest ranked genes in the MAIC output. Protein products of these genes have diverse cellular locations and are associated with numerous processes relevant to the viral life cycle and host immune system. Stages of the betacoronavirus life cycle: (1) S protein-mediated attachment to the cell surface. (2) Endocytosis. (3) Membrane fusion and viral genome release into the cytoplasm. (4) Assembly of the replication-transcription complex, translation of mRNA. (5) Viral replication and virion assembly. (6) Virion maturation, budding and translocation of vesicles.