Literature DB >> 32838343

The COVID-19 Drug and Gene Set Library.

Maxim V Kuleshov¹, Daniel J Stein¹, Daniel J B Clarke¹, Eryk Kropiwnicki¹, Kathleen M Jagodnik¹, Alon Bartal¹, John E Evangelista¹, Jason Hom¹, Minxuan Cheng¹, Allison Bailey¹, Abigail Zhou¹, Laura B Ferguson², Alexander Lachmann¹, Avi Ma'ayan¹.

Abstract

In a short period, many research publications that report sets of experimentally validated drugs as potential COVID-19 therapies have emerged. To organize this accumulating knowledge, we developed the COVID-19 Drug and Gene Set Library (https://amp.pharm.mssm.edu/covid19/), a collection of drug and gene sets related to COVID-19 research from multiple sources. The platform enables users to view, download, analyze, visualize, and contribute drug and gene sets related to COVID-19 research. To evaluate the content of the library, we compared the results from six in vitro drug screens for COVID-19 repurposing candidates. Surprisingly, we observe low overlap across screens while highlighting overlapping candidates that should receive more attention as potential therapeutics for COVID-19. Overall, the COVID-19 Drug and Gene Set Library can be used to identify community consensus, make researchers and clinicians aware of new potential therapies, enable machine-learning applications, and facilitate the research community to work together toward a cure.

Entities: Chemical

Keywords: DSML 3: Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems

Year: 2020 PMID： 32838343 PMCID： PMC7381899 DOI： 10.1016/j.patter.2020.100090

Source DB: PubMed Journal: Patterns (N Y) ISSN： 2666-3899

Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a novel coronavirus that causes the coronavirus disease (COVID-19). Globally, there are more than 21.5 million confirmed COVID-19 cases and ∼766,000 reported deaths (as of August 15, 2020). Many biomedical researchers have shifted their efforts to investigate different aspects of the coronavirus COVID-19 pandemic. One area of activity is computationally prioritizing and experimentally testing approved and experimental drugs for repurposing as candidate therapies for COVID-19. Drug-repurposing studies present a promising avenue for quickly offering a treatment because many of these drugs have known safety profiles. So far, drug-repurposing studies can be categorized into two groups, in vitro screens1, 2, 3, 4, 5, 6 and computational predictions. Computational predictions are mostly based on structural biology methods,7, 8, 9, 10 but some are based on network analysis and transcriptomics.11, 12, 13 Few studies have validated top computational predictions in cell-based assays.,, The lists of drugs mentioned in these studies can be analyzed for consensus while identified drugs can be grouped by their type. At the same time, many researchers attempt to understand the molecular mechanisms of the SARS-CoV-2 virus life cycle. Much attention has been given to studies that use profiling with mass spectrometry proteomics and phosphoproteomics. These methods identify host proteins that interact with each of the SARS-CoV-2 proteins or differentially phosphorylated proteins before and after SARS-CoV-2 infection. Another important study produced RNA-sequencing gene expression signatures from various relevant human cell lines, ferret lungs, and human lung biopsies before and after SARS-CoV-2 infection. These are just a few examples of the many studies that produce gene sets that can be organized and compared. In the past, we have developed a crowdsourcing project whereby we asked the community to identify gene expression signatures from drug, gene, and disease perturbations. The collection of over 6,000 signatures that were collected with the help of more than 70 contributors from around the world enabled us to produce a useful database called CREEDS (https://amp.pharm.mssm.edu/CREEDS/). Similarly, for this project, we developed a crowdsourcing project to integrate drug and gene sets related to COVID-19 research collected with the assistance of the research community. The resource is delivered as a web-based platform that has already been accessed by >1,700 unique users.

Results

Analysis and Visualization of Consensus Drug and Gene Sets

So far, we have collected 173 drug sets composed of 1,620 unique drugs, and 444 gene sets consisting of 18,676 unique human genes. These are presented to users via the COVID-19 Drug and Gene Set Library website in several sortable and searchable tables (Figure 1). The drug sets are subdivided into two categories: experimental (n = 26) and computational (n = 81). The top 20 most frequent drugs and genes across all sets are displayed in Figures 2A–2C. The experimental drugs, with most supportive evidence, are remdesivir, chloroquine, hydroxychloroquine, and mefloquine (Figure 2A). Although hydroxychloroquine, chloroquine, and remdesivir have received a lot of attention by the media and are tested in many clinical trials, mefloquine received far less attention. Mefloquine, just like hydroxychloroquine and chloroquine, is an anti-malaria drug. However, it has a different chemical structure and is known to act via different mechanisms. The top 20 most commonly computational predicted drugs include several known antivirals such as ritonavir, darunavir, lopinavir, and ribavirin (Figure 2B). This might be due to their pre-selection as candidates for computational docking. The top 20 most frequently submitted genes are all members of the innate immune response (Figure 2C). These genes include the typical interferon and cytokine response genes observed to be involved in the response of human cells to most pathogens.

Figure 1

Screenshot from the Landing Page of the COVID-19 Drug and Gene Set Library

Figure 2

Counts of Library Drugs and Genes

(A) Counts of most common drugs from the collection of experimental studies that reported lists of drugs that inhibit SARS-CoV-2.

(B) Counts of most common drugs from the collection of computational studies that reported lists of drugs that may inhibit COVID-19.

Screenshot from the Landing Page of the COVID-19 Drug and Gene Set Library Counts of Library Drugs and Genes (A) Counts of most common drugs from the collection of experimental studies that reported lists of drugs that inhibit SARS-CoV-2. (B) Counts of most common drugs from the collection of computational studies that reported lists of drugs that may inhibit COVID-19. (C) Counts of most common genes from the collection of all gene sets in the library. While most of the drug sets in the library are from studies that utilized computational methods, several key studies are from large-scale drug screens that include mostly Food and Drug Administration-approved drugs.1, 2, 3, 4, 5, 6 Using a Venn diagram, we compared the results from these six in vitro SARS-CoV-2 drug screen studies (Figure 3). Overall, there is little overlap across these screens, with only 11 drugs shared across two or more studies (Table 1). Namely, the drugs that appear as hits in more than one screen, in addition to remdesivir and chloroquine, are mefloquine, clofazimine, acitretin, gilteritinib, hexachlorophene, niclosamide, tetrandrine, tioguanine, and almitrine. Clofazimine is the only drug that appeared as a hit in three out of the six screens. Clofazimine is a drug used to treat leprosy, and its mechanisms of action suggest that it interferes with DNA synthesis. Acitretin is an anti-inflammatory second-generation retinoid that is used to treat severe psoriasis; it is a metabolite of etretinate. Almitrine is a drug that stimulates respiratory respiration by activating receptors of carotid bodies. It is used for the treatment of chronic obstructive pulmonary disease, and as such, it is relevant to COVID-19 symptoms. It should be noted that remdesivir appears as a hit in all six screens, but it was pre-selected as a positive control in half of the studies.

Figure 3

Overlap across Six In Vitro Drug-Repurposing Screens for SARS-CoV-2 Inhibitors

Table 1

Summary of the Six In Vitro COVID-19 Drug Screens Analyzed

Authors	Journal	Hits	Method	Cells
Jeon et al.²	Antimicrob. Agents Chemother.	24	inhibition assay	Vero cells
Touret et al.³	bioRxiv	12	inhibition assay	Vero cells
Ellinger et al.⁴	Research Square	66	inhibition assay	Caco-2
Heiser et al.⁵	bioRxiv	36	image-based assay	HRCE cells
Riva et al.⁶	bioRxiv	18	inhibition assay	Vero cells
Mirabelli et al.¹	bioRxiv	15	image-based assay	Huh-1 cells

Overlap across Six In Vitro Drug-Repurposing Screens for SARS-CoV-2 Inhibitors Summary of the Six In Vitro COVID-19 Drug Screens Analyzed The small overlap among the screens can be due to various reasons including different assay types, cellular contexts, inclusion criteria, original library content, and different laboratory protocols. We carefully reviewed and compared the results from these screens including compounds screened, assays, drug concentrations used in screens, incubation, multiplicity of infection, and hit criteria. These aspects are summarized in Table S1, and the final drug sets from each study are provided in Table S2. This analysis enabled us to compare the IC50 values reported for those drugs that appeared in multiple screens (Tables 2 and S3). Overall, we observe relative consistency of reported IC50 values across screens. We also checked whether the hits from the six COVID-19 screens also appeared as hits in other previously published similar screens for other viruses and other diseases (Figure 4; Tables S4 and S5). We observe that the hits from the Jeon et al. study overlap with several other screens that reported potential antivirals for Zika, Ebola, and MERS. This might confirm the potentially good quality of the Jeon et al. screen. Next, we examined whether any of the drugs considered as hits across the six COVID-19 screens contain pan assay interference compounds (PAINS) chemotypes. To achieve this we compared the COVID-19 screen hits with a list of PAINS filters downloaded from ChEMBL. To check for possible PAINS among the hits, we checked whether any of the hits contain any one of the PAINS substructure chemotypes (Table S6). Six hits, namely eltrombopag, ketoconazole, phenazopyridine, posaconazole, SDZ-62-434, and Z-Leu-Val-Gly-diazomethylketone, out of 195 total hits contain such substructures, although this level of overlap is not statistically significant (Fisher's exact test, p = 0.57).

Table 2

Compounds that Appear as Hits in Multiple Studies

Drug	Touret et al.³ IC₅₀ (μM)	Heiser et al.⁵ IC₅₀ (μM)	Riva et al.⁶ IC₅₀ (μM)	Ellinger et al.⁴ IC₅₀ (μM)	Jeon et al.² IC₅₀ (μM)	Mirabelli et al.¹ IC₅₀ (μM)	Overlap
Remdesivir	1.65	x	0.62	0.76	11.41	0.10	6
Clofazimine		x	x			0.08	3
Acitretin		x	x				2
Almitrine		x		1.42			2
Gilteritinib					6.76	0.22	2
Hexachlorophene		x			0.90		2
Lopinavir				19.11	9.12		2
Mefloquine				14.15	4.33		2
Niclosamide					0.28	0.14	2
Tetrandrine			1.1		3		2
Tioguanine				1.71		0.022	2

If available, the IC50 value calculated in each study is shown. Otherwise, the hit is marked by an “x.” Note that different studies use different assays and cell lines to measure dose response.

Figure 4

UpSet Plot to Visualize the Hits from the Six COVID-19 Screens (Orange) and 11 Similar Non-COVID-19 Screens (Black)

Compounds that Appear as Hits in Multiple Studies If available, the IC50 value calculated in each study is shown. Otherwise, the hit is marked by an “x.” Note that different studies use different assays and cell lines to measure dose response. UpSet Plot to Visualize the Hits from the Six COVID-19 Screens (Orange) and 11 Similar Non-COVID-19 Screens (Black)

ACE2 Up- or Downregulation Effects of Drug Hits?

To further explore the molecular effects of the positive hits from the six in vitro drug screens and to demonstrate the utility of the collected library, we developed a case study that asks whether the hits from the six screens up- or downregulate genes that are highly co-expressed with the ACE2 gene. ACE2 is the suspected cell surface receptor for SARS-CoV-2, and cells that do not express this gene have been shown to be less prone to SARS-CoV-2 infection. Since it is still undetermined whether it is desired to up- or downregulate the ACE2 expression module, we queried drugs from the published in vitro drug screen hits against the library of network-based cellular signatures (LINCS) L1000 data. We identified 61 drug hits from the six screens that have been profiled by L1000 assay. There are two drugs that significantly upregulate the ACE2 module (50 genes most correlated with ACE2 based on RNA-sequencing data from the Gene Expression Omnibus [GEO]) and one drug that significantly downregulates these genes after p-value correction (false discovery rate <0.1) (upregulated: homoharringtonine, 5.32 × 10−9; alvocidib, 1.58 × 10−5; downregulated: tazarotene, 5.77 × 10−2). Overall, 33 drugs on average upregulate the ACE2 module and 28 downregulate the module (Figure 5), suggesting that upregulating the ACE2 module might be more protective than harmful, which is counterintuitive. However, the relatively balanced division of drugs that induce or suppress this module makes this assertion inconclusive.

Figure 5

L1000 Profiled Drugs' Effects on the ACE2 Module

Average change in overall expression of the ACE2 co-expression module for 61 drug hits from the six published in vitro screens that also have L1000 profiling gene expression data.

L1000 Profiled Drugs' Effects on the ACE2 Module Average change in overall expression of the ACE2 co-expression module for 61 drug hits from the six published in vitro screens that also have L1000 profiling gene expression data.

Machine Learning to Rank Hits and Prioritize Other Candidates

The positive hits from the six COVID-19 drug screens can be used to train machine-learning models that can be used to prioritize the hits and suggest additional compounds that strongly share features with these hits. Using gene expression (GE) and chemical structure (CS) features of the hits and additional drugs and small molecules profiled via the L1000 assay, we implemented an Extra Trees (ET) classifier as a model that can be used to predict whether a drug is likely to inhibit SARS-CoV-2 in vitro. The ET classifier was able to predict hits from the six SARS-CoV-2 drug screens with an average area under the receiver-operating characteristic curve (AUROC) of 0.76 across cross-validation splits, suggesting that GE and CS features are overall predictive of the types of compounds that could inhibit SARS-CoV-2 infection (Figures 6A and 6B; Table S7). The lower value for the area under the precision recall curve can be explained by the class imbalance, which causes many non-hits to be ranked above known hits (Tables 3 and 4). Similar training and predictions were done using only GE features as input. In this case, the ET classifier achieved an average cross-validation AUROC of 0.66, which was lower than when CS features were also included but still statistically significant (Figures 6C and 6D; Table S8). It should be noted that the top-ranked predicted drugs are all from the same class of ATPase inhibitor cardiac drugs that have a similar structure and a similar GE signature effect in the L1000 assay. These drugs are over-represented in the Jeon et al. screen, so these initial results should be viewed with caution. The classifier also highly ranked lanatoside C, a drug identified as an active compound against MERS-CoV infection. This confirms that the machine-learning method could prioritize compounds that were missed by the six drug screens. In sum, this simple machine learning classification model is intended to demonstrate the potential for utilizing the drug sets collected for the library for machine-learning applications.

Figure 6

Evaluation of ET Classifiers Ability to Predict SARS-CoV-2 Inhibitors

(A) ROC curve for L1000 + MACCS-based predictions across cross-validation splits.

(B) PR curve for L1000 + MACCS-based predictions across cross-validation splits.

(D) PR curve for L1000-only predictions across cross-validation splits.

Table 3

Ranked Predictions for Screen Hits Based on L1000 + MACCS Input with p > 0.01

Broad Pert. ID	Drug	Hit	Prediction Probability
BRD-K23478508	digoxin	1	0.8677456
BRD-A34806832	proscillaridin	1	0.61186494
BRD-A68930007	ouabain	1	0.48673511
BRD-K13514097	everolimus	1	0.12437698
BRD-K76674262	omacetaxine mepesuccinate	1	0.03459994
BRD-K88538023	oxiconazole	1	0.02330089
BRD-A29731977	17-hydroxyprogesterone-caproate	1	0.02278362
BRD-K59873006	digitoxin	1	0.02124448
BRD-K06926592	tretinoin	1	0.02050656
BRD-A80908310	cloperastine	1	0.01705306
BRD-K96390176	calcipotriol	1	0.01589157
BRD-K33882852	ZK-93423	1	0.01579197
BRD-K90699611	acitretin	1	0.01383878
BRD-A10070317	propranolol	1	0.01347796
BRD-A99117172	hydroxychloroquine	1	0.01282602
BRD-A50287119	sirolimus	1	0.01201528
BRD-K15409150	penfluridol	1	0.01139704
BRD-A62025033	temsirolimus	1	0.011242
BRD-K74501079	azithromycin	1	0.01123628
BRD-K87909389	alvocidib	1	0.01096243
BRD-K68392338	ZK-93426	1	0.01075777
BRD-K99964838	bosutinib	1	0.01062753
BRD-A62184259	cycloheximide	1	0.01058221
BRD-K12184470	flunarizine	1	0.01058221
BRD-K17561142	amiodarone	1	0.01029646
BRD-A64290322	cyclosporin A	1	0.0101906
BRD-K68246049	TTNPB	1	0.01013295
BRD-A91699651	chloroquine	1	0.01005631

Table 4

Ranked Predictions for Top Additional Compounds Based on L1000 + MACCS Input

Broad Pert. ID	Drug	Prediction Probability
BRD-A80502530	cinobufagin	0.70859567
BRD-A76528577	vincristine	0.3044745
BRD-K51290057	SA-792709	0.25357778
BRD-A68202111	BRD-A68202111	0.1923075
BRD-U19872303	spiramycin	0.186088
BRD-A22783572	vinblastine sulfate	0.18156875
BRD-K04010869	prostaglandin A₁	0.15031656
BRD-K08486545	cymarin	0.14312159
BRD-K01188359	vinblastine	0.12795675
BRD-A57089740	peruvoside	0.12597708
BRD-K67783091	haloperidol	0.10281666
BRD-A44827100	erythromycin	0.10106031
BRD-K36248164	etretinate	0.10086468
BRD-A29322418	canrenoic acid	0.09826209
BRD-K46523383	pramocaine	0.08840484
BRD-A52650764	ingenol	0.08561776
BRD-K80348542	cephaeline	0.08324268
BRD-A29854054	lorglumide	0.0632068
BRD-K03981224	ethisterone	0.06260333
BRD-A90131694	alclometasone	0.06221619
BRD-U66370498	androstanol	0.06103837
BRD-K21667562	AM 404	0.05919457
BRD-A89434049	sarmentogenin	0.05852038
BRD-A94810754	ionomycin	0.05814178
BRD-A37501891	BRD-A37501891	0.05203431

Evaluation of ET Classifiers Ability to Predict SARS-CoV-2 Inhibitors (A) ROC curve for L1000 + MACCS-based predictions across cross-validation splits. (B) PR curve for L1000 + MACCS-based predictions across cross-validation splits. (C) ROC curve for L1000-only predictions across cross-validation splits. (D) PR curve for L1000-only predictions across cross-validation splits. Ranked Predictions for Screen Hits Based on L1000 + MACCS Input with p > 0.01 Ranked Predictions for Top Additional Compounds Based on L1000 + MACCS Input

Discussion

Here we describe a platform created to collect drug and gene sets related to COVID-19 research using various methods of data accrual. Many top-ranked frequent genes that are associated with COVID-19 are part of the interferon pathway. This is consistent with our knowledge that type I (IFN-α, IFN-β) and type III (IFN-λ) interferon systems are the primary defense against viral infections. However, it was suggested that one of the evasion mechanisms by SARS-CoV-2 is to dampen the interferon response. It has been hypothesized that hyperinflammation in COVID-19 could drive disease severity and would be amenable to treatment with drugs that reduce inflammation., However, this remains controversial because the high level of antiviral response could be reflective of increased viral burden rather than an inappropriate host response. The most striking result from the meta-analysis applied to the content of the library is the limited overlap across drug screen studies. It is expected that experimental validation of drugs to inhibit SARS-CoV-2 in vitro will be more consistent. The inconsistency across these studies could be due to a need to produce results quickly because of the urgency for discovering potential treatments. Regardless, there is some interesting overlap that cannot be explained by artifacts such as PAINS chemotypes. Hence, there is an expectation that as more similar screens are published, the top most consistent leads will advance to animal models and human trials for further testing. To prioritize compounds that may treat COVID-19, some researchers have used the strategy of finding drugs that modulate genes related to ACE2 GE. We found few hits that also highly significantly up- or downregulate the genes most correlated with ACE2. However, it is inconclusive whether up- or downregulation of this module is beneficial. Finally, we have demonstrated how the positive hits across the screens can be pooled to develop machine-learning models that can further prioritize candidates based on direct experimental accumulating evidence about potential SARS-CoV-2 inhibitors. It should be clear that the consensus analysis results should be viewed with caution. The most common drugs are not necessarily the most efficacious or promising treatments. At the same time, the most common genes may not be the most relevant to furthering COVID-19 research. It should be noted that not all drug sets and gene sets have equal weight in quality and relevancy. A list of computationally predicted drugs is not as useful toward identifying a therapy for COVID-19 when compared with a list of experimentally validated drugs. A list of upregulated genes after SARS-CoV-2 infection of cells may provide more useful information about the virus life cycle when compared with a list of genes returned from a PubMed search using the term SARS. Hence, the users of the data collected for the library should be aware of such limitations. With these limitations in mind, we hope that researchers will be able to better develop or refine their hypotheses by considering the information in the library. In a period of rapid development of methods and data related to COVID-19 research, it is critical to provide the means to organize the accumulated information in a way that it can be summarized and reused. The COVID-19 Drug and Gene Set Library provides such utility. The library of drug and gene sets can be used to identify community consensus and make researchers and clinicians aware of the developments in new potential therapies as they become available, as well as allow the research community to work together toward a cure for COVID-19. However, it is important to note that while there are now many drugs that show promise in blocking SARS-CoV-2 in vitro, in vivo studies are needed before any of these drugs can be considered real original therapeutics.

Experimental Procedures

Resource Availability

Lead Contact

Further information and requests for digital resources should be directed to and will be fulfilled by the Lead Contact, Avi Ma'ayan (avi.maayan@mssm.edu).

Materials Availability

This study generated The COVID-19 Drug and Gene Set Library website available at: https://amp.pharm.mssm.edu/covid19/.

Data and Code Availability

All data collected for this project is made available via the website https://amp.pharm.mssm.edu/covid19. The data from the site can be accessed via API. The code behind the site is available on GitHub at https://github.com/maayanlab/covid19_crowd_library. The consensus analysis of the drugs that up- or downregulate the ACE2 module is available from https://github.com/maayanlab/covid19l1000. All code and data are provided openly under the Apache License version 2.0. The supporting tables are provided openly at Mendeley Data at https://data.mendeley.com/datasets/mjbygmkdt3/1 https://doi.org/10.17632/mjbygmkdt3.1.

Collecting Drug Sets from Publications that Describe SAR-CoV-2 Drug Screens

Since the emergence of the COVID-19 pandemic, thousands of new publications related to COVID-19 research have emerged in just a few months. We continually surveyed these publications to identify research articles that describe drug screens and manually extracted drug sets from these studies to populate the COVID-19 Drug and Gene Set Library database. We also submitted to the platform published drug sets from historical sources such as those from studies that listed drugs showing antiviral activity for other related viruses. To assist us with developing and maintaining the collection, we have received help from the research community by allowing researchers to upload drug and gene sets to the database. These submissions are manually evaluated before making them publicly available.

Collecting SARS Signatures from GEO with GEO2Enrichr, BioJupies, and GEN3VA

Gene expression signatures resulting from infection of different coronaviruses for different cell types and tissues, with expression data originating from the GEO database, were processed using the GEO2Enrichr and BioJupies, and stored on the GEN3VA platform. The entries were submitted to the COVID-19 crowdsourcing platform, with an upregulated and a downregulated gene set associated with each signature.

Collecting COVID-19-Related Gene Sets with Geneshot

Geneshot is a platform that can be used to convert PubMed searches into gene sets. Using Geneshot, gene sets associated with the search terms: SARS, SARS-CoV, MERS-CoV, ACE2, and TMPRSS2 were created using both the AutoRIF and GeneRIF options. Additionally, top COVID-19 drug-repurposing candidates reported in recent literature were included as search terms. Predictions of additional genes potentially associated with the genes directly co-mentioned with these terms were also added to the database. These predictions were based on five strategies: co-occurrence via AutoRIF, GeneRIF, Enrichr, or Tagger, and co-expression using data from ARCHS4.

Developing the COVID-19 Gene and Drug Set Library Website

The COVID-19 Drug and Gene Set Library website has two sortable and searchable tables that list the drug and gene sets. Sorting can be based on the date of submission, alphabetical ordering, or list size. The tables are searchable via metadata terms such as title, authors, and descriptions, as well as via data search for specific drug or gene terms. Users can download each drug or gene set as well as the entire library. In addition, each gene set is provided with the option to perform gene set enrichment analysis with Enrichr, while genes are linked to Harmonizome for further interrogation. Similarly, drug sets can be analyzed with DrugEnrichr, a drug set enrichment analysis tool. The individual drugs that map to known compounds are linkable to their corresponding DrugBank landing pages. The website enables users to submit drug and gene sets related to COVID-19 research by completing a simple form. The form includes a dataset title, a URL source, and a description that explains how the set is relevant to COVID-19 research. The submitter is also provided with mechanisms to add additional metadata terms that can describe the cell type, tissue, organism, and other critical information about the submitted set. Users can specify the category of the additional metadata, allowing for a broad set of expanded annotations for each submitted set. Users can also submit their contact information; this information is kept private, but users can opt-in to make it public. Once a user submits a contribution to the site, their dataset is directed to a review queue in which we manually examine the validity and relevance of the contribution. The reviewing process enables an administrator to approve or reject the submitted set. If approved, the set is added to the database. To make it easy for contributors to submit multiple sets, users can access the site via API. The code behind the site is open source and available at https://github.com/maayanlab/covid19_crowd_library.

Expression Analysis of In Vitro Screen Hits

Drug sets extracted from the six in vitro screens1, 2, 3, 4, 5, 6 were matched to drugs profiled by the L1000 assay available from GEO: GSE92742. Average signatures for each drug were computed by taking the Z score mean for each gene. To quantify the average change in expression of genes co-expressed with ACE2, we obtained the top 50 genes that mostly co-express with ACE2 from the ARCHS4 resource. We then calculated the mean Z scores of the top 50 correlated genes to ACE2 and compared those values against a distribution calculated from sampling 50 random genes, repeatedly 10,000 times. The p values were calculated against the sampled distribution and corrected for multiple hypothesis testing by applying the Bonferroni correction method. The code behind this analysis is open source and available at https://github.com/maayanlab/covid19l1000.

Identifying Drug Sets from Previously Published Drug Screens for Other Diseases

To identify publications that describe similar in vitro drug screens from other contexts, we followed these steps. (1) We first queried PubMed for studies that contain the term [“drug screen” AND “in vitro”]. (2) The text from these studies was processed such that papers containing a table with drug names were saved for further manual inspection. (3) We then manually selected studies that performed drug screens comparable with the published screens for SARS-CoV-2. The study selection criteria required the identification of in vitro studies that included quantitative measures of many drugs' efficacy against a disease cell-based model.

Machine-Learning Approach to Prioritize Compounds Based on In Vitro Screens

A list of 195 drug hits from the six in vitro screens1, 2, 3, 4, 5, 6 (Table S1) was used as positives for applying a machine-learning method to prioritize these compounds and additional compounds. GE L1000 signatures for 19,777 drugs measuring the response of 978 landmark genes and their associated 166 MACCS molecular fingerprints were obtained from the SEP-L1000 project. The binary MACCS key association matrix was TF-IDF normalized to account for the frequency of different chemical structures. The dataset included 19,777 different drugs, of which 96 matched the 195 hits from the drug screens. After removing compounds from the library that appeared to be similar structurally, 8,787 compounds remained, of which 72 were hits. ET classifiers were trained to identify drug screen hits from the GE and CS features and evaluated using 3-fold cross-validation. Class weights were set inversely proportional to the class frequencies to address class imbalance. Otherwise, all ET parameters were the default Scikit-learn values. Feature selection was performed by recursive feature elimination to use 128 when both GE and CS data were used as features, or 64 features when only GE data were used. Additionally, prediction probabilities were calibrated across cross-validation splits.

32 in total

Review 1. Acitretin Use in Dermatology.

Authors: Lyn C Guenther; Rod Kunynetz; Charles W Lynde; R Gary Sibbald; John Toole; Ronald Vender; Catherine Zip
Journal: J Cutan Med Surg Date: 2017-09-27 Impact factor: 2.092

2. GEO2Enrichr: browser extension and server app to extract gene sets from GEO and analyze them for biological functions.

Authors: Gregory W Gundersen; Matthew R Jones; Andrew D Rouillard; Yan Kou; Caroline D Monteiro; Axel S Feldmann; Kevin S Hu; Avi Ma'ayan
Journal: Bioinformatics Date: 2015-05-13 Impact factor: 6.937

3. Drug-induced adverse events prediction with the LINCS L1000 data.

Authors: Zichen Wang; Neil R Clark; Avi Ma'ayan
Journal: Bioinformatics Date: 2016-04-01 Impact factor: 6.937

4. The Global Phosphorylation Landscape of SARS-CoV-2 Infection.

Authors: Mehdi Bouhaddou; Danish Memon; Bjoern Meyer; Kris M White; Veronica V Rezelj; Miguel Correa Marrero; Benjamin J Polacco; James E Melnyk; Svenja Ulferts; Robyn M Kaake; Jyoti Batra; Alicia L Richards; Erica Stevenson; David E Gordon; Ajda Rojc; Kirsten Obernier; Jacqueline M Fabius; Margaret Soucheray; Lisa Miorin; Elena Moreno; Cassandra Koh; Quang Dinh Tran; Alexandra Hardy; Rémy Robinot; Thomas Vallet; Benjamin E Nilsson-Payant; Claudia Hernandez-Armenta; Alistair Dunham; Sebastian Weigang; Julian Knerr; Maya Modak; Diego Quintero; Yuan Zhou; Aurelien Dugourd; Alberto Valdeolivas; Trupti Patil; Qiongyu Li; Ruth Hüttenhain; Merve Cakir; Monita Muralidharan; Minkyu Kim; Gwendolyn Jang; Beril Tutuncuoglu; Joseph Hiatt; Jeffrey Z Guo; Jiewei Xu; Sophia Bouhaddou; Christopher J P Mathy; Anna Gaulton; Emma J Manners; Eloy Félix; Ying Shi; Marisa Goff; Jean K Lim; Timothy McBride; Michael C O'Neal; Yiming Cai; Jason C J Chang; David J Broadhurst; Saker Klippsten; Emmie De Wit; Andrew R Leach; Tanja Kortemme; Brian Shoichet; Melanie Ott; Julio Saez-Rodriguez; Benjamin R tenOever; R Dyche Mullins; Elizabeth R Fischer; Georg Kochs; Robert Grosse; Adolfo García-Sastre; Marco Vignuzzi; Jeffery R Johnson; Kevan M Shokat; Danielle L Swaney; Pedro Beltrao; Nevan J Krogan
Journal: Cell Date: 2020-06-28 Impact factor: 41.582

5. A Screen of FDA-Approved Drugs for Inhibitors of Zika Virus Infection.

Authors: Nicholas J Barrows; Rafael K Campos; Steven T Powell; K Reddisiva Prasanth; Geraldine Schott-Lerner; Ruben Soto-Acosta; Gaddiel Galarza-Muñoz; Erica L McGrath; Rheanna Urrabaz-Garza; Junling Gao; Ping Wu; Ramkumar Menon; George Saade; Ildefonso Fernandez-Salas; Shannan L Rossi; Nikos Vasilakis; Andrew Routh; Shelton S Bradrick; Mariano A Garcia-Blanco
Journal: Cell Host Microbe Date: 2016-07-28 Impact factor: 21.023

Review 6. Mefloquine. A review of its antimalarial activity, pharmacokinetic properties and therapeutic efficacy.

Authors: K J Palmer; S M Holliday; R N Brogden
Journal: Drugs Date: 1993-03 Impact factor: 9.546

Review 7. Lamprene (clofazimine) in leprosy. Basic information.

Authors: S J Yawalkar; W Vischer
Journal: Lepr Rev Date: 1979-06 Impact factor: 0.537

8. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool.

Authors: Edward Y Chen; Christopher M Tan; Yan Kou; Qiaonan Duan; Zichen Wang; Gabriela Vaz Meirelles; Neil R Clark; Avi Ma'ayan
Journal: BMC Bioinformatics Date: 2013-04-15 Impact factor: 3.169

9. Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.

Authors: Zichen Wang; Caroline D Monteiro; Kathleen M Jagodnik; Nicolas F Fernandez; Gregory W Gundersen; Andrew D Rouillard; Sherry L Jenkins; Axel S Feldmann; Kevin S Hu; Michael G McDermott; Qiaonan Duan; Neil R Clark; Matthew R Jones; Yan Kou; Troy Goff; Holly Woodland; Fabio M R Amaral; Gregory L Szeto; Oliver Fuchs; Sophia M Schüssler-Fiorenza Rose; Shvetank Sharma; Uwe Schwartz; Xabier Bengoetxea Bausela; Maciej Szymkiewicz; Vasileios Maroulis; Anton Salykin; Carolina M Barra; Candice D Kruth; Nicholas J Bongio; Vaibhav Mathur; Radmila D Todoric; Udi E Rubin; Apostolos Malatras; Carl T Fulp; John A Galindo; Ruta Motiejunaite; Christoph Jüschke; Philip C Dishuck; Katharina Lahl; Mohieddin Jafari; Sara Aibar; Apostolos Zaravinos; Linda H Steenhuizen; Lindsey R Allison; Pablo Gamallo; Fernando de Andres Segura; Tyler Dae Devlin; Vicente Pérez-García; Avi Ma'ayan
Journal: Nat Commun Date: 2016-09-26 Impact factor: 14.919

10. In Silico Discovery of Candidate Drugs against Covid-19.

Authors: Claudia Cava; Gloria Bertoli; Isabella Castiglioni
Journal: Viruses Date: 2020-04-06 Impact factor: 5.048

30 in total

1. Longitudinal Study of DNA Methylation and Epigenetic Clocks Prior to and Following Test-Confirmed COVID-19 and mRNA Vaccination.

Authors: Alina P S Pang; Albert T Higgins-Chen; Florence Comite; Ioana Raica; Christopher Arboleda; Hannah Went; Tavis Mendez; Michael Schotsaert; Varun Dwaraka; Ryan Smith; Morgan E Levine; Lishomwa C Ndhlovu; Michael J Corley
Journal: Front Genet Date: 2022-06-03 Impact factor: 4.772

2. Structure-based drug repurposing against COVID-19 and emerging infectious diseases: methods, resources and discoveries.

Authors: Yosef Masoudi-Sobhanzadeh; Aysan Salemi; Mohammad M Pourseif; Behzad Jafari; Yadollah Omidi; Ali Masoudi-Nejad
Journal: Brief Bioinform Date: 2021-11-05 Impact factor: 11.622

3. IL10RB as a key regulator of COVID-19 host susceptibility and severity.

Authors: Georgios Voloudakis; Gabriel Hoffman; Sanan Venkatesh; Kyung Min Lee; Kristina Dobrindt; James M Vicari; Wen Zhang; Noam D Beckmann; Shan Jiang; Daisy Hoagland; Jiantao Bian; Lina Gao; André Corvelo; Kelly Cho; Jennifer S Lee; Sudha K Iyengar; Shiuh-Wen Luoh; Schahram Akbarian; Robert Striker; Themistocles L Assimes; Eric E Schadt; Miriam Merad; Benjamin R tenOever; Alexander W Charney; Kristen J Brennand; Julie A Lynch; John F Fullard; Panos Roussos
Journal: medRxiv Date: 2021-06-02

4. Identification and Development of Therapeutics for COVID-19.

Authors: Halie M Rando; Nils Wellhausen; Soumita Ghosh; Alexandra J Lee; Anna Ada Dattoli; Fengling Hu; James Brian Byrd; Diane N Rafizadeh; Ronan Lordan; Yanjun Qi; Yuchen Sun; Christian Brueffer; Jeffrey M Field; Marouen Ben Guebila; Nafisa M Jadavji; Ashwin N Skelly; Bharath Ramsundar; Jinhui Wang; Rishi Raj Goel; YoSon Park; Simina M Boca; Anthony Gitter; Casey S Greene
Journal: mSystems Date: 2021-11-02 Impact factor: 6.496

5. Virtual and In Vitro Antiviral Screening Revive Therapeutic Drugs for COVID-19.

Authors: Giovanni Bocci; Steven B Bradfute; Chunyan Ye; Matthew J Garcia; Jyothi Parvathareddy; Walter Reichard; Surekha Surendranathan; Shruti Bansal; Cristian G Bologa; Douglas J Perkins; Colleen B Jonsson; Larry A Sklar; Tudor I Oprea
Journal: ACS Pharmacol Transl Sci Date: 2020-10-14

6. SARSCOVIDB-A New Platform for the Analysis of the Molecular Impact of SARS-CoV-2 Viral Infection.

Authors: Rafael Lopes da Rosa; Tung Sheng Yang; Emanuela Fernanda Tureta; Laura Rascovetzki Saciloto de Oliveira; Amanda Naiara Silva Moraes; Juliana Miranda Tatara; Renata Pereira Costa; Júlia Spier Borges; Camila Innocente Alves; Markus Berger; Jorge Almeida Guimarães; Lucélia Santi; Walter Orlando Beys-da-Silva
Journal: ACS Omega Date: 2021-01-21

7. Phospholipidosis is a shared mechanism underlying the in vitro antiviral activity of many repurposed drugs against SARS-CoV-2.

Authors: Tia A Tummino; Veronica V Rezelj; Benoit Fischer; Audrey Fischer; Matthew J O'Meara; Blandine Monel; Thomas Vallet; Ziyang Zhang; Assaf Alon; Henry R O'Donnell; Jiankun Lyu; Heiko Schadt; Kris M White; Nevan J Krogan; Laszlo Urban; Kevan M Shokat; Andrew C Kruse; Adolfo García-Sastre; Olivier Schwartz; Francesca Moretti; Marco Vignuzzi; Francois Pognan; Brian K Shoichet
Journal: bioRxiv Date: 2021-03-24

8. Deleterious Effects of SARS-CoV-2 Infection on Human Pancreatic Cells.

Authors: Syairah Hanan Shaharuddin; Victoria Wang; Roberta S Santos; Andrew Gross; Yizhou Wang; Harneet Jawanda; Yi Zhang; Wohaib Hasan; Gustavo Garcia; Vaithilingaraja Arumugaswami; Dhruv Sareen
Journal: Front Cell Infect Microbiol Date: 2021-06-23 Impact factor: 5.293

9. A Workflow of Integrated Resources to Catalyze Network Pharmacology Driven COVID-19 Research.

Authors: Gergely Zahoránszky-Kőhalmi; Vishal B Siramshetty; Praveen Kumar; Manideep Gurumurthy; Busola Grillo; Biju Mathew; Dimitrios Metaxatos; Mark Backus; Tim Mierzwa; Reid Simon; Ivan Grishagin; Laura Brovold; Ewy A Mathé; Matthew D Hall; Samuel G Michael; Alexander G Godfrey; Jordi Mestres; Lars J Jensen; Tudor I Oprea
Journal: bioRxiv Date: 2020-11-05

10. Day-night and seasonal variation of human gene expression across tissues.

Authors: Valentin Wucher; Reza Sodaei; Raziel Amador; Manuel Irimia; Roderic Guigó
Journal: bioRxiv Date: 2022-01-11