Laura R Ganser1, Janghyun Lee2, Atul Rangadurai1, Dawn K Merriman3, Megan L Kelly1, Aman D Kansal1, Bharathwaj Sathyamoorthy1, Hashim M Al-Hashimi4,5. 1. Department of Biochemistry, Duke University School of Medicine, Durham, NC, USA. 2. Department of Chemistry, University of Michigan, Ann Arbor, MI, USA. 3. Department of Chemistry, Duke University, Durham, NC, USA. 4. Department of Biochemistry, Duke University School of Medicine, Durham, NC, USA. hashim.al.hashimi@duke.edu. 5. Department of Chemistry, Duke University, Durham, NC, USA. hashim.al.hashimi@duke.edu.
Abstract
Dynamic ensembles hold great promise in advancing RNA-targeted drug discovery. Here we subjected the transactivation response element (TAR) RNA from human immunodeficiency virus type-1 to experimental high-throughput screening against ~100,000 drug-like small molecules. Results were augmented with 170 known TAR-binding molecules and used to generate sublibraries optimized for evaluating enrichment when virtually screening a dynamic ensemble of TAR determined by combining NMR spectroscopy data and molecular dynamics simulations. Ensemble-based virtual screening scores molecules with an area under the receiver operator characteristic curve of ~0.85-0.94 and with ~40-75% of all hits falling within the top 2% of scored molecules. The enrichment decreased significantly for ensembles generated from the same molecular dynamics simulations without input NMR data and for other control ensembles. The results demonstrate that experimentally determined RNA ensembles can significantly enrich libraries with true hits and that the degree of enrichment is dependent on the accuracy of the ensemble.
Dynamic ensembles hold great promise in advancing RNA-targeted drug discovery. Here we subjected the transactivation response element (TAR) RNA from human immunodeficiency virus type-1 to experimental high-throughput screening against ~100,000 drug-like small molecules. Results were augmented with 170 known TAR-binding molecules and used to generate sublibraries optimized for evaluating enrichment when virtually screening a dynamic ensemble of TAR determined by combining NMR spectroscopy data and molecular dynamics simulations. Ensemble-based virtual screening scores molecules with an area under the receiver operator characteristic curve of ~0.85-0.94 and with ~40-75% of all hits falling within the top 2% of scored molecules. The enrichment decreased significantly for ensembles generated from the same molecular dynamics simulations without input NMR data and for other control ensembles. The results demonstrate that experimentally determined RNA ensembles can significantly enrich libraries with true hits and that the degree of enrichment is dependent on the accuracy of the ensemble.
The discovery of regulatory non-coding RNAs (ncRNAs) has been accompanied by a growing interest in targeting RNA using small molecules for therapeutics development[1-5]. Small molecules enjoy favorable pharmacological properties and do not suffer from delivery limitations inherent to oligonucleotide-based therapeutics[5]. However, targeting RNA with small molecules comes with a unique set of challenges. Most ncRNAs are non-enzymatic, making it difficult to directly screen for inhibitors. High-throughput screening (HTS) assays targeting RNA often yield hits with low specificity, unfavorable pharmacological properties, and/or poor activity in cell-based assays. Additionally, libraries used in HTS are biased to compounds that bind the deep hydrophobic pockets of proteins, not the polar and solvent exposed pockets typical of RNA targets. Rational approaches to identify small molecules that bind specific RNA secondary structures have had some success[6], but achieving the desired selectivity and efficacy is difficult given the prevalence of similar secondary structural motifs across the transcriptome.Structure-based approaches such as computational docking[7,8] potentially provide a powerful means to broadly pre-screen compound libraries and generate sub-libraries enriched with diverse compounds that selectively bind the unique pockets of ncRNAs. However, applying virtual screening (VS) to RNA drug targets is complicated by the high flexibility of RNA and its propensity to undergo large conformational changes upon small molecule binding[9]. Several approaches have been developed to address protein flexibility including ‘soft docking’[10], methods that vary side chain rotamers[11], and induced-fit docking[12]. Unfortunately, none of these approaches can treat the large conformational changes accompanying RNA recognition while maintaining the high computational efficiency needed for VS applications. An alternative approach treats the receptor as an ensemble of many conformations each of which is subjected to VS[13-15](reviewed in [7,8]). However, the force fields used in molecular dynamics (MD) simulations to generate ensembles of conformations remain underdeveloped and poorly tested for RNA[16,17]. Because of this, and the much higher flexibility of RNA[16,17], there is a greater risk of including artifactual conformations in the ensemble that are rarely sampled in solution, leading to false positives in VS[18-20]. There is also a greater risk of not sampling conformers with favorable binding pockets because of RNA’s more rugged energy landscape and high propensity for kinetic traps[21], thus increasing the likelihood of false negatives.Recent approaches that combine experimental data with computational methods are making it possible to determine ensembles of proteins and nucleic acids at atomic resolution[22-26]. Interestingly, ensembles of the apo-state determined using these hybrid approaches often include conformations similar to those observed for the biomolecule when bound to cognate partners[23-25]. Inspired by these discoveries, we[9] and others[27,28] have carried out ensemble based VS (EBVS) using experimentally informed ensembles. The utility of this approach in targeting RNA was demonstrated in a prospective study[9] utilizing an ensemble of the transactivation response element (TAR) RNA from human immunodeficiency virus type-1 (HIV-1) (Fig. 1a). The ensemble was determined using two sets of NMR residual dipolar coupling (RDC) data[29,30] to guide selection of conformations from a pool generated using MD simulations[23]. RDCs depend on the orientation of bond vectors in a biomolecule relative to a molecule-fixed alignment frame and are sensitive to internal motions spanning a broad range of timescales (picosecond-to-millisecond)[29,30]. The top 57 scoring small molecules out of a screen of 51,000 compounds included six molecules that bind TAR in vitro. These include the first example of a small molecule that binds an RNA apical loop and an aminoglycoside that binds TAR with high selectivity, inhibiting HIV replication (IC50∼20 μM) in an indicator cell line[9].
Figure 1
Experimental HTS of HIV-1 TAR RNA to generate libraries for EBVS a. Secondary structure of HIV-1 TAR. b. HTS workflow identifying hits and non-hits. c. Chemical property distributions of hits (blue) and non-hits (gray) for the Full, Filtered and Optimized libraries.
A critical evaluation of VS requires retrospective studies that test the ability of docking to discriminate between known hits and non-hits[31]. Such studies are routine in protein applications but have been scarce for RNA. Thus far, no study has evaluated the utility of experimentally informed RNA ensembles in enriching true hits using EBVS. In addition, most RNA studies employ non-hits that are not experimentally verified but rather selected using decoy-generation approaches developed for proteins that have not been validated for RNA[32-34]. Here, we generated a rich dataset by subjecting HIV-1TAR to experimental HTS against ~100,000 drug-like organic molecules (Fig. 1b). This represents one of the largest RNA-small molecule screens reported to date. After augmenting the ~100,000 compound library with 170 known TAR binders, we generated experimentally validated datasets of hits and non-hits optimized for testing VS, following the general protocol used to generate the database of useful (docking) decoys enhanced (DUD-E) in protein applications[31]. The results demonstrate that experimentally determined RNA ensembles significantly enrich libraries with true hits and that the degree of enrichment is dependent on the accuracy of the ensemble.
RESULTS
Experimental high-throughput screening to identify TAR hits and non-hits
Using a Tat peptide displacement assay we subjected HIV-1TAR (Fig. 1a) to experimental HTS against ~100,000 drug-like molecules following the workflow shown in Figure 1b (details in Methods). The library was initially tested in a primary screen employing single point measurements and 260-fold excess small molecule. The 2,812 primary screen hits were subjected to a secondary confirmation screen employing triplicate measurements. The 267 confirmed hits were tested in dose response assays yielding 17 hits with competitive doses to displace 50% of Tat peptide (CD50) values < 100 μM. These compounds were repurchased and re-tested for TAR binding using the displacement assay and NMR chemical shift mapping experiments. This yielded six confirmed hits (Table 1 and Supplementary Fig. 1) and identified three false positives (see Methods and Supplementary Fig. 2). To limit false negatives, we re-tested 56 non-hits with chemical similarity to the hits using dose response assays and NMR experiments. This resulted in the identification of one additional hit (Table 1 and Supplementary Fig. 1) and confirmation of many non-hits with high chemical similarity to our hits (examples in Supplementary Fig. 3). The fact that small structural changes can ablate binding is consistent with the hit molecules making specific interactions with TAR.
Table 1
Chemical structure and CD50 values (with and without 100-fold excess tRNA) for TAR hits identified through HTS. Reported values represent the mean and s.d. from n=3 independent experiments.
Chemical Structure
Molecule Name
CD50 (μM)
100X tRNA CD50 (μM)
CCG-133994
12 ± 4
16 ± 5
CCG-133895
17 ± 1
29 ± 17
CCG-133868
31 ± 7
21 ± 10
CCG-133905
53 ± 30
12 ± 1
CCG-133879
29 ± 8
24 ± 2
CCG-208662
41 ± 14
NA
CCG-208677
55 ± 13
NA
To test for false negatives, we re-tested 10 non-hits that score in the top 5% of EBVS using NMR. Four molecules were identified that bind TAR, including an aminoglycoside which was missed in HTS due to insolubility in DMSO, a weak binder that does not satisfy our hit criteria, and two compounds whose binding affinities could not be verified due to fluorescence interference effects (see Methods and Supplementary Fig. 4). These compounds were removed from the VS libraries to avoid biasing results. These results highlight potential weaknesses in experimental HTS and provide a blind test for EBVS to identify TAR binders.Overall, HTS yielded seven hits, which represent two novel classes of RNA binding small molecules (Table 1). Five of the hit molecules share an anthraquinone scaffold while the other two have napthyl and quinazoline cores. The anthraquinone molecules show selectivity relative to tRNA (Table 1) and insignificant activity in a microRNA screen (Dr. A.L. Garner University of Michigan, personal communication). Of particular relevance to this study, the HTS yielded 103,349 experimentally verified non-hits that can be used as decoys to test the performance of EBVS.
Building small molecule libraries for EBVS evaluation
The HTS library was augmented with 170 diverse small molecules reported in the literature to bind TAR with dissociation or inhibition constants that satisfy our hit criteria (See Supplementary Note 1 and Supplementary Table 1). The hits include derivatives of beta-carboline, quinolone, diphenylfuran, nucleosides, aminoglycosides, and many others as well as 36 molecules with demonstrated activity in cell (or cell-extract) based assays. To avoid bias and maximize chemical diversity, the 177 hits were clustered based on Bemis-Murcko atomic frameworks and the compound with highest affinity selected as a representative of each scaffold. This resulted in the “Full” library consisting of 78 hits (19 with cell-based activity) and 103,349 non-hits.The chemical properties of hits and non-hits in the Full library are markedly different (Fig. 1c). On average, the hits, which include several aminoglycosides, have larger molecular weight, charge, number of rotatable bonds, hydrogen bond donors and acceptors as well as lower LogP values. Similar differences between RNA binders and compound libraries used in HTS have been noted previously[35]. Such differences can lead to artificial enrichment in VS by biasing docking scores for hits versus non-hits based solely on differences in 1D chemical properties and not 3D structure complimentarity[31,36]. We therefore generated two additional property-matched libraries that provide a more stringent test for docking-based enrichment (see Supplementary Note 1). A “Filtered” library containing 26 hits (8 with cell-based activity) and 102,307 non-hits was generated by omitting small molecules with outlier chemical properties (Fig. 1c and Supplementary Fig. 5). An “Optimized” library containing 14 hits (5 with cell-based activity) and 637 non-hits was generated following the general protocol for decoy generation used in the DUD-E[31] for protein applications where a set number of property-matched and topologically distinct non-hits are selected for each hit (Fig. 1c, Supplementary Fig. 5, and Supplementary Fig. 6a-b). Together, the three small molecule libraries provide the means to robustly evaluate the performance of VS against TAR RNA.
Ensemble based virtual screening
EBVS was carried out against a recently reported RDC informed dynamic ensemble (E0,4rdc) of HIV-1TAR RNA (Fig. 2a)[24]. The ensemble contains twenty unique and equally populated (5% each) conformations[24]. Compared to the previous TAR ensemble used in VS[9,23] (see E1,2rdc below), this ensemble was determined using four rather than two sets of RDCs[17] and a longer MD simulation (8.2 μs versus 80 ns) to generate the starting pool of TAR conformations[24]. The TAR ensemble displays a high degree of flexibility; the pairwise RMSD between any two conformations is >1.9 Å and on average 5.9 Å. This substantially exceeds the flexibility of most protein targets, presenting a significant challenge to docking based approaches.
Figure 2
Evaluating EBVS against the RDC TAR dynamic ensemble a. The twenty conformers of the of TAR dynamic ensemble (E0,4rdc). b. ROC curve analysis showing EBVS enrichment of all hits (blue) and cell-active hits (orange) for all three libraries. c. ROC AUC and ROC(2%) scores for docking against individual conformers of the E0,4rdc ensemble, a randomly selected MD ensemble (E0,ran), and the lowest energy NOE-based structures for apo-TAR (PDB 1ANR) and tRNA (PDB 1EHZ) for the Filtered library. Dashed lines indicate the values for the full N=20 E0,4rdc ensemble. Results for the Full and Optimized libraries are shown in Supplementary Fig. 7. ROC plots were generated from one run of docking all molecules to all receptors.
Each small molecule was docked against every TAR conformer using Internal Coordinate Mechanics (ICM)[37]. Each small molecule was assigned a docking score corresponding to the best score across the 20 conformers, a Boltzmann-weighted average score, or an arithmetic average score (see Methods). The global enrichment of true binders was assessed based on the area under the curve (AUC) of a receiver operator characteristic (ROC) curve, with AUC=1.0 representing perfect enrichment and AUC=0.5 representing random selection of hits and non-hits. For the Full library, optimal enrichment was obtained using the Boltzmann average or best score, whereas the arithmetic average yielded slightly better enrichment for the Filtered and Optimized libraries (Supplementary Fig. 6c). The Full library had larger variation in enrichment across scoring approaches because it contains molecules with highly varied docking scores across conformers. In what follows, we use the Boltzmann average score for the Full library and arithmetic average score for both Filtered and Optimized libraries. Results for all scoring approaches and for including all hits without clustering are presented in Figure 2b and Supplementary Figure 6c.EBVS globally enriches the Full library with ROC AUC=0.88 and 42% of hits are identified after screening only 2% of non-hits (ROC(2%)=42%) (Fig. 2b). This corresponds to a hit rate of 1.6% as compared to 0.075% when screening the entire library, and an enrichment factor EF(2%)=21. Similar levels of enrichment were obtained for the Filtered (ROC AUC= 0.85 and ROC(2%)= 50%) and Optimized (ROC AUC=0.90 and ROC(2%)=57%) libraries (Fig. 2b). EBVS significantly enriches hits with cell-based activity with ROC AUC=0.91-0.94 and ROC(2%)=40-75% (Fig. 2b). This performance is comparable to best-case results when docking to known bound structures of proteins[38-40].The enrichment was lower for individual TAR conformers derived from the ensemble and decreased further for single conformers randomly selected from the MD pool (Fig. 2c and Supplementary Fig. 7a). Docking against the lowest energy NOE-based structure of free TAR (PDB 1ANR)[41] generally performed better than other single conformers, but consistently worse than EBVS (Fig. 2c and Supplementary Fig. 7a). Enrichment was also lower for an NMR structure of tRNA (PDB 1EHZ)[42] compared to the TAR ensemble (Fig. 2c and Supplementary Fig. 7a). The TAR binders, including those with cell activity, had higher scores on average for docking against tRNA compared to TAR suggesting that VS would have identified these as selective TAR binders (Supplementary Table 2). The similar level of enrichment observed across the three libraries when VS the ensemble, single conformers, or tRNA argues against significant artificial enrichment in the Full library.
Enrichment depends on ensemble size
On average, enrichment decreased when using smaller sub-ensembles derived from the full N=20 ensemble, reaching a minimum at N=1 (Fig. 3a and Supplementary Fig. 7b). This is despite the fact that increasing the ensemble size increases the risk of including artifactual conformations that can lead to false positives[18-20]. The N=20 TAR ensemble represents the smallest ensemble that satisfies the RDC data, with smaller ensembles failing to reproduce the RDCs to within experimental uncertainty[24]. Accordingly, the sub-ensembles have diminishing accuracy as measured based on their agreement with the four RDC datasets (RDC RMSD) (Fig. 3b). Consequently, enrichment decreases on average with increasing RDC RMSD (Fig. 3c). Similar trends were observed for all libraries and for other ensembles (See below and Supplementary Fig. 7c-d). These results show that all 20 conformations contribute to the high enrichment observed for TAR and suggest a correlation between enrichment and ensemble accuracy.
Figure 3
Dependence of EBVS enrichment on ensemble size and ensemble accuracy a. Dependence of the ROC AUC and ROC(2%) scores on ensemble size for the Filtered library. b. Dependence of the RDC RMSD on the ensemble size. c. Dependence of the ROC AUC and ROC(2%) scores on the RDC RMSD for the Filtered library. For a, b, and c the mean and s.d. values over all possible sub-ensembles of each ensemble size are plotted. d. Distinct ensembles of apo-TAR with variable accuracy as assessed based on RDC RMSD (shown in parentheses). e. Dependence of the ROC AUC and ROC(2%) on RDC RMSD for all hits (blue) and cell-active hits (orange) of the Filtered library. f. Mean and s.d. of EBVS scores for hits (blue) and non-hits (gray) of the Filtered library for all ensembles. Dashed lines represent the values for the E0,4rdc ensemble. Results for the Full and Optimized libraries are shown in Supplementary Fig. 7 and 9. All ROC values were generated from one run of docking all molecules to all receptors.
Although all conformations contribute to enrichment, some conformers are predicted to be more or less preferentially bound across the Full library (Supplementary Fig. 8a) and the preferences are different for hit molecules relative to the Full library (Supplementary Fig 8b). Interestingly, conformer 5, which most resembles a known ligand-bound TAR conformation, yields the lowest docking score for many molecules across the Full library but is favored by a smaller percentage of hit molecules, suggesting a favorable but non-selective binding pocket (Supplementary Fig. 8b). Conformers 8, 10, and 17, which also most resemble known ligand-bound TAR conformations, yield the best docking score for more hits than non-hits although conformer 17 is also often selected by false positive hits (top 2% scored non-hits) (Supplementary Fig.8b).Hyper-enriching sub-ensembles, which exhibit higher enrichment than the N=20 parent ensemble, tend to be enriched in conformers that score highly for hits compared to the Full library, such as conformer 2 (Supplementary Fig. 8c). On the other hand, conformers, such as conformers 5 and 15, that are not favored by hit molecules relative to the library are found in fewer hyper-enriching ensembles. Despite small variations, most conformers are not significantly under- (<20%) or overrepresented (>80%) in hyper-enriching ensembles, supporting that all conformers contribute to enrichment. Taken together, these results highlight how a given conformer can contribute positively to enrichment when placed within an appropriate sub-ensemble even though it may have poor enrichment when considered in isolation or in a different ensemble context.
Enrichment depends on ensemble accuracy
We carried out EBVS on six additional N=20 TAR ensembles with varying degrees of accuracy as assessed by RDC RMSD (Fig. 3d). E0,4rdc, determined using four sets of RDCs and an MD generated pool (MD0), predicts the four RDC data sets with an optimal RMSD=4.0 Hz. Three additional ensembles were generated from the same MD pool by randomly selecting conformations (E0,ran), clustering the MD pool by heavy atom RMSD (E0,clus), or by selecting an ensemble that poorly satisfies the RDCs (E0,anti). These ensembles predict RDCs with less favorable RMSDs of 10.4 Hz, 9.0 Hz, and 16.2 Hz, respectively. Additionally, we examined a previously reported TAR ensemble (E1,2rdc; RMSD=7.2 Hz) determined using two sets of RDCs and a different MD pool (MD1)[23] as well as a corresponding control ensemble (E1,ran; RMSD=11.0 Hz) obtained by randomly selecting 20 conformations from the MD1 pool. Finally, we examined the NOE-based bundle of HIV-1TAR structures (ENOE; RMSD=8.6 Hz)[41]. Note that the high RDC RMSD observed for ENOE is not surprising considering that NOE-derived distance restraints are orthogonal to RDC-derived orientational restraints and that the bundle of structures is not a statistical ensemble but rather a collection of single structures that satisfy the experimental constraints.As shown in Figure 3e, Supplementary Fig. 9a and Table 2, E0,4rdc, which best satisfies the RDCs, robustly yields the highest enrichment across the three libraries whereas E0,anti, which least satisfies the RDCs, generally yields the lowest enrichment. The enrichment observed for the remaining ensembles falls between these extremes, and is generally better for the two experimentally informed ensembles (E1,2rdc and ENOE). The three purely computational ensembles (E0,ran, E0,clus and E1,ran) show significant variations in enrichment, highlighting the risks of generating ensembles without experimental input. E0,ran and E0,clus have very similar RDC RMSDs but E0,clus consistently yields higher enrichment, showing that RDC RMSD is not the only predictor of enrichment. This is not surprising considering that RDCs are insensitive to translational aspects of RNA structure that are likely important for predicting binding and that multiple degenerate ensembles can satisfy a given set of RDCs[43].Differences in docking scores and binding pockets help explain the different enrichment levels observed across different ensembles (Fig. 3f and Supplementary Fig. 9b). The difference in scores between hits and non-hits is greater for E0,4rdc relative to other ensembles. The average scores of hits for E0,4rdc are lower than most other ensembles, consistent with formation of optimal pockets. E1,2rdc and E1,ran have comparatively lower average scores for non-hits increasing the likelihood of false positives. Conformers in these ensembles tend to have larger binding pockets relative to other ensembles (Supplementary Fig. 9c). The average scores for E0,ran and E0,anti are significantly elevated for both hits and non-hits and correspondingly they have smaller binding pockets on average (Fig. 3f and Supplementary Fig. 9b-c). All other ensembles had similar binding pocket sizes and accessibility, indicating that enrichment is not determined only by these gross binding pocket features (Supplementary Fig. 9c).
Enrichment correlates to overlap with ligand-bound conformations
We examined how well the different ensembles encompass six previously determined NMR structures of ligand-bound TAR (acetylpromazine (1LVJ)[44], rbt550 (1UTS)[45], rbt203 (1UUD)[46], rbt205 (1UUI)[46], neomycin B (1QD3)[47], and arginine (1ARJ)[48]). First, we focused on the relative orientation of TAR helices, which is an important determinant of RNA binding pockets[49] and is the least well modeled aspect of TAR in MD simulations[16]. The average inter-helical orientation has also been independently validated for three ligand-bound TAR conformations based on order tensor analysis of RDCs for arginine[50], acetylpromazine[51], and neomycin B[51]. However, these RDC studies also highlighted uncertainty in the NOE-based structures due to deviations in the local geometry and/or unaccounted flexibility (Supplementary Note 2 and Supplementary Fig. 10a).Overall, ensembles that best overlap with the ligand-bound conformations showed the best EBVS enrichment (Fig. 4a). As noted previously[24], E0,4rdc encompasses the six ligand-bound TAR inter-helical conformations despite a very broad MD0 pool. In contrast, E0,anti, which shows the weakest enrichment, shows the poorest overlap with the ligand-bound conformations. Interestingly, E0,clus, which shows better enrichment than either E0,anti or E0,ran, shows the most significant overlap among the three computational ensembles. The MD1 pool has a different spread of inter-helical angles than MD0 that does not overlap as well with the ligand-bound conformations. The ensembles derived from MD1 (E1,2rdc, E1,ran, and ENOE) all show intermediate overlap with the ligand-bound conformations.
Figure 4
Assessing EBVS-predicted small molecule bound TAR conformations. a. Inter-helical bend (βh) and twist (αh+γh) angles[59] (negative and positive twist angles correspond to over- and under-twisting, respectively) for each TAR ensemble (colored) compared to its respective parent MD pool (gray) and all ligand-bound TAR NOE-based NMR structures (black, mean and s.d. values over all deposited structures). b. For each small molecule, the inter-helical angles of the ligand-bound NMR structures (black, mean and s.d. values over all deposited structures) are compared to the conformers of the E0,4rdc ensemble (open squares), the average values over all conformers (green), the Boltzmann-weighted EBVS-predicted structures (blue circles, mean and s.d. values over n=20 independent docking runs), and all conformers predicted to be > 25% populated over n=20 independent docking runs (blue open squares).
Excluding neomycin B, the average inter-helical conformations predicted by EBVS against E0,4rdc is within error of the NMR structures for four out of five molecules and all five bend angles are within error (Fig. 4b). In contrast, only two structures are within error for EBVS against E0,ran (Supplementary Fig. 10b). In the case of neomycin B, docking prefers a conformer that differs considerably from the NMR structure. Here, the larger size of neomycin B likely contributes to greater uncertainty in the docking predictions as is observed in benchmark studies (Fig. 5 and Ref [52]).
Figure 5
Evaluating ligand-bound poses predicted using EBVS a. Success rates for a benchmark re-docking X-ray (black) or NMR (gray) structures of RNA bound to ligands (see Supplementary Table 3). Data shown for molecules with number of rotatable bonds Nflex< 11 (solid line) and Nflex> 11 (dashed line). RMSD values correspond to the best scoring pose over n=20 independent docking runs. b. Benchmark RMSDs when re-docking ligands to their X-ray (123 structures) or NMR (26 structures) RNA structure and for molecules with Nflex< 11 (17 NMR structures and 90 X-ray structures) and Nflex> 11 (9 NMR structures and 33 X-ray structures). Also shown are the RMSDs over n=20 independent docking runs for each ligand-bound TAR NMR structure after re-docking to the NMR structure (yellow) or when carrying out EBVS against E0,4rdc (blue) and E0,ran (red). (center line, median; center square, mean; box limits, 25th and 75th percentiles; whiskers, 5th and 95th percentiles; points, outliers) c. Lowest RMSD bound poses over n=20 independent docking runs based on re-docking the NMR structure (yellow) or when carrying out EBVS against E0,4rdc (blue) or E0,ran (red). All poses are superimposed onto the NMR structure (black) using the binding pocket and ligand.
Comparison of ligand-bound poses reveals that, with the exception of neomycin B, EBVS correctly places the ligands within or near the RNA binding pocket defined by the NOE-based NMR structure. A more quantitative comparison is complicated by many factors, including the fact that EBVS predicts an ensemble of bound conformations not a single structure, differences in NMR and EBVS predicted RNA structures that complicate alignment, and by evidence for uncertainty in local aspects of the NOE-based NMR structure[50] which may arise from the dynamic nature of these complexes (Supplementary Note 2 and Supplementary Fig. 10a). Notwithstanding the above complications, we compared the EBVS predicted ligand poses with the NMR structures.We first carried out benchmark studies by re-docking known RNA ligand-bound structures (Supplementary Table 3) and computing the ligand RMSD between the re-docked pose and original NMR structure. For X-ray structures of RNA bound to ligands with less than 11 rotatable bonds (Nflex<11), we obtained a success rate of 72% for an RMSD cutoff of 2.5 Å (Fig. 5a). However, the success rate dropped significantly for NMR structures or molecules with Nflex>11 (Fig. 5a). These results highlight the fact that our docking protocol is able to recapitulate bound poses when the structure is well-defined and the molecule is not highly flexible.To compare EBVS predicted ligand poses to the NMR structures, we computed the ligand RMSD after superimposing structures using both the RNA binding pocket and ligand (see Methods). On average, EBVS predicts the ligand-bound poses (RMSD= 8.3 ± 3.5 Å (acetylpromazine), 7.0 ± 2.0 Å (arginine), 9.2 ± 1.8 Å (rbt205), 10.3 ±3.4 Å (rbt550), 10.1 ± 3.3 Å (rbt203) and 17.1 ± 4.5 Å (neomycin B)) with an accuracy that is comparable, albeit consistently slightly poorer, than those obtained when re-docking the ligands against their NMR structure (RMSD= 4.9 ± 0.6 Å (acetylpromazine), 6.8 ± 0.9 Å (arginine), 7.1 ± 1.3 Å (rbt205), 8.9 ± 0.9 Å (rbt550), 8.0 ± 2.2 Å (rbt203) and 12.0 ± 1.8 Å (neomycin B)) (Fig. 5b). These RMSDs are on the high end for re-docking NMR structures (Fig. 5b). This could be because the apo-ensemble does not perfectly reproduce the ligand-bound TAR conformations and/or because of uncertainty in the NOE-based NMR structures due to lack of RDC restraints and/or unaccounted flexibility. When only considering the lowest RMSD pose over 20 docking runs, EBVS E1,4rdc agrees better with the NMR structure than re-docking the NMR structure itself for three of the six ligands and EBVS against E0,ran yields the poorest agreement for all ligands except neomycin B (Fig. 5c). In light of our benchmark study, the poor pose prediction of neomycin B may be attributed in part to its large number of rotatable bonds.
Discussion
Advances in hybrid experimental-computational methods are enabling the determination of dynamic ensembles with ever increasing accuracy. One of the emerging themes from studies thus far is that bound conformations of biomolecules are often significantly populated in the apo-state ensemble. Even though ensemble-based docking is becoming a popular method for treating flexibility during VS[7,8,13-15], only three studies have subjected experimentally informed ensembles to VS[9,27,28]. Rather, static structures or purely computational ensembles are typically subjected to VS. Here, we present the first perspective study evaluating the enrichment performance of VS experimentally informed ensembles and comparing it to that of computational ensembles.While an ensemble of structures can often be identified that outperforms single X-ray or NMR structures in retrospective enrichment studies[53,54], identifying the successful ensemble in advance of VS can prove difficult[19,55]. This is a significant problem for RNA given that a handful of conformers have to be selected from thousands of conformations as representatives of a broad conformational landscape. Our results emphasize the potential importance of conformational penalties[27] when developing and testing scoring functions against highly flexible RNA targets[9,56]. In the case of TAR, the performance varies significantly when drawing N=20 ensembles from the same MD pool without guidance from experimental data (Fig. 3 and Table 2). Data from NMR, X-ray, or other methods can guard against artifactual conformations and guide identification of the most populated conformations, which carry the least conformational penalties for ligand binding[22-26,57]. Experimentally informed conformer populations can also be directly translated into scoring penalties during EBVS[27]. Additionally, experimental data can define an optimally small ensemble for VS applications, whereas there is no general recipe for selecting ensemble size without experimental input[53,55]. In the case of RDCs, it has been shown that the minimum sized ensembles that satisfy the data represent a data driven clustering of the real ensemble[43]. Here, the ensemble size is naturally tuned to the level of dynamics with greater flexibility calling for larger ensembles to satisfy the RDCs[43].
Table 2
Enrichment scores for all TAR ensembles. ROC values were generated from one run of docking all molecules to all receptors.
RDC RMSD (Hz)
Full library Boltzmann weighting
Filtered library arithmetic average
Optimized library arithmetic average
AUC
ROC 2%
AUC
ROC 2%
AUC
ROC 2%
E0,4rdc
4.0
0.88
42%
0.85
50%
0.90
57%
E0,ran
10.4
0.47
21%
0.56
8%
0.51
0%
E0,clus
9.0
0.79
29%
0.75
23%
0.82
36%
E0,anti
16.2
0.51
14%
0.49
4%
0.47
0%
E1,2rdc
7.2
0.87
50%
0.76
31%
0.86
36%
E1,ran
11.0
0.81
29%
0.78
23%
0.86
50%
ENOE
8.6
0.73
31%
0.80
27%
0.76
36%
Our study also highlights future challenges and opportunities in RNA VS applications. First, while our results indicate that EBVS significantly enriches compounds with activity in cell-based (or cell extract based) assays, there is a need to more directly assess the RNA binding selectivity of hits and to assess the ability of EBVS to enrich for selective RNA binders. Second, rigorous evaluation of pose predictions from EBVS against flexible ncRNA targets will require more high-resolution structures of RNA-small molecule complexes by X-ray or NMR, so long as RDCs and other experimental restraints are used to improve the accuracy of NMR structures. Finally, there is room to further refine ensemble determination approaches by including low-populated conformational states that may have optimal binding pockets. For example, as noted previously[23,24], the experimentally informed TAR ensemble does not contain conformers with the U23-A27-U38 base triple which forms on ARG recognition[50,58]. The integration of conformational penalties from experimentally informed ensembles may help identify pitfalls in docking scoring functions that are currently obscured by treatment of RNA receptors as static structures. Notwithstanding the above future challenges, our results indicate that EBVS can immediately be applied to significantly enrich compound libraries with RNA binders.
ONLINE METHODS
HTS library composition
The small molecule library used in experimental HTS consisted of 103,498 drug-like small molecules available at the Center for Chemical Genomics (CCG), University of Michigan, Ann Arbor. 100,000 molecules were synthetic organic molecules with drug-like properties (ChemDiv). The other 3,498 compounds consisted of 2,000 bioactive molecules (MicroSource Discovery Systems Inc.), 446 molecules (National Institute of Health clinical collection), and 1052 molecules that the CCG had previously found to be active against other targets. The library was stored as 2-5 mM stock solutions in DMSO for ~3 years for initial screens. Repurchased molecules were stored as 3-20 mM stock solutions in DMSO for ~1 year, except for CCG-39701 which was stored as a powder and dissolved in water before use.
Preparation of HIV-1 TAR RNA and Tat peptide
HIV-1TAR for NMR and binding assays was prepared by in vitro transcription using DNA template containing the T7 promoter (Integrated DNA Technologies). DNA template was annealed at 50 μM DNA in 3 mM MgCl2 by heating to 95°C for 5 min and cooling on ice for 30 min. The transcription reaction was carried out at 37°C for 12 hours with T7 RNA polymerase (New England BioLabs) in the presence of 13C/15N labeled or unlabeled nucleotide triphosphates (Cambridge Isotope Laboratories, Inc). RNA was purified using 20% (w/v) denaturing polyacrylamide gel electrophoresis with 8 M urea and 1X TBE. Purified RNA was extracted from the gel by electroelution in 1X TAE buffer and purified by ethanol precipitation. Purified RNA was dissolved in water to 50 μM RNA, heated to 95°C for 5 min and cooled on ice for 1 hour to anneal. For NMR experiments, 13C/15N labeled RNA was exchanged into NMR buffer [15 mM NaH2PO4/Na2HPO4, 25 mM NaCl, 0.1 mM EDTA, 10% (v/v) D2O at pH 6.4]. For in vitro assays, unlabeled RNA was diluted to 150 nM in Tris-HCl assay buffer [50 mM Tris-HCl, 50 mM KCl, 0.01% (v/v) Triton X-100 at pH 7.4].The Tat peptide used in HTS, (5-FAM)-AAARKKRRQRRRAAA-Lys(TAMRA), was purchased (LifeTein) with purity > 95% as assessed by Electrospray Ionization Mass Spectrometry. The peptide was stored at −20°C as a 100 μM stock solution in Tris-HCl assay buffer and diluted to 60 nM with assay buffer for use in HTS.
High-throughput screening
Assay
HTS utilized a previously described TAR-Tat displacement assay[60]. The Tat peptide is highly flexible when free in solution and becomes structured upon binding to TAR[61-63]. When the Tat peptide is flexible, its two terminal fluorophores, fluorescein and TAMRA, interact and their fluorescence is quenched. Alternatively, in its extended form bound to TAR, the fluorophores are held at a distance allowing fluorescence resonance energy transfer (FRET) from fluorescein to TAMRA. Thus, as inhibitor displaces Tat, there is a decrease in fluorescence signal (excitation: 485 nm, emission: 590 nm). For these assays, we used 50 nM TAR and 20 nM Tat because this ratio gave the maximal fluorescence signal. In the literature, this assay commonly uses a 1:1 ratio of TAR to Tat, so the excess TAR in our assay results in higher CD50 values and a relatively more stringent test of binding. Using neomycin B as a control, we found that the CD50 obtained using our assay (CD50 = 0.96 ± 0.42 μM) is slightly higher than the same assay with a 1:1 ratio of TAR to Tat (CD50 = 0.32 ± 0.10 μM).The library was tested in a primary screen using a single point measurement (n=1) and 260-fold excess molecule [50 nM TAR, 20 nM Tat, and 13 μM molecule] followed by a confirmation screen of triplicate measurements (n=3) for the 2812 molecules that showed activity, defined as a change in fluorescence signal three standard deviations above the negative control (Tat alone). Molecules were pin-tooled (200 nL) into opaque 384-well microplates by Biomek FX 384-well nanoliter HDR (Beckman) and Mosquito X1 (TTP Labtech). TAR and Tat were dispensed with Multidrop reagent dispenser (Thermo Scientific). Assay mixtures were incubated at room temperature for 10–15 minutes prior to fluorescence measurements using a Pherastar plate reader (BMG Labtech). Each plate during HTS contained 16 wells of TAR and Tat without molecule (negative control) and 16 wells of Tat only (positive control). The Z-factor[64] was calculated for each microplate; the average Z-factor throughout the screening campaign was 0.71.
Dose response assays
A total of 267 molecules with reproducible activity were tested in a dose response assay and those with CD50 < 100 μM were considered hits. Dose-response assays were performed such that the final assay concentrations were 50 nM TAR, 20 nM Tat, and 1-1000 μM molecule in assay buffer. Assays were performed in parallel with and without 100-fold excess bulk yeast tRNA to test specificity and in the absence of RNA (Tat only) to measure background signal. There were 137 molecules that caused fluorescence intensity change with Tat alone, suggesting they bound Tat; these were removed from further analysis. Assays were performed in opaque 384-well microplates and read with a Clariostar plate reader (BMG Labtech). Fluorescence signal was normalized to the highest intensity after subtracting background signal. Dose response curves were fit to Equation 1 with OriginPro (OrginLab) using the instrumental weighting method. Equation 2 was used to obtain CD50 values,
where A1 and A2 are the lowest and highest signals, respectively; p is the hill slope; and logx0 is the logarithm to base 10 of the concentration at half response. All variables were allowed to float during the fit. Assays were measured in triplicate and the mean and standard deviation (s.d.) is reported.
Validation of hits
The 17 small molecule hits from the dose response assays were re-purchased and re-tested for activity in addition to 56 molecules with chemical similarity to these hits, defined as having >80% similarity based on sphere exclusion clustering performed with JKlustor package (ChemAxon). Next, 32 molecules, including all 17 hits and 15 chemically similar molecules with possible activity in the assay, were tested for TAR binding by NMR chemical shift titrations employing [13C-1H] SOFAST-HMQC NMR experiments[65] performed at 298 K on 600 MHz and 800 MHz Agilent spectrometers equipped with triple-resonance HCN cryogenic probes. 13C/15N-labeled TAR was exchanged into NMR buffer. Concentrated stocks of molecule in DMSO were added to TAR such that no more than 10% (v/v) DMSO was added to the buffer. Free TAR controls had equivalent volumes of DMSO to compensate for minor changes that may be induced by DMSO. Spectra were processed using nmrPipe[66] and SPARKY[67].Nine molecules were inactive in both the displacement assay and NMR when retested with fresh molecule, suggesting that the original activity was due to contamination or degradation. One of the 56 molecules with chemical similarity to the hits, CCG-133994, was active in both the displacement assay and NMR, despite not being identified as a hit in the primary screen. Three molecules had activity in the assay, but did not bind based on NMR chemical shift titrations. Inspection of the Tat-only control for these molecules suggest that they likely bind Tat rather than TAR in the displacement assay (Supplementary Fig. 2). These should have been identified earlier in the workflow, but the fluorescence change in the presence of Tat may not have been large enough. Overall, seven molecules were confirmed to bind TAR RNA based on their activity in the TAR-Tat displacement assay and their ability to induce chemical shift perturbations in the TAR NMR spectra (Table 1 and Supplementary Fig. 1).
Hit molecules
The anthraquinone hits and chemically similar molecules exhibited a color change from orange to blue when diluted from 100% DMSO to an aqueous solution, likely due to DMSO reacting with the anthraquinone to form DMSO-anthraquinone, as described previously[68]. All experiments were performed with the derivatives in the blue state. The addition of the small molecule hits to TAR resulted in large chemical shift perturbations or line broadening in 2D NMR spectra for several residues throughout TAR (Supplementary Fig. 1b). As expected, hits with similar chemical structures induce similar chemical shift perturbations indicating that they interact with TAR via similar binding modes (Supplementary Fig. 1b). There are however two interesting exceptions. One of the five anthraquinone molecules, CCG-133905, induces significantly more broadening consistent with tighter binding and/or partial aggregation (Supplementary Fig. 1). CCG-133994, which contains an ester and an amine, induces chemical shift perturbations that are distinct from the other anthraquinone molecules, suggesting a distinct binding mode for this molecule (Supplementary Fig. 1). Furthermore, NMR reveals that CCG-133994 is in slow exchange, which is in agreement with the fact that it is the tightest binder in the TAR-Tat displacement assays.
Identification of false negatives
To investigate possible false negatives in the HTS, we selected ten molecules in the top 5% of docking scores and tested them for TAR binding using NMR. Four of the ten molecules did in fact bind TAR under NMR conditions (Supplementary Fig. 4). Closer analysis revealed that different factors led to the exclusion of these small molecules from HTS during the primary screen. One aminoglycoside molecule, CCG-39701, was insoluble in DMSO but was active in the assay when dissolved in water (Supplementary Fig. 4). CCG-174885, does not displace the Tat-peptide strongly enough to be a hit in our assay, but NMR clearly shows that it does bind TAR. The other two molecules, CCG-208298 and CCG-100975, had fluorescence interference at high concentration preventing determination of an accurate CD50 (Supplementary Fig. 4). To avoid biasing results, these molecules were not included in EBVS. Although these results demonstrate sources of uncertainty in our HTS results, our database is still based on more experimental data than the current standard of docking decoys and our Optimized library should limit the number of false negatives by removing molecules topologically similar to hit molecules (see below). These results also provide a blind test of EBVS since we were able to identify TAR binders.
Virtual Screening
VS was performed using the docking program Internal Coordinate Mechanics (ICM, Molsoft)[37]. The protocol allowed full ligand flexibility and rigid receptors. Docking was set up as described previously[19]. Briefly, each of the 20 conformers of the TAR dynamic ensemble[34] was uploaded to ICM in PDB format and converted to ICM objects using the default options (waters deleted and hydrogens optimized). Binding pockets were identified with the ICM PocketFinder Module using a tolerance value of 4.6. The volume and buriedness of the binding pocket are given by ICM. Receptor maps were generated to include all atoms within 5 Å of the predicted binding pockets with atom occupancy weighted. Docking was run with a thoroughness value of 1, flexible ring sampling level 2, and covalent geometry relaxed. Protonation states of the small molecules were assigned in ICM at pH 7 with the exception of neomycin B which was manually assigned a charge of +5 based on previous reports[69]. The full library was docked to each ensemble a single time for the enrichment studies. Docking against the parent E0,4rdc ensemble was replicated and shown to give similar scores/enrichment (ROC AUC/ROC2%= 0.88/42%, 0.81/35%, 0.87/50% for Full, Filtered and Optimized libraries respectively).
Ensemble-Based Docking Scores
The docking scores provided by ICM represent predicted binding energies in kcal/mol. For each molecule, a composite score across all conformers was assigned as the arithmetic average, the top score, or the Boltzmann-weighted average. To calculate the Boltzmann-weighted average, the fractional population of all 20 TAR conformers was calculated using the Boltzmann distribution (Eq. 3). The population of each conformer was multiplied by its docking score and these values were summed over all conformers to calculate the population-weighted score of each molecule (Eq. 4).
Where pi is the population of conformer i, εi is the docking score of conformer i, R is the gas constant (1.987×10−3 kcal K−1 mol−1), T is temperature (298 K), and M is the number of conformers in the ensemble.
Receiver Operator Characteristic Curves
An in-house python script was used to generate the ROC plots using Equations 5 and 6 and to calculate the ROC scores (ROC AUC, ROC(2%)),
where n is the number of true negatives (TN), true positives (TP), false negatives (FN) or false positives (FP) at every possible score threshold.
Generating TAR ensembles
The RDC-derived TAR ensembles (E0,4rdc and E1,2rdc) were determined as reported previously[23,24]. Note that no RDCs were measured in the TAR apical loop and this structure is not directly informed by experimental data. The NOE-based NMR ensemble (ENOE) consists of all 20 models of apo-TAR downloaded from the PDB (1ANR)[41]. The randomly selected ensembles (E0,ran and E1,ran) were constructed by using a random number generator to randomly select 20 structures from the two pools of TAR conformations generated using MD simulations[23,24] containing 10,000 (MD0) and 80,000 (MD1) conformations, respectively. Another ensemble was generated by clustering MD0 into 20 clusters by heavy-atom RMSD of all non-terminal nucleotides and taking representative structures from each cluster (E0,clus). Finally, an ensemble that poorly agrees with all four RDC data sets (E0,anti) was generated using a sample and select (SAS) Monte Carlo selection scheme to maximize the χ2 function assessing the agreement between measured and predicted RDCs (Eq. 7)[23],
where i runs over all the RDCs measured for the different constructs j and δ is the weight used to normalize different RDC data sets, and is set at one tenth of the range of RDCs measured for each TAR construct[24]. Dexp are the experimentally measured RDCs and Dcalc are the predicted RDCs that were calculate by PALES[70,71] as described below.The quality of the various TAR ensembles used in this study was determined by evaluating how well they agree with four sets of RDC data measured on variably elongated TAR RNA molecules as described previously[24]. Briefly, the program PALES[70,71] was used to calculate predicted RDCs based on the structures in the ensemble, after in silico elongation as described previously[24]. A scaling factor was used to account for variations in experimental conditions. The predicted RDCs are averaged for all structures of the ensemble assuming equal probabilities (Eq. 8),
where k runs over the N conformers of the ensemble, λj is the scaling factor for the jth TAR construct and Di,j is the ith coupling in the jth construct. These calculated RDCs were then compared to measured RDCs and the RMSD (Hz) was calculated.
Fitting RDCs to ligand-bound NMR structures
Previously published one bond C-H RDCs[49-51] were used to assess the quality of NOE-based NMR structures of TAR in complex with arginine (1ARJ)[48], acetylpromazine (1LVJ)[44] and neomycin B (1QD3)[47]. Specifically, we computed RMSD between the measured RDCs and values calculated when using the best-fit order tensor determined using RAMAH[72].
Benchmarking docking predicted poses
Using an updated set of ligand-bound RNA structures from the PDB that include 123 X-ray structures (90 with Nflex ≤ 11) and 26 NMR (17 with Nflex ≤ 11) structures (Supplementary Table 3), we re-docked each structure 20 times using the same docking procedure as described above. The binding pockets in the NMR structures were defined as any residue within 5 Å of the small molecule. Complexes with metal interactions near the binding site were not included in this benchmark. The RMSD between the re-docked structure and the original pose was calculated using the heavy atoms of the ligand for the best scoring pose over twenty runs.
Computing inter-helical angles
EBVS was used to predict inter-helical angles for six TAR-ligand complexes and values were compared to inter-helical angles in the NOE-based NMR structures of the complexes (acetylpromazine (1LVJ)[44], rbt550 (1UTS)[45], rbt203 (1UUD)[46], rbt205 (1UUI)[46], neomycin B (1QD3)[47], and arginine (1ARJ)[48]). For each of these molecules, docking against a TAR ensemble was repeated twenty times using the protocol described above. The inter-helical angles (αh, βh, γh) were computed for each conformer of all ensembles as well as for each model of the bound TAR NMR structures using an in-house software as described previously[49]. For this calculation, the lower helix was defined by base pairs C19-G43, A20-U42 and G21-C41 and the upper helix was defined by base pairs G26-C39, A27-U38 and G28-C37. For each docking run, the inter-helical angles were population-weighted based on the Boltzmann-weighted docking scores and averaged over all twenty replicates. The inter-helical angles for the NOE-based NMR bundles were averaged over all models assuming equal populations.
Analysis of ligand-bound poses
Ligand poses predicted by EBVS were compared to the NOE-based NMR structures for six-ligand TAR complexes by computing the heavy-atom RMSD between ligands after superimposing structures by both the ligand and RNA binding pocket (defined as any residue within 5 Å of the ligand in the NMR structure). As a control, we first re-docked all ligands to the lowest energy NMR structure twenty times using the same docking protocol as above, defining the binding pocket as all resides within 5 Å of the ligand. The RMSD values were calculated for the best scoring pose over all twenty runs. Next, each ligand was docked to E0,4rdc or E0,ran ensembles twenty times using the same docking protocol. For each run, the ligand RMSD was calculated for the best scoring pose(s) from EBVS (some runs resulted in two significantly (>25%) populated poses) to all structures in the NMR bundle and the best-fit RMSDs over all 20 runs were averaged.
Data Availability
Results from the high-throughput screen have been deposited on PubChem (AID: 1259389). The SDF files for the Full, Filtered and Optimized libraries have been made available at https://sites.duke.edu/alhashimilab/resources/. All other data can be made available upon request.
Code Availability
All custom scripts have been made available at https://sites.duke.edu/alhashimilab/resources/ or can be provided upon request.
Authors: H A Carlson; K M Masukawa; K Rubins; F D Bushman; W L Jorgensen; R D Lins; J M Briggs; J A McCammon Journal: J Med Chem Date: 2000-06-01 Impact factor: 7.446
Authors: Andrew C Stelzer; Aaron T Frank; Jeremy D Kratz; Michael D Swanson; Marta J Gonzalez-Hernandez; Janghyun Lee; Ioan Andricioaei; David M Markovitz; Hashim M Al-Hashimi Journal: Nat Chem Biol Date: 2011-06-26 Impact factor: 15.040
Authors: Laura R Ganser; Chia-Chieh Chu; Hal P Bogerd; Megan L Kelly; Bryan R Cullen; Hashim M Al-Hashimi Journal: Cell Rep Date: 2020-02-25 Impact factor: 9.423