Sanghoon Lee1, Jeffrey A van Santen1, Nima Farzaneh1, Dennis Y Liu1, Cameron R Pye2, Tim U H Baumeister1, Weng Ruh Wong3, Roger G Linington1. 1. Department of Chemistry, Simon Fraser University, 8888 University Drive, Burnaby, British Columbia V5A 1S6, Canada. 2. Unnatural Products Inc., 2161 Delaware Avenue Suite A, Santa Cruz, California 95060, United States. 3. Department of Chemistry and Biochemistry, University of California, Santa Cruz, Santa Cruz, California 95064, United States.
Abstract
Few tools exist in natural products discovery to integrate biological screening and untargeted mass spectrometry data at the library scale. Previously, we reported Compound Activity Mapping as a strategy for predicting compound bioactivity profiles directly from primary screening results on extract libraries. We now present NP Analyst, an open online platform for Compound Activity Mapping that accepts bioassay data of almost any type, and is compatible with mass spectrometry data from major instrument manufacturers via the mzML format. In addition, NP Analyst will accept processed mass spectrometry data from the MZmine 2 and GNPS open-source platforms, making it a versatile tool for integration with existing discovery workflows. We demonstrate the utility of this new tool for both the dereplication of known compounds and the discovery of novel bioactive natural products using a challenging low-resolution antimicrobial bioassay data set. This new platform is available at www.npanalyst.org.
Few tools exist in natural products discovery to integrate biological screening and untargeted mass spectrometry data at the library scale. Previously, we reported Compound Activity Mapping as a strategy for predicting compound bioactivity profiles directly from primary screening results on extract libraries. We now present NP Analyst, an open online platform for Compound Activity Mapping that accepts bioassay data of almost any type, and is compatible with mass spectrometry data from major instrument manufacturers via the mzML format. In addition, NP Analyst will accept processed mass spectrometry data from the MZmine 2 and GNPS open-source platforms, making it a versatile tool for integration with existing discovery workflows. We demonstrate the utility of this new tool for both the dereplication of known compounds and the discovery of novel bioactive natural products using a challenging low-resolution antimicrobial bioassay data set. This new platform is available at www.npanalyst.org.
Traditionally,
natural products discovery has been a linear endeavor,
with projects completed on a sample-by-sample basis. This approach
has been successful in discovering some of our most valuable therapeutics
such as Taxol, rapamycin, and cyclosporine A,[1−3] but is becoming
increasingly inefficient as rates of rediscovery rise.[4] The development of accurate and accessible omics technologies,
including genome sequencing, untargeted metabolomics, and high-throughput
screening, is changing the discovery landscape for this field. Omics-data
integration between these platforms has the potential to improve both
the speed and accuracy of compound discovery by leveraging information
from orthogonal data types at the system level, rather than evaluating
individual samples sequentially.[5,6] Currently, however,
there are few open access tools that integrate metabolomics and bioassay
data sets, limiting applications of these approaches in the natural
products community.[7]Several approaches
exist for integrating biological activity data
with untargeted metabolomics data. Bioactivity-based molecular networking
predicts the bioactivity of each MS feature by calculating the Pearson
correlation between activity profiles and intensity profiles for each
feature in the sample set using a combination of open-source tools
and custom R scripts.[8] Ory et al. developed
a method termed FInd BIoactive COmpounds (FiBiCo) which combines results
from four different statistical models (Spearman, F-PCA, PLS, PLS-DA)
to prioritize MS features that correlate positively with bioactivity
profiles.[9] This tool is written in R and
is available from the Supporting Information of the original article.
Finally, Olivon et al. have developed a strategy that incorporates
metabolomic, taxonomic, and bioactivity data into a single data matrix
for bioactive compound prioritization.[10] This approach color codes molecular networks based on the biological
activities of active fractions to permit manual prioritization of
MS features. Several of these methods are labor-intensive to implement
and require tailored and customized workflows; none are available
as stand-alone platforms for data integration that include both data
analysis and data visualization components.Previously we developed
an approach, termed Compound Activity Mapping,
to directly predict bioactive constituents and modes of action from
complex mixtures using a combination of image-based screening and
untargeted metabolomics.[11] We now present
NP Analyst, a versatile, open access platform for Compound Activity
Mapping (www.npanalyst.org). Importantly, NP Analyst is designed to work with biological data
from any assay platform and accepts mass spectrometry in several common
data formats including the standard open mzML format,[12] output peak lists from MZmine 2,[13] and network files from the Global Natural Products Social molecular
networking platform (GNPS).[14] The inclusion
of these input formats makes NP Analyst compatible with bioassay data
from most bioassay types and MS data from all of the major instrument
manufacturers. We have developed new strategies for processing metabolomics
data and integrating these results with bioactivity profiles, and
have packaged these tools in an open access online environment. NP
Analyst provides the research community with a new discovery platform
designed to accelerate compound dereplication, highlight priority
metabolites for isolation, and generate global network views of biologically
active chemical space for large extract libraries.
Results
Platform Design
Originally, Compound Activity Mapping
was designed to work with one specific biological assay (cytological
profiling) and relied on highly customized in-house scripts for data
integration.[11,15] In designing the NP Analyst platform
(Figure ), we identified
three key attributes required for adoption by the natural products
community:
Figure 1
Structure of
the NP Analyst platform. Users input both mass spectrometry
files and biological activity data for sample sets, which are scored
to prioritize compounds with strong predicted biological activities
for isolation and secondary screening.
an open, freely accessible
online interfacea workflow capable of
accepting data from any biological
assaymass spectrometry data import functions
compatible with
both open data formats for raw data and common mass spectrometry data
processing packages for processed dataStructure of
the NP Analyst platform. Users input both mass spectrometry
files and biological activity data for sample sets, which are scored
to prioritize compounds with strong predicted biological activities
for isolation and secondary screening.Developing NP Analyst as an online resource offered several advantages
over desktop deployment. It eliminated the need to support multiple
operating systems and allows dynamic allocation of storage and processing
resources, making the platform both responsive and scalable. In addition,
updates and upgrades can be made immediately available without the
need for software updates, improving the user experience.The
online interface includes a suite of data validation and quality
control checks to assist users with data upload and formatting. This
feedback identifies issues with input data and allows users to make
corrections before to job submission, reducing job failures related
to data format issues. Each results file is assigned a unique job
number which is used to create a permanent hyperlink to the results
pages, providing a facile mechanism for collaborators to retrieve
and share data.Compound Activity Mapping works by comparing
the distribution of
mass spectrometry features in the sample set against the biological
signatures of each of these samples. This requires processing of mass
spectrometry data to generate a complete list of all unique mass spectrometry
features (m/z vs retention time
pairs) in the sample set and the distribution of these features among
the samples in the set. To ensure that NP Analyst would work for the
broadest cross section of users we created a new data processing pipeline
that requires only MS1 level MS data in the standard mzML
open data format. This strategy is sufficient to describe the vast
majority of unique mass spectrometry features in the data set without
requiring MS2 data of a specific format from one of the
myriad possible data acquisition modes. This approach also enables
laboratories without MS2 capability (e.g., UPLC-TOF instruments)
to use the NP Analyst platform.To extend this approach beyond
our original study using image-based
screening data, we have developed a new algorithm for data integration
that is agnostic to bioassay data format. Unlike the previous strategy,
which required normalized data sets with values between −1
and +1 and a precise number of features, this new approach will accept
data sets with any number of biological features and accepts different
assay data formats (inhibition/no inhibition, percent growth, etc.).
This eliminates the requirement for users to have access to sophisticated
screening infrastructure (e.g., high-content imaging microscope) and
permits the use of existing biological screening data in the NP Analyst
platform. The only requirement is that the bioactivity file contains
multiple biological features to create biological fingerprints for
each sample. These can either be multiple readouts from a single assay
(e.g., gene expression profiles) or single readouts from multiple
assays (e.g., activity across a panel of bacterial pathogens). These
data should be provided as a flat CSV file containing one row for
each sample in the set and one column for each bioassay readout. For
full instructions on bioassay file formatting requirements see the
online documentation at https://liningtonlab.github.io/npanalyst_documentation/NPAnalyst/file-import/
Algorithms
NP Analyst contains three main functions;
an optional step to process the mass spectrometry data to create a
list of unique MS features, feature score determination based on bioactivity
profiles, and generation of output files for visualization.
Mass Spectrometry
Data Processing
Mass spectrometry
data can be uploaded in one of three formats: as individual mzML files
for each sample, as a comprehensive peak list obtained through third-party
software (e.g., MZmine 2), or as a GNPS-derived graphML network file.
In the first case (mzML), NP Analyst will perform replicate comparison
and feature alignment on the individual files as discussed below.
In the latter two cases (MZmine 2 and GNPS), data processing and alignment
are performed in these external platforms, and only the final aligned
peak list is supplied to NP Analyst. This allows users to either process
their data in GNPS or MZmine 2 using existing workflows or use the
alignment and replicate comparison tools built into the NP Analyst
platform.For the mzML input, files are uploaded as peak picked
centroided data. If technical replicates are available, then samples
are first compared to identify signals that are present consistently
between replicates. For n replicates, the default
requirement is present in n – 1 samples. Because
both retention times (rt) and m/z values can vary between analytical runs, a processing method to
align replicate signals is required. We designed an alignment method
based on R-trees[16] that dynamically groups
signals between samples into groups that conform to allowed tolerances
of rt and m/z values (Experimental Section). This approach ensures that groups remain
limited to the defined errors in both dimensions by subdividing groups
that grow to include members with too wide a range of rt of m/z values. An advantage of this R-trees
approach is that it is easily extensible to additional dimensions.
This permits future incorporation of additional data types, such as
drift times/CCS values from ion mobility spectroscopy. Following replicate
comparison, consensus signals from each sample are aligned using the
same R-trees method. This analysis yields a single file containing
all the unique mass spectral features in the sample set, describing
these signals’ distribution within the set.For MZmine
2 data, preprocessing, replicate comparison, and alignment
are performed using in-built tools in the MZmine package, and a single
aligned peak list is exported into NP Analyst without further manipulation.
For GNPS data the standard GNPS graphML network file is imported and
reformatted by the NP Analyst package to generate a list of unique m/z features and sample distributions compatible
with the NP Analyst pipeline. Instructions for both of these export
methods are provided in the software documentation (https://liningtonlab.github.io/npanalyst_documentation/).
Bioactivity Profile Integration
MS features in the
unique feature list are scored for the strength and consistency of
their predicted biological profiles using the Activity Score and Cluster
Score metrics employed in our previous study.[11] NP Analyst has been configured to automatically adjust to the dimensionality
of the bioassay data file (e.g., 5 bioassay features vs 250) and is
capable of handling data from most assay readouts (IC50, percent inhibition etc.). In addition, Boolean data types (e.g.,
True/False, Inhibition/no inhibition, Yes/No) are also accepted. The
only requirements are that each data column must contain a consistent
data type (i.e., no columns containing a mixture of numerical and
Boolean data) and that missing data must be represented by empty cells
rather than “not tested” or other text entries.Activity Score is a measure of the strength of the phenotype for
a given mass spectrometry feature. It is determined by calculating
the sum of the mean of the squares of the bioactivity values for each
extract containing a given mass spectrometric feature. For example,
if a feature is present in three samples, the Activity score is calculated
by averaging the squares of the values in each column of the bioassay
data and summing the resulting values.Cluster Score is a measure
of the consistency of the biological
fingerprints between all of the samples that contain a given mass
spectrometric feature. It is determined by taking the average Pearson
similarity scores between the biological fingerprints of all extracts
containing a given mass spectrometric feature.Users can define
Activity and Cluster Score cutoffs during the
job submission process. The analysis workflow calculates both scores
for every MS feature and retains MS features that meet the minimum
values for both parameters. Increasing the cutoff values for either
score reduces the complexity of the results files by removing MS features
that do not have strong bioactivity profiles and/or do not correlate
with specific biological phenotypes. These two parameters can be used
in concert to select either MS features with strong biological profiles
(Activity Score) or features with consistent activity profiles independent
of spectrum of activity (Cluster Score). The range of Activity Score
values depends on the bioassay data format. For bioassay data that
has been normalized to a scale of 0 to 1, the maximum Activity Score
is equal to the number of bioassay parameters. By contrast, Cluster
Score values always range from −1 to +1. In practice, setting
low positive cutoff values (0.3 for Activity Score, 0.1 for Cluster
Score) is sufficient to remove many of the inactive features, simplifying
both data visualization and data interpretation. The definition of
“high” scores is dependent on the type of bioassay data
used, but as a general guide values greater than 30% of maximum for
Activity Score and 50% of maximum for Cluster Score can be considered
“high”.
Data Visualization
Network Visualization
NP Analyst provides three complementary
data visualization options: network, scatter plot, and community visualizations.
The network view (Figure S1) displays samples
and associated mass spectrometry features, distributed by chemical
relatedness. Extract nodes (squares) are connected by edges to their
associated mass spectrometry features (circles), with the network
containing all MS features that pass the Activity and Cluster Score
filters. The value of this representation is that samples are grouped
together in the network based on the presence of shared bioactive
mass spectrometry features. MS features that possess high interconnectivity
within each group derive from molecules predicted to be responsible
for the observed biological phenotypes. These can include different
adducts and in-source fragments from the same molecule or features
from groups of related molecules with similar biological profiles.
Therefore, the network view is valuable for describing the range and
breadth of predicted bioactive metabolites within the sample set and
how they are distributed between samples.As part of the analysis
pipeline the network is divided into communities using the Louvain
method for community detection. Each community is given a unique color
and community ID used in the Community view (Figure S2). These communities contain highly interrelated nodes indicative
of shared biologically active MS features and are a helpful resource
for selecting priority extracts for further investigation. Switching
the “Color by Community” toggle in the network view
changes the network color scheme from activity color-coding to node
colors based on community membership, which is valuable for understanding
how the full sample set is distributed between communities and how
those communities are interconnected. Finally, clicking on any node
highlights that node and the nodes to which it is directly connected,
providing a mechanism for exploring the network. The graph is interactive,
permitting zoom and pan functions, as well as autolabeling based on
zoom level.
Scatter Plot Visualization
The scatter
plot visualization
(Figure S3) presents a plot of the bioactive
MS features (retention time vs m/z ratio), providing a more targeted, compound-centric data representation
of the full data set. This plot can be filtered by Activity Score
and Cluster Score to retain only features with strong bioactivity
predictions. In addition, users can select samples of interest from
the interactive list below the plot to retain only MS features from
a defined subset of samples. This provides an interactive interface
for assessing the predicted bioactive mass spectrometry features for
any sample or set of samples in the data set and is particularly valuable
when combined with information from the Community view.
Community
Visualization
The community visualization
provides data on bioactive MS features for specific communities from
the Louvain community detection algorithm (Figure ). Each community page contains a network
view of the extract and MS feature nodes in that community, two scatter
plots (rt vs m/z and Cluster Score
vs Activity Score), and a bioactivity heatmap for the extracts in
the community. Together these plots allow users to assess the biological
similarities between samples within each community, select bioactive
MS features that interconnect these samples, and determine the rt
and m/z values for these features
for subsequent dereplication and compound isolation.
Figure 2
NP Analyst community
view (A) including network diagram (B), plot
of retention time (x-axis) vs m/z ratio (y-axis) for bioactive MS features
(C), plot of Cluster Score (x-axis) vs Activity Score
(y-axis) for bioactive MS features (D), and activity
profiles for extracts in community. Rows contain bioactivity data
for each sample. Columns contain bioassay values for each bioassay
readout (E). In plots B–D, extracts are represented by square
gray nodes, while colored circular nodes represent MS features. The
color of MS feature nodes is defined by Cluster Score (blue to red,
−1 to +1). The diameter of these nodes is defined by the Activity
Score (normalized scale from minimum to maximum Activity Score values).
The original webpage can be accessed from www.npanalyst.org by selecting
the “Open Sample Output” button, choosing the “Communities”
tab, and selecting community 15 from the dropdown menu at the top
of the page.
NP Analyst community
view (A) including network diagram (B), plot
of retention time (x-axis) vs m/z ratio (y-axis) for bioactive MS features
(C), plot of Cluster Score (x-axis) vs Activity Score
(y-axis) for bioactive MS features (D), and activity
profiles for extracts in community. Rows contain bioactivity data
for each sample. Columns contain bioassay values for each bioassay
readout (E). In plots B–D, extracts are represented by square
gray nodes, while colored circular nodes represent MS features. The
color of MS feature nodes is defined by Cluster Score (blue to red,
−1 to +1). The diameter of these nodes is defined by the Activity
Score (normalized scale from minimum to maximum Activity Score values).
The original webpage can be accessed from www.npanalyst.org by selecting
the “Open Sample Output” button, choosing the “Communities”
tab, and selecting community 15 from the dropdown menu at the top
of the page.
User Interface
The user interface is designed to assist
users with data import, quality control, and visualization, while
prioritizing ease of use. The platform requires no registration or
login details and is W3C compliant to ensure functionality on all
major operating systems and browsers. To start a new analysis, users
optionally enter an email address (in order to receive notifications
on job status) and then upload a comma separated value (CSV) file
containing the biological activity data. Because mass spectrometry
files are typically large and uploads are therefore often slow, NP
Analyst performs several key validation steps on the bioassay data
file prior to MS data upload. This reduces failure rates and improves
user experience by correcting errors early in the submission pipeline.
These validation steps include verification that sample names and
column headers are unique and that results columns contain exclusively
numerical values or allowed Boolean terms. In addition, a warning
is raised if null values are detected.Following bioassay data
validation, users select the mass spectrometry data type they wish
to analyze and upload the file(s). If the mzML type is selected, then
files are first reviewed to ensure that every sample has the same
number of replicates. Filenames are displayed in an interactive “drag
and drop” layout that allows users to correct errors with file
selection, naming, replicate assignment, etc. Once the correct replicate
files are associated with each sample, MS files are reviewed to ensure
that sample names align between the bioassay and the mass spectrometry
data. Mismatches between bioassay and MS file sample names raises
a warning listing the missing sample names.Once all validation
steps are passed, the submit button is enabled.
Clicking submit initializes the upload of mass spectrometry files,
generates a unique job number for the experiment, and starts the data
analysis pipeline. This job number is used to access the data in the
visualization section of the Web site. It is also part of the unique
URL that is sent to users who opt to supply an email address, providing
a convenient mechanism to share results between collaborators.Upon completion, the results section is displayed, including interactive
tabs for scatter plot, network, and community views, as well as a
downloads page to export the results files. Export options include
a graphML network file for use in network visualization tools (Cytoscape,
Gephi) and CSV for use in spreadsheets and graph plotting tools (e.g.,
Excel, Tableau, Jmp, Spotfire).For advanced users, the underlying
processing algorithms are freely
available as a command line tool via a Docker container at https://github.com/liningtonlab/npanalyst. The advantage of local installation is that it eliminates the slow
step of uploading mass spectrometry data to the online server; a particular
issue with mzML files. The disadvantages are that (i) programming
experience is required to deploy the Docker container, (ii) some of
the client-side quality control steps for individual files built into
the web interface are not included in the command line tool, and (iii)
the online interface cannot be used for data visualization.
Bioactive
Compound Discovery
In order to evaluate the
value of NP Analyst for bioactive compound discovery, we analyzed
a set of 925 prefractions from our in-house marine Actinobacterial
strain library. For biological profiling we deliberately selected
a low-density bioassay data set that would test the performance limits
of the platform. Data from our previously developed BioMAP antibacterial
profiling platform[17] were combined to generate
a data set containing inhibition/no inhibition results against 15
bacterial pathogens (6 Gram + ve, 9 Gram – ve) for all 925
prefractions. This bioassay offers a maximum of 32,768 unique phenotypes
(215). However, antibacterial compounds tend to have broad
activity against either Gram + ve or Gram – ve strains (or
both), meaning that the effective number of phenotypes is significantly
lower. Therefore, the BioMAP biological profiles provided a valuable
test case for the NP Analyst platform due to their coarse-grained
nature.Submission of these data and the associated MS data
for all prefractions (UPLC-ESI-qTOF, positive mode, three replicates,
mzML format) yielded the NP Analyst network shown in Figure . Examination of the bioactive
features in each community on the Communities results page highlighted
several communities (Communities 12, 15, and 16) containing MS features
with high predicted biological properties that were prioritized for
further analysis.
Figure 3
NP Analyst network for 925 microbial natural products
prefractions.
Square gray nodes represent Prefractions. Circular nodes represent
MS features. The color of MS feature nodes is defined by Cluster Score
(blue to red, −1 to +1). The diameter of these nodes is defined
by Activity Score (normalized scale from minimum to maximum Activity
Score values). A high-resolution version of this figure, including
text annotations for each node and a color scale bar, is available
in the Supporting Information (Figure S1).
NP Analyst network for 925 microbial natural products
prefractions.
Square gray nodes represent Prefractions. Circular nodes represent
MS features. The color of MS feature nodes is defined by Cluster Score
(blue to red, −1 to +1). The diameter of these nodes is defined
by Activity Score (normalized scale from minimum to maximum Activity
Score values). A high-resolution version of this figure, including
text annotations for each node and a color scale bar, is available
in the Supporting Information (Figure S1).
Discovery of Dracolactam C
Community 12 contained 23
prefractions with very similar biological profiles (Figure A). Sixteen of the 23 prefractions
in this community were connected by a single MS feature with an m/z of 452.2788 and a retention time of
3.12 min. The molecule responsible for this MS feature was therefore
prioritized for isolation. Refermentation of the producing organism
followed by mass-guided purification yielded a compound with a precursor
[M + H]+ peak at 470.2888 and a prominent [M – H2O + H]+ peak at 452.2791. Dereplication against
the Natural Products Atlas database[18] and
comparison of the 1H NMR data against the published literature
identified this compound as the polyene macrolactam micromonolactam
(1).[19] However, during the
isolation of this metabolite we identified two compounds from this
fraction with similar MS features, one of which was isobaric with
micromonolactam. To determine the identity of the active species unequivocally,
we isolated these two additional metabolites and identified them using
extensive 1D and 2D NMR experiments. One of these compounds was the
known metabolite dracolactam A (2),[20] which is proposed to derive from the intramolecular Diels–Alder
cyclization of micromonolactam (Scheme S1). The second was a new compound, dracolactam C (3),
which was identified as a different intramolecular cyclization product
of micromonolactam (Scheme S1). For a full
description of the structure elucidation of this new compound, see
the Supporting Information and Figure S4. Screening of all three compounds (Figure B) in the BioMAP panel revealed that micromonolactam
possessed a similar antibacterial profile to the profile for community
12, while dracolactams A and C were largely inactive (Table ).
Figure 4
(A) Community 12 network.
Prefractions illustrated as square gray
nodes. Predicted bioactive MS features illustrated as circular red
nodes. m/z values for priority MS
feature node annotated on the network. Peripheral nodes removed for
clarity. Full community presented in Figure S2. (B) Structures of isolated compounds. (C) Extracted ion chromatogram
(EIC) traces for predicted bioactive MS feature (top), micromonolactam
(middle), and dracolactam C (bottom) illustrating alignment between
the predicted bioactive constituent and the purified bioactive metabolite
(micromonolactam) and the absence of predicted bioactivity for the
inactive isomer dracolactam C.
Table 1
Comparison of BioMAP Screening Data
for Selected Communitiesa
1Each
color (HEX code)
indicates activity against the test organism in the BioMAP screening
panel. (1) Red (#ff0000) – A. baumannii, (2)
Orange (#ff6600) – B. subtilis, (3) Tangerine
Yellow (#ffcc00) – K. aerogenes, (4) Fluorescent
Yellow (#ccff00) – E. coli,
(5) Bright Green (#66ff00) – E. faecium, (6)
Lime Green (#00ff00) – L. ivanovii, (7) Spring
Green (#00ff66) – MRSA, (8) Bright Turquoise (#00ffcc) – O. anthropi, (9) Deep Sky Blue (#00ccff) – P. aeruginosa, (10) Navy Blue (#0066ff) – P. alcalifaciens, (11) Blue (#0000ff) – S. aureus, (12) Electric Indigo (#6600ff) – S. epidermidis, (13) Electric Purple (#cc00ff) – S. enterica, (14) Hot Magenta (#ff00 cm3) – V. cholerae, (15) Vivid Raspberry (#ff0066) – Y. pseudotuberculosis. 2No screening results
available for collismycin A or amychelin C against S. epidermidis.
(A) Community 12 network.
Prefractions illustrated as square gray
nodes. Predicted bioactive MS features illustrated as circular red
nodes. m/z values for priority MS
feature node annotated on the network. Peripheral nodes removed for
clarity. Full community presented in Figure S2. (B) Structures of isolated compounds. (C) Extracted ion chromatogram
(EIC) traces for predicted bioactive MS feature (top), micromonolactam
(middle), and dracolactam C (bottom) illustrating alignment between
the predicted bioactive constituent and the purified bioactive metabolite
(micromonolactam) and the absence of predicted bioactivity for the
inactive isomer dracolactam C.1Each
color (HEX code)
indicates activity against the test organism in the BioMAP screening
panel. (1) Red (#ff0000) – A. baumannii, (2)
Orange (#ff6600) – B. subtilis, (3) Tangerine
Yellow (#ffcc00) – K. aerogenes, (4) Fluorescent
Yellow (#ccff00) – E. coli,
(5) Bright Green (#66ff00) – E. faecium, (6)
Lime Green (#00ff00) – L. ivanovii, (7) Spring
Green (#00ff66) – MRSA, (8) Bright Turquoise (#00ffcc) – O. anthropi, (9) Deep Sky Blue (#00ccff) – P. aeruginosa, (10) Navy Blue (#0066ff) – P. alcalifaciens, (11) Blue (#0000ff) – S. aureus, (12) Electric Indigo (#6600ff) – S. epidermidis, (13) Electric Purple (#cc00ff) – S. enterica, (14) Hot Magenta (#ff00 cm3) – V. cholerae, (15) Vivid Raspberry (#ff0066) – Y. pseudotuberculosis. 2No screening results
available for collismycin A or amychelin C against S. epidermidis.These bioassay results
align with the prediction from NP Analyst,
which prioritized a single MS feature (m/z 452.2788; rt 3.12 min). Inspection of the original UPLC-HRMS
metabolomics data for these fractions and comparison against the retention
times for compounds 1–3 using the
same analytical method revealed that all three compounds were present
in multiple prefractions in the library. However, only the distribution
of micromonolactam aligned with the activity profile for community
12 (Figure C), explaining
why only one of these metabolites was predicted as the active component.
This represents the first instance of reported biological activity
for micromonolactam.
Dereplication of Collismycin Compound Family
Community
15 (Figure A) presented
a more complicated scenario. The community contained three prefractions
with broad-spectrum antibiotic activities and a large number of candidate
bioactive features. However, six of these features possessed strong
Activity Scores (large diameter red nodes connecting prefractions
RLUS-2108C and RLUS-2108D in Figure A, highlighted rows in Figure E). A review of the retention times for these
features in the scatter plot view (Figure B) highlighted three sets of features with
related retention times (2.64, 3.42, and 3.97 min) that were consistent
with adducts and fragments from three separate molecules (Figure S5). This was supported by the peak shapes
for extracted ion chromatograms (EICs) for these features in the original
MS data, which were grouped into three sets (Figure D).
Figure 5
(A) Community 15 network. Prefractions illustrated
as square gray
nodes. Predicted bioactive MS features illustrated as circular red
nodes. The color of MS feature nodes is defined by Cluster Score (blue
to red, −1 to +1). The diameter of these nodes is defined by
Activity Score (normalized scale from minimum to maximum Activity
Score values). (B) Scatter plot (Activity Score ≥ 0.3) illustrating
retention time alignment for bioactive MS features. (C) Structures
of isolated compounds. (D) Alignment of EICs for bioactive MS features
from community 15 illustrating retention time and peak shape alignment
for three major components in the mixture. (E) Results table for all
bioactive features from the community, highlighting features related
to compounds in panel C (green = collismycin A, blue = collismycin
B, red = SF2738D).
(A) Community 15 network. Prefractions illustrated
as square gray
nodes. Predicted bioactive MS features illustrated as circular red
nodes. The color of MS feature nodes is defined by Cluster Score (blue
to red, −1 to +1). The diameter of these nodes is defined by
Activity Score (normalized scale from minimum to maximum Activity
Score values). (B) Scatter plot (Activity Score ≥ 0.3) illustrating
retention time alignment for bioactive MS features. (C) Structures
of isolated compounds. (D) Alignment of EICs for bioactive MS features
from community 15 illustrating retention time and peak shape alignment
for three major components in the mixture. (E) Results table for all
bioactive features from the community, highlighting features related
to compounds in panel C (green = collismycin A, blue = collismycin
B, red = SF2738D).Isolation and characterization
of two of these metabolites by 1D
and 2D NMR experiments (Figures S22–S25) identified the related bipyridyl compounds collismycin A (4) and SF2738D (5) (Figure C).[21,22] The third metabolite
was not present in sufficient quantity for analysis by NMR but was
isobaric with collismycin A and possessed very similar MS2 fragmentation and UV absorbance spectra (Figure S6), consistent with the known stereoisomer collismycin B.[22] Surprisingly, screening of collismycin A and
SF2738D in the BioMAP assay revealed that, while collismycin A possessed
antibacterial activity against a range of strains, SF2738D was completely
inactive. This result is in line with previously published screening
data for these compounds, which did not identify any antimicrobial
activity for SF2738D (Table ).[21]This result highlights
one of the limitations of the NP Analyst
platform. In cases where inactive compounds are always coproduced
with bioactive molecules in microbial cultures it is not possible
to differentiate between the activities of the two metabolites. The
central premise of the method is that metabolites that are present
in both active and inactive fractions will have low Activity and Cluster
Scores. However, inactive metabolites that are only present in active
fractions will have high Activity and Cluster Scores, provided that
the active fractions they are present in all have similar biological
profiles. This issue is particularly acute for small communities with
low numbers of samples (<10), as the chance of coexpression of
active and inactive metabolites is higher if the number of samples
is low. Therefore, users should take care to consider the values for
Cluster and Activity Scores when selecting priority MS features to
isolate from small communities containing multiple candidate masses.
Specifically, users should examine the connections between MS features
and samples, and the activity profiles of those samples on the Communities
page. This information should be used in combination with the relative
magnitudes of the Activity and Cluster Scores to select priority molecules
for downstream isolation and biological evaluation. It is recommended
that users prioritize features with the highest interconnectivity
between samples with similar activity profiles, and that MS features
with additional connections to inactive samples are given lower priority.
Discovery of Amychelin C
Finally, community 16 contained
5 prefractions, 2 of which possessed very similar bioactivity profiles
and were connected by 3 mass spectrometry features. Evaluation of
the scatter plot and EIC traces for these prefractions indicated the
presence of two related molecules, one of which contained both the
precursor [M + H]+ feature at m/z 753.3243 and a prominent mass fragment at m/z 623.2319 (Figure A). Searching the Natural Products Atlas database did not yield any
candidate structures for this compound pair, so this community was
prioritized for further investigation.
Figure 6
(A) Community 16 network.
Prefractions illustrated as square gray
nodes. Predicted bioactive MS features illustrated as circular red
nodes. Circle diameter proportional to Activity Score. Peripheral
nodes removed for clarity. Full community presented in Figure S2. (B) Substructures determined from
NMR data. (C) Comparison of MS fragmentation patterns for unlabeled
and labeled amychelin C, illustrating the position of labeled [1,2,3-13C]-(l)-serine (red moiety) and 13C-labeled
carbons (blue circles). (D) Structures of amychelin C (6) and amychelin (7).
(A) Community 16 network.
Prefractions illustrated as square gray
nodes. Predicted bioactive MS features illustrated as circular red
nodes. Circle diameter proportional to Activity Score. Peripheral
nodes removed for clarity. Full community presented in Figure S2. (B) Substructures determined from
NMR data. (C) Comparison of MS fragmentation patterns for unlabeled
and labeled amychelin C, illustrating the position of labeled [1,2,3-13C]-(l)-serine (red moiety) and 13C-labeled
carbons (blue circles). (D) Structures of amychelin C (6) and amychelin (7).Fermentation of the producing organism, followed by mass-guided
isolation by HPLC-MS, yielded 1.18 mg of an off-white solid with an m/z of 753.3053. 1D and 2D NMR analyses,
coupled with high-resolution mass spectrometry, suggested a formula
for this new metabolite of C31H45N8O14 (m/z 753.3053 [M
+ H]+ (calcd. 753.3050)). Examination of the gCOSY, gHSQC,
and gHMBC spectra identified three subunits (Figure B). Signals for subunit B were broad and
asymmetric, and possessed integrations in multiples of three, suggesting
three repeating motifs. Examination of the MS2 spectrum
revealed sequential neutral losses consistent with the loss of the
amino acid subunits proposed in Figure C. This analysis, coupled with key gHMBC correlations
between subunits (Figure S7A), afforded
the planar structure for this new metabolite, amychelin C (6; Figure C). Amychelin
C is related to a previously reported siderophore, amychelin (7),[23] but differs by the inclusion
of a methyl-oxazoline moiety in place of the oxazoline subunit present
in amychelin. To determine the absolute configurations of the amino
acid-derived stereocenters we performed Marfey’s analysis (Figure S8), which identified the presence of l-threonine, l-ornithine, and a 2:1 ratio of d- and l-serine.Because amychelin C contained both d- and l-serine.,
an additional experiment was required to determine the position of
the l-serine residue. Refermentation of the producing organism
in the presence of a 1:1 mixture of [1,2,3-13C]-(l)-serine and [1,2,3-12C]-(d)-serine yielded an
isotopically labeled version of amychelin C with an increased [M +
H]+ signal of 3 Da. Interpretation of the MS fragmentation
data for this isotopically labeled derivative (Figure S9) identified the position of isotopic labeling (Figure C), completing the
full absolute configurational analysis. Interestingly, amychelin C
is enantiomeric to amychelin at all shared chiral centers (Figure D). This finding
supports a recent study on the evolutionary origins of this compound
class which noted several examples with opposite configurations at
some or all positions.[24] Screening of purified
amychelin C in the BioMAP assay recapitulated the spectrum of activity
predicted from the original community, confirming this molecule as
the predominant active component in this community (Table ).
Discussion
Prioritization of bioactive constituents from complex mixtures
has been a longstanding challenge for the field of natural products.[25] This is often the rate-limiting step for bioactive
compound discovery and frequently leads to the rediscovery of known
molecules.[26,27] NP Analyst offers a target-agnostic
platform for predicting biological activities of MS features directly
from complex mixtures that addresses this bottleneck. Importantly,
this platform is not tailored to a particular biological assay or
target class, making it suitable for use with any numerical or Boolean
assay readout. This extends its utility beyond drug discovery to other
scientific areas. For example, this platform could be used to identify
molecules related to behavioral phenotypes in chemical ecology studies,
given response data for extract libraries in ecologically relevant
assay systems.In principle, every natural product will have
a defined profile
across all of the biological space,[4,28] meaning that
NP Analyst is not restricted to multiparametric assays but can also
be used with individual results from a panel of different assay platforms.
For example, laboratories may have results for the same extract library
against a range of different bioassay targets (cancer cell lines,
bacterial or fungal pathogens, protozoan parasites, viruses, etc.).
Although acquired as individual results in each screen, these data
are suitable for assembly into biological profiles for use in NP Analyst,
provided that the data are normalized so that each column uses the
same scale. Detailed instructions for data normalization are included
in the Web site documentation (https://liningtonlab.github.io/npanalyst_documentation/). This tool is therefore immediately applicable in laboratories
that have legacy bioassay results, provided that companion mass spectrometry
data also exist for these samples.For optimal performance,
users should carefully consider the quality
of the input MS data that they import. While it is possible to run
NP Analyst without any data preprocessing, this typically increases
the number of candidate bioactive MS features, because raw mass lists
can include many “junk” MS features.[29−31] Replicate comparison
can significantly reduce noise peaks in untargeted metabolomics data
sets[29,31,32] and is strongly
recommended where possible. In addition, it is useful to consider
the likely limit of detection of biological assays and to select an
appropriate signal intensity cutoff for the MS data. While it is tempting
to select the minimum possible cutoff value so as not to miss any
potentially important features, if the assay is insensitive this may
dramatically increase the complexity of the resulting networks without
including any additional biologically informative MS features.An advantage of the NP Analyst approach is that communities are
created based on bioactive MS feature distribution, rather than sample
activity profiles. Therefore, molecules with different structures
but similar biological profiles will form separate communities, even
if the samples that contain them have indistinguishable biological
fingerprints. For example, the prefractions containing micromonolactam
are part of a large group of 79 prefractions with activity against
the same 6 pathogens (S. aureus, MRSA, P. alcalifaciens, B. subtilis, E. faecium, L. ivanovii). However, the
micromonolactam-containing cluster includes just 23 prefractions.
The remaining prefractions with this biological profile are distributed
across other communities in the network, including several (such as
community 3) that have clear candidate MS features for future development
that are distinct from micromonolactam.In cases where communities
do not have clear individual bioactive
MS features, an effective strategy for prioritizing bioactive molecules
is to identify sets of active MS features with the same retention
time in the plot of retention time vs m/z ratio in the community view (Figure C). It is widely recognized that many compounds form
multiple MS features during the ionization process (adducts, fragments,
and multiply charged species).[29,31] Because NP Analyst
scores each MS feature independently, all MS features from a given
bioactive molecule will have similar Activity and Cluster Scores but
different m/z ratios. These appear
as a vertical “stripe” of features with the same retention
time in the retention time vs m/z ratio plot (Figure C), which provides retention time and MS properties for bioactive
molecules for direct isolation without the need for further bioassay-guided
fractionation. Examples of this phenomenon are communities 3 and 15,
both of which contain multiple MS features for individual priority
molecules.Overall, NP Analyst offers a powerful suite of data
visualizations
for exploring the bioactive component of extract libraries. The scatter
plot, network, and community views provide three complementary viewpoints
that should be used interdependently to identify priority MS features
for further chemical and biological evaluation. Provision of the raw
data from all three visualizations in the downloads page allows users
to manipulate these data in other third-party data visualization tools
(e.g., Tableau), enabling the development of tailored data analysis
workflows as required.
Conclusion
A new open online platform
has been developed for the direct prediction
of metabolite bioactivity profiles from complex mixtures. This platform
accepts a wide range of bioassay data types and is compatible with
both the mzML mass spectrometry open data format and output files
from two commonly used open-source mass spectrometry data processing
platforms (GNPS and MZmine 2). This platform is available at www.npanalyst.org.We
validated this new platform by analyzing a “low-resolution”
antimicrobial bioassay data set for 925 natural product prefractions.
From these results, three communities were selected for further study,
leading to the isolation of three classes of bioactive metabolites
including two new compounds (dracolactam C (3) and amychelin
C (6)). In addition, this analysis afforded accurate
predictions of biological activities for all three compound classes,
including the first reported biological activity for the polyene macrolactam
micromonolactam (1). Together, these results demonstrate
the utility of this new platform as a rapid, accurate, and flexible
strategy for the discovery of novel bioactive natural products from
complex mixtures.
Experimental Section
General Experimental Procedures
Optical rotations were
measured on a Model 341 Polarimeter (PerkinElmer). Ultraviolet absorption
spectra were recorded on a Cary 300 UV–vis spectrophotometer
(Agilent Technologies). HR-ESI-MS and MS2 fragmentation
spectra were recorded on a SYNAPT UPLC-ESI-qTOF (Waters). NMR spectra
were measured on an AVANCE II 600 MHz spectrometer equipped with a
5 mm QNP cryoprobe (Bruker). MPLC (CombiFlash, Teledyne ISCO) was
carried out on RediSep Rf solid load cartridge (5 g, Teledyne ISCO).
HPLC separations were performed on either a Waters autopurification
system equipped with a SQ Detector 2 quadrupole MS detector or an
Agilent 1200 series HPLC equipped with a binary pump and a diode array
detector using either Synergi Fusion-RP or Kinetex XB-C18 columns
(Phenomenex).
Collection of Samples
Sediment samples
were collected
into sterile 15 mL Falcon tubes by SCUBA. Sample RL09-219 was collected
from Crowbar Canyon, CA. Samples RL12-176 and RL12-145 were collected
from Bell Point and Dinner Island, respectively, under permit number
12-034 from the Washington Department of Fish and Wildlife.
Isolation
of Bacteria
Sediment samples were plated
onto Actinobacteria-specific isolation media with added antifungal
and Gram-negative antibacterial agents (cycloheximide and nalidixic
acid; 50 mg/L each) by radial stamping with sterile cotton swabs.
Morphologically distinct colonies were picked and replated on Difco
marine broth agar plates repeatedly until pure isolates were obtained.
Isolated colonies of Actinobacteria were subjected to liquid medium
culturing using our standard fermentation conditions[15] and cryopreserved as glycerol stock solutions at −80
°C.
Isolate Fermentation, Extraction, and Prefractionation
Bacterial frozen stocks were inoculated on solid media (10.0 g
of
glucose, 5.0 g of NZ-amine, 1.0 g of CaCO3, 20.0 g of starch,
5.0 g of yeast extract, 20.0 g of agarose, and 1.0 L of water) and
incubated at room temperature until discrete colonies became visible.The colonies were inoculated into 40.0 mL culture tubes containing
7.0 mL of liquid media (31.2 g of Instant Ocean, 10.0 g of soluble
starch, 4.0 g of peptone, 2.0 g of yeast extract per 1 L of water).
Liquid cultures were incubated at room temperature and shaken at 200
rpm. After 3 days, 3.0 mL of the small-scale cultures were used to
inoculate 60.0 mL of the same media in 250.0 mL wide-neck Erlenmeyer
flasks with a metal spring and milk filter top. After 5 days, 50.0
mL of the medium scale cultures were inoculated into 1.0 L of media
in 2.8 L wide-mouth Fernbach flasks containing a large spring with
20.0 g of Amberlite XAD-16 adsorbent resin. Large-scale cultures were
fermented for 7 days, and the cells and resin were filtered using
Whatman glass microfiber filters and washed with sterile water. The
filtered cells and resin with the filter paper were extracted with
1:1 dichloromethane (CH2Cl2) and methanol (MeOH).
The organic extracts were removed from the cells and resin by vacuum
filtration, and the extracts were evaporated under vacuum. The dried
extracts were prefractionated by MPLC using a MeOH/H2O
step gradient system (10% MeOH wash (discarded), 20% MeOH, 40% MeOH,
60% MeOH, 80% MeOH, 100% MeOH, and 100% EtOAc) to afford six prefractions
(A–F). Growth, extract, and prefractionation of each bacterium
(RL09-219-HVF-D (RLUS-2105), RL12-176-HVF-A (RLUS-2108), and RL12-145-NTF-A
(RLUS-2152)) were processed in the same conditions as described above.
Compound Purification
Micromonolactam (1) and
dracolactams A (2) and C (3) were
isolated from prefractions RLUS-2152C and RLUS-2152D. RLUS-2152D was
separated by a RP-HPLC (Phenomenex Synergi Fusion-RP 80A, 250 mm ×
10.0 mm, 10 μm) and eluted using a gradient of 5% MeCN to 95%
MeCN with 0.02% formic acid (0–22 min) at a flow rate of 8.0
mL/min to afford two subfractions (RLUS-2152D-1 and RLUS-2152D-2).
RLUS-2152D-1 was purified by HPLC on a Phenomonex Kinetex XB-C18 column
(100 mm × 4.6 mm, 26 μm, 1.0 mL/min) using an isocratic
elution profile (25% MeCN with 0.02% formic acid) to afford micromonolactam
(1) and dracolactam C (3). RLUS-2152C was
subjected to an ODS HPLC (Phenomonex Kinetex XB-C18, 100
mm × 4.6 mm, 26 μm) using an isocratic elution system (20%
MeCN + 0.02% formic acid, 1.0 mL/min) to furnish dracolactam A (2).RLUS-2108C was loaded on a RP-HPLC (Phenomenex Synergi
Fusion-RP, 250 mm × 10.0 mm, 10 μm) using a gradient of
5% MeCN to 95% MeCN with 0.02% formic acid at a flow rate of 8.0 mL/min
gave collismycin A (4, RLUS-2108C-1) and two fractions
(RLUS-2108C-2 and RLUS-2108C-3). The RLUS-2108C-2 fraction was purified
by a RP-HPLC (Phenomonex Kinetex XB-C18, 100 mm ×
4.6 mm, 26 μm) using an isocratic separation (50% MeCN + 0.02%
formic acid, 1.0 mL/min) to afford SF2738D (5).Amychelin C (6) was purified from RLUS-2105D by a
RP-HPLC (Phenomenex Synergi Fusion-RP, 250 mm × 10.0 mm, 10 μm)
using a gradient elution profile (5% MeCN to 95% MeCN + 0.02% formic
acid, 8.0 mL/min).
Antimicrobial Screening Methods and Data
Antimicrobial
susceptibility tests were performed using a miniaturized high-throughput
assay adapted from the broth microdilution method outlined by the
Clinical and Laboratory Standards Institute (CLSI). Bacterial test
strains were individually grown on fresh Nutrient Broth (NB, ATCC
Medium 3) agar, Tryptic Soy Broth (TSB, ATCC Medium 18) agar, or Brain
Heart Infusion (BHI, ATCC Medium 44) agar, respectively (Table S4), as recommended by the American Type
Culture Collection (ATCC) cultivation protocol. Individual colonies
were used to inoculate 3 mL of sterile NB, TSB, or BHI media and grown
overnight with shaking (200 rpm; 37 °C). Listeria ivanovii (ATCC BAA-139) and Streptococcus pneumoniae (ATCC 49619) were incubated overnight but not shaken (37 °C;
5% CO2). Saturated overnight cultures were diluted in Cation-Adjusted
Mueller-Hinton Broth (CAMHB, BBL BD) or 1:1 CAMHB/BHI media according
to turbidity to achieve approximately 5 × 105 CFU
of final inoculum density and dispensed into sterile clear polystyrene
384-well microplates (Thermo Scientific 265202) with a final screening
volume of 30 μL. DMSO solutions of test compounds and antibiotic
controls were prepared as 1:1 dilution series and pinned into each
assay plate (200 nL) using a high-throughput pinning robot (Tecan
Freedom EVO 100) to achieve final screening concentrations ranging
from 128 μM to 3.91 nM per compound. In each 384-well plate,
lane 1 was reserved for DMSO vehicle and culture medium; lane 2 reserved
for DMSO vehicle, culture medium, and target bacteria; lanes 23 and
24 reserved for antibiotic controls, DMSO vehicle, culture medium,
and target bacteria. After compound pinning, assay plates were read
as T0 at OD600 using an automated
plate reader system (Thermo Scientific Spinnaker Microplate Robot;
BioTek Synergy Neo2 plate reader) and then every hour for 20 hours
(T1–T20), while incubating in an ambient room temperature carousel (25 °C).
Resulting growth curves from the dilution series of each compound
and control were used to determine their minimum inhibitory concentration
(MIC) values following standard procedures.
Safety Statement
Caution! Biological agents Staphylococcus aureus, methicillin-resistant Staphylococcus
aureus (MRSA), Vibrio cholerae, Salmonella entericayphimuriumser.
Typhimurium, Pseudomonas aeruginosa, and Yersinia pseudotuberculosis were handled following
BSL-2 protocols. No other unexpected or unusually high safety hazards
were encountered in the work reported.
Authors: Mingxun Wang; Jeremy J Carver; Vanessa V Phelan; Laura M Sanchez; Neha Garg; Yao Peng; Don Duy Nguyen; Jeramie Watrous; Clifford A Kapono; Tal Luzzatto-Knaan; Carla Porto; Amina Bouslimani; Alexey V Melnik; Michael J Meehan; Wei-Ting Liu; Max Crüsemann; Paul D Boudreau; Eduardo Esquenazi; Mario Sandoval-Calderón; Roland D Kersten; Laura A Pace; Robert A Quinn; Katherine R Duncan; Cheng-Chih Hsu; Dimitrios J Floros; Ronnie G Gavilan; Karin Kleigrewe; Trent Northen; Rachel J Dutton; Delphine Parrot; Erin E Carlson; Bertrand Aigle; Charlotte F Michelsen; Lars Jelsbak; Christian Sohlenkamp; Pavel Pevzner; Anna Edlund; Jeffrey McLean; Jörn Piel; Brian T Murphy; Lena Gerwick; Chih-Chuang Liaw; Yu-Liang Yang; Hans-Ulrich Humpf; Maria Maansson; Robert A Keyzers; Amy C Sims; Andrew R Johnson; Ashley M Sidebottom; Brian E Sedio; Andreas Klitgaard; Charles B Larson; Cristopher A Boya P; Daniel Torres-Mendoza; David J Gonzalez; Denise B Silva; Lucas M Marques; Daniel P Demarque; Egle Pociute; Ellis C O'Neill; Enora Briand; Eric J N Helfrich; Eve A Granatosky; Evgenia Glukhov; Florian Ryffel; Hailey Houson; Hosein Mohimani; Jenan J Kharbush; Yi Zeng; Julia A Vorholt; Kenji L Kurita; Pep Charusanti; Kerry L McPhail; Kristian Fog Nielsen; Lisa Vuong; Maryam Elfeki; Matthew F Traxler; Niclas Engene; Nobuhiro Koyama; Oliver B Vining; Ralph Baric; Ricardo R Silva; Samantha J Mascuch; Sophie Tomasi; Stefan Jenkins; Venkat Macherla; Thomas Hoffman; Vinayak Agarwal; Philip G Williams; Jingqui Dai; Ram Neupane; Joshua Gurr; Andrés M C Rodríguez; Anne Lamsa; Chen Zhang; Kathleen Dorrestein; Brendan M Duggan; Jehad Almaliti; Pierre-Marie Allard; Prasad Phapale; Louis-Felix Nothias; Theodore Alexandrov; Marc Litaudon; Jean-Luc Wolfender; Jennifer E Kyle; Thomas O Metz; Tyler Peryea; Dac-Trung Nguyen; Danielle VanLeer; Paul Shinn; Ajit Jadhav; Rolf Müller; Katrina M Waters; Wenyuan Shi; Xueting Liu; Lixin Zhang; Rob Knight; Paul R Jensen; Bernhard O Palsson; Kit Pogliano; Roger G Linington; Marcelino Gutiérrez; Norberto P Lopes; William H Gerwick; Bradley S Moore; Pieter C Dorrestein; Nuno Bandeira Journal: Nat Biotechnol Date: 2016-08-09 Impact factor: 54.908
Authors: Lennart Martens; Matthew Chambers; Marc Sturm; Darren Kessner; Fredrik Levander; Jim Shofstahl; Wilfred H Tang; Andreas Römpp; Steffen Neumann; Angel D Pizarro; Luisa Montecchi-Palazzi; Natalie Tasman; Mike Coleman; Florian Reisinger; Puneet Souda; Henning Hermjakob; Pierre-Alain Binz; Eric W Deutsch Journal: Mol Cell Proteomics Date: 2010-08-17 Impact factor: 5.911
Authors: Taise T H Fukuda; Eric J N Helfrich; Emily Mevers; Weilan G P Melo; Ethan B Van Arnam; David R Andes; Cameron R Currie; Monica T Pupo; Jon Clardy Journal: ACS Cent Sci Date: 2021-01-20 Impact factor: 14.553