Shunji Yamada1,2, Kengo Ito2, Atsushi Kurotani2, Yutaka Yamada2, Eisuke Chikayama2,3, Jun Kikuchi1,2,4. 1. Graduate School of Bioagricultural Sciences, Nagoya University, 1 Furo-cho, Chikusa-ku, Nagoya, Aichi 464-0810, Japan. 2. RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. 3. Department of Information Systems, Niigata University of International and Information Studies, 3-1-1 Mizukino, Nishi-ku, Niigata-shi, Niigata 950-2292, Japan. 4. Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.
Abstract
InterSpin (http://dmar.riken.jp/interspin/) comprises integrated, supportive, and freely accessible preprocessing webtools and a database to advance signal assignment in low- and high-field NMR analyses of molecular complexities ranging from small molecules to macromolecules for food, material, and environmental applications. To support handling of the broad spectra obtained from solid-state NMR or low-field benchtop NMR, we have developed and evaluated two preprocessing tools: sensitivity improvement with spectral integration, which enhances the signal-to-noise ratio by spectral integration, and peaks separation, which separates overlapping peaks by several algorithms, such as non-negative sparse coding. In addition, the InterSpin Laboratory Information Management System (SpinLIMS) database stores numerous standard spectra ranging from small molecules to macromolecules in solid and solution states (dissolved in polar/nonpolar solvents), and can be searched under various conditions using the following molecular assignment tools. SpinMacro supports easy assignment of macromolecules in natural mixtures via solid-state 13C peaks and dimethyl sulfoxide-dissolved 1H-13C correlation peaks. InterAnalysis improves the accuracy of molecular assignment by integrated analysis of 1H-13C correlation peaks and 1H-J correlation peaks of small molecules dissolved in D2O or deuterated methanol, which supports easy narrowing down of metabolite candidates. Finally, by enabling database interoperability, SpinLIMS's client software will ultimately support scientific discovery by facilitating sharing and reusing of NMR data.
InterSpin (http://dmar.riken.jp/interspin/) comprises integrated, supportive, and freely accessible preprocessing webtools and a database to advance signal assignment in low- and high-field NMR analyses of molecular complexities ranging from small molecules to macromolecules for food, material, and environmental applications. To support handling of the broad spectra obtained from solid-state NMR or low-field benchtop NMR, we have developed and evaluated two preprocessing tools: sensitivity improvement with spectral integration, which enhances the signal-to-noise ratio by spectral integration, and peaks separation, which separates overlapping peaks by several algorithms, such as non-negative sparse coding. In addition, the InterSpin Laboratory Information Management System (SpinLIMS) database stores numerous standard spectra ranging from small molecules to macromolecules in solid and solution states (dissolved in polar/nonpolar solvents), and can be searched under various conditions using the following molecular assignment tools. SpinMacro supports easy assignment of macromolecules in natural mixtures via solid-state 13C peaks and dimethyl sulfoxide-dissolved 1H-13C correlation peaks. InterAnalysis improves the accuracy of molecular assignment by integrated analysis of 1H-13C correlation peaks and 1H-J correlation peaks of small molecules dissolved in D2O or deuterated methanol, which supports easy narrowing down of metabolite candidates. Finally, by enabling database interoperability, SpinLIMS's client software will ultimately support scientific discovery by facilitating sharing and reusing of NMR data.
Environmental problems
such as marine pollution; destruction of
land and fresh water ecosystems; depletion of resources including
energy, raw materials, and food; and health problems are some of the
global challenges of modern society. The realization of a materials-circulating
society, including use of renewable energy and production of sustainable
food and materials, is increasingly important. With the rapid development
of information and communication technology in recent years, it is
expected that innovations in environmental science, sustainable resources,
materials, foods, and medicine will be integrated by effectively connecting
the accumulating scientific data and real-world information.[1−3] As a result, digital innovations in the analyses of natural mixtures,
such as biogeochemical samples from the environment and molecular
complexities from biological tissues, are becoming important both
for a sustainable society and for healthcare.[4,5]NMR approaches to natural mixture analysis are being developed
as a strategy[6] to evaluate homeostatic
stages via molecular compositional changes in healthcare,[7−11] foods,[12,13] natural materials,[14−17] biomass utilizations,[18,19] and environmental ecology.[20−24] Alongside, there have been many advances in NMR technology, including
high-field NMR over 1 GHz using high-temperature superconducting materials,[25] hyperpolarization,[26] and photodetection NMR using diamond nitrogen-vacancy centers,[27] zero-magnetic-field NMR,[28] and compact and benchtop NMR instruments that have become
highly cost-effective owing to the marked progress in permanent magnet
materials.[29,30] These innovations in NMR hardware
are likely to be applied not only to precise analysis by high-magnetic-field
NMR in the laboratory but also to homeostatic assessments of environment
and health, and quality control in the fields of agriculture, forestry,
and fishery.Thus, identification of molecules contained in
mixtures is an important
task in NMR analysis. Because the physical and chemical properties
of these molecules can be extremely diverse, various sample preparation
methods and pulse sequences have been used for mixture analysis.[4,5,31] Depending on the target molecules
under analysis, the sample preparation method may range from solid-state
to polar and nonpolar solvent systems.[32,33] When targeting
small molecules, for example, solution NMR in a polar or semipolar
solvent system such as deuterated water (D2O) or deuterated
methanol (MeOD) is generally used.[34,35] On the other
hand, macromolecules can be evaluated using a dimethyl sulfoxide (DMSO)-solubilized
system[36] or solid-state 13C
cross-polarization magic-angle spinning (CP-MAS) NMR. One-dimensional
(1D)-NMR and two-dimensional (2D)-NMR such as 1H–13C heteronuclear single quantum coherence (HSQC) and 2D-1H–J resolved (2D-Jres) spectroscopy are also useful for applications where
stable isotope labeling experiments cannot be applied.Nevertheless,
such molecular assignments remain difficult owing
to the problems of spectral overlap, and a lack of available reference
spectra or convenient molecular assignment tools specific to the molecules
and conditions of interest. Databases and analytical tools for traditional
major metabolomics studies such as HMDB,[37] BMRB,[38] BML-NMR,[39] MMCD,[40] NMRShiftDB,[41] TOCCATA,[42] COLMAR,[43] MetaboLights,[44] MetaboAnalyst,[45] SpinAssign,[46] and
SpinCouple[47] focus on the analysis of low-molecular-mass
metabolites by high-magnetic-field solution NMR. For the analysis
of macromolecular mixtures derived from environmental samples and
living organisms, however, solid-state CP-MAS spectral data can characterize
insoluble samples, whereas HSQC spectral data in a DMSO solvent are
required to characterize soluble samples. BMRB contains reference
NMR data on biomolecules in various solvents such as DMSO and methanol,
but it is limited to partial structural data for polysaccharides.
In addition, Bm-Char of ECOMICS[48] can be
used to characterize chemical structures from the HSQC spectrum of
a biomass sample. As opposed to many other databases of metabolites,
GISSMO[49,50] offers the complete spin system for a large
number of metabolites, making analysis possible regardless of the
magnetic field. Nevertheless, there remain insufficient databases
and analytical tools for complex mixtures of similarly structured
macromolecules, or for solid CP-MAS NMR, which has typically very
low resolution, or low-field benchtop NMR.To overcome these
problems, here we have developed InterSpin, an
integrated supportive webtool comprising freely accessible preprocessing
tools, a database, and molecular assignment webtools to advance signal
assignment in low- and high-field NMR analyses of small- to macromolecular
mixtures (Figure ).
InterSpin comprises the following three elements: (1) spectrum-preprocessing
tools, (2) molecular assignment tools, and (3) the InterSpin Laboratory
Information Management System (SpinLIMS) database.
Figure 1
Overview of InterSpin.
InterSpin is a freely accessible integrated
supportive webtool for advanced performance of NMR signal assignment
in low- and high-field NMR analyses of small- to macromolecular mixtures.
InterSpin comprises the following three elements. (1) Spectrum-preprocessing
tools. In the case of a broad spectrum obtained from low-field benchtop 1H-NMR or solid-state 13C CP-MAS, sensitivity improvement
with spectral integration (SENSI) helps to overcome the problem of
low signal-to-noise ratio by increasing resolution through the integration
of multiple spectra, whereas PKSP supports effective peak separation
by a multivariate spectral decomposition method. (2) Molecular assignment
tools. SpinMacro supports simplifying the macromolecular assignment
of a solid CP-MAS spectrum or a DMSO-solubilized 1H–13C HSQC spectrum. SpinAssign searches the SpinLIMS database
for a compound corresponding to the HSQC NMR peaks. SpinCouple can
assign 1H–J 2D-Jres NMR peaks. InterAnalysis is a Venn-diagram-type highly
accurate annotation tool that helps to narrow down candidate molecules
using correlation peaks from both the HSQC spectrum and the 2D-Jres spectrum. In the bottom right of the figure,
blue, yellow, and red circles represent a set of search results; the
green star represents the narrowed-down set. (3) InterSpin Laboratory
Information Management System (SpinLIMS) database. The database includes
reference solid-state CP-MAS spectra and solution-state HSQC spectra
(DMSO) for macromolecules, and reference solution-state HSQC and 2D-Jres spectra (D2O and MeOD) for small
molecules.
Overview of InterSpin.
InterSpin is a freely accessible integrated
supportive webtool for advanced performance of NMR signal assignment
in low- and high-field NMR analyses of small- to macromolecular mixtures.
InterSpin comprises the following three elements. (1) Spectrum-preprocessing
tools. In the case of a broad spectrum obtained from low-field benchtop 1H-NMR or solid-state 13CCP-MAS, sensitivity improvement
with spectral integration (SENSI) helps to overcome the problem of
low signal-to-noise ratio by increasing resolution through the integration
of multiple spectra, whereas PKSP supports effective peak separation
by a multivariate spectral decomposition method. (2) Molecular assignment
tools. SpinMacro supports simplifying the macromolecular assignment
of a solid CP-MAS spectrum or a DMSO-solubilized 1H–13C HSQC spectrum. SpinAssign searches the SpinLIMS database
for a compound corresponding to the HSQC NMR peaks. SpinCouple can
assign 1H–J 2D-Jres NMR peaks. InterAnalysis is a Venn-diagram-type highly
accurate annotation tool that helps to narrow down candidate molecules
using correlation peaks from both the HSQC spectrum and the 2D-Jres spectrum. In the bottom right of the figure,
blue, yellow, and red circles represent a set of search results; the
green star represents the narrowed-down set. (3) InterSpin Laboratory
Information Management System (SpinLIMS) database. The database includes
reference solid-state CP-MAS spectra and solution-state HSQC spectra
(DMSO) for macromolecules, and reference solution-state HSQC and 2D-Jres spectra (D2O and MeOD) for small
molecules.
Results and Discussion
Signal Enhancement and
Peak Separation of Benchtop NMR Spectra
by SENSI and PKSP
To support preprocessing of a broad spectrum,
InterSpin uses peaks separation (PKSP) and SENSI1D, which have been
newly developed as webtools (Figure ). SENSI1D is intended to increase signal intensities
and to overcome the problem of low signal-to-noise (S/N) ratio by
the integration of multiple spectra without additional measurements. On the other hand, PKSP is a multivariate method of spectral decomposition
that includes the algorithms for non-negative sparse coding (NNSC),[51,52] which separates the spectrum into non-negative sparse components;
multivariate curve resolution-alternate least squares (MCR-ALS); fast
independent component analysis (FastICA); and non-negative matrix
factorization (NMF). We have previously described the spectrum-preprocessing
methods of SENSI,[53] MCR-ALS,[15] and NMF;[16] here,
we have integrated them into InterSpin as a freely available webtool.First, we verified the effectiveness of the new function NNSC in
PKSP using multicomponent test data with increasing numbers of components.
MCR-ALS and NMF required significant computing time when processing
more than 100 components, whereas NNSC and FastICA were fast, maintaining
speed even as the component number increased (Figure ). In terms of resolving the spectrum of
mixtures of 10 standard compounds (Supporting Information Table S1) with reference to the spectrum of each
standard compound by PKSP, the Durbin–Watson (DW) plot approached
2 (Supporting Information Figure S1, white)
with 10 components identified by all algorithms, the residual sum
of squares (RSS) plot converged to 0, and the spectrum was separated
into the correct number of components (Supporting Information Figure S1). NNSC, MCR-ALS, and NMF generally
showed good separation of all components from the mixed spectra (Supporting
Information Figure S2). In NNSC, a sharp
peak was observed in the broad part of the spectrum (3–4 ppm)
for glucose. In FastICA, a large error in the original spectrum occurred
for glucose and sucrose. In NNSC, NMF, and MCR-ALS, the ratio of components
in the mixture was well estimated, but FastICA showed an error for
alanine, phenylalanine, proline, valine, and glucose, although its
calculation speed was fast (Supporting Information Figure S3).
Figure 2
Comparison of the analysis speed of each algorithm in
peaks separation
(PKSP). (a) Three average analysis times for 25, 50, 100, and 198
components (i.e., compounds to be separated by each algorithm of PKSP).
(b) Three average analysis speeds for 25, 50, 100, and 198 components.
Comparison of the analysis speed of each algorithm in
peaks separation
(PKSP). (a) Three average analysis times for 25, 50, 100, and 198
components (i.e., compounds to be separated by each algorithm of PKSP).
(b) Three average analysis speeds for 25, 50, 100, and 198 components.As a demonstration of the integrated
use of SENSI and PKSP webtools, Figure shows that histidine,
creatine, and lactate were well separated as major components of Thunnus
muscle measured by benchtop 60 MHz NMR (Figure ). For this demonstration, the 60 MHz NMR
spectra from 51 samples of 40 fish foods (Supporting Information Table S2) and 11 standard compounds (Supporting
Information Table S3) first showed that
the SENSI tool strengthened 25 peaks of the 11 standards 66-fold on
average and improved the S/N ratio 5.5-fold (Supporting Information Figure S4 and Table S4). Subsequently, peak separation
of the benchtop NMR spectrum was performed by NNSC of PKSP, which
led to the separation of 17 components (Supporting Information Figure S5). Note that where there are multiple
signals for the same molecule, their coefficients of variation (CVs)
indicate that their signal intensities vary together. This information
can support signal attribution. Thus, histidine, creatine, and lactate
could be identified using the CV value of peaks detected by SENSI
(Supporting Information Figure S4b) and
the individual components obtained by PKSP (Supporting Information Figure S5d).
Figure 3
Molecular assignment of a mixture using
peaks separated by PKSP
(NNSC). (a) Original and separated spectra of No. 33 Thunnus sample
measured by benchtop 60 MHz NMR. Original and separated component
spectra of (b) histidine, (c) creatine, and (d) lactate.
Molecular assignment of a mixture using
peaks separated by PKSP
(NNSC). (a) Original and separated spectra of No. 33 Thunnus sample
measured by benchtop 60 MHz NMR. Original and separated component
spectra of (b) histidine, (c) creatine, and (d) lactate.To evaluate peak separation by the four algorithms
in PKSP, here
we determined the appropriate number of separate peaks using DW and
an RSS plot, which is the sum of squares of the residuals of the original
matrix of each model and the reconstruction matrix.[54] Although FastICA calculated negative values as separate
matrices, it determined the number of components more quickly than
the other algorithms (Figure ). When the number of components was large, however, NNSC
provided a realistic approximate spectrum at high speed and with non-negative
values. For the analysis of large numbers of components, therefore,
we considered that it would be most efficient to determine the number
of components with FastICA and then perform accurate spectral separation
with NNSC.For analysis in SENSI and PKSP, the peak maximum
must have exactly
the same chemical shift for each signal. Thus, peak alignment to correct
chemical shifts altered because of pH, temperature, or magnetic field
inhomogeneity caused by magnetic material in the sample is an important
process.The calculation algorithm used for PKSP is a multivariate
analysis;
therefore, it is essential to have M numbers of spectral
data. However, because 2D-NMR has data in the f2
direction, PKSP can be applied to data from the 2D matrix of one or
more spectra of 2D-NMR. Therefore, if a user has difficulty with 2D-NMR
peak-picking of, for example, saccharides and lipids, our approach
can support objective peak-picking by helping to separate peaks via
PKSP. As a result of peak separation in one spectrum of 2D-Jres using PKSP’s MCR-ALS algorithm, it
was separated into three components (Supporting Information Figure S6). As shown in Figure S6, 8 compounds (Valine, Lactate, Alanine, Creatine, Trimethylamine
N-oxide, Betaine, Glycine and Glucose) were assigned.In general,
quality control is essential in modern food production.
In many cases, however, the primary production or distribution sites
(i.e., farms or fishing grounds) are located far from laboratories
or analytical centers (i.e., food companies or facilities). In such
cases, benchtop NMR may potentially revolutionize the quality control
processes that identify metabolic changes in food resulting from storage
and fermentation. As a practical tool, we previously developed FoodPro,[55] a database and webtool for predicting the taste
and longevity of foods on the basis of the similarity of desktop NMR
spectra of food substances. As shown in this study, SENSI and PKSP
are expected to lead to improved cost-effectiveness of this approach
by supporting the annotation of the broad spectra obtained from in
situ low-field NMR.
Assignment of Macromolecules by SpinMacro
InterSpin’s
Annotation tools consist of the newly developed SpinMacro and InterAnalysis,
and the re-implemented SpinAssign and SpinCouple, which were previously
developed (Figure ). SpinMacro is a webtool for supporting simplification of the molecular
assignment of macromolecules in solid-state 13CCP-MAS
spectra and in 1H–13C HSQC spectra recorded
in a DMSO solvent (Figure , Supporting Information Figure S7). As reference data for SpinMacro, solid CP-MAS peaks and HSQC peaks
of compounds in a DMSO solvent have been stored in the InterSpin Laboratory
Information Management System (SpinLIMS; see Figure ) database.
Figure 4
How to assign macromolecules in a mixture
using “SpinMacro”.
The flow of data through SpinMacro is shown. The user queries of CP-MAS
peaks or HSQC peaks are entered as PHP. The SpinLIMS database is then
searched for candidate molecules within the set range of 13C chemical shifts for CP-MAS, or 1H and 13C
chemical shifts for HSQC.
How to assign macromolecules in a mixture
using “SpinMacro”.
The flow of data through SpinMacro is shown. The user queries of CP-MAS
peaks or HSQC peaks are entered as PHP. The SpinLIMS database is then
searched for candidate molecules within the set range of 13C chemical shifts for CP-MAS, or 1H and 13C
chemical shifts for HSQC.The steps for using SpinMacro are as follows. (1) PHP interpretation
of the user query for CP-MAS peaks or HSQC peaks. (2) Connect to the
SpinLIMS database and search for candidate molecules within the set
range of 13C chemical shifts for CP-MAS, or 1H and 13C chemical shifts for HSQC. (3) Conversion of
results to HTML and JavaScript for convenient and quick display. Here,
the previously reported solid-state CP-MAS spectra of Euglena gracilis(14) and
standards (paramylon, peptides, and lipids) were queried using SpinMacro
and SENSI–PKSP. First, the CV value was determined for peaks
picked by SENSI (Figure ), and then, the components were identified by PKSP (Supporting Information Figure S8). As a result, paramylon, peptides,
and lipids were separated as the main three components of E. gracilis. Ultimately, as a result of retrieving
the peaks picked by SENSI with SpinMacro, it was possible to verify
their assignment (Supporting Information Figure S7). In a previous study of general lipids and general peptides
of E. gracilis,[56] we conducted experiments that required considerable measurement
time, such as 2D-/3D-NMR pulse sequences of solid-state NMR (i.e.,
INADEQUATE, SHA+, and 3D-DARR). Because the peak separation by NNSC
corresponds to 1D-CP-MAS, this tool supports a more rapid evaluation
of macromolecular mixtures.
Figure 5
CV of peaks picked by SENSI from E. gracilis CP-MAS spectrum. (a) SENSI results. Red
circles are the picked peaks.
The enlarged view (top left) shows the raw spectrum of paramylon from
the data used for SENSI of a sugar region. (b) CV of peaks picked
by SENSI. Blue circles indicate lipid signals, black circles indicate
peptide signals, and red circles indicate paramylon signals.
CV of peaks picked by SENSI from E. gracilisCP-MAS spectrum. (a) SENSI results. Red
circles are the picked peaks.
The enlarged view (top left) shows the raw spectrum of paramylon from
the data used for SENSI of a sugar region. (b) CV of peaks picked
by SENSI. Blue circles indicate lipid signals, black circles indicate
peptide signals, and red circles indicate paramylon signals.The database and mixture analysis
tools for macromolecular and
solid CP-MAS NMR of complex and similar structures have room for development.
SpinMacro developed herein retrieves the peak of the whole macromolecular
structure from SpinLIMS and provides candidate molecules in analyses
of environmental and biological macromolecules. In the future, it
should be possible to improve assignment accuracy by discriminating
macromolecules with similar structures through the extraction of features
of chemical structures using machine learning algorithms based on
macromolecular databases.
InterSpin Laboratory Information Management
System (SpinLIMS)
Database
Within InterSpin, SpinLIMS (Figure ) is a relational database comprising several
entities or “tables” developed by MySQL (Supporting
Information Figures S9a and S10a, core
tables). To make the database extensible, SpinLIMS client software
was developed to incorporate a simple registration system. After registering
in the user table (“limsuser”), the researchers can
associate their NMR spectrum (“spectrum”) with the chemical
shift (“cs”) or the J value (“jval”)
tables, as well as the molecular value (“metabolite”)
table by means of the assignment table (cs_assign, hc_pk (h_pk for 1H-1D-NMR, c_pk for 13C-1D-NMR), and hj_pk) via
the client software. For the NMR spectrum, there is an associated
pulse-type table (“pulse”), solvent table (“solvent”),
and standard substance table (“stdref”). For chemical
shifts and J values, there is an associated peak-shaped
table (“pkshape”). Molecular name (“metabolitename”),
atom (“atom”), and nuclide (“nucleus”)
tables are associated with the molecule.SpinLIMS contains numerous
reference spectra of small molecules to macromolecules recorded in
solid state and solution state (polar and nonpolar solvent systems)
that can be used to support mixture analysis of various samples. Overall,
there are 34 data tables in SpinLIMS, as well as tables for managing
the information from NMR experiments (Supporting Information Figure S10b). In addition to HSQC in D2O (705 spectra) and 2D-Jres in D2O (623 spectra), SpinLIMS has several newly added spectra
from CP-MAS (35 spectra), HSQC in MeOD (947 spectra) and deuterated
DMSO (171 spectra), and 2D-Jres in MeOD
(357 spectra). SpinMacro, InterAnalysis, SpinAssign, and SpinCouple
are connected to the MySQL server via a local network in InterSpin
(Supporting Information Figure S9b). As
a result, the re-implemented SpinAssign and SpinCouple facilitate
chemical shift searches in MeOD. SpinAssign also facilitates searches
in a deuterated DMSO/pyridine solvent.
Venn-Diagram-Type Annotation
by InterAnalysis
Within
InterSpin, the new tool InterAnalysis is a Venn-diagram-type annotation
tool that can aid simultaneously searches of two kinds of correlation
peaks, 1H–13C HSQC and 2D-Jres, to narrow down candidate molecules (Figure ). The flow of data through
InterAnalysis is as follows: (1) PHP interpretation of user queries,
(2) connection to the SpinLIMS database and conversion to HTML, and
(3) JavaScript execution for a convenient and rapid view.
Figure 6
Result of InterAnalysis
for 1H–13C
HSQC and 1H–J 2D-Jres peaks from Acanthogobius flavimanus body muscle extract in MeOD. The summary shows the number of query
peaks, the number of assigned molecules, and the narrowed-down set
of molecules. The table shows some of the molecular assignment results
for each query peak.
Result of InterAnalysis
for 1H–13C
HSQC and 1H–J 2D-Jres peaks from Acanthogobius flavimanus body muscle extract in MeOD. The summary shows the number of query
peaks, the number of assigned molecules, and the narrowed-down set
of molecules. The table shows some of the molecular assignment results
for each query peak.Here, we demonstrated the application of InterAnalysis to
HSQC
and 2D-Jres peaks from A. flavimanus (Yellowfin goby) body muscle extracts
in MeOD (Figure )
and deuterated potassium phosphate (Supporting Information Figure S11). For data acquired in MeOD extract,
SpinAssign and SpinCouple assigned 223 and 107 molecules. By contrast,
InterAnalysis assigned 25 molecules, narrowing down the molecules
to 11 and 23%, respectively (Figure ). From previous studies,[12,21,23] seven metabolites such as l-valine, l-leucine, l-phenylalanine, l-histidine, l-proline, linoleic acid, and capric acid were confirmed as
well-known metabolites that should be present in fish.In the
analysis of natural mixtures, molecular assignment based
on two kinds of 2D-NMR spectra, HSQC and 2D-Jres, is a powerful strategy to increase assignment accuracy.
The previous tools, SpinAssign and SpinCouple, acquired two separate
results of correlation peak attribution; thus, it was highly time-consuming
to narrow down candidate molecules. The newly developed Venn-diagram-type
webtool, InterAnalysis, supports the annotation of environmentally
and biologically derived small-molecule mixtures.
Conclusions
As shown above, InterSpin provides free access to a suite of tools
whose goal is to support the interpretation of low-resolution NMR
spectra, similar to the spectra recorded for food, material, and environmental
applications. Each tool of InterSpin supports low-resolution NMR spectrum
analysis by having interoperability as demonstrated, for example,
by the peak attribution of E. gracilis by NNSC of SENSI, and confirmation of metabolite candidates by SpinMacro.
Furthermore, 2D-Jres and HSQC are pulse
sequences that are frequently used in high-magnetic-field NMR; conventionally,
SpinAssign and SpinCouple have had to be applied individually, but
InterAnalysis will aid the simultaneous application of these tools
at the same time.NMR has the great advantage that chemical
shifts and coupling constants
are absolute physical constants that have high repeatability and interchangeability
among different agencies. Therefore, NMR provides data that are suitable
for reuse globally. For NMR analyses that target the molecular complexity
of living bodies and environments, InterSpin provides an integrated
supportive resource, consisting of an extensible SpinLIMS database
and webtools that are easily accessible to varied and numerous researchers.
SpinLIMS’s client software will ultimately promote scientific
discovery through the open circulation of knowledge by facilitating
data sharing and reusing, as well as the interoperability of NMR data
for the achievements of researchers to be recognized fairly and with
transparency.In conclusion, InterSpin comprises integrated
supportive webtools
that are effective not only for precision analysis in laboratories
but also for on-site analysis by benchtop NMR. As a platform linking
the laboratory and the real world, it will support sustainable development
on the basis of NMR data.
Experimental Section
SENSI and PKSP
The SENSI and PKSP webtools were developed
using the Shiny package based on previously reported R scripts.[15,16,53] Here, we incorporated a new method,
NNSC,[51,52] into PKSP.
Database and Client Software
of SpinLIMS
SpinLIMS was
developed in MySQL. It integrated previously reported data from SpinAssign[46] and SpinCouple.[47] In addition, it newly implemented NMR spectra for solid-state CP-MAS
and solution-state in DMSO/pyridine and MeOD solvents. The SpinLIMS
client software was developed with Java.
SpinMacro, InterAnalysis,
SpinAssign, and SpinCouple
SpinMacro and InterAnalysis were
developed in HTML, PHP, JavaScript,
and MySQL. SpinAssign[46] and SpinCouple[47] were completely re-implemented within the program
and were connected to the SpinLIMS database to run within InterSpin.
Evaluation of Benchtop NMR Signal Assignment Performance by
SENSI and PKSP
To evaluate the performance of SENSI and PKSP,
we used 1H-NMR data of mixtures of 10 standard compounds
(Two kinds of compositions: Supporting Information Table S1), 40 fish-based food mixtures (Supporting Information Table S2) and 11 standard compounds (Supporting
Information Table S3) measured by benchtop
60 MHz NMR (Nanalysis, Alberta, Canada). All NMR spectra were phased
, baseline corrected and spectral aligned by the Mnova software (Mestrelab
Research, A Coruña, Spain). The aligned data was normalized
to the root of the sum of the squared value of all variables for a
given data. In case of mixtures of 10 standard compounds, we analyzed
by PKSP (four algorithms such as NNSC, NMF, MCR-ALS, FastICA). In
case of analysis of 51 samples of 40 fish foods and 11 standard compounds,
we used NNSC method in KSPS with 17 components. In addition, the computational
speed of PKSP (four methods of NNSC, NMF, MCR-ALS, FastICA in the
number of components of each data number) was evaluated using the
similarly processed spectra from198 plants and algae biomass measured
by 500 MHz 13CCP-MAS (Bruker Biospin, Rheinstetten, Germany).
Evaluation of Macromolecular Assignment by SpinMacro with SENSI
and PKSP
To evaluate the molecular attribution strategy of
SpinMacro using SENSI and PKSP, we used the previously reported CP-MAS
spectrum of E. gracilis(14) and spectra of standard compounds (paramylon,
peptide, and lipid). Their NMR spectra were conducted phased and baseline
correcttion. Then, all spectra were aligned by the Mnova software.
The aligned data was normalized to the root of the sum of the squared
value of all variables for a given data. Subsequently, we analyzed
by SINSI and PKSP (NNSC method, three components) using processed
data.
Comparison of Small-Molecule Assignment by InterAnalysis, SpinAssign,
and SpinCouple
To evaluate the performance of InterAnalysis,
HSQC and 2D-Jres peaks in a 700 MHz NMR
(Bruker Biospin, Rheinstetten, Germany) spectra of body muscle extract
of A. flavimanus were assigned molecules
by InterAnalysis, SpinAssign, and SpinCouple.