Anjana Singh1,2, Ved Prakash2, Nikhil Gupta3,4, Ashish Kumar4, Ravi Kant1, Dinesh Kumar3. 1. All India Institute of Medical Sciences (AIIMS), Rishikesh, Uttarakhand 249201, India. 2. Pulmonary & Critical Care Medicine, King George's Medical University, Lucknow, Uttar Pradesh 226003, India. 3. Centre of Biomedical Research (CBMR), SGPGIMS, Lucknow, Uttar Pradesh 226014, India. 4. Department of Chemistry, Banaras Hindu University, Varanasi, Uttar Pradesh 221005, India.
Abstract
Detection of metabolic disturbances in lung cancer (LC) has the potential to aid early diagnosis/prognosis and hence improve disease management strategies through reliable grading, staging, and determination of neoadjuvant status in LC. However, a majority of previous metabolomics studies compare the normalized spectral features which not only provide ambiguous information but further limit the clinical translation of this information. Various such issues can be resolved by performing the concentration profiling of various metabolites with respect to formate as an internal reference using commercial software Chenomx. Continuing our efforts in this direction, the serum metabolic profiles were measured on 39 LC patients and 42 normal controls (NCs, comparable in age/sex) using high-field 800 MHz NMR spectroscopy and compared using multivariate statistical analysis tools to identify metabolic disturbances and metabolites of diagnostic potential. Partial least-squares discriminant analysis (PLS-DA) model revealed a distinct separation between LC and NC groups and resulted in excellent discriminatory ability with the area under the receiver-operating characteristic (AUROC) = 0.97 [95% CI = 0.89-1.00]. The metabolic features contributing to the differentiation of LC from NC samples were identified first using variable importance in projection (VIP) score analysis and then checked for their statistical significance (with p-value < 0.05) and diagnostic potential using the ROC curve analysis. The analysis revealed relevant metabolic disturbances associated with LC. Among various circulatory metabolites, six metabolites, including histidine, glutamine, glycine, threonine, alanine, and valine, were found to be of apposite diagnostic potential for clinical implications. These metabolic alterations indicated altered glucose metabolism, aberrant fatty acid synthesis, and augmented utilization of various amino acids including active glutaminolysis in LC.
Detection of metabolic disturbances in lung cancer (LC) has the potential to aid early diagnosis/prognosis and hence improve disease management strategies through reliable grading, staging, and determination of neoadjuvant status in LC. However, a majority of previous metabolomics studies compare the normalized spectral features which not only provide ambiguous information but further limit the clinical translation of this information. Various such issues can be resolved by performing the concentration profiling of various metabolites with respect to formate as an internal reference using commercial software Chenomx. Continuing our efforts in this direction, the serum metabolic profiles were measured on 39 LC patients and 42 normal controls (NCs, comparable in age/sex) using high-field 800 MHz NMR spectroscopy and compared using multivariate statistical analysis tools to identify metabolic disturbances and metabolites of diagnostic potential. Partial least-squares discriminant analysis (PLS-DA) model revealed a distinct separation between LC and NC groups and resulted in excellent discriminatory ability with the area under the receiver-operating characteristic (AUROC) = 0.97 [95% CI = 0.89-1.00]. The metabolic features contributing to the differentiation of LC from NC samples were identified first using variable importance in projection (VIP) score analysis and then checked for their statistical significance (with p-value < 0.05) and diagnostic potential using the ROC curve analysis. The analysis revealed relevant metabolic disturbances associated with LC. Among various circulatory metabolites, six metabolites, including histidine, glutamine, glycine, threonine, alanine, and valine, were found to be of apposite diagnostic potential for clinical implications. These metabolic alterations indicated altered glucose metabolism, aberrant fatty acid synthesis, and augmented utilization of various amino acids including active glutaminolysis in LC.
Lung cancer (LC) is
the foremost contributor to cancer-related
deaths globally, and the number is increasing every year due to several
factors including poor air-quality index. LC is also the most lethal
cancer type, and for about 15% of LC diagnosed cases, the averaged
survival time is no longer than five years.[1,2] The
primary contributing factor to the lethality of LC is lack of reliable
noninvasive clinical markers, which can be used to distinguish LC
from other lung diseases after a symptomatic stage and predict its
prognosis.[3] Therefore, a majority of patients
are diagnosed at advanced stages following several clinical procedures
or diagnostic tests including the painful procedure of bronchoscopy/lung
biopsy.[4] However, clinical intervention
is unlikely to succeed if LC is diagnosed at an advanced stage, and
therefore, there is paucity and need for other noninvasive diagnostic
markers for LC to facilitate timely diagnosis and staging and for
guiding treatment decisions.[5,6]Starting our efforts
in this direction, we explored here the metabolomics
approach to identify reliable serum metabolic signatures that can
help to improve the early diagnostic and prognostic screening of LC.
Metabolomics studies on different cancer types revealed that tumor
progression often led to aberrant metabolic changes in biofluids including
blood serum.[3,5,7−16] Analysis of the NMR spectra of biological fluids (such as urine,
serum, bile, cerebrospinal fluid, etc.) using multivariate statistical
analysis tools is an emerging metabolomics approach for the identification
of the diagnostic panel of metabolic biomarkers for clinical surveillance.[17] The serum-based metabolomics studies carried
out in the recent past well demonstrate that the circulatory levels
of serum metabolites can differentiate LC types and can predict the
severity; thus, the approach has exquisite potential to aid early
diagnosis of LC and its clinical management.[5,15,16,18−27] Most of these metabolomics studies have been carried out in European,
Chinese, Japanese, and Korean populations. Though the prevalence of
LC is progressively increasing in India,[28] however, there is scarcity of such metabolomics studies performed
on Indian population. The present study, therefore, aims to perform
the serum metabolomics analysis in north Indian patients using NMR
which offers several technical advantages like reproducibility and
minimal sample handling. Clinically, LC is broadly classified into
two subtypes: small-cell LC (SCLC) and non-small-cell LC (NSCLC);
the ratio of NSCLC to SCLC is about 5:1.[29,30] Among NSCLC, the most important histological subtypes include adenocarcinoma
(ADC), squamous cell lung carcinoma (SqCC), and large-cell carcinoma,
of which ADC and SqCC represent ∼90% of all cases.[30] In the present study, we attempted to demonstrate
the utility of NMR-based metabolomics analysis in LC detection and
further to identify the distinctive metabolic patterns of clinical
subtypes of LC.
Results and Discussion
Patient Characteristics
The study involved 39 LC patients
(36 male and 3 female) with median age at presentation 54 years, and
the median disease duration was 3 years. The histological classification
and other clinical and demographic characteristics of patients are
tabulated in Table . As evident, there are 5 SCLC patients and 34 NSCLC patients, and
among 34 NSCLC patients, there are 14 ADC patients and 20 SqCC patients.
For a comparative analysis, the study involved about 42 normal healthy
subjects comparable in age and sex.
Table 1
Clinical and Demographic
Characteristics
of Subjects Included in the Study
parameter
LC
NC
total number (M/F)
39 (36:3)
42 (34:8)
age (range)
54.13 ± 10.96 (29–72)
48.98 ± 7.27 (38–74)
age of male
subjects (comparable, with no significant difference)
53.42 ± 1.852 (N = 36)
49.08 ± 1.303 (N = 36)
age of female subjects
(significantly different, p-value < 0.001)
62.67 ± 0.88 (N = 3)
48.33 ± 0.88 (N = 6)
disease duration (months)
18 (12–24)
smoking
24
(61.5%)
26 (61.9%)
Serum Metabolic Disturbances Associated with LC
Figure compares the cumulative
1D 1H Carr–Purcell–Meiboom–Gill (CPMG)
NMR spectra of serum samples obtained from LC patients and normal
control (NC) subjects. The NMR peaks in the spectra are annotated
as per the chemical shift assignments of metabolites tabulated in
the Supporting Information (Table S1).
Visual inspection revealed possible metabolic differences in the most
abundant metabolites such as lipid/lipoproteins, glucose, and lactate.
However, to confirm these changes and evaluate other subtle metabolic
differences, we further performed the multivariate statistical analysis.
Figure 1
Stacking
of cumulative 1D 1H NMR (CPMG) spectra recorded
on serum samples of LC patients (red) and NC subjects (blue): the
spectra in panel A represent chemical shift range δ(0.55–4.65)
ppm, whereas the spectra in panel B represent that from δ(5.8–8.5)
ppm with 8 times magnification compared to spectral region δ(0.55–4.65)
ppm for the purpose of clarity. Abbreviations used: His: histidine;
3HB: 3-hydroxybutyrate; LDL/VLDL: low-/very-low-density lipoprotein;
TMAO: trimethyl-N-oxide; NAG: N-acetyl
glycoprotein; NAAL: N-alpha acetyl lysine.
Stacking
of cumulative 1D 1H NMR (CPMG) spectra recorded
on serum samples of LC patients (red) and NC subjects (blue): the
spectra in panel A represent chemical shift range δ(0.55–4.65)
ppm, whereas the spectra in panel B represent that from δ(5.8–8.5)
ppm with 8 times magnification compared to spectral region δ(0.55–4.65)
ppm for the purpose of clarity. Abbreviations used: His: histidine;
3HB: 3-hydroxybutyrate; LDL/VLDL: low-/very-low-density lipoprotein;
TMAO: trimethyl-N-oxide; NAG: N-acetyl
glycoprotein; NAAL: N-alpha acetyl lysine.First, the normalized spectral features were compared,
and the
score plot derived from the partial least-squares discriminant analysis
(PLS-DA) analysis is shown in Figure A. An exquisite separation between LC and NC samples
and clustering of samples in each study group were clearly evident
from the score plot, suggesting that the two study groups are distinctively
different in terms of their serum metabolic profiles. This was further
corroborated by significantly higher (R2 > 0.9; Q2 > 0.7) validation parameters
of the PLS-DA model (Figure B), suggesting the good predictive power of discriminatory
models. Next, the serum metabolic features of the discriminatory potential
were identified based on the indexing of variable importance in projection
(VIP) scores derived from the PLS-DA analysis (Figure C). The VIP score value > 1.0 was considered
as the criterion for significant contribution to the discriminatory
model. Compared to NC, the sera of LC patients were characterized
by (a) elevated levels of glucose and N-acetyl-glycoproteins
and (b) decreased levels of alanine, lactate, myo-inositol, and different
membrane/lipid metabolites (including VLDL/LDL, polyunsaturated fatty
acid, choline, etc.). These metabolic changes clearly suggested that
there is coexistence of hyperglycemia and chronic inflammation in
LC patients. Such metabolic changes have been reported previously
in several other cancer conditions.[31] A
plasma-based metabolomics study performed at a very high field of
900 MHz revealed similar metabolic changes such as increased glucose
and decreased lipid profiles in the sera of LC patients.[27] However, the VIP score plot in our study showed
a higher degree of metabolic redundancy, and it is clearly evident
that the resonances of lipid/membrane metabolites, glucose, and lactate
were dominating the VIP score plot (Figure C).
Figure 2
Normalized spectral features analyzed for discriminatory
analysis
using the PLS-DA method. The resulted 2D score plot is shown in (A)
where the semitransparent or shaded areas represent the 95% confidence
regions: red and blue colored regions correspond to the LC and NC
study groups. (B) Bar plot showing the three performance measures
obtained after 10-fold CV analysis. (C) VIP score plot used for identifying
the metabolites of discriminatory relevance.
Normalized spectral features analyzed for discriminatory
analysis
using the PLS-DA method. The resulted 2D score plot is shown in (A)
where the semitransparent or shaded areas represent the 95% confidence
regions: red and blue colored regions correspond to the LC and NC
study groups. (B) Bar plot showing the three performance measures
obtained after 10-fold CV analysis. (C) VIP score plot used for identifying
the metabolites of discriminatory relevance.In order to avoid this redundancy and to screen other circulatory
metabolites for their discriminatory relevance, the NMR spectra of
LC and NC subjects were further analyzed, and the concentrations of
25 serum metabolites were estimated explicitly using Chenomx NMR suite
program. These metabolites are 3-hydroxybutyrate, lactate, acetate,
citrate, acetoacetate, pyruvate, succinate, alanine, betaine, creatine,
creatinine, glutamine, glutamate, glucose, glycine, leucine, isoleucine,
valine, proline, phenylalanine, threonine, tyrosine, histidine, trimethylamine-N-oxide (TMAO), and myo-inositol. As described previously,[32−35] these concentrations were further used to estimate other four relevant
metabolic ratios such as glutamate to glutamine ratio (EQR), phenylalanine
to tyrosine ratio (PTR), branched-chain amino acid to tyrosine ratio
[BTR also referred as Fischer ratio; estimated as (leucine + isoleucine
+ valine)/tyrosine],[36,37] and lactate-to-pyruvate ratio
(LPR). The estimated mean and median values of these 29 serum metabolic
features are tabulated in the Supporting Information (Table S2).Next, these 29 metabolic concentration profiles
were compared between
the study groups using principal component analysis (PCA) and PLS-DA
with orthogonal signal correction (OPLS-DA). Although PCA score plots
revealed poor discrimination among the two groups (results not shown),
the sample clustering and group separation among the two groups was
clearly evident in the 2D score plots of OPLS-DA (with model validation
parameters R2X = 0.729, R2Y = 0.648, and Q2 = 0.2; Figure A). The results based on the OPLS-DA analysis were nearly
similar to those based on the PLS-DA analysis (see Supporting Information, Figure S1). Further, the VIP score
plot analysis was performed to identify the metabolites of discriminatory
potential (Figure C, Supporting Information, Table S3).
The VIP scores provide indexing of the metabolic features contributing
to the discriminatory model.[39] For normalized
spectral features (more than 100 in number), the cutoff value is often
selected is 1.0.[39] However, for a limited
number of variables (e.g., 29 in our present study), the VIP score
cutoff value can be decided for legitimate selection of discriminatory
features as described previously.[39] For
example, a cutoff value of 0.9 resulted in 15 metabolic features of
discriminatory potential.
Figure 3
Discriminatory analysis performed based on OPLS-DA
modeling for
29 serum metabolic concentration profiles (25 serum metabolic entities
and 4 serum metabolic ratios as shown in Table ). The resulted score plot is shown in (A).
The plot in (B) shows the validation performance of the OPLS-DA model
in terms of correlation between scrambled models and the original
model using 100 permutations. (C) VIP score plot derived from the
OPLS-DA analysis in SIMCA highlighting the metabolites of discriminatory
significance. (D,E) ROC curves generated in the Biomarker module of
MetaboAnalyst based on the CV performance: the ROC curves averaged
from all CV runs are shown in (D), and the ROC curve based on all
29 features and computed with the 95% CI is shown in (E).
Discriminatory analysis performed based on OPLS-DA
modeling for
29 serum metabolic concentration profiles (25 serum metabolic entities
and 4 serum metabolic ratios as shown in Table ). The resulted score plot is shown in (A).
The plot in (B) shows the validation performance of the OPLS-DA model
in terms of correlation between scrambled models and the original
model using 100 permutations. (C) VIP score plot derived from the
OPLS-DA analysis in SIMCA highlighting the metabolites of discriminatory
significance. (D,E) ROC curves generated in the Biomarker module of
MetaboAnalyst based on the CV performance: the ROC curves averaged
from all CV runs are shown in (D), and the ROC curve based on all
29 features and computed with the 95% CI is shown in (E).All 29 circulatory profiles were further evaluated for their
diagnostic
potential and resulted in excellent discriminatory ability (Figure D,E). Among various
multivariate receiver operating characteristic (ROC) curves generated,
the curve based on top 10 discriminatory metabolites (selected based
on the highest VIP scores in the PLS-DA model) showed exquisite diagnostic
potential with the area under the ROC curve (AUROC) value = 0.97 [95%
CI = 0.93–1.00] (Figure D) which was as good as the cumulative ROC curve generated
based on all 29 circulatory profiles (AUROC value = 0.97) [95% CI
= 0.89–1.00] (Figure E), suggesting that these serum-based metabolic profiles estimated
in this study have good diagnostic potential as well.Compared
to NC, the LC patients showed significant alterations
for 17 serum metabolites. The sera of LC patients were characterized
by decreased serum levels of citrate, betaine, creatinine, and most
of the amino acids (including valine, leucine, isoleucine, glycine,
alanine, glutamine, proline, threonine, tyrosine, and histidine),
whereas the circulatory levels of pyruvate, acetoacetate, and 3-hydroxybutyrate
were significantly increased. In this study, the metabolic changes
observed were found well-consistent with previous serum-based studies,[3,18,23,24] though some disagreements were observed as well, as evident from Table . The observed discrepancies
may be attributed to the fact that the present study for the first
time involved concentration profiling using formate as an internal
reference. The example to be discussed here is that of circulatory
glucose which, in principle, decreases in cancer patients due to Warburg
effect (i.e., increased aerobic glycolysis in cancer).[40] However, majority of metabolomics studies in
the literature have failed to show significant difference in glucose
levels. In our study, the comparison based on normalized spectral
features clearly revealed that the circulatory glucose levels are
elevated in LC patients, suggesting that analysis needs to be rectified
and cross-checked. On the other hand, the circulatory glucose levels
estimated with respect to formate (as an internal reference) clearly
showed that the circulatory glucose levels are almost comparable between
cancer patients and normal healthy controls. According to a recent
metabolomics study,[38] the circulating formate
levels also decrease in LC patients relative to healthy control subjects;
therefore possibly, the net decrease in glucose levels is counterbalanced
by its normalization with respect to formate. Nevertheless, from the
metabolite concentrations estimated for other circulatory metabolites
including various amino acids and organic acids, a remarkable pattern
of metabolic alterations has been found and demonstrated to be well
consistent with various previous metabolomics studies.[3,18,23,24] In simple words, the metabolite concentrations reported in this
study can also be considered as ratiometric metabolic profiles.
Table 2
Diagnostic Potential of Serum Metabolites
Evaluated Based on AUROC for Differentiating LC from HCa
study
references
metabolite
AUROC
p-value
fold change
relative change
consistent
not consistent
histidine
0.923
0.000
–1.261
↓***
refs (18)(23)(24),
glutamine
0.880
0.000
–0.956
↓***
refs (18)(23)(24),
glycine
0.869
0.000
–1.062
↓***
ref (18)
threonine
0.868
0.000
–1.028
↓***
refs (18)(38),
ref (3)
alanine
0.833
0.000
–0.875
↓***
refs (23)(24),
ref (18)
valine
0.826
0.000
–0.812
↓***
ref (24)
ref (18)
citrate
0.788
0.000
–0.977
↓***
ref (18)
tyrosine
0.774
0.000
–0.714
↓***
ref (24)
ref (18)
proline
0.771
0.000
–0.767
↓***
ref (23)
ref (3)
leucine
0.763
0.000
–0.679
↓***
ref (18)
isoleucine
0.721
0.000
–0.525
↓***
ref (18)
pyruvate
0.695
0.002
0.710
↑**
refs (18)(24)(38),
3-hydroxybutyrate
0.694
0.003
0.658
↑**
ref (3)
betaine
0.694
0.004
–0.507
↓**
succinate
0.683
0.492
–0.482
↓
ref (18)
creatinine
0.667
0.018
–0.444
↓*
lactate
0.660
0.006
–0.417
↓**
refs (18)(24),
creatine
0.645
0.186
–0.340
↓
ref (18)
acetoacetate
0.628
0.011
0.440
↑*
ref (18)
myo-inositol
0.616
0.529
–0.219
↓
phenylalanine
0.606
0.109
–0.163
↓
refs (18)(23),
TMAO
0.586
0.318
–0.234
↓
ref (18)
acetate
0.579
0.952
–0.193
↓
ref (18)
glutamate
0.529
0.575
–0.116
↓
ref (3)
ref (18)
glucose
0.505
0.714
–0.053
↓
ref (18)
formate
refs (18)(27),
↓**[24,38]
Abbreviations used: TMAO: tri-methylamine-N-oxide; note: the discriminatory analysis based on normalized
spectral features revealed no change in formate. This was also well
consistent with a previous NMR-based plasma metabolomics study performed
at 900 MHz field strength[27] and formed
the basis for us to use it as an internal reference compound for performing
concentration profiling in the software program Chenomx. The study
showing decreased circulatory levels of formate in LC was based on
gas chromatography–mass spectrometry.[38]
Abbreviations used: TMAO: tri-methylamine-N-oxide; note: the discriminatory analysis based on normalized
spectral features revealed no change in formate. This was also well
consistent with a previous NMR-based plasma metabolomics study performed
at 900 MHz field strength[27] and formed
the basis for us to use it as an internal reference compound for performing
concentration profiling in the software program Chenomx. The study
showing decreased circulatory levels of formate in LC was based on
gas chromatography–mass spectrometry.[38]The present study also
aimed to evaluate the diagnostic utility
of serum metabolic profiles estimated by NMR for differentiating LC
patients from NC subjects. For this, the ROC curves were generated
for serum metabolites of discriminatory relevance between LC and NC
groups, and the AUROC analysis was performed to test their diagnostic
ability. Setting the criteria for diagnostic potential as AUROC value
more than 0.8 and p-value less than 0.001, six key
metabolic entities (histidine, glutamine, glycine, threonine, alanine
and valine) were selected as diagnostic markers of LC (Figure , Table ). Among the selected metabolic ratios, the
AUROC values for LPR were found to be greater than 0.8 (0.82, down)
suggesting its diagnostic potential in LC as well, whereas the AUROC
values for PTR (0.80, up) and EQR (0.7, up) were found to be in the
moderate range and that for BTR (0.51, down) is of no diagnostic potential
(see Supporting Information, Table S4).
Figure 4
Top 9
putative metabolic biomarkers selected after the ROC curve
analysis was performed with all 29 serum metabolic entities tabulated
in Table . The ROC
curve plots shown here are the diagnostic potential of these metabolic
entities between LC and NC groups as evident from the AUROC value,
and the computed 95% CI is in the faint blue background. The metabolic
differences are further evident from the box-cum-whisker plots shown
on the right side of each ROC curve plot.
Top 9
putative metabolic biomarkers selected after the ROC curve
analysis was performed with all 29 serum metabolic entities tabulated
in Table . The ROC
curve plots shown here are the diagnostic potential of these metabolic
entities between LC and NC groups as evident from the AUROC value,
and the computed 95% CI is in the faint blue background. The metabolic
differences are further evident from the box-cum-whisker plots shown
on the right side of each ROC curve plot.Compared to NC, the decreased serum levels of various amino acids
in LC might be related to their augmented utilization in LC to regulate
various biological functions. Consistent with previous reports, the
significantly decreased serum levels of glutamine might be related
to activated glutaminolysis in LC patients to replenish the energy
demand required for regulating complex immune-mediated inflammatory
responses.[41] Well-consistent with previous
reports,[42,43] the elevated serum levels of 3HB and acetoacetate
(the end products of lipid-metabolism) in LC patients were indicative
of aberrant lipid metabolism (or active fatty acid synthesis) in cancer
including LC. Particularly, the important metabolite of interest found
in this study is histidine which is significantly decreased in the
sera of LC patients and further provided satisfactory sensitivity
and specificity with AUROC equal to 0.93 [95% confidence interval
(CI) = 0.86–0.97]. Various clinical and preclinical studies
suggest that histidine has its strong antioxidative and anti-inflammatory
effects.[44−46] Further, it is a precursor for histamine which serves
as a key mediator for many pathological responses including immune-mediated
chronic/acute inflammatory and hypersensitivity responses.[47,48] A recent study from our lab has also demonstrated that the circulatory
histidine levels significantly decrease in Takayasu arteritis patients
with active disease (i.e., immune-mediated active inflammation).[49] Therefore, the decreased serum levels of histidine
might be related to its augmented utilization under conditions of
elevated oxidative stress and inflammation (a common clinical manifestation
of LC[50]). The pathophysiological states,
that is, oxidative stress and inflammation, were further indicated
by the elevated serum levels of PTR and NAG as per previous reports.[35,51] The summary of these metabolic pathway alterations is shown in Figure .
Figure 5
Summary of key metabolic
changes and their association with the
underlying disease pathophysiology.
Summary of key metabolic
changes and their association with the
underlying disease pathophysiology.
Differential Metabolic Signatures of ADC and SqCC
The
sample size of 14 for ADC patients and 20 for SqCC allowed us to further
compare the serum metabolic profiles between ADC and SqCC patients.
The 3D score scatter plot derived from the PLS-DA analysis based on
the concentration profiles of 29 metabolic entities listed in Table
S2 is shown in Supporting Information (Figure
S2A). A trend for clustering of ADC and SqCC samples and separation
between these clustered samples was clearly evident from the score
plot (shown in Supporting Information,
Figure S2A), suggesting that serum metabolic profiles between two
groups are distinctively different. However, the lower cross-validation
(CV) parameters (>0.2) (see Supporting Information, Figure S2B) revealed that the generated PLS-DA model does not exhibit
good discrimination and predictive ability. The poor performance of
the discriminatory model partly may be attributed to the low sample
size. The VIP score plot in combination with the student t-test was used to identify the metabolic profiles of discriminatory
and statistical significance (see Supporting Information, Figure S2C,D). Further, we performed the ROC curve analysis, and
key metabolic entities (glycine, proline, creatinine, phenylalanine,
myo-inositol, and glutamine) were selected as biomarkers of diagnostic
potential for discrimination between ADC and SqCC groups (Figure ).
Figure 6
Top six putative metabolic
biomarkers selected after the ROC curve
analysis performed with all 29 serum metabolic entities tabulated
in Table for testing
their diagnostic potential between ADC (in blue) and SqCC (in yellow)
study groups as evident from the AUROC value, and computed 95% CI
is in the faint blue background. The metabolic differences are further
evident from the box-cum-whisker plots shown on the right side of
each ROC curve plot.
Top six putative metabolic
biomarkers selected after the ROC curve
analysis performed with all 29 serum metabolic entities tabulated
in Table for testing
their diagnostic potential between ADC (in blue) and SqCC (in yellow)
study groups as evident from the AUROC value, and computed 95% CI
is in the faint blue background. The metabolic differences are further
evident from the box-cum-whisker plots shown on the right side of
each ROC curve plot.We also observed that
the circulatory metabolic ratios showed no
statistical significant difference between ADC and SqCC groups, and
the AUROC values were also less than 0.6 for each of these metabolic
entities (see Supporting Information, Figure
S3 and Table S5), suggesting their similar metabolic response in both
the clinical subtypes of LC. To be mentioned here is that the results
based on discriminatory analysis between ADC and SqCC are very preliminary,
and future studies on large cohort of patients are required to validate
these findings. As a reference for future studies, the results of
PLS-DA-based discriminatory analysis performed between ADC and SqCC
groups with respect to NC are summarized in Figure .
Figure 7
(A) 3D score plot derived from the PLS-DA model-based
discriminatory
analysis and showing exquisite separation of ADC (in red) and SqCC
(in green) serum samples with respect to the NC group. (B,C) Univariate
analysis performed to evaluate the quantitative variations in the
concentration profiles of serum metabolites. (B) Box-cum-whisker plots
of 20 metabolic entities for which the statistical significance is
at the level of Tukey’s p ≤ 0.05 in
the one-way ANOVA test compared to NC. (C) In each plot, the boxes
denote interquartile ranges, the horizontal line inside the box denotes
the median, and the bottom and top boundaries of the boxes are the
25th and 75th percentiles, respectively. Lower and upper whiskers
are the 5th and 95th percentiles, respectively.
(A) 3D score plot derived from the PLS-DA model-based
discriminatory
analysis and showing exquisite separation of ADC (in red) and SqCC
(in green) serum samples with respect to the NC group. (B,C) Univariate
analysis performed to evaluate the quantitative variations in the
concentration profiles of serum metabolites. (B) Box-cum-whisker plots
of 20 metabolic entities for which the statistical significance is
at the level of Tukey’s p ≤ 0.05 in
the one-way ANOVA test compared to NC. (C) In each plot, the boxes
denote interquartile ranges, the horizontal line inside the box denotes
the median, and the bottom and top boundaries of the boxes are the
25th and 75th percentiles, respectively. Lower and upper whiskers
are the 5th and 95th percentiles, respectively.
Concluding Remarks
In conclusion, the present study on Indian
LC patients further
supported that NMR-based serum metabolomics analysis has exquisite
potential to provide metabolic markers for discriminating LC and NC
subjects and has the ability to distinguish clinical subtypes of LC
as well. The metabolic disturbances observed in LC patients were suggestive
of augmented utilization of amino acids to support anabolic metabolism
and cancer-induced inflammation and also the elevated oxidative stress,
activated glutaminolysis, and altered energy metabolism (Figure ). However, for translating
these findings into clinical procedure, further studies are required
on a large patient sample size, especially in each of the clinical
subtype cohort. Additionally, procedural optimization will be required
to improve the accuracy of the NMR-based tests. The limitations of
the study are the following: (i) the sample size of clinical subtypes
of LC is legitimately low and (ii) the impact of staging on the serum
metabolic profile is lacking. Nevertheless, the alternative strategy
described in this study (i.e., concentration profiling of serum metabolites)
and the consistency of the findings with the majority of previous
reports suggest that the approach will open a new avenue for scientists
involved in NMR-based serum metabolomics studies for encompassing
the benefits of recent software tools like Chenomx used here for concentration
profiling of circulatory metabolites in the NMR spectra of serum samples.
Materials
and Methods
Ethical Approval
The study was approved by the Human
ethics Committee of King George’s Medical University (KGMU),
Lucknow 226003, Uttar Pradesh, India (IEC no. 1758/Ethics/R.Cell-17;
dated: 08/02/2017). The work was performed in strict accordance with
the guidelines of the Institutional Ethical committee. Before withdrawing
the blood from the subjects, the purpose of the study was explained
to all participants, and a signed written informed consent was obtained.
Patient Selection and Sample Collection
A total of
39 consecutive newly diagnosed treatment naïve LC patients
were enrolled who consulted at the OPD of Department of Pulmonary
Surgery, King George’s Medical University (KGMU), Lucknow 226003,
Uttar Pradesh, India. Histopathological examinations of the biopsied
or resected tissue samples were conducted to classify the LC patients
into its clinical subtypes according to the 7th edition of the TNM
staging system.[52] For comparative analysis,
42 healthy subjects (comparable in age and sex) were recruited as
NCs. From each participant, 2.0 mL of blood sample was drawn in plain
vacutainer tubes (Becton Dickinson), and serum was extracted as described
previously[53] and stored at −80 °C.
Sample Preparation
Before carrying out the NMR experiments,
the stored serum samples were thawed and homogenized using a vortex
mixer for 5 min. The NMR samples were prepared following the procedure
as described previously.[32] Briefly, 250
μL of 0.9% saline sodium phosphate buffer of strength 50 mM
(pH 7.4, prepared in 100% deuterium oxide i.e., D2O) was
added to 250 μL of serum to minimize the variation in pH. The
resultant sample mixture was centrifuged at 16,278g for 5 min, and then 450 μL of supernatant of this mixture
was transferred to 5 mm NMR tubes (Wilmad Glass, USA). A sealed capillary
tube holding 1.0 mM TSP (sodium salt of 3-trimethylsilyl (2,2,3,3-d4)-propionic acid) dissolved in D2O was inserted in the NMR tubes as an external reference (resulting
final TSP concentration ∼ 0.1 mM). The NMR solvent (deuterium
oxide with deuteration degree min. 99.95%) and the sodium salt of
TSP were purchased from Merck Millipore and Sigma-Aldrich (St. Louis,
MO, USA), respectively.
Data Collection and Preprocessing
The serum samples
were prepared and examined on 800 MHz NMR following the procedure
as mentioned previously.[32] The recorded
NMR spectra were analyzed using the PROCESSOR module of commercial
software Chenomx (v8.2, Edmonton, Canada), and the CPMG data matrix
containing 0.02 ppm spectral bins normalized with respect to the total
spectral intensity was prepared for multivariate analysis as described
previously.[32] The normalized spectral bins
(or features) were assigned for various metabolites as per the spectral
assignments reported in our recent serum metabolomics studies.[32,53] As glucose, lactate, and lipid/membrane metabolites (including low-/very-low-density
lipoproteins) significantly and variably contribute to the total spectral
intensity, the discriminatory analysis based on normalized spectral
features may provide unreliable results as demonstrated previously.[32] To avoid any such possibility, we additionally
estimated the concentrations of 25 serum metabolites with respect
to formate (as an internal reference and selecting pH of sample equal
to 7.0 ± 0.5). The purpose of selecting formate is that its singlet
NMR signal is present most downfield in the NMR spectrum of serum
and does not have overlap with any other signal of serum metabolites.
Further, the software Chenomx provides the option to use formate as
a calibration standard to overcome/minimize the analytical variations.
A recent NMR-based serum metabolomics study showed that formate does
not change significantly in the sera of LC patients compared to LC.[27] This further supported the use of formate as
an internal reference. For spectral calibration, the concentration
of formate was set to 10 μM, that is, nearly close to the detection
limit of a 800 MHz NMR spectrometer and well within the reported circulatory
range in the literature.[38,54] The estimated concentrations
of serum metabolites have been reported here in micromolar except
for glucose and lactate for which the concentrations have been reported
in millimolar (see Supporting Information, Table S2) and can simply be considered as ratiometric metabolic
profiles estimated with respect to formate.
Multivariate Statistical
Analysis
The multivariate
data derived from NMR-based serum metabolic profiling was analyzed
for comparison between LC and NC groups using multivariate statistical
analysis tools such as unsupervised PCA and supervised PLS-DA performed
using freely available web-based server named MetaboAnalyst.[17,55,56] The PLS-DA analysis was performed
following the details described previously.[32] The performance of PLS-DA models was further improved by integrating
it with orthogonal signal correction. The analysis is referred to
as OPLS-DA and removes variability not relevant to class separation.[57] The OPLS-DA analysis was performed using commercial
software SIMCA (v14.0, Umetrics, Umeå, Sweden: https://umetrics.com/kb/simca-14).The spectral features or metabolites of statistical significance
and diagnostic potential were identified finally through performing
the ROC curve analysis integrated with the the student t-test using Biomarker module of MetaboAnalyst. The level of statistical’
significance was’ set at p < 0.05. Continuous
variables were expressed as the mean ± SD and categorical variables
as the percentage.