Yingdi Zhu1, Andreas Lesch2, Xiaoyun Li3,4, Tzu-En Lin5, Natalia Gasilova1, Milica Jović1, Horst Matthias Pick6, Ping-Chih Ho3,4, Hubert H Girault1. 1. Institute of Chemical Sciences and Engineering, School of Basic Sciences, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland. 2. Department of Industrial Chemistry "Toso Montanari", Universita degli Studi di Bologna, 40136 Bologna, Italy. 3. Department of Fundamental Oncology, Université de Lausanne, 1066 Epalinges, Switzerland. 4. Ludwig Institute for Cancer Research, Université de Lausanne, 1066 Epalinges, Switzerland. 5. Institute of Biomedical Engineering, College of Electrical and Computer Engineering, National Chiao Tung University, 30010 Hsinchu, Taiwan. 6. Environmental Engineering Institute, School of Architecture, Civil and Environmental Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.
Abstract
Skin problems are often overlooked due to a lack of robust and patient-friendly monitoring tools. Herein, we report a rapid, noninvasive, and high-throughput analytical chemical methodology, aiming at real-time monitoring of skin conditions and early detection of skin disorders. Within this methodology, adhesive sampling and laser desorption ionization mass spectrometry are coordinated to record skin surface molecular mass in minutes. Automated result interpretation is achieved by data learning, using similarity scoring and machine learning algorithms. Feasibility of the methodology has been demonstrated after testing a total of 117 healthy, benign-disordered, or malignant-disordered skins. Remarkably, skin malignancy, using melanoma as a proof of concept, was detected with 100% accuracy already at early stages when the lesions were submillimeter-sized, far beyond the detection limit of most existing noninvasive diagnosis tools. Moreover, the malignancy development over time has also been monitored successfully, showing the potential to predict skin disorder progression. Capable of detecting skin alterations at the molecular level in a nonsurgical and time-saving manner, this analytical chemistry platform is promising to build personalized skin care.
Skin problems are often overlooked due to a lack of robust and patient-friendly monitoring tools. Herein, we report a rapid, noninvasive, and high-throughput analytical chemical methodology, aiming at real-time monitoring of skin conditions and early detection of skin disorders. Within this methodology, adhesive sampling and laser desorption ionization mass spectrometry are coordinated to record skin surface molecular mass in minutes. Automated result interpretation is achieved by data learning, using similarity scoring and machine learning algorithms. Feasibility of the methodology has been demonstrated after testing a total of 117 healthy, benign-disordered, or malignant-disordered skins. Remarkably, skin malignancy, using melanoma as a proof of concept, was detected with 100% accuracy already at early stages when the lesions were submillimeter-sized, far beyond the detection limit of most existing noninvasive diagnosis tools. Moreover, the malignancy development over time has also been monitored successfully, showing the potential to predict skin disorder progression. Capable of detecting skin alterations at the molecular level in a nonsurgical and time-saving manner, this analytical chemistry platform is promising to build personalized skin care.
Human skin acts as
a barrier to protect the body from the environment.
As the largest organ of the body, the skin is at especially high risk
of suffering from harmful impacts both externally and internally.
Skin disorders are thus some of the most common human health problems
worldwide. The disorder can result from diverse factors, like wounding,
infection, allergy, inflammation, genetic mutation, and so forth.[1] Some inner body health problems can also cause
symptoms on the skin. For instance, immune system abnormalities, cardiovascular
diseases, neurological diseases, diabetes, and microbial infections
like the current COVID-19 epidemic can trigger specific physical or
chemical changes on the skin.[2,3] Many skin disorders
could be life-threatening if not treated properly in time. Taking
skin cancer as an example, the cancer cells can spread from where
they arise, normally in the epidermis or the epidermal–dermal
junction, deep into the dermis and further invade other parts of the
body, resulting in a dangerous situation with a high fatality rate.[4] A regular monitoring of the skin status, therefore,
is of great value for the public health policy strategy.Clinically,
skin monitoring mainly relies on microscopic, imaging,
and molecular tools. The initial screening is often based on visual
and physical evaluations, noting the skin lesion size, shape, color,
texture, evolving, as well as the occurrence of bleeding, oozing or
crusting, using tools like dermoscopy. Once suspected as a skin malignancy,
a surgical biopsy is carried out to establish a definitive diagnosis.
During the process, a small amount of skin tissue is removed from
the suspicious region and analyzed using histopathological or genetic
techniques like hematoxylin–eosin staining, immunohistochemistry,
in situ hybridization, and gene expression profiling.[5,6] Lymph node biopsy is conducted when there are signs of malignant
spread. Body imaging tests are ordered to check for metastasis at
distant sites, using tools like computed tomography, ultrasonography,
or magnetic resonance imaging.[7] Over the
years, more analytical tools have been proposed to analyze biopsied
skin samples. Examples include Raman spectroscopy, scanning electrochemical
microscopy, quantitative chemical sensors, quantitative proteomics,
and mass spectrometry imaging, to mention a few.[8−10] Despite these
advancements, strategies that are not only robust but also rapid and
patient-friendly are in short supply.Herein, we develop an
analytical chemical methodology to achieve
a quick, noninvasive, and high-throughput skin monitoring. Within
it, a well-established adhesive sampling procedure is used in conjunction
with matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF)
mass spectrometry to record skin surface mass profiles. The mass profiles,
determined by the skin molecular composition, can act as “fingerprints”
for the characterization of the skin state. They are afterward subjected
to similarity scoring and machine learning to reach an automated result
interpretation. The adhesive sampling provides a mild collection of
cells from epidermal skin layer using adhesive discs. This sampling
procedure is noninvasive, painless, readily performed, and has been
widely accepted in skin pharmacological and physiological studies.[11,12] MALDI-TOF mass spectrometry (MS) is now implemented in clinical
laboratories for a broad range of applications such as microbial identification.
It detects analytes precisely according to their molecular weights
with the turn-around time of several minutes only. The mass spectrometer
target is designed to accommodate hundreds of sample spots, so that
the assays can be conducted in a high-throughput manner. The mass
spectral data are easy to process due to the presence of mostly singly
charged analyte ions.[13,14] Automated data interpretation,
using artificial intelligence approaches like machine learning, data
mining, or complex network analysis, is one of the most prospective
trends in clinical diagnosis.[15] It eliminates
the heavy reliance on skilled physicians with prior clinical experience
and enables the processing of large complex data sets in a short time,
e.g., analysis of hundreds of mass spectra in milliseconds. After
testing on a total of 117 skin samples, the developed methodology
is demonstrated to allow a routine examination of skin state, an early
and accurate detection of skin disorders, and a dynamic monitoring
of the disorder progression.
Results
Methodology Description
As illustrated in Figure A, the methodology
is composed of three steps, including (i) adhesive collection of cells
from the surface of suspicious skin regions using sterile adhesive
sampling discs (“sampling”), (ii) recording mass fingerprints
of the collected skin cells on the sampling discs by MALDI-TOF mass
spectrometry (“measurement”), and (iii) interpretation
of the skin surface mass fingerprint data using similarity scoring
algorithms, peak tracking method, and machine learning tools (“data
interpretation”). The whole detection process is rather rapid,
with each step taking only a few minutes.
Figure 1
Methodology description.
(A) Schematic illustration of the proposed
methodology for skin monitoring and diagnosis. (B) Illustration of
the mass spectrometry ionization process with the presence of a sampling
disc. All of the measurements were conducted under linear positive
mode with 400 ns delayed extraction, using sinapinic acid as the matrix.
Methodology description.
(A) Schematic illustration of the proposed
methodology for skin monitoring and diagnosis. (B) Illustration of
the mass spectrometry ionization process with the presence of a sampling
disc. All of the measurements were conducted under linear positive
mode with 400 ns delayed extraction, using sinapinic acid as the matrix.The sampling discs used herein (photo in Figure S1) have been validated by human skin safety tests. The discs
carrying the collected skin cells were directly covered with the MALDI
matrix for the mass recording. This sample preparation process is
straightforward, without any pretreatment like cell lysis, component
extraction, or analyte labeling. During the MALDI measurements, the
presence of a sampling disc made of nonconducting materials would
hinder the electrical charge dissipation from the sample surface and
thus affect the ionization process (Figure B).[16] This could
result in a decrease of absolute peak intensities, as observed during
a comparative test with in vitro grown cells (Figure S2). Nevertheless, the mass spectra still displayed
a good mass resolution with a high signal-to-noise ratio. For each investigated skin region, the mass spectrum was
obtained with 5 × 250 laser shots throughout the sample region
to reduce the possible impact from “sweet” spots and
to generate a panorama of the whole skin region. For intersample comparison,
the normalized spectra with relative peak intensities were considered
to minimize the influence from sample quality or experimental conditions.
Tests on Healthy Human Skin and Benign Skin Disorders
The
methodology was first tested on a normal skin region, i.e., a
region without moles, scars, pimples, hair, etc.,
on the forearm of a healthy adult volunteer. The skin region was sampled
four times in a row, using a new disc each time. Stratified cells
with a lateral diameter of ∼30 μm were clearly observed
on each disc (Figure A–D). The collected cell layer had a mean thickness of ∼3
μm (height profile in Figure E). Most of the cells were dead, as confirmed by trypan
blue staining of a freshly collected sample (Figure F). The cells should be derived from stratum
corneum, the skin layer composed of dead corneocytes with a thickness
between 10 and 40 μm on healthy humans.[17] Hence, the sampling process is noninvasive and painless, as the
removal of corneocytes will not damage the deep skin tissue, and the
removed cells will be regenerated naturally within around 14 days.[18] Mass fingerprints generated from the four sequential
samplings closely resembled each other, displaying mutual cosine similarity
of 0.912(±0.037) (Figure G). The cosine similarity was calculated using the algorithm
of cosine correlation. It considers both the location (mass-to-charge
ratio, m/z) and the relative intensity
of each mass spectral peak, scoring spectral similarity between 0
(completely different) and 1 (totally identical). The result indicates
that a suspicious skin region can be sampled multiple times to ensure
reliability of the detection. The detectable mass peaks were mainly
located in the mass range of 2000–20,000 m/z, mostly derived from proteins in the skin cells.
Notably, the sampling disc itself did not bring any interference mass
signal (Figure S3).
Figure 2
Tests on healthy human
skin and benign human skin disorders. (A–J)
Characterization of a normal skin region on the forearm of a healthy
adult volunteer with four times adhesively sampling in a row: (A–D)
3D laser scanning microscopic images from the four samples, (E) microscopic
height profile of the third sample, (F) microscopic image after trypan
blue staining (10 min immersion in 0.02% trypan blue) of a freshly
collected sample, and (G) mass fingerprints from the four samples.
(H,I) Analysis of skin surface mass fingerprints from three skin conditions,
i.e., normal healthy skins, skin moles, and dry-itchy dehydrated skin,
derived from different healthy volunteers at different body sites:
(H) averaged mass fingerprint at each skin condition (shaded area:
interquartile range) and (I) hierarchical clustering using cosine
similarity and average linkage, correlated with the fingerprint peak
intensity profiles.
Tests on healthy human
skin and benign humanskin disorders. (A–J)
Characterization of a normal skin region on the forearm of a healthy
adult volunteer with four times adhesively sampling in a row: (A–D)
3D laser scanning microscopic images from the four samples, (E) microscopic
height profile of the third sample, (F) microscopic image after trypan
blue staining (10 min immersion in 0.02% trypan blue) of a freshly
collected sample, and (G) mass fingerprints from the four samples.
(H,I) Analysis of skin surface mass fingerprints from three skin conditions,
i.e., normal healthy skins, skin moles, and dry-itchy dehydrated skin,
derived from different healthy volunteers at different body sites:
(H) averaged mass fingerprint at each skin condition (shaded area:
interquartile range) and (I) hierarchical clustering using cosine
similarity and average linkage, correlated with the fingerprint peak
intensity profiles.The methodology is able
to distinguish between normal healthy skin
and benign skin disorders, as demonstrated by testing 66 skin regions
from nine healthy adult volunteers. The volunteers, aged between 25
and 63, came from different countries in Asia, Europe, and South America.
The test was conducted on three common skin conditions, i.e., normal
healthy skin, skin moles, and dehydrated skin, at different body sites,
i.e., hand, arm, neck, chest, waist, and leg (Table S1). A skin mole occurs when melanocytes, the pigment-producing
skin cells, grow in a cluster instead of spreading throughout the
epidermis. All the tested moles have been confirmed as benign by a
local dermatologist. The investigated dehydrated skins were characterized
by the symptoms of dryness, itching, and slightly cracking. The dehydration
was mostly caused by the cold winter weather with low humidity, typical
in Alpine Switzerland where the volunteers currently live. Mass fingerprints
generated from all the investigated skins are shown in Figure S4, with the averaged spectrum (with interquartile
range) at each condition displayed in Figure H. In order to explore their intercorrelations,
hierarchical cluster analysis (HCA) was conducted using the spectral
cosine similarity. HCA is an unsupervised machine learning method
to group objects in such a way that the objects in the same group
are more similar to each other than to those in other groups. Results
showed that mass fingerprints derived from the same skin condition
were generally more similar to each other than to fingerprints from
other conditions, with corresponding characteristic peak distribution
profiles (Figure I,
with peak details in Table S2). Specifically,
mass fingerprints from normal healthy skin displayed mutual spectral
cosine similarity of 0.861(±0.058), with a 60POP common peak
rate (the ratio between the number of peaks detected with more than
60% of presence among the mass spectra and the averaged peak number
per spectrum) reaching 76% (34/45). This high similarity is consistent
with the fact that molecular compositions are generally conserved
among healthy humans.[19] The slight variations
among samples could come from the differences in age, gender, skin
tone, or body site (investigation in Figure S5). The occurrence of skin dehydration or skin moles changed the epidermal
constituents and thus made the mass fingerprints different, with the
similarity to normal healthy skins decreased to 0.299(±0.101)
and 0.519(±0.143), respectively. The dehydrated skin regions
were generally similar to each other in morphology and shared mutual
similarity as high as 0.890(±0.054), with a 60POP common peak
rate of 76% (35/46). The mutual similarity among the skin moles was
lower, i.e., 0.664(±0.216), with a 60POP common peak rate of
66% (31/47). Some moles even displayed higher similarity to their
corresponding normal skins than to the moles from other individuals.
This could be explained by their difference in morphology, with the
lateral diameter ranging from 1 to 5 mm and the color increasing from
light brown to dark black.
Detection and Monitoring of Malignant Skin
Disorder in Mice
Feasibility of the methodology was further
demonstrated through
a systematic in vivo study on a malignant skin disorder, i.e., melanoma-type
skin cancer. Because of its propensity for lethal metastasis and therapeutic
resistance, melanoma is one of the deadliest forms of humanmalignancy.[20] Due to ethical considerations, the study was
conducted on mouse models, instead of on humanpatients directly.
The malignancy was induced using a well-developed genetically engineered
method, by treating Tyr::CreER;BRaf;Pten transgenic mice with 4-hydroxytamoxifen (4-HT) topically on the
back skin.[21,22] This cancer model carried the
most common genetic mutation in humanmelanoma and had the potential
of metastatic outgrowth. By controlling the tumor growth time, the
mice developed skin cancer at different progression stages, including
(i) pretumor lesions with submillimeter-sized black speckles, (ii)
radial growth phase tumors with cancer cells confined in the epidermis
and epidermal–dermal junction, (iii) vertical growth phase
tumors with cancer cells invading the dermis, and (iv) metastatic
tumors with cancer cells invading lymph nodes and distant sites.[23]Adhesive sampling was applied to the 4-HT-treated
skin regions after optimization of the sampling procedure (Figures S6 and S7). Representative mass fingerprints
generated throughout the disorder progression are shown in Figure . Among them, Figure A, more in Figure S8, was from healthy controls. Figure B,C was from the
fourth week after 4-HT administration, when the cancer was successfully
initiated in the skin with the development of submillimeter-sized
black speckles. The speckle number varied among mice, implying the
individual difference in the cancer initiation dynamics. Figure D–F was from
the sixth week, when small tumor nodules formed, with the superficial
surface area between 10 and 24 mm2 among mouse individuals. Figure G,H came from the
seventh week, when the tumor size increased to 50–70 mm2. Figure I–L
came from the ninth week, when the tumors reached 130–182 mm2 and started metastasis, as indicated by the emergence of
new tumor nodules located more than 15 mm away from the primary tumors.[24] Along with the disorder progression, the fingerprint
patterns changed gradually, with difference observed on peaks like
Peak 1 (4690 m/z), Peak 2 (7493 m/z), Peak 3 (7811 m/z), Peak 4 (8712 m/z),
Peak 5 (9366 m/z), Peak 6 (9979 m/z), Peak 7 (10,165 m/z), Peak 8 (10,598 m/z), Peak 9 (10,871 m/z), Peak 10
(11,546 m/z), Peak 11 (14,975 m/z), and Peak 12 (15,615 m/z) (±1000 ppm mass tolerance). These peaks
will be discussed with more details later. Due to the accumulation
of melanin pigment produced by the cancer cells, the lesions were
typically blue-black in color (representative photos in Figure S9). Compared to the healthy controls,
the collected skin cells displayed obvious difference in size and
shape when the tumor size was larger than 70 mm2 (microscopic
images in Figure S10). The sampling from
the tumor lesions was estimated to collect cancer cells together with
the surrounding texture, like the resident and infiltrating host cells,
secreted factors, and extracellular matrix proteins.
Figure 3
Representative mass fingerprints
during malignant skin disordering.
The skin conditions are (A) healthy, (B,C) lesions with different
number of black speckles (B_F, B_M), (D–L) tumor lesions with superficial surface area of 10–182
mm2 (T10, T15, T24, T50, T70, T130, T140, T168, T182). Mass peaks associated with the malignancy growth
are labeled in order of their m/z values. All spectra were obtained from the fourth samplings on the
skin regions treated with 4-HT.
Representative mass fingerprints
during malignant skin disordering.
The skin conditions are (A) healthy, (B,C) lesions with different
number of black speckles (B_F, B_M), (D–L) tumor lesions with superficial surface area of 10–182
mm2 (T10, T15, T24, T50, T70, T130, T140, T168, T182). Mass peaks associated with the malignancy growth
are labeled in order of their m/z values. All spectra were obtained from the fourth samplings on the
skin regions treated with 4-HT.An automated result interpretation, using data learning tools,
showed that the disorder could be detected at early stages with high
confidence. Figure A displays the correlations among the skin surface mass fingerprints
during the disorder progression, obtained from hierarchical clustering
based on fingerprint cosine similarity (cluster distance in Figure S11, similarity scores in Table S3). The similarity among all healthy skins
was higher than 0.960, with a mean value of 0.981(±0.007). When
the cancer happened, the mass fingerprints were found to be different
from the healthy cluster with the similarity decreased, more specifically
0.865(±0.013) for the very early lesions with submillimeter-sized
speckles and 0.153(±0.059) for the 10–182 mm2 tumors. At early stages with speckles only, although the similarity
to healthy skin was not as low as that from tumors, the fingerprint
peaks were considerably different from the healthy ones with 35% (32/91
peaks) newly detected or increased in intensity and 9% (8/91 peaks)
disappeared or decreased in intensity (absolute log 2-fold change
≥1, paired t test P value
≤0.05) (Figure B, peak list in Table S4). Many of these
different peaks were detected in overall low intensity and thus made
a limited contribution to the fingerprint cosine similarity. If compared
in a way that increases the weighting of peak presence but reduces
or eliminates the weighting of peak intensity, the differentiation
between the speckle lesions and healthy skins is expected to be easier.[25] The algorithm of relative Euclidean distance
and Jaccard index were proposed as such measures, with the similarity
scores decreased to 0.326(±0.049) and 0.446(±0.058), respectively
(Figure C). Both scores
were significantly lower (t test P value 4 × 10–70, 9 × 10–70) than those of their counterpart among the healthy skins, i.e.,
0.679(±0.072) and 0.810(±0.070), respectively (more investigation
in Figures S12 and S13). To further evaluate
the detection performance, the mass fingerprints were subjected to
machine learning classification. Fingerprints from altogether 51 skin
samples were investigated, 18 healthy and 33 cancerous (including
speckle and tumor lesions at different progression stages). After
a total of 84 classifiers from commonly used learning models were
screened (on a Weka machine learning workbench),[26] 28 of them produced satisfactory results, with the detection
accuracy (correct classification rate) higher than 0.98, the kappa
statistic (a metric comparing observed accuracy with expected accuracy)
higher than 0.96, and other performance measures like F-measure (the
weighted harmonic mean of precision and recall) and ROC area (area
under the receiver operating characteristic curve) higher than 0.98
(10-fold cross-validation) (Figure D, Table S5). Notably, 12
of these classifiers even provided 100% correct detection (true positive
rate 1, false positive rate 0, for both healthy and cancerous skin)
with all the performance measures reaching 1. They were MultiClassClassifier
(MCC), KStar, IB1, Winnow, SMO, Logistic, LibSVM, HNB, BayesianLogisticRegression,
WAODE, AODEsr, and AODE. These classifiers could be the top choices
for a diagnosis in the future. Such outcome also demonstrated the
robustness of the proposed data collection procedure for skin disorder
detection, as it showed no strong preference on the classification
algorithms for the data interpretation. The limit of detection for
the skin cancer was thus displayed as early as the occurrence of submillimeter-sized
black speckles. Such a detection limit is much better than many existing
noninvasive methods, for instance, the widely used “ABCDE”
dermoscopic examinations often requiring cancer lesions larger than
6 mm in diameter (∼30 mm2 area).[27]
Figure 4
Data interpretation by analysis of the mass fingerprints on the
whole. The skin disorder underwent the following progression stages:
healthy state (Healthy), black speckle lesions (Speckles), early tumors (Early T), medium-sized
tumors (Medium T), and metastatic tumors (Metastatic T). (A) Hierarchical clustering of the skin surface
mass fingerprints using cosine similarity and average linkage. (B)
Comparison of each peak relative intensity between Healthy and Speckles, with the t test P value plotted versus the magnitude of log 2-fold change.
(C) Fingerprint similarity among Healthy, or between Healthy and Speckles, scored by the algorithm
of relative Euclidean distance, Jaccard index, or cosine correlation
(mean value bar, data dot distribution, mean difference ΔM among the two groups of data sets). (D) Performance of
28 machine learning algorithms for the detection of the malignant
skin disorder, evaluated by Kappa statistic, accuracy, F-measure,
and ROC area (weighted average value among classes). (E) Principal
component analysis based on the first three principal components,
with a 95% confidence ellipsoid applied to each progression stage.
Data interpretation by analysis of the mass fingerprints on the
whole. The skin disorder underwent the following progression stages:
healthy state (Healthy), black speckle lesions (Speckles), early tumors (Early T), medium-sized
tumors (Medium T), and metastatic tumors (Metastatic T). (A) Hierarchical clustering of the skin surface
mass fingerprints using cosine similarity and average linkage. (B)
Comparison of each peak relative intensity between Healthy and Speckles, with the t test P value plotted versus the magnitude of log 2-fold change.
(C) Fingerprint similarity among Healthy, or between Healthy and Speckles, scored by the algorithm
of relative Euclidean distance, Jaccard index, or cosine correlation
(mean value bar, data dot distribution, mean difference ΔM among the two groups of data sets). (D) Performance of
28 machine learning algorithms for the detection of the malignant
skin disorder, evaluated by Kappa statistic, accuracy, F-measure,
and ROC area (weighted average value among classes). (E) Principal
component analysis based on the first three principal components,
with a 95% confidence ellipsoid applied to each progression stage.More than detection of the disorder, progression
of the disorder
over time was also monitored with convenience by the new methodology.
As shown in Figure A, along with the disorder development, the skin lesions with speckles,
early tumors (10–24 mm2), medium-sized tumors (50–70
mm2), and further metastatic tumors (130–182 mm2) deviated gradually from the healthy cluster, with cosine
similarity to the healthy decreased to 0.865(±0.013), 0.205(±0.082),
0.113(±0.014), and 0.134(±0.010), respectively. It was also
found that lesions at the same or nearby progression stages had more
similar mass fingerprints compared to the lesions at stages far away
from each other. For instance, the 10 mm2 early skin tumor
displayed fingerprint cosine similarity of 0.888(±0.012) to other
early tumors (10–24 mm2), while its similarity to
the earlier stage (speckles), the later stage (medium-sized tumors),
and further to the metastatic tumors was decreased to 0.542(±0.095),
0.776(±0.011), and 0.537(±0.118), respectively. Such findings
were further confirmed by principal component analysis (PCA), another
type of unsupervised machine learning. PCA explains the variation
of a large number of original responses (here, the hundreds of mass
fingerprint peaks) using a smaller number of factors (here, the first
three principal components PC0, PC1, and PC2). Due to the dimensionality
reduction, it is possible to reveal small differences from the fingerprint
peaks. PCA plot of the mass fingerprints from different progression
stages is displayed in Figure E (PCA scores in Table S6), with
the grouping result confirmed through a statistical analysis of the
PCA scores by one-way analysis of variance (ANOVA) (Figure S14 and Table S7). It was observed that the 95% confidence
ellipsoids of different tumor stages gradually separated along with
the disorder progression. This indicates that the proposed platform
could monitor skin disordering over time and predict the disorder
progression stage through pattern matching of the fingerprints. Note
that the PCA plot showed a partial overlap between the nearby stages
of progression, for instance, between early tumors and medium-sized
tumors or between medium-sized tumors and metastatic tumors. This
could be explained by a gradual change of the molecular composition
in the tumor microenvironment along with the disease progression.
The variance in tumor growth dynamics among mouse individuals could
also make an impact.In addition to analysis of the fingerprint
pattern on the whole,
the building of marker mass peaks also facilitated the disorder detection
and monitoring. To select peaks differentiating the cancerous from
the healthy skins, Fisher’s extract test was applied after
a comparison of the fingerprint peak location, number and relative
intensity (r.int.) (details in Figure S15). The top five most relevant peaks were found to
be Peaks 1, 2, 3, 5, and 9 labeled previously in Figure . As statistically validated
in Figure A, each
of them was detected from all the tumor lesions, especially from all
the early tumors, with a considerable intensity (r.int. 0.19 ± 0.09, 0.12 ± 0.04, 0.15 ± 0.09, 0.92 ±
0.16, 0.36 ± 0.20, respectively), but not detectable or detected
with an extremely low intensity from healthy skins (r.int. 0, 0, 0, 0.02 ± 0.02, 0, respectively) (P values ≤0.001). When only these peaks were considered, the
skin cancer was also detected with high confidence, especially at
early stages, using machine learning models selected from Figure D (Figure B for the detection of speckle
lesions, Figure C
for all cancerous lesions). Similarly, some of the fingerprint peaks
were found to be useful for monitoring the disorder progression. They
were Peaks 3, 4, 7, 8, 10, 11, and 12 labeled in Figure . As statistically validated
in Figure D, intensities
of these peaks increased significantly (P values
≤0.05) along with the tumor growth, and reached a considerable
level at the metastatic stage (mostly more than quadrupled from early
tumors). They allowed a confident classification among the cancer
progression stages, as investigated with the same machine learning
models used in Figure B,C (Figure E). By
tracking the marker peaks, noting their presence, absence, or intensity
changes, it is possible to make a quick prediction of the disorder
occurrence, seriousness, and development trend.
Figure 5
Data interpretation by
tracking marker peaks. (A–C) Marker
mass peaks helpful for skin disorder detection, with the relative
peak intensity statistically compared among Healthy, Early T, and all tumors by one-way ANOVA (A);
performance of these marker peaks for the classification between Healthy and Speckles (B) or between Healthy and all cancerous lesions (C), using machine learning
classifiers. (D,E) Marker mass peaks useful to monitor the cancer
progression, with the relative peak intensity statistically compared
among Early T, Medium-sized T, and Metastatic T by one-way ANOVA (D); performance of these
marker peaks for the classification among cancer lesions at different
progression stages (i.e., Speckles, Early
T and Medium-sized T, and Metastatic
T) using machine learning classifiers (E). Star rating: *P value ≤0.05, **P value ≤0.01,
***P value ≤0.001. In (B,C,E), machine learning
algorithms with top performance in Figure D were selected for the classification. Classification
performance measures: Kappa statistic, accuracy, F-measure, and ROC
area.
Data interpretation by
tracking marker peaks. (A–C) Marker
mass peaks helpful for skin disorder detection, with the relative
peak intensity statistically compared among Healthy, Early T, and all tumors by one-way ANOVA (A);
performance of these marker peaks for the classification between Healthy and Speckles (B) or between Healthy and all cancerous lesions (C), using machine learning
classifiers. (D,E) Marker mass peaks useful to monitor the cancer
progression, with the relative peak intensity statistically compared
among Early T, Medium-sized T, and Metastatic T by one-way ANOVA (D); performance of these
marker peaks for the classification among cancer lesions at different
progression stages (i.e., Speckles, Early
T and Medium-sized T, and Metastatic
T) using machine learning classifiers (E). Star rating: *P value ≤0.05, **P value ≤0.01,
***P value ≤0.001. In (B,C,E), machine learning
algorithms with top performance in Figure D were selected for the classification. Classification
performance measures: Kappa statistic, accuracy, F-measure, and ROC
area.Molecular identities of the above
marker peaks were tentatively
clarified through correlation with top-down proteomic data obtained
from the excised skin tissues. The peak assignment is feasible as
both MALDI-TOF and top-down proteomics measure intact proteins with
the signal intensity highly related to the protein abundance.[28] Accordingly, Peak 1 could be assigned to thymosin
β-4, an actin monomer-binding protein impacting cell mobility
by regulating the cell adhesion.[29] Peak
2 possibly represents vimentin, the only type of intermediate filament
contained by melanoma cells to maintain the cell integrity and a suggested
biomarker for melanoma diagnosis.[30] Peak
3 could originate from LINE-1 retrotransposable element ORF1 protein,
whose expression is a hallmark of many types of malignancy.[31] Peaks 4 and 10 might be from actin β and
actin γ cytoplasmic 1, highly expressed in skin cancers to regulate
the cell proliferation, motility, and migration.[32] Peak 5 is likely from ATPase inhibitory factor 1, overexpressed
to promote cancer cell survival under temporary anoxic conditions
possibly by preserving cellular ATP despite mitochondrial dysfunction.[33] The S100 calcium binding proteins including S100a8 (Peak 7) and S100b (Peak 8) are
involved in many phenotypic features of cancer cells and are widely
used as skin cancer diagnosis markers.[34] Peak 9 is possibly from heat shock protein 1, released from malignant
cells to the extracellular space to form a fostering environment beneficial
for the tumor growth.[35] Peaks 11 and 12
could be from hemoglobin subunit α and β-2, whose level
in solid tumors can be influenced by the hypoxic tumor microenvironment
due to aberrant vascularization and poor blood supply.[36] Peak assignment information was given in Data File S1.As a validation of the above
detection results, quantitative proteomics
was conducted with the excised skin tissues.[37] Compared to the healthy controls, the tumor lesions had more than
30% of proteins differently expressed, including ∼28% up-regulated
and ∼8% down-regulated (Figure A by spectral counting, Figure S16A by label-free quantification, more details in Data File S2). It clearly showed that the skin
protein composition was significantly changed when the disorder happened.
This coincided with changes of the skin surface mass fingerprints,
demonstrating the capability of the new method for skin state investigations
at the molecular level. Among the up-regulated proteins, we found
six biomarkers widely used for the diagnosis and prognosis of melanoma.
They were premelanosome protein (Pmel), tyrosinase
(Tyr), S100 calcium binding protein B (S100b), chondroitin sulfate proteoglycan 4 (Cspg4), fibronectin
1 (Fn1), and melanoma cell adhesion molecule (Mcam) (Figure B, Figure S16B, and Data File S3).[38] This confirmed
the investigated tumor lesions as melanoma. Notably, among these biomarkers, S100b was also observed on the skin surface mass fingerprints
as Peak 8 (10,598 m/z), exclusively
from the tumor lesions not the healthy controls, further showing the
feasibility of the new methodology. The other biomarkers were not
observed on the fingerprints as they were large proteins exceeding
the 2000–20,000 m/z mass
recording window.
Figure 6
Result validation via quantitative proteomic analysis.
Quantitative
proteomics was performed with the excised mouse skin tissues using
Scaffold proteome software. Three tumor lesions (70, 130, and 182
mm2) were analyzed as representatives, with the quantitative
value (i.e., normalized total spectral count) of each identified protein
statistically compared to the healthy controls. (A) Comparison of
all the protein quantitative values between tumorous and healthy skins,
with the paired t test statistical significance (P value) plotted versus the magnitude of log 2-fold change.
(B) Comparison of quantitative values from six well-established melanoma
biomarkers between tumorous and healthy skins (mean value bars, median
value lines, data dot distribution; star rating: *P value ≤0.05, **P value ≤0.01, ***P value ≤0.001).
Result validation via quantitative proteomic analysis.
Quantitative
proteomics was performed with the excised mouse skin tissues using
Scaffold proteome software. Three tumor lesions (70, 130, and 182
mm2) were analyzed as representatives, with the quantitative
value (i.e., normalized total spectral count) of each identified protein
statistically compared to the healthy controls. (A) Comparison of
all the protein quantitative values between tumorous and healthy skins,
with the paired t test statistical significance (P value) plotted versus the magnitude of log 2-fold change.
(B) Comparison of quantitative values from six well-established melanoma
biomarkers between tumorous and healthy skins (mean value bars, median
value lines, data dot distribution; star rating: *P value ≤0.05, **P value ≤0.01, ***P value ≤0.001).
Extension Test on Adjacent Nondisorder Skin on Mice
Along
with the tumor growth, mass fingerprints of the adjacent nontumor
skin regions (located ∼15 mm away from the tumor lesions) were
also found to be changed. At early stages with tumor size smaller
than 70 mm2, the adjacent nontumor skin mass fingerprints
resembled the healthy controls with cosine similarity higher than
0.900 (representatives in Figure A, Healthy, T50, T70). The skin looked like that in a healthy state, displaying
similar color and texture. However, the skins were found slightly
bluish and stiffer when the tumor started metastasis, and the mass
fingerprints were different from the healthy ones with a great decline
of the dominant peak at 9979 m/z (representatives in Figure A, T130, T168, T182). The fingerprint difference could be explained by the collection
of fewer epidermal cells due to the increase of skin stiffness. During
the tumor growth, the cancer cells produced a high content of melanin
pigment, which could be obtained by the surrounding skin cells through
intercellular transfer and cell fusion.[39] Intracellularly, the melanin is present in the form of microsized
granules, i.e., melanosomes, which are quite hard to deform.[40] The accumulation of melanosomes thus led to
the increase of stiffness and the mildly bluish color observed on
the adjacent regions. The increased stiffness could reduce cell adhesion
tendency to the sampling discs, and therefore, fewer cells were collected
(microscopic images in Figure S17). This
phenomenon might also be used as a sign to indicate the cancer metastasis.
A deeper investigation was not included here. Notably, although different
from healthy controls, the adjacent nontumor skin fingerprints remained
distinct from the tumor lesions with cosine similarity lower than
0.200 (Figure S18).
Figure 7
Extension tests on adjacent
skins and comparison test with blood.
(A) Mass fingerprints from adjacent nontumor skins (located around
15 mm away from the tumor lesions) when the tumor size was 50, 70,
130, 168, or 182 mm2. (B,C) Analysis of the skin malignancy
by a blood test, i.e., detection of blood-circulating exosomes using
MALDI-TOF mass spectrometry: (B) hierarchical clustering based on
spectral cosine similarity and average linkage and (C) principal component
analysis by the first three principal components with 95% confidence
ellipsoids applied to the malignancy progression stages. (D,E) Comparison
of the blood test and the new methodology for the differentiation
among the disorder progression stages, (D) by scoring spectral cosine
similarities (mean value bar and label, median value line, data dot
distribution) or (E) by statistical analysis of the principal component
scores (mean difference bar, P value ≤0.05
for significant difference).
Extension tests on adjacent
skins and comparison test with blood.
(A) Mass fingerprints from adjacent nontumor skins (located around
15 mm away from the tumor lesions) when the tumor size was 50, 70,
130, 168, or 182 mm2. (B,C) Analysis of the skin malignancy
by a blood test, i.e., detection of blood-circulating exosomes using
MALDI-TOF mass spectrometry: (B) hierarchical clustering based on
spectral cosine similarity and average linkage and (C) principal component
analysis by the first three principal components with 95% confidence
ellipsoids applied to the malignancy progression stages. (D,E) Comparison
of the blood test and the new methodology for the differentiation
among the disorder progression stages, (D) by scoring spectral cosine
similarities (mean value bar and label, median value line, data dot
distribution) or (E) by statistical analysis of the principal component
scores (mean difference bar, P value ≤0.05
for significant difference).
Comparison Test with Blood in Mice
The relevance of
this new methodology for the detection and monitoring of skin disorders
was further demonstrated through comparison with a blood test. Blood
testing is frequently involved in clinics due to the rich bioinformation
in blood and the easy sampling procedure. For the study of malignancy
like cancer, blood-circulating exosomes are the “hot”
target analytes.[41] They participate in
malignancy progression in a variety of ways and are not difficult
to isolate.[42] Here, the blood exosomes
derived throughout the skin cancer growth were analyzed by MALDI-TOF
MS to correlate with the detection results from the new methodology.
Along with the cancer growth, the exosomal mass fingerprints also
changed gradually (Figure S19), with the
different mass peaks mostly coming from commonly used blood biomarkers
for malignancy prediction, like complement component 3, fibrinogen
α chain, haptoglobin, serum amyloid A, and hemoglobin subunits
(Data File S4).[43,44] The differentiation results between cancerous and healthy mice according
to the exosomal fingerprints, through either HCA (Figure B, cluster distance in Figure S20, cosine similarities in Table S8) or PCA (Figure C, scores in Table S9), were generally consistent with the results from the new methodology.
This once again validated the reliability of the new method. Comparing
the two approaches, the new method was found more performant for the
differentiation between different stages of the cancer progression.
It generated mutual cosine similarity of 0.601(±0.147), 0.721(±0.162),
and 0.862(±0.063) among the early tumors, medium-sized tumors,
and metastatic tumors, whereas the mutual similarity from the blood
test was much higher, being 0.942(±0.008), 0.957(±0.007),
and 0.964(±0.009), respectively (Figure D). This was also indicated by the PCA plot,
as the new method produced less overlapping of the 95% confidence
ellipsoids (Figure E versus Figure C)
and larger statistical difference among the principal component scores
(Figure E, Figure S14 versus Figure S21, Table S7 versus Table S10) among the tumor stages. Considering
the analyte location, a large proportion of blood background components
produced by various types of body cells could suppress the signals
from the skin disorder and thus limit the blood test sensitivity.
The new method directly investigated the skin, where the disorder
arose, and therefore provided much more specific bioinformation to
assist the disorder analysis. This argumentation should also hold
true for other types of skin malignancy.
Discussion
A methodology
for skin monitoring has been developed here, through
the coordination of skin surface adhesive sampling, MALDI-TOF mass
recording, and algorithm-directed data interpretation. As demonstrated
above, the methodology allows a routine investigation of skin state,
an early detection of skin disorder, and a dynamic monitoring of the
disorder progression. Compared to existing skin analysis strategies,
this new approach shows advantages in five aspects: (i) noninvasive
due to the mild adhesive sampling, (ii) time-saving with each step
taking only a few minutes, (iii) high-throughput owing to the hundred-spot
mass spectrometer target, (iv) highly sensitive with skin malignancy
detectable as early as the occurrence of submillimeter-sized lesions,
and (v) allowing automated data interpretation to achieve high accurate
analysis, e.g., detection of the skin malignancy with 100% accuracy.Regarding the sampling methods in the dermatological study, skin
biopsy is the state-of-the-art procedure; however, it is often painful,
risky, and limited to the surgery scheduling. Whenever allowed by
a dermatologist, the noninvasive adhesive sampling can partially replace
skin biopsy to provide a patient-friendly diagnosis. In order to provide
a reproducible detection, the sampling protocol should be kept consistent,
as parameters like the materials of sampling discs and the sampling
repetition times could influence the quality of the collected skin
samples. Researchers have made continuous efforts to make the process
user-friendly, with the distribution of commercial kits in high quality
and low price, like the kits from DermTech and D-Squame (used here).The MALDI-TOF mass fingerprinting stands out from other analytical
strategies due to its speed and simplicity. It can analyze a large
number of samples in a short time without complicated experimental
operations. It is readily compatible with the adhesive sampling, where
the collected skin samples can be analyzed straightforwardly without
any sample pretreatment. This label-free measurement helps the collection
of sample information on the whole without bias. The mass fingerprinting
procedure can also be coupled with mass spectrometry imaging to investigate
the skin from more aspects, for instance, description of spatial distribution
of diagnosis-relevant molecules or detection of the skin lesion boundaries.By algorithm-directed data analysis, the methodology achieved a
quick and accurate result interpretation. For the scoring of mass
fingerprint similarity, we employed here mostly the algorithm of cosine
correlation, which considers both the peak presence and intensity.
It generally provided satisfactory detection results for the skin
cancer at various stages of progression. To explore minor intersample
differences, for instance, study of very early skin disorder with
minor abnormality or differentiation between nearby stages of disorder
progression, an algorithm with increased weighting of peak presence
and decreased weighting of peak intensity could generate better results.
Jaccard index and Euclidean distance were proposed as such measures
(Figures S12, S13, and S22). Other often-used
measures like Pearson’s correlation were not investigated here,
but they might be options. The machine learning models used here were
extracted from the Weka algorithm library. For the mouseskin cancer,
the learning models listed in Figure performed best with the new method. A deeper comparison
of currently available learning models using much larger skin disorder
data sets is going to be made in the future to select the most appropriate
one or combination to reach the most reliable data interpretation.
A dedicated algorithm could also be realized to reach an easy tracking
of disease-characteristic peaks on the fingerprints, following an
example recently reported for the analysis of liquid chromatography
data.[45] On the whole, the algorithm-directed
data learning helps to reach automated result interpretation, reducing
the heavy reliance on skilled physicians with prior clinical experience.One more concern is about the protein assignment of the mass fingerprint
peaks. We made the assignment through correlation with top-down proteomic
data by matching the peak mass with the detected protein mass. This
procedure is much more reliable than directly matching the peak mass
with the theoretical protein molecular weight provided by databases
like UniProt. A protein detected from a biological sample is not always
in its theoretical full length as recorded in databases, due to a
variety of cellular processes and possibly experimental operations.
The assignment outcome matched well with a previous proteome study
on melanoma tissue biopsies, showing the rationality of this procedure.[46] Nevertheless, this assignment is still at a
tentative level, as the sample preparation and measurement process
of MALDI-TOF mass fingerprinting are different from the proteomic
analysis. A more precise procedure might be using high-resolution
MALDI tandem mass spectrometry to conduct in situ bottom-up proteomics.In this work, melanoma was used as a skin disease model. In clinics,
it is often an important task to differentiate between early melanoma
and skin moles, due to their similarity in physical appearance. The
differentiation is, in principle, possible using the new methodology,
as the protein profile of melanoma cells were considerably different
from the melanocytes forming the moles.[47] This is also indicated by a preliminary comparison test between
humanmelanoma cells and skin moles (Figure S23). Generally, it is promising to apply the new methodology to a broad
range of skin conditions, including other types of skin cancer like
mycosis fungoides, basal cell carcinoma, and squamous cell carcinoma,
as well as skin diseases with different pathogenesis like wound (i.e.,
burn), allergy (i.e., atopic dermatitis), infection (i.e., leprosy,
chancroid), keratinocyte hyperproliferation (i.e., psoriasis), and
so on. This is supported by a recent finding that the skin molecular
composition is specific to the disease type and pathogenesis, after
investigation of 311 skin biopsy samples related to 16 common types
of skin diseases.[48] It has also been reported
that the molecular signature is changeable along with the disease
progression or treatment.[49] This indicates
that the methodology we developed could be used widely for skin analysis,
helpful to explore physiological and pathological aspects of various
skin states.The implementation of this methodology in practice
could be facilitated
by building a comprehensive skin surface mass fingerprint database
from common skin conditions. In order to reach this goal, we are currently
applying for a test permission on humanpatients through the collaboration
with local dermatologists. The database could be integrated with proper
similarity measures, machine learning tools, and peak tracking algorithms
to achieve a straightforward data analysis on a single software platform.
The construction of such a database could be labor-intensive, but
it will greatly promote the result interpretation once completed.
Currently, MALDI-TOF mass spectrometers have been widely installed
in clinical laboratories, opening doors for a routine application
of this methodology. The rapid, high-throughput, and patient-friendly
detection manner also provides the opportunity to build personalized
skin care in the future.
Materials and Methods
Ethics
Statement
The studies on mice were performed
in accordance with Swiss federal regulations and procedures approved
by veterinary authority of Canton Vaud. D-Squame brand adhesive sampling
discs employed in this work have passed human skin safety tests and
obtained the European Certification (CE mark). Samples from healthy
human skin surfaces were subjected to MALDI-TOF MS measurements only,
without manipulation of the genetic materials. All biological wastes
were deposited properly according to the biosafety rules issued by
École Polytechnique Fédérale de Lausanne.
Development
of Mouse Model Skin Cancer
The skin cancer
was induced using genetically engineered method, by treating Tyr::CreER;BRaf;Pten transgenic mice with 4-hydroxytamoxifen
(4-HT) topically on the back skin. This cancer development procedure
has been described previously by Dankort et al. in detail.[21,22] The mice harbored a 4-HT-inducible Cre recombinase-estrogen
receptor fusion transgene (CreER) under the control
of melanocyte-specific tyrosinase (Tyr) promoter.
The mice also carried conditional alleles of BRaf (BRaf) and Pten (Pten). First, 1.5 μL of 4-HT (50 mg·mL–1 in dimethyl sulfoxide) was applied topically on the back skin of
the mice at 3 weeks old, using a small paint brush. Activation of CreER by 4-HT led to the melanocyte-specific expression
of BRAF and the
silencing of PTEN. The BRAFV600E
mutation is the most common genetic alteration in humanmelanoma,
and the silencing of tumor suppressor gene PTEN promotes
malignant progression. By controlling the tumor growth time, the mice
were developed with melanoma at different progression stages.
Skin Sampling
On healthy human volunteers, the skin
regions to be sampled were washed with soap, rinsed with running water,
and left to dry naturally. D-Squame adhesive sampling discs were used
for the sampling. The disc was composed of transparent polyester film
and an acrylic adhesive layer. The adhesive side was attached to a
suspicious skin region. A mild lateral pressure was applied for 10
s to achieve a good adhesion between the adhesive layer and the skin
surface. The boundary of the suspicious region was outlined on the
polyester film using a watercolor pen. The disc was then stripped
off the skin surface by holding one corner. The sampling on mice was
performed by a similar procedure, after removing the skin hair carefully
using a razor blade and cleaning the skin surface gently using a wipe
containing 70% ethanol. Each skin region was sampled multiple times
in a row (e.g., four times), using a new disc each time. The multiple
samplings helped to ensure reliability of the detection result. The
obtained samples were either analyzed immediately or stored in 4 °C
for later analysis within 24 h. For mass spectrometry measurements,
the sampling discs carrying collected epidermis cells were attached
to the MALDI target using a Scotch double-sided tape (3M Science,
USA), with the cell-collection side facing up. The MALDI matrix was
pipetted to cover the outlined sample region and left to dry at room
temperature to form cell–matrix cocrystals. The target was
then loaded into mass spectrometer for measurement.
MALDI-TOF MS
Measurements
Sinapinic acid, 15 mg·mL–1 in 50/49.9/0.1% (volume percentage) acetonitrile/water/trifluoroacetic
acid, was used as the MALDI matrix. All measurements were conducted
on Bruker microflex LRF MALDI-TOF mass spectrometer under linear positive
mode with delayed extraction. The positive mode benefits the detection
of cellular proteins as they have high proton affinities with the
tendency to be ionized through protonation.[13] The delayed extraction involves a time delay (400 ns here) between
the laser pulse and the ion-accelerating voltage, providing time-of-flight
compensation for ion velocity spread to improve mass resolution.[50] The instrumental parameters were set as laser
intensity 70%, laser attenuator with 30% offset and 40% range, laser
frequency 20.0 Hz, detector gain 20×, suppress up to 1000 Da,
mass window 2000–20,000 m/z. The mass window lower than 2000 m/z suffered from interference from the matrix, and the window higher
than 20,000 m/z had the distribution
of only low-abundant peaks due to the low ionization efficiency of
large proteins from the whole intact cells. Mass calibration was conducted
with an aqueous solution containing cytochrome c,
myoglobin, and protein A (1 mg·mL–1 for each).
For each skin sample, the mass spectrum was obtained with 5 ×
250 laser shots throughout the sample region to reduce the possible
impact from “sweet” spots and to generate a panorama
of the whole skin region.
Protein Assignment for MALDI-TOF MS Peaks
Skin surface
mass fingerprint peaks were tentatively assigned to proteins by mass
matching with the skin tissue top-down proteomic data. Most ions detected
by MALDI-TOF MS were singly charged, and thus the protein mass was
[(m/z)peak top –
1] under linear positive detection mode. This mass corresponded to
the molecular average mass, obtained by summing the average atomic
mass of each constituent element. For each protein identified by top-down
proteomics, the average mass was calculated according to the measured
protein sequence, using a peptide mass calculator like PeptideSynthetics.
In this regard, the average mass measured by MALDI-TOF MS was correlated
with the average mass obtained from top-down proteomics. Considering
the mass resolution of liner TOF, 1000 ppm tolerance was allowed for
the mass matching. If one MALDI-TOF MS peak could be correlated with
multiple proteins in the top-down proteomic data according to the
mass value, the peak was then assigned to the protein detected with
the highest number of matching fragments with an E value (expectation value) lower than 0.0001, i.e., the most abundant
protein detected with high confidence around that mass.
Data Analysis
The MALDI-TOF mass spectra were processed
using Mass-Up software for smoothing (moving average), baseline correction
(Snip algorithm), peak picking (MALDIquant algorithm, signal-to-noise
ratio 3.0, half-window size 60), and peak alignment (Forward algorithm,
1000 ppm mass tolerance).[51] Data analysis
by machine learning, including the unsupervised learning of HCA, PCA,
and the supervised learning of classification, was conducted in the
Weka machine learning environment implanted on the Mass-Up platform.
Mass spectral similarity scoring was conducted on the BacteriaMS software
platform, using the algorithm of Jaccard index, relative Euclidean
distance, intensity-weighted Euclidean distance, or cosine correlation.
These algorithms were explained in the Algorithm Description section
in the Supporting Information. All of the
statistical analyses were conducted via a hypothesis t test (for two groups of data) or ANOVA (for three or more groups
of data) using the OriginLab software, with the statistical significance
conditioned by a P value ≤0.05.
Authors: Hensin Tsao; Jeannette M Olazagasti; Kelly M Cordoro; Jerry D Brewer; Susan C Taylor; Jeremy S Bordeaux; Mary-Margaret Chren; Arthur J Sober; Connie Tegeler; Reva Bhushan; Wendy Smith Begolka Journal: J Am Acad Dermatol Date: 2015-02-16 Impact factor: 11.527
Authors: Chong Sun; Liqin Wang; Sidong Huang; Guus J J E Heynen; Anirudh Prahallad; Caroline Robert; John Haanen; Christian Blank; Jelle Wesseling; Stefan M Willems; Davide Zecchin; Sebastijan Hobor; Prashanth K Bajpe; Cor Lieftink; Christina Mateus; Stephan Vagner; Wipawadee Grernrum; Ingrid Hofland; Andreas Schlicker; Lodewyk F A Wessels; Roderick L Beijersbergen; Alberto Bardelli; Federica Di Nicolantonio; Alexander M M Eggermont; Rene Bernards Journal: Nature Date: 2014-03-26 Impact factor: 49.962
Authors: John C Tran; Leonid Zamdborg; Dorothy R Ahlf; Ji Eun Lee; Adam D Catherman; Kenneth R Durbin; Jeremiah D Tipton; Adaikkalam Vellaichamy; John F Kellie; Mingxi Li; Cong Wu; Steve M M Sweet; Bryan P Early; Nertila Siuti; Richard D LeDuc; Philip D Compton; Paul M Thomas; Neil L Kelleher Journal: Nature Date: 2011-10-30 Impact factor: 49.962
Authors: David Dankort; David P Curley; Robert A Cartlidge; Betsy Nelson; Anthony N Karnezis; William E Damsky; Mingjian J You; Ronald A DePinho; Martin McMahon; Marcus Bosenberg Journal: Nat Genet Date: 2009-03-12 Impact factor: 38.330