Mattias Ohlsson1,2, Thomas Hellmark3, Anders A Bengtsson4,5, Elke Theander6, Carl Turesson5,6, Cecilia Klint7, Christer Wingren8, Anna Isinger Ekstrand8. 1. Computational Biology & Biological Physics, Department of Astronomy and Theoretical Physics, Lund University, Sölvegatan 14A, Lund SE-221 00, Sweden. 2. Center for Applied Intelligent Systems Research (CAISR), Halmstad University, Halmstad SE-301 18, Sweden. 3. Department of Clinical Sciences Lund, Nephrology, Skåne University Hospital Lund, Lund University, Lund SE-221 85, Sweden. 4. Rheumatology, Department of Clinical Sciences, Lund, Lund University, Lund SE-221 00, Sweden. 5. Department of Rheumatology, Skåne University Hospital, Lund and Malmö SE-214 28, Sweden. 6. Rheumatology, Department of Clinical Sciences, Malmö, Lund University, Malmö SE-221 00, Sweden. 7. Immunovia AB, Lund 223 81, Sweden. 8. Department of Immunotechnology, Lund University, Medicon Village, Scheelevägen 2, Lund SE-223 81, Sweden.
Abstract
Early and correct diagnosis of inflammatory rheumatic diseases (IRD) poses a clinical challenge due to the multifaceted nature of symptoms, which also may change over time. The aim of this study was to perform protein expression profiling of four systemic IRDs, systemic lupus erythematosus (SLE), ANCA-associated systemic vasculitis (SV), rheumatoid arthritis (RA), and Sjögren's syndrome (SS), and healthy controls to identify candidate biomarker signatures for differential classification. A total of 316 serum samples collected from patients with SLE, RA, SS, or SV and from healthy controls were analyzed using 394-plex recombinant antibody microarrays. Differential protein expression profiling was examined using Wilcoxon signed rank test, and condensed biomarker panels were identified using advanced bioinformatics and state-of-the art classification algorithms to pinpoint signatures reflecting each disease (raw data set available at https://figshare.com/s/3bd3848a28ef6e7ae9a9.). In this study, we were able to classify the included individual IRDs with high accuracy, as demonstrated by the ROC area under the curve (ROC AUC) values ranging between 0.96 and 0.80. In addition, the groups of IRDs could be separated from healthy controls at an ROC AUC value of 0.94. Disease-specific candidate biomarker signatures and general autoimmune signature were identified, including several deregulated analytes. This study supports the rationale of using multiplexed affinity-based technologies to reflect the biological complexity of autoimmune diseases. A multiplexed approach for decoding multifactorial complex diseases, such as autoimmune diseases, will play a significant role for future diagnostic purposes, essential to prevent severe organ- and tissue-related damage.
Early and correct diagnosis of inflammatory rheumatic diseases (IRD) poses a clinical challenge due to the multifaceted nature of symptoms, which also may change over time. The aim of this study was to perform protein expression profiling of four systemic IRDs, systemic lupus erythematosus (SLE), ANCA-associated systemic vasculitis (SV), rheumatoid arthritis (RA), and Sjögren's syndrome (SS), and healthy controls to identify candidate biomarker signatures for differential classification. A total of 316 serum samples collected from patients with SLE, RA, SS, or SV and from healthy controls were analyzed using 394-plex recombinant antibody microarrays. Differential protein expression profiling was examined using Wilcoxon signed rank test, and condensed biomarker panels were identified using advanced bioinformatics and state-of-the art classification algorithms to pinpoint signatures reflecting each disease (raw data set available at https://figshare.com/s/3bd3848a28ef6e7ae9a9.). In this study, we were able to classify the included individual IRDs with high accuracy, as demonstrated by the ROC area under the curve (ROC AUC) values ranging between 0.96 and 0.80. In addition, the groups of IRDs could be separated from healthy controls at an ROC AUC value of 0.94. Disease-specific candidate biomarker signatures and general autoimmune signature were identified, including several deregulated analytes. This study supports the rationale of using multiplexed affinity-based technologies to reflect the biological complexity of autoimmune diseases. A multiplexed approach for decoding multifactorial complex diseases, such as autoimmune diseases, will play a significant role for future diagnostic purposes, essential to prevent severe organ- and tissue-related damage.
Inflammatory
rheumatic diseases (IRD) are heterogeneous syndromes
that are classified based on clinical phenotypes and key disease markers.
Systemic erythematosus lupus (SLE), rheumatoid arthritis (RA), Sjögren’s
syndrome (SS), and antineutrophil cytoplasmic antibody (ANCA)-associated
vasculitis (SV) represent four IRDs, which, if left untreated, can
lead to severe and sometimes permanent disability, increased morbidity,
and premature mortality.[1,2] Diagnosis at an early
stage plays a crucial role in establishing proper disease monitoring
and enabling therapeutic interventions to prevent or minimize organ-
and tissue-related damage. However, clinical diagnosis remains a challenge
due to fluctuating symptoms over time, including a wide repertoire
of manifestations such as fatigue, joint and muscle pain, and inflammation
symptoms, which are shared among several IRDs, and also with other
conditions mimicking IRDs, e.g., infections, malignancies, etc. In
addition, a patient can be affected by more than one autoimmune disease
at the same time (such as concurrent Sjögren’s syndrome
in SLE and RApatients), which confers an increased risk of misdiagnosis
and/or underdiagnosis.[3−5] Current tools for clinical diagnosis include the
combined information generated from clinical, laboratory, and imaging
findings, where the presence of various autoantibodies, such as antinuclear
antibodies (ANA), anticyclic citrullinated peptides (aCCP), rheumatoid
factor (RF), ANCA (including antiproteinase 3 (anti-PR3) and antimyeloperoxidase
(anti-MPO)) antibodies, anti-double-stranded antibodies (anti-dsDNA),
anti-Ro/SSA, and anti-LA/SSB, constitutes important biomarkers in
the diagnostic routine of SLE, RA, SS, and SV.[6−9] However, a positive result for
an autoantibody may not be exclusive for one disease, and the use
of single markers has not reached the high levels of specificity as
required.[5,10−12] Identification of new
blood-based biomarkers for correct and early diagnosis is of high
clinical relevance to enable early therapeutic interventions, thereby
saving both lives and cost for society. Considering that underlying
disease biology is still unclear, panels of disease-specific markers
may provide important insights on key disease-specific molecular alterations.
Previous studies have shown that high-performance proteomic technologies,
such as recombinant antibody microarrays, which offer a multiplexed
approach, reflect the complexity of multifactorial diseases better.[13−17] Using this approach, candidate biomarker panels indicative for SLE,
systemic sclerosis, and SLE disease activity have been identified.[15,17,18] The aim of this study was to
perform protein expression profiling of the IRDs SLE, RA, SS, and
SV and of healthy controls (H) and to identify candidate biomarker
signatures for classification. To this end, a total of 316 serum samples
collected from patients with autoimmune disease and from healthy controls
were analyzed on 394-plex antibody microarrays. Using this methodology,
we showed for the first time that classification of IRD could be achieved
at high accuracy. These results highlight the power of using a multiplexed
approach for decoding multifactorial complex diseases, such as IRDs,
which may play a significant role for future diagnostic purposes.
Experimental
Procedures
Clinical Samples
This retrospective study included
a total of 316 serum samples collected from healthy controls (n = 77) and patients diagnosed with an IRD (n = 239). All samples were collected from Departments of Rheumatology
and Nephrology at Skåne University Hospital (Malmö or
Lund). Patients were diagnosed with either SLE (n = 39), RA (n = 45), SS (n = 73),
or SV (n = 82) and were considered, according to
their specific clinical criteria, to be in an active (n = 198) or inactive (n = 41) disease when samples
were collected. For SLEpatients, disease activity was defined using
the SLEDAI-2000-score[19] (mean score 7,
range 1–19). All patients with RA had fulfilled the 1987 American
College of Rheumatology criteria for RA,[20] and had active disease, with a median CRP of 31 mg/L (interquartile
range 13–55). The majority of RApatients were positive for
anti-CCP (86%) and RF (84%). ANCA specificity in SV patients was defined
according to anti-MPO (n = 40) or anti-PR3 status
(n = 42) and clinical activity according to the BVAS
score.[21] All Sjögren samples were
collected from patients that fulfilled the 2002 American-European
Consensus Group criteria[22] for primary
SS. As controls, serum samples from healthy individuals with no previous
history of autoimmune disease were used. Within the IRD cohort, the
mean age was 59 years and the female to male ratio was 168:82, whereas
the mean age in healthy controls was 60 years and the female to male
ratio was 66:11 (Table ). Ethical approval for the study was granted by the Regional Ethics
Review Board in Lund, Sweden.
Table 1
Demographic Data
of the Patients Diagnosed
with an Inflammatory Rheumatic Disease SLE, RA, SS, or SV and Healthy
Controls (H)
inflammatory
rheumatic diseases
healthy
controls
parameter
SLE
RA
SS
SV
H
total
no. of samples
n = 39
n = 45
n = 73
n = 82
n = 77
n = 316
female:male ratio
33:6
32:13
71:2
32:50
66:11
234:83
mean age years (range)
51 (29–77)
65 (38–85)
61 (24–85)
60 (11–83)
50 (18–81)
60 (11–85)
Antibody Microarray Production and Analysis
A total
of 394 recombinant scFv antibodies were selected from in-house designed
phage display libraries[23,24] (Supporting Information Table S1). Of these, 379 of the scFv antibodies
were directed against 161 (mainly immunoregulatory) antigens. The
remaining 15 scFv antibodies were directed against 15 short amino
acid motifs (4–6 amino acids long), denoted CIMS antibodies[25] (Supporting Information Table S2). For some analytes, more than one scFv antibody
clone (n = 2–9) targeting different epitopes
was chosen to minimize the risk of impaired antibody activity followed
by epitope masking during the sample labeling process. All scFv antibodies
were produced according to standardized protocols in 15 mL of E. coli cultures and purified from the cell periplasmic
space using the MagneHis Protein Purification System (Promega, Madison,
WI) and a KingFisher96 Robot (Thermo Fisher Scientific, Waltham, MA).
Buffer exchange to PBS was performed using a Zeba 96-well desalt spin
plate (Pierce). Concentration and purity of the scFvs were determined
using a Nanodrop at 280 nm (NanoDrop Technologies, Wilmington) and
SDS-PAGE analysis (InVitrogen, Carlsbad, CA). Production of 26 ×
28 spot subarrays was generated by a noncontact printer (SciFlexarrayer
S11, Scenion, Berlin, Germany). Briefly described, single droplets
(300 pL) of scFv antibody solutions, PBS (blank) or biotinylated BSA
(position marker), were printed on black polymer Maxisorp slides (NUNC
A/S, Roskilde, Denmark) and allowed to absorb to the surface. Antibody
microarrays were analyzed, as previously described.[26] In brief, biotinylated samples were added to individual
subarrays, and bound proteins were detected using Alexa-647-labeled
streptavidin. Slides were scanned at 635 nm using the LS Reloaded
laser scanner (Tecan) at a fixed laser scanning setting of 150 PMT
gain.
Data Preprocessing
Data preprocessing was performed
as follows. In brief, spot signal intensities were quantified using
Immunovia Quant software, v1.0 (Immunovia AB, Lund, Sweden). Signal
intensities with local background subtraction were used for data analysis.
Each data point represented the mean value of three technical replicate
spots, unless any replicate cross-validation (CV) exceeded 15%, in
which case the worst-performing replicate was eliminated and the average
value of the two remaining replicates was used. The data were normalized
using a two-step strategy. First, the data were normalized according
to the day-to-day variation using the “subtract by group mean”
approach, as previously described,[27] and
secondly, a modified semiglobal normalization was used to minimize
array-to-array variations. In this approach, 15% of the antibodies
displaying the lowest CV values over all samples were identified and
used to calculate a scaling factor, as previously described.[28,29] Quality control and visualization of potential outliers was performed
using Qlucore Omice Explorer 3.1 software (Qlucore AB, Lund, Sweden).
The raw array data set is available at https://figshare.com/s/3bd3848a28ef6e7ae9a9.
Data Analysis
A schematic outline of the experimental
analysis process is demonstrated in Figure . For differential analysis, leave-one-out
cross-validation (LOO CV), and signature development, one group (H,
SLE, RA, SS, or SV) was set against the remaining groups. Analysis
1A in Figure refers
to the identification of a general IRD signature, where healthy controls
(H) were set against the IRDs, meaning H versus SLE+RA+SS+SV. When
performing analysis within the IRD group (Figure B–E), each individual disease was
set against the group of the remaining three diseases, as follows:
(B) SLE versus RA+SS+SV, (C) RA versus SLE+SS+SV, (D) SS versus SLE+RA+SV,
and (E) SV versus SLE+RA+SS.
Figure 1
Schematic outline of the antibody microarray
process applied on
serum samples from systemic lupus erythematosus (SLE), rheumatoid
arthritis (RA), Sjögren syndrome (SS), ANCA-associated vasculitis
(SV), and healthy controls (H). For each analysis (Wilcoxon, leave-one-out
cross-validation, and signature development), each group was set against
the remaining samples, i.e., (A) H versus SLE+RA+SS+SV, (B) SLE versus
RA+SS+SV, (C) RA versus SLE+SS+SV, (D) SS versus SLE+RA+SV, and (E)
SV versus SLE+RA+SS.
Schematic outline of the antibody microarray
process applied on
serum samples from systemic lupus erythematosus (SLE), rheumatoidarthritis (RA), Sjögren syndrome (SS), ANCA-associated vasculitis
(SV), and healthy controls (H). For each analysis (Wilcoxon, leave-one-out
cross-validation, and signature development), each group was set against
the remaining samples, i.e., (A) H versus SLE+RA+SS+SV, (B) SLE versus
RA+SS+SV, (C) RA versus SLE+SS+SV, (D) SS versus SLE+RA+SV, and (E)
SV versus SLE+RA+SS.Up or downregulated proteins
were identified using Wilcoxon signed
rank test (q < 0.05) and p-values
were adjusted with the Benjamini and Hochberg method.[30] Venn diagrams were created at http://bioinformatics.psb.ugent.be/webtools/Venn/. For supervised classification analysis, a linear support vector
machine (SVM) (cost parameter = 1) combined with a LOO CV algorithm
was used to evaluate the predictive performance of a model. In the
LOO CV procedure, one sample was removed, and the remaining samples
were used to train the model. The left-out sample was then used to
test the model, and the process was repeated until every sample had
been used as a test sample. A decision value for each excluded sample
was generated, corresponding to the distance to the hyper plane and
a receiver operating characteristic (ROC) curve was constructed. The
area under the curve (AUC) was then calculated and used as a measure
of the prediction performance of the classifier.
Identification
of Disease-Specific Signatures
To define
a condensed biomarker signature for the differential profiling analysis,
a ranking procedure combined with two levels of K-fold cross-validation
loops was used (Supporting Information Figure S1). For each individual analysis (H versus SLE+RA+SS+SV etc.),
the output was a list of proteins ranked according to how important
they were in classification analysis. In short, in the first level
of K-fold cross-validation, the ranking of the proteins was defined
using an inner loop. Here, the data set was randomly divided into
training and validation set 15 times and then repeated 5 times (5-fold
cross-validation strategy). In the end, proteins were ranked according
to their average importance, resulting in a ranking list. In the next
level of K-fold cross-validation, an outer loop was used to test the
biomarker signatures of a given length. The final condensed biomarker
signature, of a given size, was then assembled using all ranking lists
analyzed in the outer loop. For details, see the Supporting Information Materials and Methods section and Figure S1.
Results
The aim
of this study was to perform differential protein expression
profiling of IRDs and healthy controls and to identify condensed biomarker
signatures for disease classification. To this end, a total of 316
serum samples, collected from healthy controls (n = 77) and patients diagnosed with SLE (n = 39),
RA (n = 45), SV (n = 82), or SS
(n = 73), were analyzed on 394-plex antibody microarrays.
One sample collected from a patient with Sjögren’s syndrome
was removed from analysis due to technical reasons. One antibody,
targeting Keratin-19, failed during the printing process and was removed
from further analysis, though two clones targeting the same antigen
remained. Altogether, a total of 315 samples and 393 antibodies were
used for final data analysis, differential profiling, and signature
development. Visualization of the data set in Qlucore revealed no
differences in relation to array block, sample labeling day, assay
day, or scanning positions, suggesting that any technical variations
had successfully been removed during normalization.
Differential Protein Expression
Profiling of Healthy and Autoimmune
Serum Samples
In the first step of analysis, we wanted to
investigate if a signature reflecting IRD (including SLE, RA, SS,
and SV) could be identified. Altogether, we were able to demonstrate
that the IRD samples could be distinguished from healthy controls
and that a biomarker signature, indicative for IRD indeed could be
identified. Using an SVM analysis combined with LOO CV, including
all antibodies (n = 393), the IRDs could be separated
from healthy controls with an ROC AUC value of 0.94 (Figure A). Since LOO CV analysis utilizes
all antibodies for classification, further analysis was performed
to investigate whether healthy and autoimmune samples could be classified
using a smaller set of antibodies. Using a ranking procedure (see Methods section), the 40 best-performing antibodies
were selected (Supporting Information Table S3), which were able to classify IRD and healthy controls by a predictive
AUC value of 0.93. These results clearly show that these IRDs can
be differentiated from healthy controls using a protein signature,
which could potentially pave the way for a diagnostic test of IRDs.
Figure 2
(A) ROC
curve including AUC value generated from leave-one-out
cross-validation analysis on healthy versus autoimmune diseases (SLE,
RA, SS, and SV). (B) Heatmap from supervised analysis including the
top 18 differentially expressed analytes, represented by 25 scFv clones
(Wilcoxon analysis q < 0.05) between healthy (yellow
bars) and inflammatory rheumatic diseases (blue bars), which include
SLE, RA, SS, and SV. Individual clone suffixes are shown in brackets.
(A) ROC
curve including AUC value generated from leave-one-out
cross-validation analysis on healthy versus autoimmune diseases (SLE,
RA, SS, and SV). (B) Heatmap from supervised analysis including the
top 18 differentially expressed analytes, represented by 25 scFv clones
(Wilcoxon analysis q < 0.05) between healthy (yellow
bars) and inflammatory rheumatic diseases (blue bars), which include
SLE, RA, SS, and SV. Individual clone suffixes are shown in brackets.Next, we were interested in which analytes were
downregulated among
the IRDs. Using Wilcoxon, a total of 77 analytes, targeted by 114
antibodies, were found to be differentially expressed (q < 0.05) between IRDs and healthy controls. Among the upregulated
analytes, some of the most interesting immunoregulatory analytes included
apolipoprotein A1, IL-6, IL-12, TNF-α, IL-16, osteopontin, PRKCZ,
and DLG4, whereas antibodies targeting C3, IL-4, VEGF, KKCC1-1, and
SPDLY-1 were found among the downregulated analytes. A heatmap including
the top 25 antibodies and their corresponding analytes revealed some
separation of the two groups (Figure B, Supporting Information Table S4). Supported by the fact that separation of IRD from healthy
controls could be achieved using two different approaches, though
Wilcoxon signed rank test is a nonparametric test and relies on multiple
testing, whereas the K-fold cross-validation is an algorithm within
machine learning to estimate the prediction error, we compared the
lists including the top 25 antibodies with the 40-plex signature panel.
Some overlapping could be observed including antibodies targeting
the analytes C3, C4, RPS6KA2, KCC2B-3C5, and UBC9. Altogether, these
results indicated that a general IRD signature may indeed be present,
involving upregulation of several analytes with immunoregulatory functions.
Differential Protein Expression Profiling of SLE, RA, SS, and
SV
Considering that many autoimmune diseases display similar
symptoms, making clinical diagnosis challenging, we turned our focus
toward the IRDs (Figure ). Herein, a total of four groups were formed as follows: (B) SLE
versus RA+SS+SV, (C) RA versus SLE+SS+SV, (D) SS versus SLE+RA+SV,
and (E) SV versus SLE+RA+SS. Leave-one-out cross-validation analysis,
including all antibodies, showed that the classification of, respectively,
IRD-type could be achieved at high accuracies, as presented by ROC
AUC values ranging from 0.96 to 0.80 (Figure ). The best separation was achieved for SLE
with an ROC AUC value of 0.96 (Figure A) followed by SV and RA, which were classified at
ROC AUC values of 0.94 (Figure B) and 0.86 (Figure C), respectively, whereas SS demonstrated an ROC AUC value
of 0.80 (Figure D).
Figure 3
ROC curves
with AUC values generated from LOO CV analysis, representing
(A) SLE versus RA+SS+SV, (B) SV versus SLE+RA+SS, (C) RA versus SLE+SS+SV,
and (D) SS versus SLE+RA+SV.
ROC curves
with AUC values generated from LOO CV analysis, representing
(A) SLE versus RA+SS+SV, (B) SV versus SLE+RA+SS, (C) RA versus SLE+SS+SV,
and (D) SS versus SLE+RA+SV.Again, we were interested in if the different groups could be separated
using shorter biomarker signatures. Condensed biomarker signatures
for SLE, RA, SS, and SV, respectively, were identified using the same
procedure, as described previously (Supporting Information Table S3). Herein, using the disease-specific
signatures, SLE was again found to be classified with highest accuracy
(AUC = 0.96) followed by SV (AUC = 0.94), SS (AUC = 0.80), and RA
(AUC = 0.79). Principal component analysis (PCA) plots of the obtained
condensed biomarker signatures are presented in Figure .
Figure 4
PCA plots of supervised analysis based on 40-plex
biomarker panels
representing SLE (A), SV (B), SS (C), and RA (D).
PCA plots of supervised analysis based on 40-plex
biomarker panels
representing SLE (A), SV (B), SS (C), and RA (D).A closer look at these four disease-specific signatures revealed
that antibodies targeting analytes such as C3, C4, apolipoprotein
A1, and factor B were present on more than one list (Supporting Information Table S3). However, analytes unique for each
signature were also identified, such as Lewis x and TNF-a in SLE,
PRKCZ and PTK6 in RA, IL-8 and RANTES in SS, and C1q and IL-18 in
SV, which could indicate the presence of disease-specific markers.
Altogether, by applying 394-plex antibody microarrays interfaced with
stringent data analysis, 40-plex antibody signatures capable of classifying
the autoimmune diseases SLE, RA, SS, and SV at high predictive powers
were pinpointed.To further explore the serum proteomes of SLE,
RA, SS, and SV,
differentially expressed analytes for respective disease type were
identified (Wilcoxon. q < 0.05) (Supporting Information Table S4). In total, the highest number of differentially
expressed analytes was found for SV (n = 326 antibodies
targeting 160 analytes) followed in a decreasing order by SS (n = 207 antibodies targeting 127 analytes), SLE (n = 127 antibodies targeting 85 analytes), and RA (n = 114 antibodies targeting 81 analytes).Considering
the complexity of underlying molecular alterations
in IRD and that both common and disease-specific alterations would
be of interest, we investigated the amount of overlap. First, we investigated
the overlap based on an antibody level, i.e., relating to the specific
clones, irrespective of which analytes they targeted. This revealed
a major overlap (Figure A), which was not surprising, considering the high number of antibodies
generated from the differential analysis.
Figure 5
Venn diagrams representing
the overlap of variables generated from
differential analysis (Wilcoxon signed rank test, q < 0.05) for SLE, RA, SS, and SV. Since an analyte may be targeted
by more than one antibody diagram, (A) represents the overlap of antibodies,
whereas (B) represents the overlap on an analyte level. Disease-specific
analytes are outlined in (B). Diagram was created at http://bioinformatics.psb.ugent.be/webtools/Venn/.
Venn diagrams representing
the overlap of variables generated from
differential analysis (Wilcoxon signed rank test, q < 0.05) for SLE, RA, SS, and SV. Since an analyte may be targeted
by more than one antibody diagram, (A) represents the overlap of antibodies,
whereas (B) represents the overlap on an analyte level. Disease-specific
analytes are outlined in (B). Diagram was created at http://bioinformatics.psb.ugent.be/webtools/Venn/.A summary including the top 25
antibodies and their specific targets
for each disease is presented in Supporting Information Table S4. Out of those top 25 lists, most analytes
within SLE, RA, and SS were found to be upregulated (15, 21, and 25
respectively), whereas the opposite, e.g., downregulation of most
analytes (n = 23) was observed in SV. Accordingly,
the overlap with the condensed biomarker signatures for respective
diseases was also investigated, which revealed some overlap. Altogether,
these results indicated that biologic events, including deregulation
of specific analytes for each disease type, could be identified, which
may indicate different pathogenetic routes and which could potentially
be used to further understand the complexity behind disease progression
and for further diagnostic tools.
Discussion
Autoimmune
diseases today pose a global health issue, affecting
millions of people around the globe, and there is an urgent need for
refined clinical tools for early and differential diagnosis.[31] Diffuse, general symptoms, such as fatigue,
inflammation, and joint pain, which also change in severity over time,
shared among several diseases, make clinical diagnosis challenging.
In this study, candidate biomarker signatures for the inflammatory
rheumatic diseases RA, SLE, SS, and SV were identified. Altogether,
the results showed that LOO CV analysis including all antibodies (n = 393) could accurately classify individual IRDs at AUC
values ranging between 0.96 and 0.80 (Figure ). In addition, panels including 40 antibodies
could still classify the autoimmune diseases at high accuracy, with
AUC values ranging between 0.96 and 0.79 (Figure ). These results show that using a multiplexed
approach to reflect the pathogenetic complexity in rheumatic disorders
looks very promising and is a venue to continue and explore to identify
new targets for early and differential diagnosis of autoimmune diseases.
There is no doubt that there is a large call for better biomarkers
in autoimmune diseases. Blood-based biomarkers constitute a simple
noninvasive approach, suitable for both discovery biomarker analysis
as for the clinical setting and constitute a major ground within the
autoimmune community research. Although a few biomarkers have been
found as early manifestations of the disease, such as the presence
of antinuclear antibodies (ANAs) in SLE,[32,33] aCCP in RA,[34,35] and anti-SSA/B in SS,[36] many biomarkers display too low specificity
and/or sensitivity and are used one-by-one or too few in concert to
reflect the complexity of the disease.[16,37] Biomarkers
for differential diagnosis are difficult to identify and refined tools
for correct and early diagnosis are of urgent need to prevent severe
organ- and tissue-related damage. This study utilized an antibody
microarray platform targeting mainly immunoregulatory proteins, which
seems to have an advantage when it comes to identifying levels of
proteomic changes within systemic autoimmune disorders, as previously
demonstrated by the delivery of candidate biomarker signatures for
classification of SLE, systemic sclerosis, and SLE disease activity.[13,17,18,29,38,39]Based
on classification analysis, SLE and SV were found to be the
ones most readily separated from the others (AUC = 0.96 and AUC =
0.94 respectively), while RA and SS were a bit more difficult to separate
(AUC = 0.86 and AUC = 0.80, respectively) (Figures and 4). This may
partly be explained by the fact that Sjögren’s syndrome
may overlap in patients with SLE and RA, and similar pathogenic mechanisms
have been suggested.[3,4] To our knowledge, samples in this
study were collected from patients diagnosed with primary SS. RA,
which has the highest prevalence of the IRDs investigated in this
study, is a heterogeneous condition with complex pathogenesis. This
may be reflected by the overlap of the biomarker signature with that
of other disorders. Analyzing the serum proteome in patients with
primary but also secondary SS, RA, and SLE would indeed be of great
value for decoding underlying molecular pathways and of importance
from a diagnostic and therapeutic perspective.The low number
of samples used in this study confers a limiting
factor since an independent data set for validation was not included.
The use of supervised learning algorithms may pose a problem when
they are applied in small data sets due to the risk of overfitting,
which may lead to poor performance in new sample sets.[40,41] Considering this, the approach used for feature extraction and subsequent
generation of condensed signatures in this study was carefully selected
to avoid the risk of overtraining. Ultimately, a short signature with
high predictive power may always be preferred from a logistical and
cost-effective view. However, there is always a trade-off between
the length of the signature and performance, which is why in this
first study, we compromised to include 40 antibodies in the final
consensus list. Also, the high number of antibodies most likely reflects
that pinpointed diseases do share similar pathogenic pathways, and
thus a higher number of antibodies for differential diagnosis may
be necessary from this perspective. This assumption may also be supported
by the major overlap of analytes observed from the differential analysis
(Wilcoxon) (Figure and Supporting Information Table S4),
which further stresses the significance of larger data sets to achieve
even more stringent analysis.Based on the differential protein
expression analysis, only a small
number of disease-specific analytes were found (Supporting Information Table S4, Figure B). The complement system is highly involved in the
pathogenesis of autoimmune diseases,[42] and
the major overlap of analytes may suggest similar molecular mechanisms
underlying disease progression in autoimmunity. Only one analyte,
UBEC2, was found uniquely in SS. UBEC2, is a member of the ubiquitin-conjugating
enzyme family, which is involved in the process of destruction of
mitotic cyclins and for cell cycle progression.[43,44] Interestingly, Ro52 has previously been identified as an E3 ubiquitin
ligase, whose increased expression may lead to increased apoptosis
and promote autoreactivity as in the generation of Ro52 autoantibodies.[45] Compared to the other IRDs, most analytes were
found to be downregulated among SV samples, which could explain the
high number of differentially expressed analytes within this group.
The reason for this difference, however, can only be speculated on,
but may indicate that the underlying molecular events taking place
in systemic vasculitis are different from the other three diseases.
Further studies with bigger sample sets stratified by disease phenotype
may help to clarify the underlying role of disease-specific analytes
and to aid in the search for novel candidate biomarkers for therapeutic
strategies.In this study, several analytes involved in immunoregulatory
response
were found to be deregulated among the IRDs compared to the healthy
controls (Supporting Information Table S4). One of the upregulated analytes was TNF-α, which has already
been shown to be a useful therapeutic target for treatment with biological
TNF inhibitors, especially in RA.[46,47] Other analytes
included the proinflammatory cytokine IL-6, which is also highly interesting
from a therapeutic perspective. Monoclonal antibodies that block the
IL-6 receptor have been shown to be effective in the treatment of
RA[48] and large vessel vasculitis.[49] The level of osteopontin has previously been
demonstrated to be elevated in SLEpatients, which we could confirm
in this study. Osteopontin has been suggested to be associated with
SLE development and a potential marker for SLE activity and organ
damage.[50] Altogether, these data suggest
that a more general autoimmune signature may be present, including
several already known and novel markers that may play significant
roles in autoimmunity. In addition, the finding of a candidate biomarker
signature for classification of IRDs from healthy controls, which
is also supported from other studies,[15] further strengthens the potential of using our antibody microarray
platform for biomarker discovery in autoimmune diseases. A future
tool, capable of functioning as a sensor for autoimmune diseases,
resulting in the transferral of patients to the right instance, would
be of high significance for early and correct diagnosis.The
four systemic IRDs analyzed in this study were chosen based
on that although they share some clinical symptoms and autoimmune
features, the phenotype and, in particular, the long-term disease
course differ substantially. In addition, three of them, e.g., SLE,
RA, and SS, are among the most common autoimmune diseases. SV is not
that common, though associated with a poor prognosis if untreated.
In future studies, it would however be interesting to include other
relevant types of immunological diseases and/or nonautoimmune inflammatory
conditions such as septic arthritis, scleroderma, multiple sclerosis,
and spondyloarthritis. Furthermore, samples from patients with early,
clinically undifferentiated disease should be investigated. This would
give an opportunity to identify more relevant markers for differential
diagnosis and would, even more, reflect the everyday challenge faced
at the diagnostic routine at the clinic. A special focus is needed
on the clinical challenge of how to differentiate severe autoimmune
diseases from nonautoimmune inflammatory conditions, which could be
pivotal for early therapeutic interventions. Of note, today, effective
treatments are missing for some IRDs, i.e., Sjögren’s
syndrome. Better diagnostics would open new and better combinations
of therapy, which would decrease the risk for severe organ- and tissue-related
damage and increase the quality of life for the patients.In
this study, we conclude that a general IRD biomarker signature
could be delineated and that individual IRDs could be classified at
high accuracies using a multiplexed microarray. These results together
with previous studies[15,17,18] suggest that the use of a multiplexed approach is highly suitable
for decoding multifactorial diseases such as autoimmune diseases and
will play a significant role for future purposes of early diagnosis,
essential to prevent severe organ- and tissue-related damage.
Authors: Johan Ingvarsson; Anette Larsson; Anders G Sjöholm; Lennart Truedsson; Bo Jansson; Carl A K Borrebaeck; Christer Wingren Journal: J Proteome Res Date: 2007-08-16 Impact factor: 4.466
Authors: Linn Petersson; Michael Coen; Nabil A Amro; Lennart Truedsson; Carl A K Borrebaeck; Christer Wingren Journal: Bioanalysis Date: 2014-05 Impact factor: 2.681
Authors: Josef S Smolen; Andre Beaulieu; Andrea Rubbert-Roth; Cesar Ramos-Remus; Josef Rovensky; Emma Alecock; Thasia Woodworth; Rieke Alten Journal: Lancet Date: 2008-03-22 Impact factor: 79.321
Authors: F A van Gaalen; S P Linn-Rasker; W J van Venrooij; B A de Jong; F C Breedveld; C L Verweij; R E M Toes; T W J Huizinga Journal: Arthritis Rheum Date: 2004-03
Authors: Niclas Olsson; Christer Wingren; Mikael Mattsson; Peter James; David O'Connell; Fredrik Nilsson; Dolores J Cahill; Carl A K Borrebaeck Journal: Mol Cell Proteomics Date: 2011-06-14 Impact factor: 5.911