BACKGROUND: Idiopathic pulmonary fibrosis (IPF) is a chronic progressive fibrotic lung disease associated with substantial morbidity and mortality. The objective of this study was to determine whether there is a peripheral blood protein signature in IPF and whether components of this signature may serve as biomarkers for disease presence and progression. METHODS AND FINDINGS: We analyzed the concentrations of 49 proteins in the plasma of 74 patients with IPF and in the plasma of 53 control individuals. We identified a combinatorial signature of five proteins-MMP7, MMP1, MMP8, IGFBP1, and TNFRSF1A-that was sufficient to distinguish patients from controls with a sensitivity of 98.6% (95% confidence interval [CI] 92.7%-100%) and specificity of 98.1% (95% CI 89.9%-100%). Increases in MMP1 and MMP7 were also observed in lung tissue and bronchoalveolar lavage fluid obtained from IPF patients. MMP7 and MMP1 plasma concentrations were not increased in patients with chronic obstructive pulmonary disease or sarcoidosis and distinguished IPF compared to subacute/chronic hypersensitivity pneumonitis, a disease that may mimic IPF, with a sensitivity of 96.3% (95% CI 81.0%-100%) and specificity of 87.2% (95% CI 72.6%-95.7%). We verified our results in an independent validation cohort composed of patients with IPF, familial pulmonary fibrosis, subclinical interstitial lung disease (ILD), as well as with control individuals. MMP7 and MMP1 concentrations were significantly higher in IPF patients compared to controls in this cohort. Furthermore, MMP7 concentrations were elevated in patients with subclinical ILD and negatively correlated with percent predicted forced vital capacity (FVC%) and percent predicted carbon monoxide diffusing capacity (DLCO%). CONCLUSIONS: Our experiments provide the first evidence for a peripheral blood protein signature in IPF to our knowledge. The two main components of this signature, MMP7 and MMP1, are overexpressed in the lung microenvironment and distinguish IPF from other chronic lung diseases. Additionally, increased MMP7 concentration may be indicative of asymptomatic ILD and reflect disease progression.
BACKGROUND:Idiopathic pulmonary fibrosis (IPF) is a chronic progressive fibrotic lung disease associated with substantial morbidity and mortality. The objective of this study was to determine whether there is a peripheral blood protein signature in IPF and whether components of this signature may serve as biomarkers for disease presence and progression. METHODS AND FINDINGS: We analyzed the concentrations of 49 proteins in the plasma of 74 patients with IPF and in the plasma of 53 control individuals. We identified a combinatorial signature of five proteins-MMP7, MMP1, MMP8, IGFBP1, and TNFRSF1A-that was sufficient to distinguish patients from controls with a sensitivity of 98.6% (95% confidence interval [CI] 92.7%-100%) and specificity of 98.1% (95% CI 89.9%-100%). Increases in MMP1 and MMP7 were also observed in lung tissue and bronchoalveolar lavage fluid obtained from IPF patients. MMP7 and MMP1 plasma concentrations were not increased in patients with chronic obstructive pulmonary disease or sarcoidosis and distinguished IPF compared to subacute/chronic hypersensitivitypneumonitis, a disease that may mimic IPF, with a sensitivity of 96.3% (95% CI 81.0%-100%) and specificity of 87.2% (95% CI 72.6%-95.7%). We verified our results in an independent validation cohort composed of patients with IPF, familial pulmonary fibrosis, subclinical interstitial lung disease (ILD), as well as with control individuals. MMP7 and MMP1 concentrations were significantly higher in IPF patients compared to controls in this cohort. Furthermore, MMP7 concentrations were elevated in patients with subclinical ILD and negatively correlated with percent predicted forced vital capacity (FVC%) and percent predicted carbon monoxide diffusing capacity (DLCO%). CONCLUSIONS: Our experiments provide the first evidence for a peripheral blood protein signature in IPF to our knowledge. The two main components of this signature, MMP7 and MMP1, are overexpressed in the lung microenvironment and distinguish IPF from other chronic lung diseases. Additionally, increased MMP7 concentration may be indicative of asymptomatic ILD and reflect disease progression.
Idiopathic pulmonary fibrosis (IPF), a progressive fibrotic interstitial lung disease (ILD) with median survival of 2.5–3 y, is largely unaffected by currently available medical therapies [1]. The disease is characterized by alveolar epithelial cell injury and activation, fibroblast/myofibroblast foci formation, and exaggerated accumulation of extracellular matrix in the lung parenchyma. Recent studies employing high-throughput genomic technologies to analyze samples from IPF patients or genetically modified animals have highlighted the complexity of the pathways involved in the disease (reviewed in [2-4]). While these studies have improved the understanding of the molecular mechanisms underlying lung fibrosis, they did not translate well into the clinical arena.Identification of peripheral blood biomarkers may facilitate the diagnosis and follow-up of patients with IPF as well as the implementation of new therapeutic interventions. Currently, establishing a diagnosis of IPF may require surgical lung biopsy in patients with atypical clinical presentations or high-resolution computed tomography (HRCT) scans. Patients with IPF are often evaluated by serial pulmonary physiology measurements and repeated radiographic examinations. These studies provide a general assessment of the extent of disease, but do not provide information about disease activity on a molecular level. Higher serum concentrations of surfactant proteins [5], KL-6 [6], FASL [7], CCL-2 [8], α-defensins [9], and most recently SPP1 [10] have been reported in patients with IPF and other ILDs, but most of these studies were modest in size and assayed only a single or a few protein markers simultaneously.In this study, we used a multianalyte protein assay system to simultaneously measure concentrations of 49 plasma proteins, including cytokines, chemokines, growth and angiogenic factors, matrix metalloproteases (MMPs), and markers of apoptosis in a derivation cohort comprised of IPF patients and healthy controls. We identified a combinatorial signature of five proteins; of these, we measured concentrations of two metalloproteases, MMP7 and MMP1, in other chronic lung diseases and compared them to the levels observed in IPF patients. Finally, the potential role of MMP7 and MMP1 as IPF peripheral blood biomarkers was tested in an independent validation cohort.
Methods
For detailed description of the methods used in this study, see Text S1.
Initial IPF Derivation Cohort
This study included 74 patients with IPF evaluated at the University of Pittsburgh Medical Center. The diagnosis of IPF was established on the basis of published criteria [11], and surgical lung biopsy when clinically indicated [12] (see Text S1). Clinical data were available through the Simmons Center database. Smoking status was defined as previously described [13]. Fifty-three control individuals were obtained from the pulmonary division sample collection core. Baseline demographic information is detailed in Table 1. The mean percent predicted forced vital capacity (FVC%) of IPF patients was 61.9 ± 20.8; mean percent predicted carbon monoxide diffusing capacity (DLCO%) was 42.1 ± 17.4.
Table 1
Derivation Cohort Patient Characteristics
Derivation Cohort Patient Characteristics
Chronic Obstructive Pulmonary Disease
Plasma samples from 73 patients with chronic obstructive pulmonary disease (COPD) evaluated at the University of Pittsburgh were available for this study. Individuals were clinically stable at the time of examination, had tobacco exposure of at least ten pack years, and had no clinical diagnosis of rheumatologic, infectious, or other systemic inflammatory disease. Disease severity was measured using the GOLD classification as previously described [14]. The COPD cohort included 13 patients with GOLD class 0–I, 21 patients with GOLD II, and 39 patients with GOLD III–IV.
Sarcoidosis
Plasma samples from 47 patients with sarcoidosis evaluated at the University of Pittsburgh Medical Center were tested. Patients with lung disease (n = 29) demonstrated an average FVC% of 76.7 ± 22.1, and average DLCO% of 72.9 ± 25.5. The diagnosis and staging of disease was determined according to American Thoracic Society and European Respiratory Society criteria, as previously described [15,16].
Hypersensitivity Pneumonitis
Serum samples from 41 patients with subacute/chronic hypersensitivitypneumonitis (HP) and 34 patients with IPF evaluated at Instituto Nacional de Enfermedades Respiratorias in Mexico were available for this study. Diagnosis of IPF and HP has been previously described for this cohort [17,18]. Briefly, HP patients showed the following features: (a) antecedent bird exposure and positive serum antibodies against avian antigens; (b) clinical and functional features of ILD; (c) HRCT showing diffuse centrilobular poorly defined micronodules, ground glass attenuation, focal air trapping, and mild to moderate fibrotic changes; and (d) greater than 35% lymphocytes in bronchoalveolar lavage (BAL) fluid. Forty-four percent of the patients had a surgical lung biopsy; in all cases lung histology was consistent with the diagnosis of HP. The average FVC% was 60.3 ± 15.3 for HP and 59.1 ± 17.2 for IPF patients.
Independent Validation Cohort
Serum samples from 20 control individuals, eight patients with subclinical idiopathic ILD, 16 patients with familial pulmonary fibrosis, and nine with sporadic IPF, evaluated at the Warren Grant Magnuson Clinical Center of the National Institutes of Health (NIH), were available for this study. Patients with subclinical disease were first-degree relatives of patients with familial pulmonary fibrosis; they were asymptomatic, with normal pulmonary function tests but HRCT findings consistent with early ILD. Familial pulmonary fibrosis was defined as previously described [19]. Normal volunteers were used as controls.These cohorts have been previously described by us [20,21]. Briefly, the mean FVC% values for patients with sporadic IPF and familial pulmonary fibrosis were 59.4 ± 19.7 and 75.7 ± 16.7, respectively. Eight patients with familial pulmonary fibrosis were diagnosed with early asymptomatic ILD using HRCT [21]; the mean FVC% in this group was 101.3 ± 10.1. Gender, age, ethnic origin, and smoking status for all groups are presented in Table 2.
Table 2
Validation Cohort Patient Characteristics
Validation Cohort Patient CharacteristicsLung tissue samples for microarray analysis were obtained through the University of Pittsburgh Health Sciences Tissue Bank as we previously described [22]. Twenty-three samples were obtained from surgical remnants of biopsies or lungs explanted from patients with IPF who underwent pulmonary transplant and 14 control normal lung tissues obtained from the disease free margins with normal histology of lung cancer resection specimens. The morphologic diagnosis of IPF was based on typical microscopic findings consistent with usual interstitial pneumonia [12,23]. All patients fulfilled the diagnostic criteria for IPF outlined by the American Thoracic Society and European Respiratory Society [11].All studies were approved by the Institutional Review Board at the University of Pittsburgh, the National Heart, Lung, and Blood Institute, or the National Institute of Respiratory Diseases, Mexico. Informed consent was obtained from all patients.
Blood Samples
Blood (45 ml) was drawn from participants using standardized phlebotomy procedures. Plasma or serum was separated by centrifugation, and all specimens were immediately aliquoted and frozen.
BAL
BAL was performed through flexible fiberoptic bronchoscopy as part of the diagnostic process, as we previously described [18,22,24]. Supernatants were kept at −70 °C until use. BAL samples from 22 IPF patients (age 62.2 ± 7.2 y) and ten normal controls (age 41.5 ± 5 y) were available for this study.
Multiplex Analysis
Assays were performed using Luminex xMAP technology (Luminex Corporation) in 96-well microplate format according to appropriate manufacturers' protocols (Invitrogen and R&D Systems), as previously described [25] and in Text S1.
Bead-Based Immunoassays
A 34-plex assay was performed for IL1A, IL1RA, IL1B, IL2, IL2R, IL4, IL5, IL6, IL7, IL8, IL10, IL12B, IL13, IL15, IL17, TNFA, IFNA, IFNG, GMCSF, EGF, VEGF, GCSF, FGF2, HGF, CXCL9, CXCL10, CCL2, CCL3, CCL4, CCL5, CCL11, TNFRS1A, TNFRS1B, and TRAIL-R2 (Invitrogen). MMP assays included MMP1, MMP2, MMP3, MMP7, MMP8, MMP9, MMP12, and MMP13 (R&D Systems).Assays for FAS, EGFR, FASL, Cyfra 21–1 (CKRT19 fragment), IGFBP1, and KLK10 were developed in our Pittsburgh Luminex Core Facility. The assays were validated as described [25].
ELISA
Quantitative sandwich enzyme immunoassay for humanMMP1, MMP7, and AGER was performed as recommended by the manufacturer (R&D Systems).
Oligonucleotide Microarray Experiments
Detailed information is provided in Text S1. Briefly, total RNA was used as a template for synthesis of cDNA as recommended by the manufacturer of the arrays (Agilent Technologies). The cDNA was used as a template to generate Cy3-labeled cRNA that was used for hybridization on Agilent Whole Human Genome 4X 44K multipack arrays (Agilent Technologies). After hybridization, scanning, and feature extraction, data files were imported into a microarray database and linked with updated gene annotations using SOURCE (http://genome-www5.stanford.edu/cgi-bin/SMD/source/sourceSearch) and then normalized using cyclic LOESS [26]. Differentially expressed genes were identified using significant analysis of microarrays (SAM) [27]. Probes corresponding to the 49 protein markers were identified through their gene symbols. Expression levels for the probes that corresponded to these markers were extracted. In the case of redundant probes, those with the highest expression level and with the lowest Q-value were selected for presentation.
Statistical Analysis
A protein was considered differentially expressed when there was a change of at least 25% in concentration and statistical significance at p < 0.05 corrected for multiple testing. Data are reported as mean ± standard deviation. The Wilcoxon rank-sum test was used to identify potential biomarkers that univariately distinguish IPF samples from controls. For multiple testing the Bonferroni method was used to control the family-wise error rate at 5%. Data were analyzed using the R language for statistical computing (http://www.r-project.org/) [28]. Classification and regression trees (CART) methodology was used to identify potential combinations of peripheral blood biomarkers that could be used to distinguish IPF from controls. CART was performed using the rpart package for recursive partitioning. Classification performance was assessed using the ROCR package (http://rocr.bioinf.mpi-sb.mpg.de/). For oligonucleotide array data analysis, we applied SAM [27]. Data visualization and clustering were performed using Genomica (http://genomica.weizmann.ac.il/index.html) [29] and Spotfire Decision Site 9 (TIBCO).
Results
Plasma Proteins Distinguish IPF Patients from Controls in Derivation Cohort
Of 49 markers analyzed, 48 are detectable in plasma (Figure 1A); univariate analysis identified 12 proteins that are differentially expressed in IPF compared to controls (Table 3). Five MMPs (MMP7, MMP1, MMP3, MMP8, MMP9), two chemokines (CXCL10, CCL11), FAS, IL12B, and the soluble TNF receptors (TNFRSF1A, TNFRSF1B) are significantly overexpressed; AGER is significantly underexpressed in plasma of patients with IPF compared to controls. MMP7 and MMP1, which have previously been shown to play a role in IPF pathogenesis, are the top-ranked proteins in univariate analysis (Table 3). Significant differences persist when age, gender, or smoking status is statistically controlled.
Figure 1
Peripheral Blood Proteins Distinguish IPF Patients from Controls
(A) Heatmap of proteins measured in the plasma of IPF and control patients. Columns, individual patients; rows, proteins. Every protein level was divided by the geometric mean of values for the same proteins for all patients and log based 2 transformed. Increasing shades of yellow, increased; increased shades of purple, decreased; gray, unchanged. Proteins were clustered using Genomica. Red vertical line, cluster of proteins increased in IPF; green vertical line, cluster of proteins decreased in IPF.
(B) Classification tree obtained by CART applied to plasma protein concentration data from IPF patients and controls. A blue box identifies a terminal node as control; a red box as IPF. All counts are listed as control/IPF. Concentrations are in ng/ml. In the subgroup with high MMP7 concentration but low MMP1 concentration (14 IPF samples, five control samples), splitting on IGFBP1 and TNFRSF1A improves classification, while in the subgroup with low MMP7, MMP8 improves classification.
(C) ROC curves for using each of five markers, or their combination, to classify samples as IPF or control. Sensitivity, or true positive rate, is plotted on the y-axis, and false positive rate, or 1 − specificity, on the x-axis. The area under each ROC curve is equivalent to the numerator of the Mann-Whitney U-statistic comparing the marker distributions between IPF and control samples. The magenta line labeled “Combined” is for the combinatorial classifier using all five markers. The identity line at 45 ° represents a marker that performed no better than classifying samples as IPF or control by flipping a fair coin.
Table 3
Plasma Proteins That Distinguish IPF from Controls
Peripheral Blood Proteins Distinguish IPF Patients from Controls
(A) Heatmap of proteins measured in the plasma of IPF and control patients. Columns, individual patients; rows, proteins. Every protein level was divided by the geometric mean of values for the same proteins for all patients and log based 2 transformed. Increasing shades of yellow, increased; increased shades of purple, decreased; gray, unchanged. Proteins were clustered using Genomica. Red vertical line, cluster of proteins increased in IPF; green vertical line, cluster of proteins decreased in IPF.(B) Classification tree obtained by CART applied to plasma protein concentration data from IPF patients and controls. A blue box identifies a terminal node as control; a red box as IPF. All counts are listed as control/IPF. Concentrations are in ng/ml. In the subgroup with high MMP7 concentration but low MMP1 concentration (14 IPF samples, five control samples), splitting on IGFBP1 and TNFRSF1A improves classification, while in the subgroup with low MMP7, MMP8 improves classification.(C) ROC curves for using each of five markers, or their combination, to classify samples as IPF or control. Sensitivity, or true positive rate, is plotted on the y-axis, and false positive rate, or 1 − specificity, on the x-axis. The area under each ROC curve is equivalent to the numerator of the Mann-Whitney U-statistic comparing the marker distributions between IPF and control samples. The magenta line labeled “Combined” is for the combinatorial classifier using all five markers. The identity line at 45 ° represents a marker that performed no better than classifying samples as IPF or control by flipping a fair coin.Plasma Proteins That Distinguish IPF from ControlsTo determine whether combinations of these plasma proteins correctly classify IPF patients, we applied recursive partitioning to the entire set of 49 markers and found that plasma protein profiles clearly distinguish IPF patients from normal controls. CART analysis showed that MMP7 and MMP1, in addition to being the two most significant biomarkers, are key components of a combinatorial classifier that also includes MMP8, IGFBP-1, and TNFRS1A (Figure 1B). Sensitivity and specificity of the classifier are 98.6% (95% confidence interval [CI] 92.7%–100%) and 98.1% (95% CI 89.9%–100%), respectively. High concentrations of MMP7 alone (≥1.99 ng/ml) correctly classify 69 of 74 IPF patients (93.2%) but incorrectly classify five normal samples as IPF and five IPF samples as controls, whereas the combination of high plasma concentrations of both MMP7 (≥1.99 ng/ml) and MMP1 (≥2.15 ng/ml) excludes all controls. Thus the combination of high MMP7 and high MMP1 concentrations can distinguish IPF patients from controls. Receiver operating characteristic curves (ROCs) (Figure 1C) confirm that MMP7 is the best univariate classifier, although the combination of five markers performs somewhat better (Figure 1C), as does the combination of MMP7 and MMP1 (unpublished data).
MMP7 and MMP1 Are Increased in the Lung and BAL Fluid of Patients with IPF
To determine whether protein concentration differences in peripheral blood reflect gene expression differences present in the lung, we analyzed gene expression patterns in 23 IPF and 14 control lungs using oligonucleotide microarrays (Figure 2A). Of the five plasma proteins in the CART plasma signature (Figure 1B), only the genes for MMP7 and MMP1 are significantly overexpressed in IPF lungs compared to controls (SAM Q value = 0 for both genes; 7.3- and 15.7-fold increase, respectively). Of the ten other proteins that are significantly different in the plasma of patients with IPF (Table 3), the genes for MMP3, AGER, and IL12B are also significantly differentially expressed in IPF lungs (Figure 2A).
Figure 2
MMP7 and MMP1 Gene and Protein Levels Are Significantly Increased in the Lungs of Patients with IPF
(A) Average gene expression levels (log scale) measured using gene expression microarrays of genes that encode the 49 protein markers in IPF lungs (y-axis) compared to control lungs (x-axis). Colored squares (black or red) are genes that encode proteins that changed significantly in plasma. Red squares are genes that changed significantly (SAM Q value <5%) in gene expression data and that encode proteins measured in peripheral blood. Green oblique lines denote 2-fold change.
(B and C) MMP7 (B) and MMP1 (C) concentrations (ng/ml) are significantly (p < 0.00001 and p = 0.018, respectively) higher in BAL fluid of patients with IPF (n = 22) compared to control individuals (n = 10).
MMP7 and MMP1 Gene and Protein Levels Are Significantly Increased in the Lungs of Patients with IPF
(A) Average gene expression levels (log scale) measured using gene expression microarrays of genes that encode the 49 protein markers in IPF lungs (y-axis) compared to control lungs (x-axis). Colored squares (black or red) are genes that encode proteins that changed significantly in plasma. Red squares are genes that changed significantly (SAM Q value <5%) in gene expression data and that encode proteins measured in peripheral blood. Green oblique lines denote 2-fold change.(B and C) MMP7 (B) and MMP1 (C) concentrations (ng/ml) are significantly (p < 0.00001 and p = 0.018, respectively) higher in BAL fluid of patients with IPF (n = 22) compared to control individuals (n = 10).To determine whether MMP7 and MMP1 proteins are secreted into the alveolar microenvironment, we measured their concentrations in BAL obtained from 22 patients with IPF and ten control individuals. MMP7 and MMP1 BAL concentrations are significantly higher in IPF patients when compared to controls (p < 0.00001 and p = 0.018, respectively) (Figure 2B and 2C). Hence, elevated MMP7 and MMP1 levels in the lung microenvironment are the most likely source for their increased concentrations in peripheral blood.
MMP7 and MMP1 Are Not Increased in Patients with COPD or Sarcoidosis
To determine whether concentrations of MMP7 and MMP1 are increased in other common chronic lung diseases, we measured plasma concentrations in patients affected with sarcoidosis or COPD. The 47 sarcoidosispatients were stratified into those with evidence for parenchymal lung disease (stage 2 or greater; n = 29) and those with no lung parenchymal involvement (n = 18). As shown in Figure 3, there are no significant differences in plasma concentrations of MMP7 (p = 0.78) (Figure 3A) or MMP1 (p = 0.27) (Figure 3B) between the sarcoidosis groups with or without lung abnormalities when compared to controls. COPDparticipants were grouped by GOLD class, into 0–I (n = 13), II (n = 21), and III–IV (n = 39). No significant differences are found in plasma concentrations of MMP7 (p = 0.21) or MMP1 (p = 0.85) between groups of COPDpatients stratified by GOLD class (Figure 3A and 3B, respectively).
Figure 3
MMP7 and MMP1 Plasma Concentrations Are High in IPF, but Not Sarcoidosis or COPD
Concentrations (ng/ml) of MMP7 (A) and MMP1 (B) are significantly higher in patients with IPF (n = 74; p < 0.00001 and p = 0.018, respectively), compared to controls (n = 53), but not sarcoidosis (n = 47; p = 0.78 and p = 0.28, respectively), compared to controls (n = 53) or COPD (n = 73; p = 0.21 and 0.85, respectively, stratified by GOLD class, as 0–I, II, and III–IV).
MMP7 and MMP1 Plasma Concentrations Are High in IPF, but Not Sarcoidosis or COPD
Concentrations (ng/ml) of MMP7 (A) and MMP1 (B) are significantly higher in patients with IPF (n = 74; p < 0.00001 and p = 0.018, respectively), compared to controls (n = 53), but not sarcoidosis (n = 47; p = 0.78 and p = 0.28, respectively), compared to controls (n = 53) or COPD (n = 73; p = 0.21 and 0.85, respectively, stratified by GOLD class, as 0–I, II, and III–IV).
MMP7 and MMP1 Are Significantly Higher in the Serum of Patients with IPF Compared to Patients with HP
To determine whether peripheral blood concentrations of MMP7 and MMP1 distinguish IPF from other common forms of ILD, we measured their levels in 41 patients with HP and 34 patients with IPF. Univariately, serum concentrations of MMP7 (p = 0.01) and MMP1 (p < 0.001) are significantly higher in IPF compared to HP; fold changes for MMP1 and MMP7 are 2.3 and 1.31, respectively (Figure 4A and 4B).
Figure 4
MMP7 and MMP1 Serum Concentrations Are Higher in IPF, Compared to HP
(A and B) Concentrations (ng/ml) of MMP7 (A) and MMP1 (B) in the blood are significantly higher in patients with IPF (n = 34) than in patients with HP (n = 41).
(C) Average gene expression levels (log scale) in IPF samples (y-axis) compared to HP (x-axis) measured by gene expression microarrays. Gray circles, all genes on the array; red circles, MMP1 and MMP7. Green oblique lines denote 2-fold change.
(D) Combinations of serum MMP7 (y-axis) and MMP1 concentrations (x-axis) in IPF (closed circles) and HP patients (open circles). Corners represent points in which the trade-off between positive predictive value (PPV) and negative predictive value (NPV) are optimal for ruling out IPF (blue) or concluding IPF (red) on the basis of MMP1 and MMP7 concentrations.
(E) ROC curves for using MMP1 or MMP7, or their combination, to classify samples as IPF or HP. Sensitivity, or true positive rate, is plotted on the y-axis, and false positive rate, or 1 − specificity, on the x-axis. The identity line at 45 ° represents a marker that performed no better than classifying samples as IPF or HP by flipping a fair coin.
MMP7 and MMP1 Serum Concentrations Are Higher in IPF, Compared to HP
(A and B) Concentrations (ng/ml) of MMP7 (A) and MMP1 (B) in the blood are significantly higher in patients with IPF (n = 34) than in patients with HP (n = 41).(C) Average gene expression levels (log scale) in IPF samples (y-axis) compared to HP (x-axis) measured by gene expression microarrays. Gray circles, all genes on the array; red circles, MMP1 and MMP7. Green oblique lines denote 2-fold change.(D) Combinations of serum MMP7 (y-axis) and MMP1 concentrations (x-axis) in IPF (closed circles) and HP patients (open circles). Corners represent points in which the trade-off between positive predictive value (PPV) and negative predictive value (NPV) are optimal for ruling out IPF (blue) or concluding IPF (red) on the basis of MMP1 and MMP7 concentrations.(E) ROC curves for using MMP1 or MMP7, or their combination, to classify samples as IPF or HP. Sensitivity, or true positive rate, is plotted on the y-axis, and false positive rate, or 1 − specificity, on the x-axis. The identity line at 45 ° represents a marker that performed no better than classifying samples as IPF or HP by flipping a fair coin.Similar results are observed in a reanalysis of a previously published DNA microarray dataset comparing gene expression in lung tissue obtained from IPF and HP patients [18]. In this reanalysis, MMP7 and MMP1 levels are significantly higher in IPF compared to HP (false discovery rate [FDR] < 5%), however, as observed in the peripheral blood, the change in MMP7 levels is moderate when compared to the increase in MMP1 (Figure 4C).Combinations of serum MMP1 and MMP7 concentrations have positive predictive values for determining that a patient has IPF ranging from 91% (MMP7 > 2.6 ng/ml and MMP1 > 8.9 ng/ml) to 66%, and negative predictive value (ruling out IPF) ranging from 96% (MMP7 < 2.9 ng/ml and MMP1 > 3.5ng/ml) to 70% (Figure 4D). Additionally, the combination of high MMP7 and high MMP1 peripheral blood concentrations distinguish IPF from HP with 96.3% sensitivity (95% CI 81.0%–100%) and 87.2% specificity (95% CI 72.6%–95.7%) (Figure 4E), further supporting that MMP1 in combination with MMP7 distinguishes IPF from HP.
MMP7 and MMP1 Are Significantly Higher in the Serum of an Independent Validation Cohort
To verify our findings, we measured serum concentrations of MMP7 and MMP1 in an independent validation cohort comprised of patients affected with IPF, familial pulmonary fibrosis, or subclinical ILD, and control individuals. This cohort has been recently described by us [21]. Even though concentrations were measured in serum and not plasma, significantly higher concentrations of MMP7 and MMP1 are found in patients with pulmonary fibrosis compared to controls (p < 0.001 and p = 0.01, respectively). Notably, serum MMP7 concentrations in patients with subclinical ILD are significantly higher compared to control individuals (p = 0.019) and significantly lower compared to patients with full-blown IPF (p < 0.0001) (Figure 5A), suggesting that MMP7 may serve as a biomarker for disease progression. There is no significant difference in MMP7 concentrations between patients with familial or sporadic IPF, consistent with the findings of Yang et al. [30].
Figure 5
MMP7 Concentrations Significantly Distinguish Control from Subclinical ILD, Familial, or Sporadic IPF
(A) Dark solid lines show median concentrations in each group. The interquartile range (IR) or middle 50% of concentrations is delimited by a box. Data are expressed on a log base 2 scale.
(B) ROC curves for using MMP1 or MMP7, or their combination, to classify samples as IPF (sporadic or familial) or control in validation cohort.
(C and D) Serum MMP7 concentrations moderately correlate with decreases in FVC% (C) and DLCO% (D). Linear regressions and 95% CI inversely relate MMP7 concentration (ng/ml) to FVC% and DLCO%. *p < 0.05, **p < 0.01, and ***p < 0.001.
MMP7 Concentrations Significantly Distinguish Control from Subclinical ILD, Familial, or Sporadic IPF
(A) Dark solid lines show median concentrations in each group. The interquartile range (IR) or middle 50% of concentrations is delimited by a box. Data are expressed on a log base 2 scale.(B) ROC curves for using MMP1 or MMP7, or their combination, to classify samples as IPF (sporadic or familial) or control in validation cohort.(C and D) Serum MMP7 concentrations moderately correlate with decreases in FVC% (C) and DLCO% (D). Linear regressions and 95% CI inversely relate MMP7 concentration (ng/ml) to FVC% and DLCO%. *p < 0.05, **p < 0.01, and ***p < 0.001.In this cohort, elevated MMP1 concentrations combined with high concentrations of MMP7 can distinguish IPF from controls with 89.2% sensitivity (95% CI 71.8%–91.7%) and 95.0% specificity (95% CI 75.1%–99.9%), supporting the findings in our derivation cohort (Figure 5B).
MMP7 Concentrations Correlate Moderately with Disease Severity
To determine whether concentrations of MMP7 or MMP1 correlate with disease severity, we compared pulmonary function measurements with serum concentrations of MMP7 and MMP1 in the validation cohort. We found a significant correlation between higher MMP7 concentrations and disease severity as measured by FVC% (Figure 5C) and DLCO% (Figure 5D). Fitted models predict a decline of 4.1% in DLCO% (p = 0.002, r = −0.53) and 4.0% in FVC% (p = 0.002, r = −0.51) for each increment of 1 ng/ml in serum MMP7. We did not find any statistically significant correlation between MMP1 concentrations and pulmonary function measurements (unpublished data).
Discussion
Overall, our study demonstrates the first evidence for a peripheral blood protein signature in IPF patients to our knowledge. MMP7 and MMP1, two matrix metalloproteases previously implicated in the pathogenesis of IPF [31], are significantly increased in plasma, serum, BAL fluid, and lung tissue of IPF patients, suggesting that increased MMP7 and MMP1 levels in the peripheral blood are indicative of the pathologic changes that characterize the IPF alveolar microenvironment. Used in combination, blood levels of MMP1 and MMP7 can distinguish IPF patients from diverse types of chronic lung disease including HP, a common interstitial pneumonia that can sometimes be indistinguishable from IPF [32-34]. Increases in MMP7 blood concentrations are observed in patients with subclinical familial pulmonary fibrosis, and higher levels of MMP7 are associated with disease severity. Taken together our findings support the use of MMP1 and MMP7 as IPF biomarkers and suggest that their role in diagnosis, early detection, and monitoring of disease progression should be further investigated.Multiple MMPs are among the 12 proteins significantly increased in the blood of IPF patients. The roles of MMPs have been intensively studied and debated in IPF [35]. While multiple and often contrasting roles have been proposed for MMPs in regulating abnormal epithelial response to injury, fibroblast proliferation, extracellular matrix accumulation, and aberrant tissue remodeling, the consensus is that this family of matrix degrading enzymes is involved in disease pathogenesis [31,36-40]. The two top-ranked proteins in this study are MMPs known to be significantly overexpressed in the activated alveolar epithelium in IPF lungs. MMP1, a matrix metalloprotease that primarily degrades fibrillar collagen, is rarely expressed under normal conditions, but is highly overexpressed in reactive alveolar epithelial cells in IPF lungs [39]. MMP7, a matrix metalloprotease with multiple local inflammatory regulatory roles [41,42], is also highly upregulated in alveolar epithelial cells in IPF [39,43]. Furthermore, MMP7 knockout mice are relatively protected from bleomycin-induced fibrosis [39], suggesting that MMP7 may have a profibrotic effect in IPF. Taken in the above context, our results strongly suggest that activated epithelial cells in IPF lungs are the likely source of elevated peripheral blood concentrations of MMP1 and MMP7, thus supporting their use as biomarkers for disease detection and progression.Our data show that neither patients with COPD, a chronic progressive lung disease, nor patients with sarcoidosis, a chronic granulomatous ILD, express significantly increased peripheral blood concentrations of MMP7 or MMP1. Further, elevated peripheral blood MMP1 concentrations, in the presence of elevated MMP7 concentrations, distinguish IPF from HP. A similar trend in gene expression of MMP7 and MMP1 is found in the lungs of patients with IPF and HP, further supporting the notion that the changes in peripheral blood concentrations of MMP7 and MMP1 are reflective of the lung gene environment and constitute a disease-specific signal. This finding may be very important clinically, because subacute HP is frequently misdiagnosed as idiopathic nonspecific interstitial pneumonia (NSIP), and in its chronic advanced form HP can be undistinguishable from IPF [32-34]. In fact, recent studies have demonstrated that histopathologic and HRCT abnormalities observed in chronic HP often overlap with those of usual interstitial pneumonia (UIP), representing an important challenge to the differential diagnosis of these conditions [33,34,44]. Thus, the elevated peripheral blood concentrations of MMP7 and MMP1 observed in IPF are not due to a systemic stress response to a chronic lung disease and distinguish COPD, sarcoidosis, and HP from IPF. While we do not advocate at this stage relying solely on peripheral blood concentrations of MMP7 and MMP1 in distinguishing IPF from HP, sarcoidosis, or the less difficult differential diagnosis of COPD, it seems likely that knowing these concentrations will impact clinical decision-making.We did not compare IPF to other idiopathic interstitial pneumonias such as NSIP. There is nothing in our data to suggest that we can distinguish IPF from these diseases using MMP7 and MMP1 peripheral blood concentrations. In fact the finding of elevated MMP7 in patients with subclinical ILD may be indicative that this increase may be present in other idiopathic ILDs. Furthermore, gene expression patterns were found to be extremely similar in IPF and NSIP [30,45], and BAL MMP7 levels were also recently found to be similar in patients with IPF and NSIP [46]. The major limitation in these studies was the small number of cases with NSIP because of the substantial rarity of isolated NSIP. Therefore our results should encourage the establishment of multicenter collections of peripheral blood samples of patients with ILD with sufficient power to determine whether NSIP and IPF differ in their peripheral blood protein expression.In comparison to other studies, major attributes of our analysis include the relatively large size of our derivation cohort and the large number of proteins assayed in this cohort of patients with IPF, the comparison of peripheral blood biomarker levels with their gene expression levels in the lungs and BAL, the comparison with multiple relatively large control populations with other chronic lung diseases to establish specificity of our findings, and the verification of our initial results in an independent validation cohort. A unique feature of our validation cohort is that it contains patients with subclinical ILD who are asymptomatic first-degree relatives of patients affected with familial IPF. These individuals have HRCT findings of early ILD, but do not have pulmonary function abnormalities, cough, or dyspnea [19,21]. Analysis of samples from this cohort allowed us to demonstrate that MMP7 concentrations are significantly higher in patients with early subclinical lung disease, suggesting that MMP7 may be a marker for early asymptomatic ILD. Peripheral blood concentrations of MMP7 also correlate with pulmonary function tests, which are surrogate measures of disease severity and thus may reflect molecular mechanisms of lung remodeling in IPF [31]. Naturally, the use of different platforms and different sample types limits our ability at this stage to set a disease-specific MMP concentration threshold. However, the reproducibility and concordance of our results across different sample types and in multiple cohorts suggest that such a threshold can and should be determined.In conclusion, in this study we report for the first time to our knowledge the presence of a peripheral blood protein signature in a disease that is confined to the lung. This signature is composed of MMPs, TNF receptors, and some chemokines. Our data demonstrate that peripheral blood increases in two of these markers (MMP1 and MMP7) are also observed in lung and may be specific to IPF. We provide verification of our observations in an independent validation cohort and show that MMP7 correlates with disease severity and is increased in patients with subclinical ILD. While additional studies will determine the value of this protein signature in clinical practice, our results support a potential value of peripheral blood proteins as biomarkers in an organ-confined disease such as IPF. If validated, these biomarkers have the potential to greatly facilitate the introduction of new therapies in IPF and to profoundly affect the management of these patients.
Russian Translation of the Abstract by Anna E. Lokshin
(24 KB DOC)Click here for additional data file.
Spanish Translation of the Abstract by Moises Selman
(42 KB DOC)Click here for additional data file.
Japanese Translation of the Abstract by Kazuhisa Konishi
(22 KB DOC)Click here for additional data file.
Chinese Translation of the Abstract
(38 KB DOC)Click here for additional data file.
Hebrew Translation of the Abstract
(69 KB DOC)Click here for additional data file.
Supplementary Methods
(148 KB DOC)Click here for additional data file.
Supporting Information
Accession Numbers
The Entrez Gene IDs (http://www.ncbi.nlm.nih.gov/sites/entrez) of the proteins discussed in this paper are: AGER, 177; CCL11, 6356; CXCL10, 3627; IL12B, 3593; IGFBP1, 3484; MMP1, 4312; MMP3, 4314; MMP7, 4316; MMP8, 4317; MMP9, 5318; TNFRSF1A, 7132; TNFRSF1B, 7133.
Authors: J E Gadek; J A Kelman; G Fells; S E Weinberger; A L Horwitz; H Y Reynolds; J D Fulmer; R G Crystal Journal: N Engl J Med Date: 1979-10-04 Impact factor: 91.245
Authors: Fengrong Zuo; Naftali Kaminski; Elsie Eugui; John Allard; Zohar Yakhini; Amir Ben-Dor; Lance Lollini; David Morris; Yong Kim; Barbara DeLustro; Dean Sheppard; Annie Pardo; Moises Selman; Renu A Heller Journal: Proc Natl Acad Sci U S A Date: 2002-04-30 Impact factor: 11.205
Authors: K E Greene; T E King; Y Kuroki; B Bucher-Bartelson; G W Hunninghake; L S Newman; H Nagae; R J Mason Journal: Eur Respir J Date: 2002-03 Impact factor: 16.671
Authors: H Mukae; H Iiboshi; M Nakazato; T Hiratsuka; M Tokojima; K Abe; J Ashitani; J Kadota; S Matsukura; S Kohno Journal: Thorax Date: 2002-07 Impact factor: 9.139
Authors: Yan Y Sanders; Namasivayam Ambalavanan; Brian Halloran; Xiangyu Zhang; Hui Liu; David K Crossman; Molly Bray; Kui Zhang; Victor J Thannickal; James S Hagood Journal: Am J Respir Crit Care Med Date: 2012-06-14 Impact factor: 21.405
Authors: J S Swaney; C Chapman; L D Correa; K J Stebbins; R A Bundey; P C Prodanovich; P Fagan; C S Baccei; A M Santini; J H Hutchinson; T J Seiders; T A Parr; P Prasit; J F Evans; D S Lorrain Journal: Br J Pharmacol Date: 2010-08 Impact factor: 8.739
Authors: Cecilia Marmai; Rachel E Sutherland; Kevin K Kim; Gregory M Dolganov; Xiaohui Fang; Sophia S Kim; Shuwei Jiang; Jeffery A Golden; Charles W Hoopes; Michael A Matthay; Harold A Chapman; Paul J Wolters Journal: Am J Physiol Lung Cell Mol Physiol Date: 2011-04-15 Impact factor: 5.464
Authors: Revathi Rajkumar; Kazuhisa Konishi; Thomas J Richards; David C Ishizawar; Andrew C Wiechert; Naftali Kaminski; Ferhaan Ahmad Journal: Am J Physiol Heart Circ Physiol Date: 2010-01-15 Impact factor: 4.733