| Literature DB >> 26581097 |
Allan R Brasier1,2,3, Yingxin Zhao1,2,3, Heidi M Spratt2,3,4, John E Wiktorowicz2,3,5, Hyunsu Ju2,4, L Joseph Wheat6, Lindsey Baden7, Susan Stafford8, Zheng Wu8, Nicolas Issa7, Angela M Caliendo9, David W Denning10, Kizhake Soman3,5, Cornelius J Clancy11, M Hong Nguyen12, Michele W Sugrue12, Barbara D Alexander13, John R Wingard12.
Abstract
Invasive pulmonary aspergillosis (IPA) is an opportunistic fungal infection in patients undergoing chemotherapy for hematological malignancy, hematopoietic stem cell transplant, or other forms of immunosuppression. In this group, Aspergillus infections account for the majority of deaths due to mold pathogens. Although early detection is associated with improved outcomes, current diagnostic regimens lack sensitivity and specificity. Patients undergoing chemotherapy, stem cell transplantation and lung transplantation were enrolled in a multi-site prospective observational trial. Proven and probable IPA cases and matched controls were subjected to discovery proteomics analyses using a biofluid analysis platform, fractionating plasma into reproducible protein and peptide pools. From 556 spots identified by 2D gel electrophoresis, 66 differentially expressed post-translationally modified plasma proteins were identified in the leukemic subgroup only. This protein group was rich in complement components, acute-phase reactants and coagulation factors. Low molecular weight peptides corresponding to abundant plasma proteins were identified. A candidate marker panel of host response (9 plasma proteins, 4 peptides), fungal polysaccharides (galactomannan), and cell wall components (β-D glucan) were selected by statistical filtering for patients with leukemia as a primary underlying diagnosis. Quantitative measurements were developed to qualify the differential expression of the candidate host response proteins using selective reaction monitoring mass spectrometry assays, and then applied to a separate cohort of 57 patients with leukemia. In this verification cohort, a machine learning ensemble-based algorithm, generalized pathseeker (GPS) produced a greater case classification accuracy than galactomannan (GM) or host proteins alone. In conclusion, Integration of host response proteins with GM improves the diagnostic detection of probable IPA in patients undergoing treatment for hematologic malignancy. Upon further validation, early detection of probable IPA in leukemia treatment will provide opportunities for earlier interventions and interventional clinical trials.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26581097 PMCID: PMC4651335 DOI: 10.1371/journal.pone.0143165
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Schematic view of panel development for pulmonary IPA.
Schematic view of strategy for discovery, qualification, and verification of panel-based classifiers. The BAP fractionation platform fractionates proteins and peptides for analysis by automated size exclusion chromatography (SEC). Candidate biomarkers were assembled based on proteins identified in the discovery phase and by previous studies. For each candidate, targeted proteomics assays using stable isotope dilution (SID)-selected reaction monitoring (SRM) assays were developed, standardized, and used to quantitate the abundance of the candidate biomarker in the discovery population (qualification). Nonparametric statistical filters were used to identify 15 host response proteins/peptides and 2 fungal polysaccharides. SID-SRM-MS measurement of host-response proteins and peptides were used to test RF classifier performance.
Characteristics of study cohorts.
| Cohort | Disease | Age, years (mean ± S.D.) | GenderNo. (M/F) | Underlying Disease, number (Leukemia/Other) | Absolute Neutrophil Count (billion cells/L) (Median (IQR)) |
|---|---|---|---|---|---|
|
| |||||
| ControlN = 34 | 48.35 ± 14.79 yr | 20/14 | 25/9 | 1.3 (0.3, 3.9) | |
| AutoN = 17 | 54.17 ± 11.27 yr | 10/7 | 9/8 | 1.3 (0.4, 3) | |
| CaseN = 34 | 56.97 ± 10.87 yr | 20/14 | 21/13 | 0.3 (0, 1.8) | |
| Total SamplesN = 85 | 52.96 ± 13.10 yr | 50/35 | 55/30 | 1.0 (0.1, 3.1) | |
| Total Unique IndividualsN = 68 | 52.66 ± 13.59 yr | 40/28 | 46/22 | 0.9 (0.1, 3.0) | |
|
| |||||
| ControlN = 20 | 49.2 ± 16.68 yr | 13/7 | 20/0 | 2.5 (0.7, 5.3) | |
| CaseN = 17 | 57.12 ± 12.18 yr | 10/7 | 17/0 | 0.0 (0.0, 0.8) | |
| TotalN = 37 | 52.84 ± 15.12 yr | 23/14 | 37/0 | 1.1 (0.0, 3.1) | |
|
| |||||
| ControlN = 30 | 50.27 ± 15.17 yr | 23/7 | 30/0 | 1.5 (0.1, 3.7) | |
| CaseN = 27 | 55.07 ± 13.31 yr | 20/7 | 27/0 | 0.1 (0.0, 1.7) | |
| TotalN = 57 | 52.54 ± 14.4 yr | 43/14 | 57/0 | 0.4 (0.0, 3.0) |
Treatment regimens for case and controls, by diagnosis.
| TREATMENT CATEGORY | IPA CASES | MATCHED CONTROLS |
|---|---|---|
|
| 22 | 29 |
|
| 21 | 29 |
|
| 1 | 0 |
|
| 37 | 34 |
|
| 24 | 28 |
|
| 5 | 1 |
|
| 3 | 1 |
|
| 2 | 1 |
|
| 1 | 2 |
|
| 1 | 1 |
|
| 1 | 0 |
|
| 1 | 1 |
Protein Identification.
| NO. | PROTEIN NAME | ACC NO | PI | MW | MS EXPEC. VALUE | P-VALUE | ABUNDANCE RATIO (CASE VS CONTROL) |
|---|---|---|---|---|---|---|---|
|
| Fibrinogen beta chain | P02675 | 7.66 | 51 | 1.26E-16 | 0.030148 | 1.16 |
|
| Fibrinogen beta chain | P02675 | 5.99 | 49 | 3.97E-14 | 0.049690 | 1.14 |
|
| Alpha-mannosidase 2 | Q16706 | 7.51 | 36 | 1.99E+01 | 0.003111 | -1.41 |
|
| Ig kappa chain V-III region | P01620 | 8.19 | 35 | 9.98E-12 | 0.000343 | -1.54 |
|
| Ig kappa chain V-III region | P01620 | 7.15 | 35 | 1.58E-04 | 0.000673 | -1.82 |
|
| Ferritin light chain | P02792 | 7.04 | 31 | 3.97E-13 | 0.000492 | 2.12 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 8.12 | 17 | 1.58E-24 | 0.001784 | 1.71 |
|
| Complement factor B | P00751 | 7.31 | 73 | 7.92E-09 | 0.037614 | -1.31 |
|
| Hemopexin HPX | P02790 | 7.38 | 31 | 6.29E-03 | 0.005740 | -1.36 |
|
| Serum amyloid A-4 protein | P35542 | 3.76 | 25 | 5.00E-01 | 0.001691 | -1.44 |
|
| Fibrinogen alpha chain | P02671 | 5.37 | 19 | 6.29E-37 | 0.031877 | 1.25 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 8.21 | 17 | 1.26E-26 | 0.002389 | 1.43 |
|
| Fibrinogen alpha chain | P02671 | 3.96 | 12 | 1.99E-14 | 0.005660 | -1.37 |
|
| Complement C3 | P01024 | 5.60 | 48 | 1.58E-30 | 0.005304 | -1.31 |
|
| Complement C4-A | P0C0L4 | 7.25 | 42 | 1.99E+00 | 0.013122 | -1.35 |
|
| Histidine protein methyltransferase 1 homolog METTL18 | O95568 | 7.55 | 31 | 3.97E+01 | 0.043937 | -2.32 |
|
| Hemopexin HPX | P02790 | 5.22 | 30 | 2.51E-04 | 0.010053 | -1.28 |
|
| Hemopexin HPX | P02790 | 6.35 | 30 | 1.58E+01 | 0.037373 | -1.28 |
|
| Keratin, type II cytoskeletal 1 | P04264 | 9.26 | 29 | 3.15E-03 | 0.043690 | -1.12 |
|
| Serum amyloid A-4 protein | P35542 | 6.68 | 26 | 1.58E-05 | 0.047502 | -1.31 |
|
| MEF2-activating motif and SAP domain-containing transcriptional regulator | Q6ZN01 | 8.10 | 20 | 2.51E+01 | 0.046841 | -1.58 |
|
| Apolipoprotein A-II | P02652 | 7.03 | 20 | 6.29E-05 | 0.021182 | -1.36 |
|
| Alpha-1-antichymotrypsin | P01011 | 5.00 | 19 | 1.99E-60 | 0.005314 | 1.38 |
|
| Alpha-1-antichymotrypsin | P01011 | 7.74 | 19 | 9.98E-60 | 0.020735 | 1.33 |
|
| Alpha-1-antichymotrypsin | P01011 | 5.51 | 19 | 3.15E-19 | 0.004400 | 1.60 |
|
| Alpha-1-antitrypsin | P01009 | 4.87 | 17 | 9.98E-01 | 0.003011 | 1.33 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 6.33 | 17 | 6.29E-34 | 0.008686 | 1.29 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 9.18 | 17 | 2.51E-36 | 0.009977 | 1.37 |
|
| Alpha-1-antitrypsin | P01009 | 3.85 | 16 | 1.26E-12 | 0.031016 | 1.29 |
|
| Alpha-1-antitrypsin | P01009 | 4.81 | 16 | 1.26E-23 | 0.042439 | 1.37 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 5.05 | 14 | 1.58E-05 | 0.001129 | 1.54 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 7.82 | 14 | 1.58E-01 | 0.003841 | 1.51 |
|
| Apolipoprotein A-I | P02647 | 8.83 | 13 | 2.51E-20 | 0.024695 | -1.35 |
|
| Apolipoprotein A-I | P02647 | 7.59 | 13 | 3.97E-06 | 0.006216 | -1.44 |
|
| Apolipoprotein A-I | P02647 | 6.74 | 12 | 3.15E-33 | 0.021568 | -1.33 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 4.81 | 12 | 9.98E-08 | 0.001122 | 1.51 |
|
| Serum albumin | P02768 | 6.70 | 35 | 3.97E-06 | 0.022091 | 1.26 |
|
| Alpha-1-antichymotrypsin | P01011 | 5.60 | 19 | 3.97E-55 | 0.036387 | 1.35 |
|
| Alpha-1-antichymotrypsin | P01011 | 5.75 | 19 | 1.58E-48 | 0.041962 | 1.42 |
|
| Alpha-1-antichymotrypsin | P01011 | 6.14 | 19 | 1.26E-48 | 0.006482 | 1.43 |
|
| Alpha-1-antichymotrypsin | P01011 | 5.25 | 19 | 9.98E-47 | 0.004255 | 1.50 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 8.99 | 17 | 2.51E-32 | 0.002945 | 1.35 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 9.09 | 17 | 3.97E-30 | 0.004495 | 1.33 |
|
| Leucine-rich alpha-2-glycoprotein | P02750 | 5.90 | 17 | 3.97E-34 | 0.004792 | 1.31 |
|
| Alpha-1-antitrypsin | P01009 | 3.52 | 16 | 9.98E-40 | 0.045945 | 1.33 |
|
| Alpha-1-antitrypsin | P01009 | 8.16 | 16 | 3.15E-33 | 0.000936 | 1.48 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 4.57 | 14 | 3.15E-23 | 0.014537 | 1.51 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 3.55 | 14 | 3.97E-19 | 0.005862 | 1.48 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 3.99 | 12 | 9.98E-21 | 0.002073 | 1.45 |
|
| Alpha-1-acid glycoprotein 1 | P02763 | 4.64 | 12 | 3.15E-19 | 0.021949 | 1.39 |
|
| 12345 Serum albumin | P02768 | 9.47 | 109 | 6.29E-42 | 0.027907 | -1.12 |
|
| Serum albumin | P02768 | 8.14 | 73 | 3.97E-17 | 0.000148 | -1.62 |
|
| Serum albumin | P02768 | 8.20 | 73 | 6.29E-35 | 0.003984 | -1.39 |
|
| Complement C4-A | P0C0L4 | 6.22 | 73 | 3.97E-12 | 0.035901 | -1.56 |
|
| Fibrinogen alpha chain | P02671 | 6.22 | 12 | 1.99E-14 | 0.045355 | -1.42 |
|
| Fibrinogen beta chain | P02675 | 5.52 | 51 | 9.98E-27 | 0.001817 | 1.29 |
|
| Fibrinogen beta chain | P02675 | 5.60 | 51 | 9.98E-37 | 0.007696 | 1.27 |
|
| Putative uncharacterized protein C6orf50 C6orf50 | Q9HD87 | 6.25 | 26 | 3.15E+01 | 0.001200 | -1.76 |
|
| Transthyretin | P02766 | 6.95 | 26 | 1.99E+00 | 0.001374 | -1.48 |
|
| Transthyretin | P02766 | 4.00 | 26 | 1.58E-19 | 0.000343 | -1.50 |
|
| Transthyretin | P02766 | 5.35 | 26 | 2.51E-20 | 0.000116 | -1.62 |
|
| Apolipoprotein C-III | P02656 | 5.36 | 20 | 3.15E-11 | 0.013668 | -1.63 |
|
| Zinc-alpha-2-glycoprotein | P25311 | 7.65 | 17 | 1.99E-06 | 0.037615 | 1.25 |
|
| Histidine protein methyltransferase 1 homolog METTL18 | O95568 | 5.56 | 12 | 1.99E+01 | 0.022806 | -1.39 |
|
| Annexin A10 | Q9UJ72 | 6.84 | 15 | 2.51E+01 | 0.000098 | -2.21 |
|
| Transthyretin | P02766 | 6.76 | 14 | 1.26E+00 | 0.000011 | -1.97 |
Protein identification was performed using a Bayesian algorithm where matches were characterized by an expectation score, which represents an estimate of the number of matches that would be expected in that database if the matches were completely random. Significance was determined by one-way ANOVA using log2 transformed spot volumes.
*, significantly different after Benjamini-Hochburg correction for multiple hypothesis testing.
These values are for spot 4, p = 0.038105; spot 5, p = 0.046795; spot 6, p = 0.039113; spot 52, p = 0.020548; spot 60, p = 0.031786; spot 61, p = 0.021551; spot 65, p = 0.027383; and spot 66 p = 0.005949.
Fig 2Reference gel of plasma proteins dysregulated by IPA.
Shown is a reference gel of 2DE of SEC fractionated and IgY depleted plasma proteins from the study subjects. The location of 9 spots, identified as candidate discriminant proteins are shown. Spot #1 is FIBB; #2 is ALBU; #3 is AATL; #4 is A1AT; #5 is LRG1; #6 is A1AG1; #7 is APO A1; #8 is FIBA; and #9 is APO C3. The image shown represents the pI range of 3–10 and the molecular size range of 25–200+ kDa. Inset, average abundance of each protein spot is shown graphically for Case (Left) and Control (Right) as shown for protein spot # 1.
SID-SRM-MS assays for candidate plasma proteins.
| Protein Name | Accession # | Gene Name | Sequence | Q1 m/z | Q3 m/z | CE (V) | Pre Z | Prod Z | Ion type |
|---|---|---|---|---|---|---|---|---|---|
|
| P02768 | ALBU |
| 575.3111 | 694.3765 | 23 | 2 | 1 | y6 |
| 575.3111 | 823.4191 | 23 | 2 | 1 | y7 | ||||
| 575.3111 | 937.462 | 23 | 2 | 1 | y8 | ||||
| 575.3111 | 1036.53 | 23 | 2 | 1 | y9 | ||||
|
| P02647 | APOA1 |
| 700.8382 | 808.4194 | 27 | 2 | 1 | y8 |
| 700.8382 | 936.478 | 27 | 2 | 1 | y9 | ||||
| 700.8382 | 1023.51 | 27 | 2 | 1 | y10 | ||||
| 700.8382 | 1122.578 | 26 | 2 | 1 | y11 | ||||
|
| P01011 | AATC |
| 531.2975 | 633.3965 | 21 | 2 | 1 | y5 |
| 531.2975 | 819.4605 | 21 | 2 | 1 | y7 | ||||
| 531.2975 | 762.4391 | 21 | 2 | 1 | y6 | ||||
| 531.2975 | 932.5446 | 21 | 2 | 1 | y8 | ||||
|
| P02750 | LRG1 |
| 450.7792 | 501.339 | 19 | 2 | 1 | y5 |
| 450.7792 | 614.423 | 19 | 2 | 1 | y6 | ||||
| 450.7792 | 715.4707 | 19 | 2 | 1 | y7 | ||||
| 450.7792 | 843.5293 | 19 | 2 | 1 | y8 | ||||
|
| P01009 | A1AT |
| 508.3109 | 659.4081 | 21 | 2 | 1 | y6 |
| 508.3109 | 716.4296 | 21 | 2 | 1 | y7 | ||||
| 508.3109 | 829.5136 | 21 | 2 | 1 | y8 | ||||
| 508.3109 | 928.582 | 21 | 2 | 1 | y9 | ||||
|
| P02656 | APOC3 |
| 858.929 | 887.469 | 33 | 2 | 1 | y8 |
| 858.929 | 1016.511 | 33 | 2 | 1 | y9 | ||||
| 858.929 | 1144.57 | 33 | 2 | 1 | y10 | ||||
| 858.929 | 1243.638 | 33 | 2 | 1 | y11 | ||||
|
| P02675 | FIBB |
| 422.748 | 531.288 | 18 | 2 | 1 | y4 |
| 422.748 | 644.372 | 18 | 2 | 1 | y5 | ||||
| 422.748 | 757.456 | 18 | 2 | 1 | y6 | ||||
| 422.748 | 844.488 | 18 | 2 | 1 | y7 | ||||
|
| P02763 | A1AG1 |
| 876.9808 | 982.6191 | 33 | 2 | 1 | y8 |
| 876.9808 | 1119.678 | 32 | 2 | 1 | y9 | ||||
| 876.9808 | 1248.721 | 31 | 2 | 1 | y10 | ||||
| 876.9808 | 835.5507 | 33 | 2 | 1 | y7 | ||||
|
| P02671 | FIBA |
| 760.871 | 780.363 | 29 | 2 | 1 | y6 |
| 760.871 | 894.406 | 29 | 2 | 1 | y7 | ||||
| 760.871 | 993.474 | 29 | 2 | 1 | y8 | ||||
| 760.871 | 1122.517 | 29 | 2 | 1 | y9 | ||||
| 760.871 | 1237.544 | 29 | 2 | 1 | y10 |
For each of the candidate plasma proteins, SID-SRM-MS assays were developed. Shown is the protein accession number, common name, signature sequence, quadrupole (Q) mass, optimized collision enegery (CE) and ion type measured. Pre, precursor. Prod, product.
SID-SRM-MS assays for BAP peptides.
| Protein Name | Access # | Gene Name_PreZ | Sequence | Q1 m/z | Q3 m/z | CE (V) | Pre Z | Prod Z | Ion type |
|---|---|---|---|---|---|---|---|---|---|
| Retinol binding protein 4 | Q5VY30 | RBP4_599 |
| 599.8163 | 622.3553 | 24 | 2 | 1 | y5 |
| 599.8163 | 693.3925 | 24 | 2 | 1 | y6 | ||||
| 599.8163 | 792.4609 | 24 | 2 | 1 | y7 | ||||
| Apolipoprotein A-II | P02652 | APOA2_600 |
| 600.3351 | 659.3717 | 24 | 2 | 1 | y6 |
| 600.3351 | 788.4143 | 24 | 2 | 1 | y7 | ||||
| 600.3351 | 885.4671 | 24 | 2 | 1 | y8 | ||||
| 600.3351 | 972.4991 | 24 | 2 | 1 | y9 | ||||
| Apolipoprotein A-II | P02652 | APOA2_486 |
| 486.7534 | 546.2877 | 20 | 2 | 1 | y5 |
| 486.7534 | 659.3717 | 20 | 2 | 1 | y6 | ||||
| 486.7534 | 788.4143 | 20 | 2 | 1 | y7 | ||||
| Apolipoprotein A-II | P02652 | APOA2_578 |
| 578.8504 | 684.4649 | 23 | 2 | 1 | y6 |
| 578.8504 | 812.5234 | 23 | 2 | 1 | y7 | ||||
| 578.8504 | 941.566 | 23 | 2 | 1 | y8 |
For each of candidate peptide purified by SEC in BAP, SID-SRM-MS assays were developed. Shown is the protein accession number, common name, signature sequence, quadrupole (Q) mass, optimized collision enegery (CE) and ion type measured.
Fig 3SID-SRM-MS measurements of candidate biomarkers in IPA.
A). SID-SRM-MS measurements of the discovery cohort. For each candidate biomarker, the SID-SRM-MS measurements by disease category are shown. Box plots show the 25% and 75% interquartile range and the median value, indicated by horizontal dark line. Outliers are signified with circles. Note that the median horizontal line is not symmetrically located within the box plot, signaling that the data are not normally distributed. P values are from Wilcoxon ranked sum. B). SID-SRM-MS of the validation cohort. Data are presented as in Panel A.
Comparison of performance of machine learning classifiers.
| PREDICTIVE MODEL | Accuracy | Accuracy | Accuracy | AUC | AUC | AUC |
|---|---|---|---|---|---|---|
| Train | Test | Δ | Train | Test | Δ | |
| CART | 0.941 | 0.633 | 0.308 | 0.941 | 0.633 | 0.308 |
| RF | 0.821 | 0.594 | 0.227 | 0.891 | 0.69 | 0.201 |
| MARS | 0.916 | 0.632 | 0.284 | 0.959 | 0.689 | 0.27 |
| GPS | 0.740 | 0.557 | 0.183 | 0.906 | 0.775 | 0.131 |
Abbreviations: CART, classification, and regression tree; RF, random forest; MARS, multivariate adaptive regression splines; GPS, generalized pathseeker. For each classifier, the area under the ROC curve (AUC), and Δ, the difference in AUC between the training and test data set are shown. Note that the AUC of the RF classifier is lowest, indicating that this classifier will generalize to new data sets.
Comparison of variable performance of GPS classifiers.
| Predictive Variables | Accuracy | Accuracy | AUC | AUC |
|---|---|---|---|---|
| Train | Test | Train | Test | |
| GM Only | 0.50 | 0.50 | 0.699 | 0.863 |
| Host Proteins Only | 0.475 | 0.519 | 0.921 | 0.753 |
| All 14 variables | 0.74 | 0.557 | 0.906 | 0.775 |
For each classifier, the area under the ROC curve (AUC), and Δ, the difference in AUC between the training and test data set are shown. Note that the Δ AUC of the RF classifier for all 14 variables is stable, indicating that this classifier will generalize to new data sets. Accuracy represents case group only.
Fig 4Candidate variable importance of the RF IPA model.
Variable importance is a relative measurement from 0–100% that indicates the level that each marker contributes to the performance of the classifier. Note the top three important variables are host response proteins.
Fig 5ROC Curve for the RF IPA prediction.
ROC for GPS classifier using host response proteins, BAP peptides, and fungal antigens (BD and GM). AUC values for training and test data sets are given in Table 6.