| Literature DB >> 16980117 |
Dan Agranoff1, Delmiro Fernandez-Reyes, Marios C Papadopoulos, Sergio A Rojas, Mark Herbster, Alison Loosemore, Edward Tarelli, Jo Sheldon, Achim Schwenk, Richard Pollok, Charlotte F J Rayner, Sanjeev Krishna.
Abstract
BACKGROUND: We investigated the potential of proteomic fingerprinting with mass spectrometric serum profiling, coupled with pattern recognition methods, to identify biomarkers that could improve diagnosis of tuberculosis.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16980117 PMCID: PMC7159276 DOI: 10.1016/S0140-6736(06)69342-2
Source DB: PubMed Journal: Lancet ISSN: 0140-6736 Impact factor: 79.321
Characteristics of patients with tuberculosis and controls
| Total number of patients | 102 | 77 | 179 | |
| Symptomatic | 100 (98%) | 74 (96%) | 174 (97%) | |
| Persistent cough | 98 (96%) | 74 (96%) | 171 (96%) | |
| Haemoptysis | 5 (5%) | 1 (1%) | 6 (3%) | |
| Night sweats/fever | 68 (67%) | 53 (67%) | 121 (68%) | |
| Weight loss ≥5% | 86 (84%) | 60 (78%) | 146 (82%) | |
| Weight loss <5% | 11 (11%) | 15 (19%) | 26 (15%) | |
| Mean (range) symptom duration before recruitment in days | 122·6 (13–449) | 129·5 (12–754) | 126 (12–754) | |
| Smear positive | 89 (87%) | 66 (86%) | 155 (87%) | |
| Pulmonary disease | 77 (75%) | 64 (83%) | 141 (79%) | |
| Extrapulmonary disease | 2 (2%) | 2 (3%) | 4 (2%) | |
| Pulmonary and extrapulmonary | 22 (22%) | 11 (14%) | 33 (18%) | |
| Abnormal chest radiograph | 95 (93%) | 67 (87%) | 162 (91%) | |
| Cavitary disease | 66 (65%) | 49 (64%) | 115 (64%) | |
| Previous BCG vaccination | 36 (35%) | 26 (34%) | 62 (35%) | |
| Skin test positive | 56 (55%) | 36 (47%) | 92 (51%) | |
| Total number of patients | 91 | 79 | 170 | |
| Inflammatory bowel disease | 10 (11%) | 6 (8%) | 16 (9%) | |
| Sarcoidosis | 6 (7%) | 7 (9%) | 13 (8%) | |
| Respiratory infections | 27 (30%) | 24 (30%) | 51 (30%) | |
| Malaria ( | 4 (4%) | 3 (4%) | 7 (4%) | |
| HAT ( | 10 (11%) | 9 (11%) | 19 (11%) | |
| Others | 1 (1%) | 2 (3%) | 3 (2%) | |
| Neurological disease | 13 (14%) | 13 (16%) | 26 (15%) | |
| Autoimmune disease | 6 (7%) | 3 (4%) | 9 (5%) | |
| Myeloma/monoclonal gammopathy | 2 (2%) | 3 (4%) | 5 (3%) | |
| Healthy volunteers | 12 (13%) | 9 (11%) | 21 (12%) | |
Data are number (%) unless otherwise specified. HAT=human African trypanosomiasis.
Definite history of BCG vaccination, presence of scar, or both. Data missing for 38 patients.
Mantoux reaction ≥15 mm greatest diameter of induration or Heaf grade ≥3. Data missing for 46 patients.
12 controls were taking high-dose systemic steroids (prednisolone ≥60 mg per day or dexamethasone ≥12 mg per day). BCG history and skin-test data unavailable for most control patients; tuberculin skin testing was only done on small minority.
Majority pyogenic respiratory infections (based on presence of consolidation on CXR and prompt clinical response to antibacterial therapy. One patient with pulmonary infarction rather than infection is included in the test set.
Nine patients with HAT had advanced (neurological disease) based on detection of parasites and/or >5 white cells per mm3 in CSF.
Visceral leishmaniasis (1), meningococcal septicaemia (1), staphylococcal cellulitis (1).
Cerebral neoplasia (12), cerebral abscess in association with infective endocarditis (1), myasthenia gravis (2), multiple sclerosis (5) and lumbar disc prolapse (6).
Rheumatoid arthritis (3) systemic lupus erythematosis (4), systemic sclerosis (1), overlap syndrome (1).
Participant demographics
| Train | Test | Total | Train | Test | Total | |||
|---|---|---|---|---|---|---|---|---|
| Total number of patients | 102 | 77 | 179 | 91 | 79 | 170 | 349 | |
| Mean (range) age in years | 31 (16–86) | 33 (19–84) | 32 (16–86) | 44 (16–88) | 46 (14–84) | 45 (16–84) | 38 (14–88) | |
| Sex (male:female) | 65:37 | 47:30 | 112:67 | 52:39 | 42:37 | 94:76 | 206:143 | |
| Ethnic origin | ||||||||
| Sub-Saharan African | 81 (79%) | 60 (78%) | 141 (79%) | 29 (32%) | 29 (37%) | 58 (34%) | 199 | |
| African, not specified | 3 (3%) | 1 (1%) | 4 (2%) | 5 (6%) | 4 (5%) | 9 (5%) | 13 | |
| Asian | 13 (13%) | 9 (12%) | 22 (12%) | 6 (7%) | 3 (4%) | 9 (5%) | 31 | |
| White | 5 (5%) | 7 (9%) | 12 (7%) | 49(54%) | 39 (49%) | 88 (51%) | 100 | |
| Not recorded | .. | .. | .. | 2 (2%) | 4 (5%) | 6 (4%) | 6 | |
| Collection site | ||||||||
| Sub-Saharan Africa | 81 (79%) | 60 (78%) | 141 (79%) | 21 (23%) | 19 (24%) | 40 (24%) | 181 | |
| UK | 21 (21%) | 17 (22%) | 38 (21%) | 70 (77%) | 60 (76%) | 130 (76%) | 168 | |
| HIV serology | ||||||||
| HIV positive | 35 (34%) | 24 (31%) | 59 (33%) | 2 (2%) | 3 (4%) | 5 (3%) | 64 | |
| CD4 count ≥200×106 per mL | 19 | 13 | 32 | .. | .. | .. | .. | |
| CD4 count <200×106 per mL | 15 | 11 | 26 | .. | .. | .. | .. | |
| HIV negative | 60 (59%) | 45 (58%) | 105 (59%) | 12 (13%) | 8 (10%) | 20 (12%) | 125 | |
| Not determined | 7 (7%) | 8 (10%) | 15 (8%) | 77 (85%) | 68 (86%) | 145 (85%) | 160 | |
Percentages refer to proportion of patients in the training and testing set for each demographic category.
12 patients with tuberculosis had received 1–7 days of chemotherapy at time of recruitment.
CD4 counts were available for HIV-seropositive patients; no value was available for six seropositive patients.
Diagnostic performance of classifiers
| TB | C | |||||
|---|---|---|---|---|---|---|
| Kernel: Gaussian | TB | 72 | 4 | 94·23% | 93·50% | 94·93% |
| Soft margin=10 | C | 5 | 75 | |||
| 100 iterations | TB | 72 | 7 | 92·30% | 93·50% | 91·13% |
| Weight threshold=100 | C | 5 | 72 | |||
| 100 iterations | TB | 71 | 8 | 91·02% | 92·20% | 89·87% |
| Weight threshold=100 | C | 6 | 71 | |||
| Boost=10 | TB | 72 | 10 | 90·38% | 93·51% | 87·34% |
| Global pruning 25% | C | 5 | 69 | |||
| Kernel=polynomial | TB | 71 | 9 | 88·46% | 92·20% | 84·81% |
| Soft margin=1 | C | 6 | 70 | |||
| Normalised | TB | 68 | 12 | 86·54% | 88·31% | 84·81% |
| Shuffled presentation | C | 9 | 67 | |||
| Learning rate=0·3 | TB | 65 | 9 | 86·53% | 84·41% | 88·60% |
| Momentum=0·2 | C | 12 | 70 | |||
| Normalised 500 epochs | ||||||
Contingency table showing number of cases classified for each of the diagnostic classes. Codes in parentheses after classifier names refer to key of figure 1A. TB=tuberculosis; C=controls. ADTree=adaptive decision tree. C4·5 Tree. AdaBoost=adaptive boosting. SLP=single layer perceptron. MLP=multi layered perceptron. HL=hidden layers. N=neurons.
Figure 1Performance and validation of classifiers
(A) Classifier performance in ROC space. SVM_1, ADT_2, C4·5_2, C5·0_1, SVM_4, SLP_3, MLP: for names and parameters see table 3. ADT_1=Adaptive decision tree without AdaBoost. NCP_3=Non-conservative projection (normalised, random presentation). C5·0_2=C5·0 tree with winnowing. C4·5_1=C4·5 tree without AdaBoost. CP_2=conservative projection (normalised). NCP_2=non-conservative projection (normalised). Red line indicates convex hull. (B) Gaussian kernel Support Vector Machine performance (ten-fold crossvalidation). Each block of three bars shows the values for accuracy (red), sensitivity (green) and specificity (blue) obtained when the sigma Gaussian-kernel was optimised for each of these criteria. (C) Averaged ROC using ten-fold train crossvalidation with test. 100 randomly selected train and test sets with a train:test ratio [80:20]. Parameters were selected with a ten-fold crossvalidation on the train set and performance obtained in the test. Red line shows averaged ROC curve of classifiers obtained when kernel parameter is selected on accuracy criteria. Similar ROC curves were obtained when selecting on sensitivity and specificity (webfigure 1).
Figure 2Performance of SVM classifiers based on subsets of peak clusters and combinations of identified biomarkers
SAA=serum amyloid A. CRP=C-reactive protein. Gaussian SVMs were trained with the initial train set (table 2) using the specified mass peak clusters or biomarker combination (ten-fold crossvalidation for parameter selection). Classifier performance was then assessed on initial test (table 2). (A) Classification performance of correlated mass clusters. 1=10 positively correlated and 10 negatively correlated; 2=remaining 199. 3=10 positively correlated; 4=remaining 209. 5=10 negatively correlated; 6=remaining 209. Raw values supplied in webtable 1. Red line represents convex hull defined by optimal classifiers (4 and 1). (B) Biomarkers. 1g=transthyretin. 2g=CRP. 3g=neopterin. 4g=SAA. 5g=neopterin-SAA. 6g=CRP-SAA. 7g=CRP-neopterin. 8g=transthyretin-SAA. 9g=transthyretin-neopterin. 10g=transthyretin-CRP. 11g=transthyretin-CRP-neopterin. 12g=transthyretin-CRP-SAA. 13g=transthyretin-neopterin-SAA. 14g=CRP-neopterin-SAA. 15g=transthyretin-CRP-neopterin-SAA. Raw values supplied in webtable 2. Red line represents convex hull defined by optimal classifiers (2g, 6g, 12g, 9g).
Characteristics of patients and controls in second dataset
| Total number of patients | 18 | 23 | 41 | |
| Mean (range) age in years | 35 (18–61) | 32 (18–60) | 34 (18–61) | |
| Sex (male:female) | 12:6 | 7:16 | 19:22 | |
| Ethnic origin | ||||
| African | 10 | 13 | 23 | |
| Asian | 6 | 4 | 10 | |
| White | 2 | 6 | 8 | |
| Collection site | ||||
| UK (St George's Hospital) | 9 | 7 | 16 | |
| UK (Hammersmith Hospital) | 9 | 16 | 25 | |
| Symptoms | ||||
| Persistent cough | 14 | 13 | 27 | |
| Haemoptysis | 5 | 2 | 7 | |
| Night sweats/fever | 11 | 11 | 22 | |
| Weight loss | 6 | 3 | 9 | |
| Tuberculosis smear-positive | 10 | N/A | 10 | |
| Tuberculosis site of disease | ||||
| Pulmonary | 16 | N/A | 16 | |
| Extrapulmonary | 1 | N/A | 1 | |
| Pulmonary and extrapulmonary | 1 | N/A | 1 | |
| Abnormal chest radiograph | 14 | 11 | 25 | |
| Cavitary disease | 4 | 0 | 4 | |
| Previous BCG vaccination | 7 | 16 | 23 | |
| Controls with respiratory infections | N/A | 15 | 15 | |
| Controls with inflammatory bowel disease | N/A | 4 | 4 | |
| Healthy volunteers | N/A | 4 | 4 | |
| HIV-negative | 8 | 3 | 11 | |
Controls: 20 undetermined.
Data missing for ten patients with tuberculosis and six controls.
Tuberculosis: one HIV positive, nine undetermined.