| Literature DB >> 28546534 |
E Mossotto1,2, J J Ashton1,3, T Coelho1,3, R M Beattie3, B D MacArthur2, S Ennis4.
Abstract
Paediatric inflammatory bowel disease (PIBD), comprising Crohn's disease (CD), ulcerative colitis (UC) and inflammatory bowel disease unclassified (IBDU) is a complex and multifactorial condition with increasing incidence. An accurate diagnosis of PIBD is necessary for a prompt and effective treatment. This study utilises machine learning (ML) to classify disease using endoscopic and histological data for 287 children diagnosed with PIBD. Data were used to develop, train, test and validate a ML model to classify disease subtype. Unsupervised models revealed overlap of CD/UC with broad clustering but no clear subtype delineation, whereas hierarchical clustering identified four novel subgroups characterised by differing colonic involvement. Three supervised ML models were developed utilising endoscopic data only, histological only and combined endoscopic/histological data yielding classification accuracy of 71.0%, 76.9% and 82.7% respectively. The optimal combined model was tested on a statistically independent cohort of 48 PIBD patients from the same clinic, accurately classifying 83.3% of patients. This study employs mathematical modelling of endoscopic and histological data to aid diagnostic accuracy. While unsupervised modelling categorises patients into four subgroups, supervised approaches confirm the need of both endoscopic and histological evidence for an accurate diagnosis. Overall, this paper provides a blueprint for ML use with clinical data.Entities:
Mesh:
Year: 2017 PMID: 28546534 PMCID: PMC5445076 DOI: 10.1038/s41598-017-02606-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Model and data processing. Schematic representation of the model construction (blue section), validation (green section) and IBDU reclassification (red section) phases. Solid arrows represent data stream while dashed arrows represent parameters or metrics stream. The discovery set was used to identify the optimal penalty parameter (C) and number of features using the recursive feature elimination with cross validation algorithm (RFE-CV). These two elements were then passed to the training and testing set which was then modelled using a support vector machine (SVM). Three metrics were collected: area under the ROC curve (AUC); accuracy over the 5 folds and; a permutation-generated p-value.
Figure 2Dimensionality reduction approaches and hierarchical clustering of PIBD data. (A,B) Principal component analysis (A) and multidimensional scaling (B) of clinical data from 239 PIBD patients. The first three PCA components account for 52.2% of the total variance. Important note – UC/CD/IBDU diagnoses were used only to retrospectively colour data points and were not included in actual modelling. (C) Heatmap of endoscopic and histological tissue abnormalities in PIBD patients. Abnormal manifestations are shown in orange, normal in light blue and missing data in white. Asterisks indicate histology features. Ascending colon, transverse colon and descending colon labels were shortened to A-Colon, T-Colon and D-Colon respectively. Left hand side bar shows the referred diagnosis: CD in red, UC in blue, IBDU in yellow. Again, UC/CD/IBDU diagnoses were not used to model data but only to retrospectively colour each element. The top bar shows the type of investigation: histology in white, endoscopy in black. Identified colorectal groups are shown by dashed boxes and labelled from one (i) to four (iv). (D) Box and whisker plot depicting C-reactive protein (CRP) levels recorded at diagnosis across the four identified groups. Each box represents data from the first (bottom edge) and the third (top edge) quartile. Red bars and numbers are the median CRP level. Dashed whiskers show the lowest and highest CRP within each group. Black circles are outlier data points.
Preliminary assessment of linear and non-linear models. Linear support vector machine (SVM) was the selected model.
| Method | Accuracy |
|---|---|
| Simple Tree (4 splits) | 78.1% |
| Medium Tree (20 splits) | 75.2% |
| Complex Tree (100 splits) | 76.7% |
| Linear discriminant | 81.0% |
| Linear SVM | 80.5% |
| Quadratic SVM | 78.1% |
| Cubic SVM | 73.8% |
| Boosted Trees | 74.8% |
| Bagged Trees | 77.6% |
Performance of the three optimised supervised models, asterisks indicate histological features.
| Input | Accuracy % (AUC) | Precision | Recall | F1-score | (#) Features |
|---|---|---|---|---|---|
| Endoscopy | 71.0% (0.78) | 0.89 | 0.68 | 0.75 | (5) Duodenum, Ileum, D-Colon, Rectum, Perianal |
| Histology | 76.9% (0. 82) | 0.81 | 0.86 | 0.83 | (1) Ileum |
| Combined (E + H) | 82.7% (0.87) | 0.91 | 0.83 | 0.87 | (8) Duodenum, Ileum, D-Colon, Rectum, Perianal, Oesophagus*, Ileum*, A-Colon* |
All metrics represent the average over the 5-folds of the cross validation.
Figure 3Supervised classification performance and metrics. (A) Receiver operating characteristic of the combined (light blue), histology (purple) and endoscopy (green) models. The grey dashed line represents the expected performance of a random model. (B) Permutation tests of models: dashed lines represent the observed accuracy of the combined (light blue), histology (purple) and endoscopy (green) models. The endoscopic, histological and combined models have a p-value of p = 3 × 10−3, p = 5 × 10−6 and p = 1 × 10−6 respectively. The grey dashed line represents the average expected performance of random model. Solid coloured lines show the distribution of random permutations for each model. (C) Classification of IBDU patients with the combined model in Crohn’s disease (red) or ulcerative colitis (blue) subtypes. The classification posterior probability indicates the confidence of the model in assigning UC or CD labels. (D) Cumulative confidence in IBDU reclassification represented as cumulative density function (red line) of posterior probabilities for 29 IBDU patients. Each dot represents an IBDU patient.
Performance of the trained combined model over the validation set.
| Validation set | Accuracy % | Precision | Recall | F1-score | Support |
|---|---|---|---|---|---|
| UC | — | 0.65 | 0.85 | 0.73 | 13 |
| CD | — | 0.94 | 0.83 | 0.88 | 35 |
| Average/Total | 83.3% | 0.86 | 0.83 | 0.84 | 48 |