| Literature DB >> 31199451 |
Sana Syed1,2, Mohammad Al-Boni3, Marium N Khan1, Kamran Sadiq2, Najeeha T Iqbal2, Christopher A Moskaluk4, Paul Kelly5,6, Beatrice Amadi6, S Asad Ali2, Sean R Moore1, Donald E Brown7.
Abstract
Importance: Duodenal biopsies from children with enteropathies associated with undernutrition, such as environmental enteropathy (EE) and celiac disease (CD), display significant histopathological overlap. Objective: To develop a convolutional neural network (CNN) to enhance the detection of pathologic morphological features in diseased vs healthy duodenal tissue. Design, Setting, and Participants: In this prospective diagnostic study, a CNN consisting of 4 convolutions, 1 fully connected layer, and 1 softmax layer was trained on duodenal biopsy images. Data were provided by 3 sites: Aga Khan University Hospital, Karachi, Pakistan; University Teaching Hospital, Lusaka, Zambia; and University of Virginia, Charlottesville. Duodenal biopsy slides from 102 children (10 with EE from Aga Khan University Hospital, 16 with EE from University Teaching Hospital, 34 with CD from University of Virginia, and 42 with no disease from University of Virginia) were converted into 3118 images. The CNN was designed and analyzed at the University of Virginia. The data were collected, prepared, and analyzed between November 2017 and February 2018. Main Outcomes and Measures: Classification accuracy of the CNN per image and per case and incorrect classification rate identified by aggregated 10-fold cross-validation confusion/error matrices of CNN models.Entities:
Mesh:
Year: 2019 PMID: 31199451 PMCID: PMC6575155 DOI: 10.1001/jamanetworkopen.2019.5822
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Illustration of Proposed Convolutional Neural Network Classification and Visualization Framework
The convolutional neural network consists of 4 convolution layers and 1 fully connected layer. Each convolution layer consists of 3 sublayers: (1) a convolution layer, (2) a rectified linear unit activation layer, and (3) a max pooling layer. Deconvolution layers increase image resolution and find locations with high activations. The input image represents a hematoxylin-eosin–stained duodenal biopsy image (original magnification ×100).
Background Clinical Characteristics of Patient Population
| Characteristic | No. (%) | ||||
|---|---|---|---|---|---|
| Total Participants | Patients With Environmental Enteropathy | Patients With Celiac Disease, United States | Patients With No Disease, United States | ||
| Pakistan | Zambia | ||||
| Diagnosis | 102 (100) | 10 (9.8) | 16 (15.7) | 34 (33.3) | 42 (41.2) |
| Age, median (IQR), mo | 31.0 (20.3 to 75.5) | 22.0 (20.0 to 23.0) | 16.5 (10.5 to 21.0) | 129.0 (72.5 to 180.8) | 31.5 (22.0 to 49.8) |
| Sex | |||||
| Boys | 53 (51.9) | 5 (50.0) | 10 (62.5) | 12 (35.0) | 26 (62.0) |
| Girls | 49 (48.1) | 5 (50.0) | 6 (37.5) | 22 (65.0) | 16 (38.0) |
| Images | 121 (100) | 29 (24.0) | 16 (13.2) | 34 (28.0) | 42 (34.7) |
| Weight-for-age | −1.00 (−3.10 to 0.06) | −3.40 (−3.78 to −2.46) | −3.75 (−5.23 to −3.21) | −0.14 (−0.77 to 0.24) | −0.36 (−1.28 to 0.93) |
| Length-for-age/height-for age | −1.00 (−2.33 to 0.31) | −2.85 (−3.47 to −2.35) | −3.06 (−3.84 to −2.29) | −0.12 (−0.91 to 0.67) | −0.36 (−1.15 to 0.47) |
| Weight-for-height | −1.00 (−2.68 to 0.27) | −2.68 (−2.87 to −1.90) | −3.05 (−4.62 to −2.61) | 0.62 (0.40 to 1.07) | −0.23 (−1.06 to 0.50) |
Abbreviation: IQR, interquartile range.
Images refer to the number of hemotoxylin-eosin–stained biopsy images made available to the deep learning network; these included both scanned images (celiac disease and no disease) and digitized images (environmental enteropathy from Pakistan and Zambia). For Pakistan, there were 2 to 3 biopsies available from each patient; therefore, there were 29 digitized biopsy images from 10 patients.
Three patients with celiac disease did not have anthropometric data available, and they were excluded from the analysis for all z scores.
Weight-for-age z scores could only be calculated for approximately 35% of patients with celiac disease because the rest were older than 10 years and there is no reference standard for this age group.
Weight-for-height z scores could only be generated for 7 patients with celiac disease using the current algorithm.
Weight-for-height z scores could only be generated for 38 patients with no disease using the algorithm.
Figure 2. High Activation Areas
A, Hematoxylin-eosin–stained duodenal tissues with diagnosed environmental enteropathy (original magnification ×100). B, Hematoxylin-eosin–stained histologically normal duodenal tissue (original magnification ×100). These images were the areas of high activation identified by the model; we observed secretory cells, specifically Paneth cells and goblet cells, in the mucosa. Our classification model identified these secretory cells to be of high importance for distinguishing biopsies with no disease from biopsies of environmental enteropathy and celiac disease.
Figure 3. Deconvolution Groupings
We selected 151 deconvolutions from hematoxylin-eosin–stained duodenal biopsies for interpretation (original magnification ×40); the 10 groupings the model identified are shown. Red boxes and lines indicate the pixel configuration that the deconvolution model considered an area of importance. Each of these features was used by the model in its decision-making process, but the relative importance of each feature is unknown.
Classification Accuracy of Model Trained on Patients From Pakistan and Evaluated on Patients From Zambia
| Evaluation Method | Model No. | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
| Per-image accuracy | 0.94 | 0.93 | 0.92 | 0.07 | 0.81 | 0 | 0.91 | 0.997 | 0.14 | 0.89 |
| Per-case accuracy | 1.00 | 1.00 | 1.00 | 0.13 | 1.00 | 0 | 1.00 | 1.00 | 0.13 | 1.00 |