Literature DB >> 36215265

Automated multi-class classification for prediction of tympanic membrane changes with deep learning models.

Yeonjoo Choi¹, Jihye Chae², Keunwoo Park², Jaehee Hur², Jihoon Kweon², Joong Ho Ahn¹.

Abstract

BACKGROUNDS AND
OBJECTIVE: Evaluating the tympanic membrane (TM) using an otoendoscope is the first and most important step in various clinical fields. Unfortunately, most lesions of TM have more than one diagnostic name. Therefore, we built a database of otoendoscopic images with multiple diseases and investigated the impact of concurrent diseases on the classification performance of deep learning networks. STUDY
DESIGN: This retrospective study investigated the impact of concurrent diseases in the tympanic membrane on diagnostic performance using multi-class classification. A customized architecture of EfficientNet-B4 was introduced to predict the primary class (otitis media with effusion (OME), chronic otitis media (COM), and 'None' without OME and COM) and secondary classes (attic cholesteatoma, myringitis, otomycosis, and ventilating tube).
RESULTS: Deep-learning classifications accurately predicted the primary class with dice similarity coefficient (DSC) of 95.19%, while misidentification between COM and OME rarely occurred. Among the secondary classes, the diagnosis of attic cholesteatoma and myringitis achieved a DSC of 88.37% and 88.28%, respectively. Although concurrent diseases hampered the prediction performance, there was only a 0.44% probability of inaccurately predicting two or more secondary classes (29/6,630). The inference time per image was 2.594 ms on average.
CONCLUSION: Deep-learning classification can be used to support clinical decision-making by accurately and reproducibly predicting tympanic membrane changes in real time, even in the presence of multiple concurrent diseases.

Entities: Chemical

Mesh：

Year: 2022 PMID： 36215265 PMCID： PMC9550050 DOI： 10.1371/journal.pone.0275846

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

In the otologic field, evaluating the tympanic membrane (TM) and the middle ear via endoscopic evaluation is usually the first step for patients complaining of earache or other problem such as hearing loss, dizziness, or facial palsy [1]. To evaluate otologic diseases such as acute/chronic otitis externa or acute/chronic otitis media, it is important to examine the state of the external auditory canal (EAC) and TM using common tools like the otoscope, which allows for simple observation and diagnosis. Apart from being a primary diagnostic step, an accurate otoscopic exam can also guide the correct course of treatment during the follow up period. Given how important it is to diagnose and evaluate accurately the state of disease during the follow up period, intensive training is required before being able to accurately diagnose the condition [1]. Unfortunately, misdiagnosis in the clinical field is still fairly common. One study reported that diagnostic accuracy varied among physicians, including otolaryngologists, pediatricians, and family medicine doctors [2]. Another study reported that otolaryngologists diagnosed these otologic diseases with 73% accuracy while pediatricians and general practitioners had an accuracy rate of 50% and 64%, respectively [3]. Therefore, even though there is a glaring need for trained otolaryngologists to make accurate diagnoses, the limited number of specialists makes it impossible [4]. Therefore, there is a need to develop a modality that can accurately evaluate the status of EAC and TM to support the diagnostic system. Specifically, there is a need for an image-based diagnostic algorithm based on otoscopic images. In recent years, advances in image classification using deep learning networks have been proven to improve the diagnosis performance of middle ear diseases [5-8]. Khan et al. [9] reported that classification accuracy of deep network reached 94.9% in the classification of normal, chronic otitis media (COM) with TM perforation, and otitis media with effusion (OME). Detection of tympanic perforation had an accuracy rate of 91% [10]. The ensemble approach, which combines the outputs of multiple networks, enhanced predictability in the categorical classification of otoendoscopic images [11,12]. Deep learning prediction can help clinicians make more accurate decisions [13]. Although previous studies showed the potential applicability of deep learning-based diagnosis, otoendoscopic images of multiple diseases that could hamper diagnostic accuracy were excluded from the prediction. Therefore, in this study, we built a database of otoendoscopic images containing multiple diseases to investigate the impact of concurrent diseases on the classification performance of deep learning networks.

Materials and methods

Data description

Otoendoscopic images of TM were collected from patients who visited the otologic clinic in Asan Medical Center from Jan 2018 to Dec 2020. In clinical practice, the otoendoscopic video sequence was taken for diagnostic examination and an image frame visualizing the whole TM was stored in the hospital system without patient-identifiable information. Otoendoscopic images enrolled based on the date of visit were completely anonymized before being provided by the hospital system. The collected images were classified into one primary class and four secondary classes according to their diagnostic classification. The categories of each image were blindly annotated by two otologists with 26 and 5 years of experience, respectively. A total of 6,630 otoendoscopic images labeled identically by two annotators were included in this study. The primary class was annotated as one of otitis media with effusion (OME, 1,630 images), chronic otitis media (COM, 1,534 images), and ’None’ (3,466 images)–meaning the absence of OME and COM. OME refers to effusions in the middle ear cavity, which manifest in the air-fluid level or as an amber-like color change of TMs. COM refers to a perforated TM. Binary labels were given for the secondary classes of attic cholesteatoma (893 images), myringitis (1,083 images), otomycosis (181 images), and ventilating tube (1,676 images) (Fig 1). Attic cholesteatoma refers to any sign of retraction pocket in attic or visible attic destruction. Myringitis is defined as any inflammation of the tympanic membrane, including acute otitis media. Otomycosis refers to a fibrinous accumulation of debris or visible pores of fungus in the external auditory canal. Ventilating tube refers to an inserted tube across the TM. For example, when a TM was normal, the primary class was ’None’ and the secondary classes were ’False’ for attic cholesteatoma, myringitis, otomycosis, and ventilating tube (Fig 2B). An otoendoscopic image with only otomycosis was assigned ’None’ for the primary class, ’True’ for otomycosis, and ’False’ for the other secondary classes. For 3,508 images, one or more secondary classes were positive. The present study is in compliance with the Declaration of Helsinki and research approval was granted from the Institutional Review Board of the Asan Medical Center with a waiver of research consent (IRB no. 2021–0837).

Fig 1

Classification of otoendoscopic images by primary and secondary classes with representative examples.

OME, otitis media with effusion; COM, chronic otitis media.

Fig 2

(a) Schematic diagram of deep learning network for multi-class classification of otoendoscopic images. (b) Labeling examples. For a normal tympanic membrane (TM), the otoendoscopic image was labeled as ’None’ for the primary class and ’False’ for the secondary classes (attic cholesteatoma, myringitis, otomycosis and ventilating tube). When TM was diseased as one of the secondary classes without otitis media with effusion (OME) and chronic otitis media (COM), the primary class was given as ’None’ for the otoendoscopic image.

Classification of otoendoscopic images by primary and secondary classes with representative examples.

OME, otitis media with effusion; COM, chronic otitis media. (a) Schematic diagram of deep learning network for multi-class classification of otoendoscopic images. (b) Labeling examples. For a normal tympanic membrane (TM), the otoendoscopic image was labeled as ’None’ for the primary class and ’False’ for the secondary classes (attic cholesteatoma, myringitis, otomycosis and ventilating tube). When TM was diseased as one of the secondary classes without otitis media with effusion (OME) and chronic otitis media (COM), the primary class was given as ’None’ for the otoendoscopic image.

Deep learning network

The architecture of EfficientNet-B4 [14] was customized to have shared and task-specific layers for the multi-task learning (Fig 2A). The task-specific layers consisted of five shallow classifiers corresponding to the primary class and four secondary classes (’combined model’). Parameters between the classifiers were not shared. As an input to deep networks, RGB images reformatted into 256×256×3 with circular cropping were used (Fig 2A). Data augmentation was performed by randomly applying rotation (−90° to 90°), translation shift (0–20% of image size in horizontal and vertical axes), zoom (0–20%), horizontal flip, brightness change (0–20%) and downscale (0–50%). The pre-trained weight from ImageNet was applied for transfer learning. Categorical cross-entropy loss was adopted to train the models for multi-class classification, which is defined as, where N is the number of training samples, M is the number of classes, t is the ground truth, and p is the output probability. The final output was determined as the primary rank of the softmax value.

Training setup and evaluation metrics

The deep learning model implemented using Pytorch was trained on a workstation with AMD Ryzen 7 5800X CPU 3.8 GHz, 128 GB RAM, and two NVIDIA Geforce RTX 3090 Ti GPUs. The model training was conducted for 200 epochs at maximum with a mini-batch size of 32. For training, an Adam optimizer was applied with β1 = 0.9 and β2 = 0.9999. The learning rate was initially set as 10−3 and was reduced by half with a saturation criteria of 50 epochs. The evaluation metrics for each label were precision, sensitivity (recall), specificity, and dice similarity coefficient (DSC), which were defined as precision = TP / (TP + FP), sensitivity = TP / (TP + FN), specificity = TN / (FP + TN) and DSC = 2 × precision × recall / (precision + recall), where TP is true positive, FP is false positive, and FN is false negative. The per-class accuracy was calculated by dividing the sum of TPs and TNs with the total number of images in a fold. For 5-fold cross validation, the dataset was divided so that each fold contained an equal number of images (n = 1,326). The fold proportion of training, validation, and test sets was fixed at 3:1:1 and their compositions were changed under cyclic permutation.

Separate prediction for single class as reference

To evaluate the performance of multi-class classification, the deep learning models for the prediction of each class were separately trained (’separate model’). In this setting, only one classifier for the target class remained in the task-specific layers (Fig 2A).

Statistical analysis

Categorical variables are presented as numbers and percentages. The McNemar test was applied to compare DSC values between combined and separate models. Statistical analyses were performed using R package.

Results

Classification performance of combined model

In the prediction of the primary class, the overall dice similarity coefficient (DSC) was 95.19%, with COM achieving the highest DSC of 96.09% (Table 1). Misidentification between COM and OME rarely occurred (7 images), and most of the prediction errors appeared as false positives and false negatives in the ’None’ class (Fig 3). Among the secondary classes, the ventilating tube was most accurately diagnosed (DSC = 98.89%), followed by attic cholesteatoma and myringitis with DSCs of 88% or higher (Table 1). Otomycosis, which trained with fewer positive cases, had lower predictive accuracy than other classes. The AUC values for the primary and secondary classes were ≥ 0.9925 (Fig 4).

Table 1

Prediction performance of combined model for primary and secondary classes.

McNemar test was applied for the comparison with separate models, denoted with the subscript ’sep’.

	DSC	Accuracy	Sensitivity	Precision	Specificity	DSC_sep	DSC—DSC_sep	p-value
Primary class (P)	95.19%	95.32%	95.38%	95.32%	94.65%	94.90%	0.29%	0.360
None	95.68%	-	96.91%	94.49%	93.58%	95.49%	0.19%	-
OME	93.80%	-	91.90%	95.78%	96.44%	93.76%	0.04%	-
COM	96.09%	-	95.37%	96.82%	95.31%	95.46%	0.63%	-
Attic cholesteatoma (S1)	88.37%	96.97%	85.54%	91.39%	98.74%	87.75%	0.62%	0.663
Myringitis (S2)	88.28%	96.21%	87.26%	89.32%	97.96%	88.58%	-0.30%	0.404
Otomycosis (S3)	72.38%	98.69%	62.98%	85.07%	99.69%	68.26%	4.12%	0.030
Ventilating tube (S4)	98.89%	99.44%	98.57%	99.22%	99.74%	98.68%	0.21%	0.879

Fig 3

Confusion matrix of combined model in 5-fold cross validation for the prediction of primary and secondary classes.

GT, ground truth; OME, otitis media with effusion; COM, chronic otitis media.

Fig 4

Receiver operating characteristics (ROC) curves and AUC values for primary and secondary classes.

Micro-average was applied to evaluate the overall predictability of deep learning model for the primary class. AUC, area under the ROC curve; OME, otitis media with effusion; COM, chronic otitis media.

Confusion matrix of combined model in 5-fold cross validation for the prediction of primary and secondary classes.

GT, ground truth; OME, otitis media with effusion; COM, chronic otitis media.

Receiver operating characteristics (ROC) curves and AUC values for primary and secondary classes.

Prediction performance of combined model for primary and secondary classes.

McNemar test was applied for the comparison with separate models, denoted with the subscript ’sep’.

Impact of concurrent diseases

With a greater number of positive secondary classes, the probability of accurate prediction for all classes gradually decreased from 92.57% to 14.29% (Table 2). When the number of positives in the secondary classes ≥ 2, the proportion of images with at least one false prediction was over 40%. Nonetheless, the combined model had only a 0.44% probability of inaccurately predicting two or more secondary classes (29/6,630).

Table 2

Comparison of prediction accuracy between combined and separate models according to the number of positives in the secondary classes.

	Number of positivesin the secondary classes	Primary class correct					Primary class incorrect
		Number of the secondary classesincorrectly predicted					Number of the secondary classesincorrectly predicted
		4	3	2	1	0	4	3	2	1	0
Combinedmodel	0(n = 3,122)	-	-	-	127(4.07%)	2,890(92.57%)	-	-	-	14(0.45%)	91(2.91%)
	1(n = 3,190)	-	1(0.03%)	8(0.25%)	222(6.96%)	2,777(87.05%)	-	-	3(0.09%)	33(1.03%)	146(4.58%)
	2(n = 311)	-	1(0.32%)	10(3.22%)	104(33.44%)	173(55.63%)	-	1(0.32%)	2(0.64%)	12(3.86%)	8(2.57%)
	3(n = 7)	-	-	3(42.86%)	3(42.86%)	1(14.29%)	-	-	-	-	-
	Sum(n = 6,630)	-	2(0.03%)	21(0.32%)	456(6.88%)	5,841(88.10%)	-	1(0.02%)	5(0.08%)	59(0.89%)	245(3.70%)
Separatemodel	0(n = 3,122)	-	-	4(0.13%)	122(3.91%)	2,857(91.51%)	-	-	-	16(0.51%)	123(3.94%)
	1(n = 3,190)	-	-	13(0.41%)	269(8.43%)	2,743(85.99%)	-	-	1(0.03%)	16(0.50%)	148(4.64%)
	2(n = 311)	-	-	10(3.22%)	106(34.08%)	172(55.31%)	-	1(0.32%)	-	14(4.50%)	8(2.57%)
	3(n = 7)	-	-	2(28.57%)	1(14.29%)	4(57.14%)	-	-	-	-	-
	Sum(n = 6,630)	-	-	29(0.44%)	498(7.51%)	5,776(87.12%)	-	1(0.02%)	1(0.02%)	46(0.69%)	279(4.21%)

Comparison with separate models

Compared to the separate models, the combined model slightly improved the predictability of the deep learning models except for myringitis, albeit not in a statistically significant way (Table 1). The combined model provided correct diagnoses for all classes in 88.1% of the images (5,841/6,630), which was 0.98% higher than the separate models (Table 2, p = 0.009).

Discussion

In real practice, it is not easy to examine the status of TM and reach an accurate diagnosis of the middle ear in crying children or non-cooperative patients in a short time. Additionally, in situations where a skilled otologist is not available, there is likely to be an incorrect diagnosis, which leads to malpractice. Although diagnostic rates have dramatically increased since the otoendoscopy was introduced, diagnostic accuracy still differs among physicians [2], while even otolaryngologists can sometimes produce inaccurate diagnoses [3]. Therefore, many researchers have worked on various deep learning models for the effective diagnosis of middle ear diseases. Previous studies have shown that deep-learning classification can accurately predict the diagnosis of otitis media, up to almost 98.26% of the time [8,9,12]. Alhudhaif et al. [8] analyzed a total 956 otoendoscopic images divided into five classes consisting of otitis externa, ear ventilating tube, foreign bodies in the ear, pseudo-membranes, and tympanosclerosis with an overall accuracy rate of 98.26%. Khan et al. [9] analyzed 2,484 otoendoscopic images divided into three classes consisting of normal, perforation, and middle ear effusion with an overall accuracy rate of 95%. Zeng et al. [12] analyzed 20,542 otoendoscopic images divided into eight classes consisting of normal, cholesteatoma of the middle ear, chronic suppurative otitis media, external auditory canal bleeding, impacted cerumen, otomycosis external, secretory otitis media, and tympanic membrane calcification with an overall accuracy rate of 95.59%. However, these studies were limited by the fact that only one diagnostic label per image was assigned for deep-learning prediction, despite the fact that multiple diseases can be detected simultaneously in real practice. For example, some patients with attic cholesteatoma can have ventilating tube for prevention of TM retraction, while we can also diagnose myringitis in a patient who has tympanic perforation with or without tympanosclerosis. In this study, we proposed a deep-learning method that can predict the diagnosis of TM changes for two non-coexisting diseases (OME and COM) and four concurrently detectable categories (attic cholesteatoma, myringitis, otomycosis and ventilating tube) with a single network. Our deep-learning classification demonstrated high predictive performance using a database including TMs with up to 4 diseases at the same time. The DSC value of the primary class was greater than 95%, with COM achieving the highest value. In terms of secondary classes, the ventilating tube was rarely misidentified (DSC = 98.89%). Therefore, the multi-class classification for TM changes may have potential for higher clinical applicability than previous approaches in which all images were single labeled. The combined model for predicting multiple classes at the same time produced better outcomes and required less inference time than the separate models that required a per-class training. The combined model made its prediction by comprehensively observing the entire tympanic membrane (Fig 5). The combined model also finished the prediction in 1/5 of the training and inference time required for separate models (Table 3). These advantages of deep-learning prediction can help improve the overall diagnostic quality for TM changes. Due to their high predictability, the deep learning models can also support clinical decision-making for inexperienced clinicians and be utilized as a training tool for medical staff. The reduced analysis time of the deep learning models can also make real-time application more feasible. In the same regard, deep learning prediction can help with more accurate diagnoses beyond the constraints of time and space through tele-medicine. Finally, their high reproducibility can enhance the reliability and objectivity of the analysis tool for diagnosis.

Fig 5

Table 3

Computational cost and inference time for application of deep-learning classification for tympanic membrane changes.

Model		Number of parameters (million)	Training time(s)	Training time per epoch (s)	Inference time per image (ms)
Combined		17.57	10,166	50.83	2.594
Separated	Primary class	17.55	8,654	43.27	2.616
	Attic cholesteatoma	17.55	11,365.2	56.83	2.570
	Myringitis	17.55	11,147.2	55.74	2.558
	Otomycosis	17.55	11,272	56.36	2.572
	Ventilating tube	17.55	7,087.8	35.44	2.577
	Sum		49,526.2	247.64	12.893

Grad-CAM visualization of representative examples for combined (upper row) and separate (lower row) models. The red area refers to the part of the model where the attention is strong. GT, ground truth; OME, otitis media with effusion. However, there were still some limitations on this study. First, even though a large amount of samples were collected for analysis, the deep learning dataset was collected from a single center. Second, a small sample size of otomycosis resulted in fewer training opportunities, thus impairing its predictability. Third, as the number of positives in the secondary classes increases, the number of the secondary classes correctly predicted decreased, even in multi-class classification. An extended dataset with diverse disease patterns can be used to validate the generality and robustness of our classification and improve the prediction performance of TM changes. In the same vein, when applied to otoendoscopic video sequences [15], it can help overcome the bias of still image-based prediction. Cerumen, which was not included in this study, may limit the information on TMs required for diagnosis. As part of the pre-diagnosis evaluation process, quantifying the amount of cerumen using deep-learning segmentation would be helpful to determine whether cleaning of external acoustic meatus is necessary for accurate diagnosis. Ultimately, it is necessary to develop diagnostic tools that anyone can use in the EAC to easily diagnose otologic diseases.

Conclusions

In the present study, we developed a multi-class classification method for predicting TM changes using deep-learning. The deep-learning algorithm accurately diagnosed the TM changes on otoendoscopic images, even for multiple concurrent diseases. Using the combined model, the inference time per image was reduced to 2.594 ms (more than 380 images can be processed per second), which indicates that deep-learning prediction can be applicable in real-time. Therefore, deep-learning classification can support clinical decision-making by accurately and reproducibly predicting tympanic membrane changes in real time, even in the presence of multiple concurrent diseases. 21 Jun 2022

PONE-D-22-12241

Automated Multi-class Classification of Otitis Media using Deep Learning

PLOS ONE Dear Dr. Ahn, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

If the authors choose to submit a revised version of the manuscript, please include an itemized and detailed response to the comments made by the Editor and the Reviewers (see below). Please submit your revised manuscript by Aug 05 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Rafael da Costa Monsanto, M.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Thank you for stating the following financial disclosure: "This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT).(No. 2021R1A2C2010048)" Please state what role the funders took in the study. If the funders had no role, please state: ""The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."" If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf. 3. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. Additional Editor Comments: Please address the comments made by all the reviewers. Although reviewers agree this is an interesting study, there are several concerns that must be addressed before the article is considered for publication. Major concerns included: - Grammar and syntax review is necessary; - A more detailed description of the methods is needed (calculation of sample size, validation of methods, experience of examiners, inter-observer agreement, etc); - The IRB protocol number must be included in the body of the manuscript; - What does the category "None" entitles; - The lack of inclusion / differentiation of acute otitis media and otomycosis. Please include an itemized, detailed response to the comments made by all reviewers. If not yet done so, please make all data available or provide a detailed explanation why some of the data must not be shared. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Partly Reviewer #2: Yes Reviewer #3: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: I Don't Know ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Dear authors Thank you for this interesting study. This maybe helpful for general practitioners, pediatricians and other specialities that may work with some patients and will present some challenges for their diagnosis of middle ear diseases. However, the presentation of data in tables and figures look even more complete than theie explanations on the manuscript. Also there are some detailed issues with English writing. This manuscript will be totally benefit of a refinement of syntax and grammar. Some phrases look pretty colloquial for the scope of this journal. There are some other questions regarding the manuscript 1. Why don't you included the acute otitis media? this is one of the most common ear diseases in children and the assessment of this disease will enrich the study. 2. In the study, you only included two researchers, and one otologist that we do not know their time at practice, that would be really helpful to know it. Another question is variability between researchers, did you take it during the study? and also the variability between ears of a same patient, I think, this needs to be addressed. You may need to clarify if both researchers were blinded or not and how was blinding 3. It is a bit tough to know what is first and secondary class, throughout the text, is very confusing for readers 4. The classification 'none', sometimes is confused with OME and COM, this needs to be clear for readers 5. Will be very interesting if the discussion is refined and comparison with other softwares and tools is made Your work is very interesting but needs to be refined in grammar, syntax and some specific details in the methodology and results Reviewer #2: Thank you very much for the opportunity to review this quite fascinating manuscript for PLOS ONE. In this retrospective study, authors aimed to assess the impact of concurrent changes of the tympanic membrane (perforation, myringitis and more) by using deep learning. Despite of this interesting approach, I have few comments. 1. TITLE: I think it should me more appropriate if authors change the title for something more related to classification of tympanic membrane changes by using deep learning assessment. 2. MATERIALS AND METHODS: How the samples were collected? What equipment was used and how images were processed? How distant from the tympanic membrane photos were taken? Was the method reproducible? 3. Please, write the IRB number in the manuscript. 4. Was the method already validated anteriorly? 5. It is interesting that ventilation tubes were not confused with TM perforation. 6. I suggest the authors to write a paragraph discussing the importance of the deep learning in assessing the tympanic membrane also in times of telemedicine. 7. Line 199: "images inyo three classes" (into?) 8. I suggest the authors to re-write the last paragraph in order to add the conclusion section. 9. D the authors found any correlation among size of perforation and better results in automated classification? Reviewer #3: This retrospective study built a database of otoendoscopic images including multiple diseases and investigate the impact of concurrent diseases on the classification performance of deep learning network on the diagnostic performance using algorithm for multi-class classification of otitis media. However, I would like to point out some aspects that need clarification. 1) For the primary classification of the images in the 3 categories, the category "NONE" was not clear which cases would be included, if images without diagnosis of chronic otitis media and otitis media with effusion, or if there would also be images of normal ears, cerumen? In addition, for the secondary classification, there was the allocation of a very small number of certain cases, mainly of otomycoses, which greatly impaired the accuracy of this diagnosis. 2) As the author himself reports in the justifications for this study, there are several criteria for the diagnosis of tympanic membrane lesions and the criteria used by specialists (a skilled otologist) for the diagnosis of myringitis in this study were not described. 3) It needs to be clarified why the construction of this algorithm does not include the diagnosis of acute otitis media and cerumen, which could certainly impact the otoendoscopies performed by other non-ENT professionals who did not clean the external acoustic meatus. 4) Several deep learning models for the diagnosis of middle ear diseases have already been developed and it is not described what are the real differences of the previous models in relation to the combined model of this study and if the images used were still images or otoendoscopic video sequence? 5) It should be further described how this model can reduce inference time and computational resources for diagnostic support. 6) the titles of figures 2, 3 and 4 are in bold and without focus, making it very difficult to read. 7) The conclusion in the abstract is extremely broad, making inferences that cannot be supported with the result presented and we emphasize that what is described at the end of the manuscript is much more faithful to the results presented in the study. 8) We suggest checking the writing of the manuscript because there are typos. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 11 Aug 2022 Response to the comments Reviewer #1: Dear authors Thank you for this interesting study. This maybe helpful for general practitioners, pediatricians and other specialities that may work with some patients and will present some challenges for their diagnosis of middle ear diseases. However, the presentation of data in tables and figures look even more complete than theie explanations on the manuscript. Also there are some detailed issues with English writing. This manuscript will be totally benefit of a refinement of syntax and grammar. Some phrases look pretty colloquial for the scope of this journal. We thank you for your time and input. We have responded to the comments below. There are some other questions regarding the manuscript 1. Why don't you included the acute otitis media? this is one of the most common ear diseases in children and the assessment of this disease will enrich the study. We agree that acute otitis media (AOM) is one of the common otologic diseases, especially in children. In our study, AOM falls under the secondary class of ‘myringitis’, defined as any inflammation on tympanic membrane. If the deep-learning model correctly predicts for AOM, the primary and secondary classes will be ‘None’ and ‘myringitis’, respectively. We added a sentence describing the classification setting for AOM. Added in page 7: Myringitis is defined as any inflammation of the tympanic membrane, including acute otitis media. 2. In the study, you only included two researchers, and one otologist that we do not know their time at practice, that would be really helpful to know it. Another question is variability between researchers, did you take it during the study? and also the variability between ears of a same patient, I think, this needs to be addressed. You may need to clarify if both researchers were blinded or not and how was blinding In our study, two experienced otologists participated in the annotation of otoendoscopic images, which did not contain any patient-identifiable information. They blindly assigned labels to each image. When two annotators labeled an image identically, the image was included in the dataset due to 'complete agreement'. We agree that the current description of the labeling process may raise concerns about bias in data selection. The relevant sentences were changed to clarify the labeling process of our study. Changed in page 7: The categories of each image were blindly annotated by two otologists with 26 and 5 years of experience, respectively. A total of 6,630 otoendoscopic images labeled identically by two annotators were included in this study. 3. It is a bit tough to know what is first and secondary class, throughout the text, is very confusing for readers To explain the combination of the primary class (OME, COM, 'None') and the secondary classes (attic cholesteatoma, myringitis, otomycosis and ventilating tube), Figure 2(b) was added as suggested. Changed in Figure 2: Figure 2 (a) Schematic diagram of deep learning network for multi-class classification of otoendoscopic images. (b) Labeling examples. For a normal tympanic membrane (TM), the otoendoscopic image was labeled as 'None' for the primary class and 'False' for the secondary classes (attic cholesteatoma, myringitis, otomycosis and ventilating tube). When TM was diseased as one of the secondary classes without otitis media with effusion (OME) and chronic otitis media (COM), the primary class was given as 'None' for the otoendoscopic image. 4. The classification 'none', sometimes is confused with OME and COM, this needs to be clear for readers The difference between previous approaches and our study is 'None' in the primary classes. When TM falls under any of the secondary classes without OME and COM, the primary class was assigned as 'None'. Since the prediction of the secondary class was binary in nature (True/False), the absence of the disease could be predicted. However, the primary class had two options of OME and COM (multi-class classification), so the absence of COM and OME should be included as a class in the primary class. Together with Figure 2(b), the sentences describing the meaning of 'None' were added as suggested. Added in pages 7 and 8: For example, when a TM was normal, the primary class was 'None' and the secondary classes were 'False' for attic cholesteatoma, myringitis, otomycosis, and ventilating tube (Fig. 2b). An otoendoscopic image with only otomycosis was assigned 'None' for the primary class, 'True' for otomycosis, and 'False' for the other secondary classes. 5. Will be very interesting if the discussion is refined and comparison with other softwares and tools is made We agree that the comparison will be interesting. However, there is currently no open-source or commercial solution for TM diagnosis. We will consider it if/when such a solution is available. The discussion section was reorganized and rephrased as suggested too. Since major revisions were made to the Discussion section, we regret that we cannot provide complete revisions below. Your work is very interesting but needs to be refined in grammar, syntax and some specific details in the methodology and results Grammar, syntax, and other expressions that needed to be improved have been corrected as suggested. We regret that we cannot provide a complete list of revisions due to space issues. Reviewer #2: Thank you very much for the opportunity to review this quite fascinating manuscript for PLOS ONE. In this retrospective study, authors aimed to assess the impact of concurrent changes of the tympanic membrane (perforation, myringitis and more) by using deep learning. Despite of this interesting approach, I have few comments. We thank you for your time and input. We have responded to the comments below. 1. TITLE: I think it should me more appropriate if authors change the title for something more related to classification of tympanic membrane changes by using deep learning assessment. Thank you for your suggestion. If the editorial policy of PLoS One permits the change of the publication title, we will change the title as follows: ' Multi-class Classification for Prediction of Tympanic Membrane Changes With Deep Learning Models' 2. MATERIALS AND METHODS: How the samples were collected? What equipment was used and how images were processed? How distant from the tympanic membrane photos were taken? Was the method reproducible? In the clinical routine practice, otologists performed diagnostic examination using real-time video sequence and captured an image frame visualizing the whole TM for diagnosis. The otoendoscopic images completely anonymized were stored in the hospital system. After the IRB review, the dataset requested was provided to researchers. The imaging devices were listed below. The pre-processing for the collected images were described in Figure 2. The distance of otoendoscopy from the tympanic membrane varied according to the shape of patient's auditory canal. One of the biggest advantages of deep learning is its reproducibility. No differences were found in repeated tests. Information about sample collection and image acquisition condition was added in the manuscript, as suggested. �  Endoscopy digital processor �  Olympus VISERA ELITE 2 �  Olympus VISERA CLV-S40 �  Olympus OTV-SP1 �  Camera head �  Olympus CH-S200-XZ-EB �  Olympus OTV-SP1H-NA-12E Added in page 7: In clinical practice, the otoendoscopic video sequence was taken for diagnostic examination and an image frame visualizing the whole TM was stored in the hospital system without patient-identifiable information. Otoendoscopic images enrolled based on the date of visit were completely anonymized before being provided by the hospital system. 3. Please, write the IRB number in the manuscript. IRB number was added in the manuscript as suggested. Added in Page 7: The present study is in compliance with the Declaration of Helsinki and research approval was granted from the Institutional Review Board of the Asan Medical Center with a waiver of research consent (IRB no. 2021-0837). 4. Was the method already validated anteriorly? EfficientNet-B4, which was used as the basis of our deep-learning models, is a network architecture that has been validated through a lot of studies. For the customized network of our study for TM change, the performance was validated through an ablation test. 5. It is interesting that ventilation tubes were not confused with TM perforation. In the prediction of ventilation tube, the deep-learning model in our study failed in only 37 cases (0.56% of entire images) and the mis-identification between the ventilation tube and perforation (COM) was rarely observed (3 cases). Please see the images for the false predictions in attached file. 6. I suggest the authors to write a paragraph discussing the importance of the deep learning in assessing the tympanic membrane also in times of telemedicine. The importance of deep-learning classification in the use of tele-medicine was described in Discussion section. Changed in page 14: The combined model for predicting multiple classes at the same time produced better outcomes and required less inference time than the separate models that required a per-class training. The combined model made its prediction by comprehensively observing the entire tympanic membrane (Fig. 5). The combined model also finished the prediction in 1/5 of the training and inference time required for separate models (Table 3). These advantages of deep-learning prediction can help improve the overall diagnostic quality for TM changes. Due to their high predictability, the deep learning models can also support clinical decision-making for inexperienced clinicians and be utilized as a training tool for medical staff. The reduced analysis time of the deep learning models can also make real-time application more feasible. In the same regard, deep learning prediction can help with more accurate diagnoses beyond the constraints of time and space through tele-medicine. Finally, their high reproducibility can enhance the reliability and objectivity of the analysis tool for diagnosis. 7. Line 199: "images inyo three classes" (into?) The typo was corrected as suggested. 8. I suggest the authors to re-write the last paragraph in order to add the conclusion section. The last paragraph of Discussion section was reorganized and rephrased as suggested. Following the Editorial policy of PLoS One, a conclusion section was added. Added in pages 15 and 16: Conclusions In the present study, we developed a multi-class classification method for predicting TM changes using deep-learning. The deep-learning algorithm accurately diagnosed the TM changes on otoendoscopic images, even for multiple concurrent diseases. Using the combined model, the inference time per image was reduced to 2.594 ms (more than 380 images can be processed per second), which indicates that deep-learning prediction can be applicable in real-time. Therefore, deep-learning classification can support clinical decision-making by accurately and reproducibly predicting tympanic membrane changes in real time, even in the presence of multiple concurrent diseases. 9. D the authors found any correlation among size of perforation and better results in automated classification? For perforations smaller than a quarter of TM area, the sensitivity was 93.68% (474/506), while the overall sensitivity was 95.37% (1,463/1,534). The area analysis was performed visually. In the future work, quantification analysis of perforation and TM areas will be assessed using deep-learning segmentation. Reviewer #3: This retrospective study built a database of otoendoscopic images including multiple diseases and investigate the impact of concurrent diseases on the classification performance of deep learning network on the diagnostic performance using algorithm for multi-class classification of otitis media. However, I would like to point out some aspects that need clarification. We thank you for your time and input. We have responded to the comment below. 1) For the primary classification of the images in the 3 categories, the category "NONE" was not clear which cases would be included, if images without diagnosis of chronic otitis media and otitis media with effusion, or if there would also be images of normal ears, cerumen? In addition, for the secondary classification, there was the allocation of a very small number of certain cases, mainly of otomycoses, which greatly impaired the accuracy of this diagnosis. 'None' indicates the absence of OME and COM. The primary class had two options of OME and COM (multi-class classification), so 'None' should be included as a class in the primary class. To explain the combination of the primary class (OME, COM, 'None') and the secondary classes (attic cholesteatoma, myringitis, otomycosis and ventilating tube), Figure 2(b) was added and the relevant sentences were added in page 7 as suggested. We agree that the small number of otomycosis samples impaired the classification performance. However, due to the low prevalence of the disease, it was not possible to increase the dataset size. This issue was described as a limitation in our study. In the future, an extended dataset collected from multiple centers will help improve the predictability. Added in pages 7 and 8: For example, when a TM was normal, the primary class was 'None' and the secondary classes were 'False' for attic cholesteatoma, myringitis, otomycosis, and ventilating tube (Fig. 2b). An otoendoscopic image with only otomycosis was assigned 'None' for the primary class, 'True' for otomycosis, and 'False' for the other secondary classes. Changed in Figure 2: Figure 2 (a) Schematic diagram of deep learning network for multi-class classification of otoendoscopic images. (b) Labeling examples. For a normal tympanic membrane (TM), the otoendoscopic image was labeled as 'None' for the primary class and 'False' for the secondary classes (attic cholesteatoma, myringitis, otomycosis and ventilating tube). When TM was diseased as one of the secondary classes without otitis media with effusion (OME) and chronic otitis media (COM), the primary class was given as 'None' for the otoendoscopic image. 2) As the author himself reports in the justifications for this study, there are several criteria for the diagnosis of tympanic membrane lesions and the criteria used by specialists (a skilled otologist) for the diagnosis of myringitis in this study were not described. The criteria for each class were added as suggested. Added in page 7: OME refers to effusions in the middle ear cavity, which manifest in the air-fluid level or as an amber-like color change of TMs. COM refers to a perforated TM. Attic cholesteatoma refers to any sign of retraction pocket in attic or visible attic destruction. Myringitis is defined as any inflammation of the tympanic membrane, including acute otitis media. Otomycosis refers to a fibrinous accumulation of debris or visible pores of fungus in the external auditory canal. Ventilating tube refers to an inserted tube across the TM. 3) It needs to be clarified why the construction of this algorithm does not include the diagnosis of acute otitis media and cerumen, which could certainly impact the otoendoscopies performed by other non-ENT professionals who did not clean the external acoustic meatus. We agree that acute otitis media (AOM) a rather common otologic disease, especially in children. In our study, AOM falls under the secondary class of ‘myringitis’, defined as any inflammation on tympanic membrane. If the deep-learning model correctly predicts for AOM, the primary and secondary classes will be ‘None’ and ‘myringitis’, respectively. We added a sentence describing the classification setting for AOM. Some previous studies included 'cerumen' as a class for deep-learning classification as you described. Since this study only evaluated the diagnostic performance when sufficient information about the tympanic membrane was available, we assumed that the evaluation of cerumen was performed prior to the TM diagnosis. The impact of cerumen on prediction performance will be assessed using deep learning segmentation in our future work. We added a few sentences describing the issue of cerumen. Added in page 7: Myringitis is defined as any inflammation of the tympanic membrane, including acute otitis media. Added in page 15: Cerumen, which was not included in this study, may limit the information on TMs required for diagnosis. As part of the pre-diagnosis evaluation process, quantifying the amount of cerumen using deep-learning segmentation would be helpful to determine whether cleaning of external acoustic meatus is necessary for accurate diagnosis. 4) Several deep learning models for the diagnosis of middle ear diseases have already been developed and it is not described what are the real differences of the previous models in relation to the combined model of this study and if the images used were still images or otoendoscopic video sequence? The main difference between previous studies and our method is that our model predicts multiple labels for a single image, while all images were single labeled in previous studies. The combined model also requires less time for deep-learning prediction, allowing real-time application. Our dataset consisted of still images, and the advantage of otoendoscopic video sequences in the deep-learning classification of TM changes was described in Discussion, as shown below. To clarify the difference between previous approaches and our study, the paragraphs below were rephrased and reorganized. Changed in page 14: In this study, we proposed a deep-learning method that can predict the diagnosis of TM changes for two non-coexisting diseases (OME and COM) and four concurrently detectable categories (attic cholesteatoma, myringitis, otomycosis and ventilating tube) with a single network. Our deep-learning classification demonstrated high predictive performance using a database including TMs with up to 4 diseases at the same time. The DSC value of the primary class was greater than 95%, with COM achieving the highest value. In terms of secondary classes, the ventilating tube was rarely misidentified (DSC = 98.89%). Therefore, the multi-class classification for TM changes may have potential for higher clinical applicability than previous approaches in which all images were single labeled. The combined model for predicting multiple classes at the same time produced better outcomes and required less inference time than the separate models that required a per-class training. The combined model made its prediction by comprehensively observing the entire tympanic membrane (Fig. 5). The combined model also finished the prediction in 1/5 of the training and inference time required for separate models (Table 3). 5) It should be further described how this model can reduce inference time and computational resources for diagnostic support. Information about the computational cost and inference time of deep-learning prediction was added in Table 3. Descriptions of the inference time were also added to the Abstract and Discussion section Added in Abstract: The inference time per image was 2.594 ms on average. Added in Table 3: Changes in page 14: The combined model for predicting multiple classes at the same time produced better outcomes and required less inference time than the separate models that required a per-class training. The combined model made its prediction by comprehensively observing the entire tympanic membrane (Fig. 5). The combined model also finished the prediction in 1/5 of the training and inference time required for separate models (Table 3). 6) the titles of figures 2, 3 and 4 are in bold and without focus, making it very difficult to read. Resolutions of figures 2, 3 and 4 were improved (doubled in each axis) as suggested. 7) The conclusion in the abstract is extremely broad, making inferences that cannot be supported with the result presented and we emphasize that what is described at the end of the manuscript is much more faithful to the results presented in the study. The Conclusion section in the Abstract was rephrased to concisely explain the implication of our study. Changed in Abstract: Deep-learning classification can be used to support clinical decision-making by accurately and reproducibly predicting tympanic membrane changes in real time, even in the presence of multiple concurrent diseases. 8) We suggest checking the writing of the manuscript because there are typos. Grammar, syntax, and other expressions that needed to be improved have been corrected as suggested. We regret that we cannot provide a complete list of revisions due to space issues. Submitted filename: Rebuttal_multiclass_final.docx Click here for additional data file. 26 Sep 2022 Automated Multi-class Classification for Prediction of Tympanic Membrane Changes with Deep Learning Models PONE-D-22-12241R1 Dear Dr. Ahn, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Rafael da Costa Monsanto, M.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Dear authors Thank you for addressing all our comments. Now the manuscript looks even clearer and concise. The techniques and the graphics are pretty well explained which highlights the relevance of the methods used. Reviewer #3: Thank you for thoroughly addressing the comments. The authors have provided corresponding information and the manuscript has improved overall. Findings from the additional analysis are interesting and remain relevant. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #3: No ********** 28 Sep 2022 PONE-D-22-12241R1 Automated Multi-class Classification for Prediction of Tympanic Membrane Changes with Deep Learning Models Dear Dr. Ahn: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Rafael da Costa Monsanto Academic Editor PLOS ONE

11 in total

1. Deep Learning for Classification of Pediatric Otitis Media.

Authors: Zebin Wu; Zheqi Lin; Lan Li; Hongguang Pan; Guowei Chen; Yuqing Fu; Qianhui Qiu
Journal: Laryngoscope Date: 2020-12-28 Impact factor: 3.325

2. Deep metric learning for otitis media classification.

Authors: Josefine Vilsbøll Sundgaard; James Harte; Peter Bray; Søren Laugesen; Yosuke Kamide; Chiemi Tanaka; Rasmus R Paulsen; Anders Nymark Christensen
Journal: Med Image Anal Date: 2021-03-14 Impact factor: 8.545

3. Otoscopy simulation training in a classroom setting: a novel approach to teaching otoscopy to medical students.

Authors: Joel Davies; Lucas Djelic; Paolo Campisi; Vito Forte; Albino Chiodo
Journal: Laryngoscope Date: 2014-08-28 Impact factor: 3.325

4. Automatic detection of tympanic membrane and middle ear infection from oto-endoscopic images via convolutional neural networks.

Authors: Mohammad Azam Khan; Soonwook Kwon; Jaegul Choo; Seok Min Hong; Sung Hun Kang; Il-Ho Park; Sung Kyun Kim; Seok Jin Hong
Journal: Neural Netw Date: 2020-04-01