Lili Zhang1,2, Yongzheng Wu1, Bin Zheng3, Lizhong Su3, Yuan Chen4, Shuang Ma4, Qinqin Hu4, Xiang Zou5, Lie Yao5, Yinlong Yang6, Liang Chen5, Ying Mao5, Yan Chen1, Minbiao Ji1,2. 1. State Key Laboratory of Surface Physics and Department of Physics, Fudan University, Shanghai 200433, China. 2. Human Phenome Institute, Multiscale Research Institute of Complex Systems, Key Laboratory of Micro and Nano Photonic Structures (Ministry of Education), Fudan University, Shanghai 200433, China. 3. Department of Otolaryngology, Zhejiang Provincial People's Hospital, People's Hospital of Hangzhou Medical College, Hangzhou 310014, China. 4. Department of Pathology, Zhejiang Provincial People's Hospital, People's Hospital of Hangzhou Medical College, Hangzhou 310014, China. 5. Department of Neurosurgery, Department of Pancreatic Surgery, Huashan Hospital, Fudan University, Shanghai 200040, China. 6. Department of Breast Surgery, Fudan University Shanghai Cancer Center; Department of Oncology, Shanghai Medical College; Fudan University, Shanghai 200040, China.
Abstract
Maximal resection of tumor while preserving the adjacent healthy tissue is particularly important for larynx surgery, hence precise and rapid intraoperative histology of laryngeal tissue is crucial for providing optimal surgical outcomes. We hypothesized that deep-learning based stimulated Raman scattering (SRS) microscopy could provide automated and accurate diagnosis of laryngeal squamous cell carcinoma on fresh, unprocessed surgical specimens without fixation, sectioning or staining. Methods: We first compared 80 pairs of adjacent frozen sections imaged with SRS and standard hematoxylin and eosin histology to evaluate their concordance. We then applied SRS imaging on fresh surgical tissues from 45 patients to reveal key diagnostic features, based on which we have constructed a deep learning based model to generate automated histologic results. 18,750 SRS fields of views were used to train and cross-validate our 34-layered residual convolutional neural network, which was used to classify 33 untrained fresh larynx surgical samples into normal and neoplasia. Furthermore, we simulated intraoperative evaluation of resection margins on totally removed larynxes. Results: We demonstrated near-perfect diagnostic concordance (Cohen's kappa, κ > 0.90) between SRS and standard histology as evaluated by three pathologists. And deep-learning based SRS correctly classified 33 independent surgical specimens with 100% accuracy. We also demonstrated that our method could identify tissue neoplasia at the simulated resection margins that appear grossly normal with naked eyes. Conclusion: Our results indicated that SRS histology integrated with deep learning algorithm provides potential for delivering rapid intraoperative diagnosis that could aid the surgical management of laryngeal cancer.
Maximal resection of tumor while preserving the adjacent healthy tissue is particularly important for larynx surgery, hence precise and rapid intraoperative histology of laryngeal tissue is crucial for providing optimal surgical outcomes. We hypothesized that deep-learning based stimulated Raman scattering (SRS) microscopy could provide automated and accurate diagnosis of laryngeal squamous cell carcinoma on fresh, unprocessed surgical specimens without fixation, sectioning or staining. Methods: We first compared 80 pairs of adjacent frozen sections imaged with SRS and standard hematoxylin and eosin histology to evaluate their concordance. We then applied SRS imaging on fresh surgical tissues from 45 patients to reveal key diagnostic features, based on which we have constructed a deep learning based model to generate automated histologic results. 18,750 SRS fields of views were used to train and cross-validate our 34-layered residual convolutional neural network, which was used to classify 33 untrained fresh larynx surgical samples into normal and neoplasia. Furthermore, we simulated intraoperative evaluation of resection margins on totally removed larynxes. Results: We demonstrated near-perfect diagnostic concordance (Cohen's kappa, κ > 0.90) between SRS and standard histology as evaluated by three pathologists. And deep-learning based SRS correctly classified 33 independent surgical specimens with 100% accuracy. We also demonstrated that our method could identify tissue neoplasia at the simulated resection margins that appear grossly normal with naked eyes. Conclusion: Our results indicated that SRS histology integrated with deep learning algorithm provides potential for delivering rapid intraoperative diagnosis that could aid the surgical management of laryngeal cancer.
Entities:
Keywords:
head and neck; intraoperative histology; label-free imaging; laryngeal cancer; stimulated Raman scattering
Laryngeal cancer is one of the most common tumors of the respiratory tract, and squamous cell carcinoma (SCC) is the most common malignancy of the larynx 1,2. Surgery remains an essential component in the treatment of laryngeal cancer, which aims at the dual goals of cure and preservation of organs, because larynx supports the fundamental physiological functions of breathing, speech, and swallowing 2-4. Patients with early stage tumors have been proven to benefit from organ preservation-based surgical approaches 5, which requires the maximal removal of tumors while sparing the adjacent normal tissues. However, securing resection margin is challenging because of the complex anatomical structures of the larynx 6, and decisions regarding the extent of resection are crucial during operations. Although the histological differences between healthy and SCC tissues are clear, they are usually difficult to distinguish by naked eyes, even with visual aids such as narrow-band imaging 7, especially at the tumor boundaries. The current standard intraoperative histology with hematoxylin and eosin (H&E) staining suffers from a series of time-consuming procedures, such as freezing, sectioning and staining 8. In addition, skilled pathologists are required for intraoperative diagnosis, which complicates the surgical workflow and generates discrepancies of subjective results among different pathologists. Therefore, imaging tools that provide rapid and accurate delineation of normal and neoplastic tissues are critically important.Stimulated Raman Scattering (SRS) microscopy is a novel chemical imaging technique that has shown promise in label-free histology, without the need of the aforementioned tissue processing as in H&E 8-13. SRS amplifies the weak Raman signal via stimulated emission by orders of magnitude to enable fast imaging with molecular specificity inherited from spontaneous Raman spectroscopy 14-16. As a result, SRS microscopy is becoming an emerging tool for many biomedical researches, including lipid metabolism and quantification 17-19, drug delivery 20, tissue imaging 9,10,21,22, protein misfolding 23, etc., based on the endogenous contrasts from lipids, proteins and nucleic acids with subcellular resolution 24,25. In particular, SRS microscopy has shown success in rapid histopathology for brain tumors in both xenograft models and human surgical specimens, demonstrating diagnosis in near-perfect agreement with conventional H&E 8-10,22. Clinical SRS histology has recently stepped forward to the operating room with a fiber-laser based portable system, generating reliable intraoperative histological results on unprocessed surgical brain tissues 26,27. Despite previous efforts on coherent Raman scattering histopathology on a few types of tissues and diseases 8-13,26,28-33, the potential for SRS to diagnose larynx tissues has never been rigorous investigated.Moreover, as machine learning (ML) algorithms evolve rapidly, intelligent and precise diagnoses becomes possible with image-based deep learning 34,35. Several ML models including multilayer perceptron (MLP) and random forests have been applied in brain tumor diagnosis with SRS microscopy 11,26. Convolutional neural network (CNN) is an advanced neural network that has superior capability in recognizing two-dimensional data. For instance, CNNs have shown potentials in differentiating diagnostic features of H&E images, including mitotic counting in breast cancer, glands counting and epithelial/stromal segmenting in colon cancer, classification and mutation prediction in small cell lung cancer, as well as tumor grading in brain gliomas 36-40. Recently, a well-known CNN model - GoogLeNet has shown success in recognizing and classifying clinical photos of diabetic retinopathy and skin cancer 41,42. As a subtype of CNN, residual convolutional neural network (ResNet) has the further advantage of reduced error rate for both training and test data, especially when the dataset gets larger and the neural network gets deeper 43. With the advancing techniques in SRS fast imaging and increasing size of SRS image datasets 21,44, applying deep learning algorithm tailored for SRS histology is highly demanded for improving classification accuracy, as well as accomplishing more complex classification tasks.In this study, we systematically evaluated the capability of SRS microscopy in providing rapid and diagnostic histologic images for human larynx tissues. Combining two-color SRS with second harmonic generation (SHG) microscopy 45,46, we were able to acquire three-color images representing the distributions of lipids, proteins and collagen fibers. We first utilized the multi-color SRS system to image frozen tissue sections of both normal and neoplastic surgical tissues, yielding clear cytologic and histoarchitectural features for diagnosis. By correlating with H&E staining results of the adjacent sister sections and assessed by three pathologists, our results demonstrated that SRS reached high diagnostic concordance (κ>0.90). We further demonstrated that SRS microscopy is capable of imaging fresh tissues without any freezing or sectioning artifacts, capturing the fundamental diagnostic hallmarks for the classification of larynx tissues. More importantly, we developed a 34-layered ResNet (ResNet34) model trained with laryngeal SRS images, which accurately differentiate between diagnostic normal and neoplasia from 33 untrained surgical specimens. Furthermore, we simulated the process of ResNet34-SRS aiding intraoperative evaluation of resection margins on totally removed larynxes by imaging at various distances from the tumor margin. Our approach of deep learning assisted SRS microscopy holds promise in providing rapid and automated histopathologic method that may improve the surgical care of laryngeal squamous cell carcinoma.
Methods
Tissue collection and preparation
All tissue samples were collected from patients in Zhejiang Provincial People's Hospital, and approved by the Ethics Committee with informed written consent (KY2015260). Surgical tissues were removed following standard operative procedures. Laryngeal squamous cell carcinoma tissues were from clinical diagnosed biopsies, and normal tissues were largely taken from vocal cord polypus. To prepare frozen sections, surgical specimens were snap frozen in liquid nitrogen and stored at -80 0C until sectioned with freezing microtome (CM 1950, Leica). Thin sections of 20 µm thicknesses were used for SRS imaging, and adjacent 5 µm thick sections were sent for H&E staining. All fresh samples and thin sections were maintained at low temperature with dry ice and delivered to Fudan University within 7 hours through express transportation. Fresh tissues were sliced manually with a razor blade and then sealed between two coverslips and a perforated glass slide (0.5 mm thick) for direct SRS imaging. Thin frozen tissue sections were simply covered with coverslips, and imaged without further processing. Totally removed larynxes were taken from patients of advanced laryngeal SCC for simulated surgeries and evaluations of resection margins.In total, 78 patient cases were involved in the database for imaging, model training and testing, more detailed information of all the cases are shown in Table . For fresh tissues, 45 out of the 78 cases (21 normal and 24 cancerous) were used for model training and validation, whereas the residual 33 cases were kept untouched until the final testing. For frozen section analysis, 15 out of the 78 cases (marked in Table ) were used to generate 80 pairs of adjacent sections for SRS and H&E imaging. For all these cases, standard H&E based histopathology were done on paraffin embedded sections and served as the “ground truth”.
Microscope setup
The apparatus of our SRS based microscope is illustrated in Figure . A commercial femtosecond (fs) optical parametric oscillator (OPO, Insight DS+, Newport) with dual outputs were used as the light source. The fundamental 1040 nm beam (~200 fs) was used as the Stokes, and the wavelength tunable output (690-1300 nm, ~ 150 fs) was used as the pump. Both beams were linearly chirped to several picoseconds (pump: ~ 3.8 ps, Stokes: ~ 1.8 ps) through highly dispersive SF57 glass rods to work in the “spectral focusing SRS” mode 17,44, where the target Raman frequency could be adjusted by scanning the time delay between pump and Stokes pulses, instead of changing the wavelengths (Figure ). The Stokes beam was intensity modulated by an electro optical modulator (EOM) at 10 MHz, and collinearly combined with the pump beam through a dichroic mirror (DMSP1000, Thorlabs). The combined beam was delivered to the laser scanning microscope (FV1200, Olympus) and focused onto the samples with an objective (UPLSAPO 60XWIR, NA 1.2 water, Olympus). The transmitted stimulated Raman loss (SRL) signal of the pump beam was filtered with a band-pass filter (CARS ET890/220, Chroma), detected with a home-built back-biased photodiode and demodulated with a lock-in amplifier (HF2LI, Zurich Instruments) to generate pixel data for the microscope to form SRS images. In this study, we fixed the pump beam at 802 nm center wavelength, and imaged at two time delays which correspond to two Raman frequencies of 2845 cm-1 and 2930 cm-1 for lipid/protein decomposition. The SHG signal excited by the pump beam was simultaneously detected with a narrow band-pass filter (FF01-405/10-25, Semrock) and a photomultiplier (PMT) in the epi mode, generating images of collagen fiber distributions.The optical power of the pump and Stokes beams at the samples were kept at around 30 mW and 40 mW, respectively. Each field of view (FOV) was imaged with a size of 512 × 512 pixels and 2 µs pixel dwell time. Automated mosaic imaging method was applied to scan across large sample areas and all FOVs were stitched to form the full-sized images with custom written Matlab program. A typical 1 cm2 tissue costs ~ 8 minutes to image under strip mosaicing mode 21.
H&E staining
H&E staining was performed following the standard procedure. First, the tissue section was immersed in 100% methanol for 30 s and then stained in hematoxylin solution (Harris modified) for 1 minute. Sample was washed in deionized water for 10 seconds after each step. Next, we perform counterstain in 0.5% eosin solution for 60 s after dipping in bluing reagent [0.1% (v/v) ammoniawater solution] for 1 s and washing in deionized water for 1 second. At last, we dipped the sample in xylene for 10 s twice after washing and dehydrating in 80%, 95% and 100% ethanol for 2 s, respectively. Dried Sections were sealed with neutral gum and a coverslip. All reagents used were purchased from Sigma-Aldrich. The final H&E slides were imaged on a home-built automated system, composed of a bright field microscope (IX73, Olympus), a CCD camera (MG 320 C Speed, Moogee) and a motorized XY stage (Tango, Marzhauser Wetzlar GmbH & Co.). Mosaic imaging and stitching were realized with custom softwares written in VB.net and Matlab.
Image processing
Raw SRS images taken at 2845 cm-1 and 2930 cm-1 need to be decomposed into lipids and proteins distributions. Because SRS signal is linearly proportional to chemical concentrations, we apply a simple linear algorithm for the decomposition with measured SRS spectra of standard lipid (oleic acid - OA) and protein (bovine serum albumin - BSA) as shown in Figure and previous works 9,26. We extracted protein (blue) signal by subtracting SRS signal at 2845 cm-1 from that of 2930 cm-1, and the lipid (green) signal was directly taken from SRS signal at 2845 cm-1. SHG data (red) was used without further processing. Because of the aberrations from object lens, signal intensity of each image FOV is not evenly distributed, usually brighter in the center. We used the intensity profile measured from a spatially homogeneous sample to correct/flatten each FOV, followed by our stitching program to merge all FOVs together.
Survey and statistical analysis
We collected survey results using a web-based survey tool (LimeSurvey), consisting of 80 pairs of SRS and H&E images from adjacent sister sections, which were mixed and shown in random order. Three blinded pathologists were briefly educated with the principle and image contrasts of SRS, then read the 160 images and categorized each image as “normal” or “neoplasia” based on the diagnostic features of either cytology or histoarchitecture. The rating results were based on the “ground truth” of standard histopathology on paraffin embedded sections. For each pathologist, survey responses were used to calculate Cohen's kappa statistic for normal versus neoplasia to determine concordance between SRS and H&E with statistical product and service solutions (SPSS) software 47. We calculated the accuracies of the three pathologists using the ratio between the number of correct and total FOVs.
Deep-learning model
We constructed a ResNet34 model in Pytorch platform (https://pytorch.org/). The model is a tensor and dynamic neural network written in python. In addition to 34 layers of plain convolutional neural network, ResNet34 contains identity mappings that allow the information of the input or gradient to pass through many layers. ResNet34 has 33 convolutional layers and 1 fully connected (fc) layer. The convolutional layers mostly capture the main local features of images with 3×3 filters, and the last fully connected layer gives a binary classification according to the global feature connected from all local features. It is worth noticing that batch normalization was employed in the same magnitude at each convolutional layer. To optimize the neural network, the weights of network were initialized randomly. Loss was calculated according to the cross entropy (log mean square error). The selected optimizer was the 'Adam' optimizer with the following parameters: lr=5×10-5, β=0.9, β=0.99, w=10-4; where lr and w represent the learning rate and weight decay, respectively; β and β represent the memory lifetimes of the first and second moment 48. Images were fed in batches with a batch size of 100.Data augmentation techniques were used to help produce similar but non-identical data, which could effectively enlarge image database for the training of deep-learning models 49,50. In this work, we applied rotation, flipping and color jittering of images to effectively enlarge our image dataset. The random rotation angle was set to 20 degrees, and the probability of horizontal and vertical flipping is set to 0.5. The best color jitter values including image brightness, saturation and contrast fluctuations are set to 0.4.
Results
Characterization of SRS microscope
The basic principle of SRS process is illustrated in Figure . We first calibrated the SRS spectroscopy of our system with standard chemicals of lipid (OA) and protein (BSA) by measuring the SRS intensity as functions of time delays. SRS spectra of OA and BSA showed the Raman shifts of 2845 and 2930 cm-1 corresponding to the time delays of 0 and -2.2 ps in our platform, respectively (Figure ). Figure shows the typical SRS spectra of cell nucleus, and cytoplasm in larynx tissues. The difference of lipid and protein contents in cell nucleus and cytoplasm provides the basis of contrast for cellular morphology, i.e. cell nucleus contain much less lipids than surrounding tissues. Noting that collagen fibers showed strong SHG signal, which was used as is in this study. We then demonstrated multi-color imaging of larynx tissue with our integrated SRS/SHG microscope as shown in Figure , where the raw SRS images of the CH2 (2845 cm-1) and CH3 (2930 cm-1) vibrations, as well as the SHG image of collagen fibers were clearly shown. In SRS image taken at 2845 cm-1, cell nucleus appeared much darker than surrounding extranuclear structures because of the relatively lower concentrations of lipids. By contrast, SRS image obtained at 2930 cm-1 appeared bright in the whole cell. Lipid and protein contents could be extracted from the raw SRS image taken at 2845 cm-1 and 2930 cm-1 with linear decomposition algorithms. We color coded lipid, protein and collagen fibers as green, blue and red, respectively. Such three-color images could map out tissue architectures and provide detailed structural and chemical contrast for the histopathology of laryngeal tissue (Figure ). Figure shows the SRS and SHG intensity profiles along the dashed line as marked in Figure , demonstrating clear interfaces between cellulous epithelium layers and collagen-rich connective tissues. Hence, multi-color SRS microscopy may reveal important cellular features and tissue morphologies for larynx histology.
Validation of SRS imaging on thin frozen sections of larynx tissues
We began by evaluating the ability of SRS microscopy to image the architecture of normal larynx tissue sections. SRS and H&E images of adjacent thin tissue sections of vocal cord polypus were shown in Figure . Both SRS and H&E demonstrated the ability to detect characteristic large-scale histoarchitectural features including the squamous mucosa layer in the periphery and the underlying connective tissues (Figure ). Zoom-in SRS images could clearly reveal microscopic features of normal larynx (Figure ), including the intact basal lamina, regularly patterned basal layer, and squamous mucosa layer viewed from cross-section and en face (Figure ). Therefore, both SRS and H&E are capable of generating similar images of the microscopic architectures of normal larynx tissues that correlate well with each other.We next evaluated the ability of SRS microscopy to characterize the diagnostic features of laryngeal SCC. Figure showed a typical SCC in situ, demonstrating thickened squamous mucosa layer and increased cellular density, yet the basal lamina stays intact. In contrary to SCC in situ, invasive SCC tissue demonstrated infiltrative epithelial cells scattered across the basal lamina into the stroma (Figure ). Furthermore, zoomed-in SRS images could reveal detailed diagnostic features of SCC, including cytological atypia (Figure ), abnormal arrangement of neoplastic cells and lymphocytes (Figure ), cancer nests (Figure ) and keratin pearl (Figure ). It could be seen that SRS microscopy clearly differentiated these key histological features of SCC with high consistency with H&E staining.We tested the hypothesis that SRS microscopy could provide an alternative method of intraoperative histology, based on its capability to reveal diagnostic features. For each specimen, a pair of adjacent frozen sections were separately imaged with SRS and stained with H&E. We collected 160 images in total (80 SRS and 80 H&E) for the evaluation, a few of which are shown in Figure . Three professional larynx pathologists read the randomly mixed images of H&E and SRS following their own clinical practices. Responses were collected regarding the classification of neoplasia or normal based on cytology and histoarchitecture, and rated results by comparing with standard histology are shown in Table . Statistical analysis of the pathologists' diagnostic results on SRS and frozen H&E images yielded high concordance (Cohen's kappa) between them (κ=0.905-0.942). Moreover, pathologists were highly accurate in distinguishing neoplastic from normal larynx tissues based on SRS images (> 90%) (Table ). These results verified that SRS microscopy may serve as an alternative means for intraoperative histology with high accuracy and concordance compared with H&E.
SRS imaging of fresh surgical specimens
We performed SRS imaging on fresh larynx tissues, which were free of the freezing and sectioning artifacts with well-preserved tissues architectures. More importantly, fresh tissue imaging mimics the label-free intraoperative histology without complex sample processing. We first carpet-scanned the epithelium layer of a normal larynx specimen to show the detailed structures of epithelium tissues. Figure showed typical SRS images collected at different locations of the tissue, showing regular cellular morphology and patterns of normal squamous cells.We next imaged fresh laryngeal SCC tissues to demonstrate its diagnostic capability on these highly heterogeneous specimens (Figure ). SCC is the most common laryngeal cancer with abnormal squamous cells being a major pathological feature. Various pathological hallmarks at both cellular and tissue levels could be revealed by SRS imaging. Figure shows the typical heteromorphic cell nucleus, including diversified nucleus size and shape, in strong contrast to the normal squamous cells (Figure ). Figure shows scattered cells with enriched protein contents in nucleus (yellow arrows), which may be associated with the mitotic figures of proliferating tumor cells that contain elevated protein and DNA levels 25. Figure shows highly disordered cells with disrupted cellular morphologies, indicating high grade dysplasia. Cancer nests formed by a few clustered cells could be readily identified in Figure . Typical cancer nests surrounded by collagen fibers could be seen in Figure , demonstrating both cytological atypia and structural neoplasia. In highly differentiated laryngeal SCC tissue, typical keratin pearl could be observed in Figure . Fresh-tissue SRS imaging not only generates higher quality image data than frozen sections, but also simulates rapid intraoperative histology which may provide even faster and automated diagnosis when combined with proper image processing methods. We have developed a numerical algorithm to quantitatively analyze cellular density, nuclear morphology, and lipid/protein ratio, which could be used to differentiate normal and SCC tissues (Figure ). However, such feature based method relies on the pre-knowledge of quantitative histological details, requires very high image quality, and is computationally inefficient. We thus decided to apply deep-learning based method to classify SRS images instead.
Construction and training of deep-learning model
We employed ResNet34 model to assist the diagnosis of larynx tissues based on SRS image data. The network architecture is shown in Figure , the work flow and data segmentation are illustrated in Figure . To train the RetNet34 model, we incorporated SRS images from 45 training cases of 21 normal and 24 neoplastic cases (Table ), and labelled them as “normal” and “neoplasia”, respectively. Typical SRS images of the two groups are shown in Figure . SRS images of 33 untrained cases were kept untouched as the test set until the very end of the training process. We then sliced all images into small tiles of 200×200 pixels (77×77 µm) and randomly selected 18,750 image tiles from the training cases for model training. The total data sizes of the “normal” and “neoplasia” groups were kept equal. In addition, data augmentation techniques, such as rotation, vertical and horizontal flipping methods, were applied to inflate the size of the training dataset to reduce overfitting 50. At last, the enhanced training data was fed into the ResNet34 model for iteration to minimize the loss function.K-fold cross-validation (K-CV) approach was used to estimate the generalization capability of ResNet model and eliminate possible correlation between samples. Typically, K value of 5 or 10 was chosen to achieve balanced bias and variance 51. In our study, 5-fold cross-validation (5-CV) method was used: the total 18,750 training image tiles were randomly divided into five equal segments, one of which (3,750 tiles) was used for validation, and the remaining 4 segments (15,000 tiles) were used for training (model building); this process was repeated 5 times by using each of the 5 segments as the validation set, and the averaged accuracy and loss were reported for optimization in the next epoch. We plotted the averaged losses and accuracies of the 5-CV for both training and validation sets in Figure . The results showed high validation accuracy up to 95.9% with a stable standard deviation of 0.4% and low validation errors of 12.8% with a standard deviation of 1.4%, indicating a well-balanced bias and variance. In addition, both accuracies and errors remained stable from 300 to 600 epochs, implying minimum overfitting of our model.
Deep-learning assisted tissue histology
We next tested our trained ResNet34 model on SRS images of fresh larynx tissues. To illustrate our method, we showed an SRS image of a laryngeal SCC tissue in Figure , which was divided into 9✕9 image tiles (200✕200 pixels each). Our ResNet34 model made predictions on each image tile, yielding binary results of either normal (grey) or neoplasia (brown), as shown in the right panel of Figure . In the same way, we processed a typical normal laryngeal tissue and plotted the prediction results in Figure . Note the histoarchitectural heterogeneity of laryngeal SCC tissues and the fact that some specimens may contain a mixture of neoplastic and normal tissues. We judged the diagnostic results on specimen level based on the most common diagnostic class tiles by calculating the percentage of neoplastic image tiles within the entire specimen. The above two examples gave percentages of 92.6% for Figure and 3.7% for Figure , generating predicted diagnostic results of “neoplasia” and “normal”, respectively.The statistical percentages of neoplastic tiles for SRS images of specimens from 33 untrained laryngeal patient cases (test set, Table ) are shown in Figure . Each large SRS image of the whole specimen contains on average ~ 1000 image tiles for deep learning prediction. Although the total training process of ~ 600 epochs took about 10 hours, the prediction on an image of 1000 tiles only took ~ 20 seconds. Our results showed that the trained ResNet34 accurately differentiated neoplastic from normal specimens with 100% accuracy compared to standard H&E histology using paraffin embedded tissue sections. These results demonstrated the validity of ResNet34 model for laryngeal SCC prediction.The diagnostic capacity of the ResNet34 for classifying individual image was demonstrated by evaluating the 80 SRS images included in the above survey for pathologists. Based on the survey results, the receiver operating characteristic (ROC) analysis for our RestNet34 model is shown in Figure , demonstrating its validity for the classification of laryngeal SCC with an area under curve (AUC) of 0.95 and an accuracy of 90%.
Evaluation of simulated resection margins with deep-learning based SRS
We next demonstrated the possibility of using ResNet34-assisted SRS microscopy to evaluate the surgical margins. We used totally removed larynxes to simulate the surgical process (Figure ). On the removed organ, the surgeon could visually identify the gross margin, as well as the estimated resection margin (Figure ) for simulated surgery. Three fresh tissue specimens were collected within the tumor (T), at the gross margin (M), and ~ 5 mm away from the gross margin (N) as judged by the surgeon based on his experience with naked eyes (Figure ). These specimens were imaged with SRS and then sent for standard histology. We presented the results of three studied cases in Figure . Only in the first case did the surgeon's assessment match well with the ResNet34-SRS prediction. In the other two cases, residual neoplasm could be detected at simulated surgical margins. Figure shows an SRS image of a specimen collected at the gross margin and its ResNet34 predicted results, demonstrating mixed normal and neoplastic regions with a detectable boundary. Corresponding image tiles predicted as neoplastic and normal tissues are also shown. These results implied that resection margins identified by naked eyes may still be infiltrated by tumor cells, and ResNet34-assisted SRS microscopy may provide rapid intraoperative assessments on resection margins.
Discussion
The ideal laryngeal tumor surgery is to remove all local malignant tissues without any residual viable tumor cells left behind. In clinic, after tumor resection, surgeons always cut several pieces of tissues at the residual margin to be evaluated by a pathologist. If cancerous cells still exist at the resection margin, extended resection and examination is demanded to reach the goal of complete tumor resection. Resection margins considered normal would be retained for the preservation of functions, especially for early laryngeal SCC. However, it is difficult to obtain three-dimensional assessment of tumor edges with traditional histology, and sub-mucosal extension may be left behind with a risk of tumor recurrence despite seemingly clear resection margins. Although the imaging depth of SRS microscopy is limited (<200 um), it still holds potential for providing 3D histology in the setting of intraoperative imaging, because of the intrinsic optical sectioning capability. Moreover, since SRS microscopy is non-invasive, surgical tissues imaged by SRS could still be used for further molecular and histological evaluations.Although simple neural networks with a few of hidden layers could approximate any continuous function, it has limited capacity, inadequate data expression ability, and is prone to fall into local minimums during optimization process. With the increasing number of hidden layers, deep neural network usually works better for approximating the true distribution of data, especially when the training dataset is large. However, overly increasing the depth of neural network may cause vanishing or exploding gradient problems, and lead to worse training results. For instance, the performance of the well-known VGGNet (16 or 19 layers) and GoogLeNet (22 layers) may become worse when the number of layers is further increased 43. Comparing with plain CNN, ResNet contains additional identity mappings which allows the information of the input and gradient to pass smoothly through many layers. ResNet performs exceptionally well when the network gets much deeper, and it won the ImageNet Large Scale Visual Recognition Competition (ILSVRC) in 2015. The commonly used ResNet has 18, 34, 50, 101 or 152 layers. Considering our computational capacity and image data size, we chose the 34-layer ResNet in this work. Comparing with previous SRS studies using random forests and MLP (9,21), ResNet34 model has the advantage of retaining the structures of two-dimensional SRS data, and outperformed 4-layer MLP for the same training dataset as we have tested (Fig. ). Thus RestNet34 based SRS microscopy may provide an alternative for DL-based histopathology with high accuracy.In principle, the deeper the neural network is used, the larger the dataset is needed. In the current study, the achievable dataset is limited by the number of accessible fresh surgical samples from patients. We partially compromised this issue by applying data augmentation techniques to effectively enlarge the image dataset, and suppressed the overfitting of our DL model. For the same reason, DL classification of cancerous specimens into subtypes was not possible at the current stage. To increase the size of SRS image dataset, it is important to accumulate more surgical tissues as well as increase the imaging speed. With the advancement of multi-color imaging techniques and fast scanning methods 21,44, the imaging speed for large-area tissues may ultimately approach that of digital pathology. It is expected that larger image datasets provides opportunities for further development and optimization of DL based neural networks to accomplish more refined tasks, such as the classification of tumors into different grades and subtypes.In summary, we have shown that multicolor SRS microscopy could provide label-free histology for larynx tissues, revealing key diagnostic features with results similar to traditional H&E. Moreover, SRS integrated with ResNet34 deep neural network may provide a rapid and accurate means for intraoperative diagnosis on fresh, unprocessed larynx surgical tissues. With future larger image datasets, ResNet34 model could be further optimized for robust and automated prediction that can help informing surgical goals and improving decision-making work flows. Our work widens the biomedical applications of this emerging technique, and our method may be applied to broader types of solid tumors that might benefit from rapid intraoperative diagnosis.Supplementary figures and table; the optical setup; typical SRS and H&E images used for the survey; quantitative image analysis; and typical SRS images used for training deep-learning model.Click here for additional data file.
Table 1
Comparison of SRS and H&E images from web-based survey results. 80 pairs of both types of images were presented to three pathologists (P1-P3) in random order for evaluation. Each image was rated as “normal” or “neoplasia” and compared with the standard histopathology result.
Authors: Richa Mittal; Mihaela Balu; Tatiana Krasieva; Eric O Potma; Laila Elkeeb; Christopher B Zachary; Petra Wilder-Smith Journal: Lasers Surg Med Date: 2013-08-31 Impact factor: 4.025
Authors: Conor E Steuer; Mark El-Deiry; Jason R Parks; Kristin A Higgins; Nabil F Saba Journal: CA Cancer J Clin Date: 2016-11-29 Impact factor: 508.702
Authors: Minbiao Ji; Spencer Lewis; Sandra Camelo-Piragua; Shakti H Ramkissoon; Matija Snuderl; Sriram Venneti; Amanda Fisher-Hubbard; Mia Garrard; Dan Fu; Anthony C Wang; Jason A Heth; Cormac O Maher; Nader Sanai; Timothy D Johnson; Christian W Freudiger; Oren Sagher; Xiaoliang Sunney Xie; Daniel A Orringer Journal: Sci Transl Med Date: 2015-10-14 Impact factor: 17.956
Authors: Erik Rodner; Thomas Bocklitz; Ferdinand von Eggeling; Günther Ernst; Olga Chernavskaia; Jürgen Popp; Joachim Denzler; Orlando Guntinas-Lichius Journal: Head Neck Date: 2018-12-12 Impact factor: 3.147
Authors: Daniel S Kermany; Michael Goldbaum; Wenjia Cai; Carolina C S Valentim; Huiying Liang; Sally L Baxter; Alex McKeown; Ge Yang; Xiaokang Wu; Fangbing Yan; Justin Dong; Made K Prasadha; Jacqueline Pei; Magdalene Y L Ting; Jie Zhu; Christina Li; Sierra Hewett; Jason Dong; Ian Ziyar; Alexander Shi; Runze Zhang; Lianghong Zheng; Rui Hou; William Shi; Xin Fu; Yaou Duan; Viet A N Huu; Cindy Wen; Edward D Zhang; Charlotte L Zhang; Oulan Li; Xiaobo Wang; Michael A Singer; Xiaodong Sun; Jie Xu; Ali Tafreshi; M Anthony Lewis; Huimin Xia; Kang Zhang Journal: Cell Date: 2018-02-22 Impact factor: 41.582
Authors: Dan Fu; Yong Yu; Andrew Folick; Erin Currie; Robert V Farese; Tsung-Huang Tsai; Xiaoliang Sunney Xie; Meng C Wang Journal: J Am Chem Soc Date: 2014-06-09 Impact factor: 15.419
Authors: Minbiao Ji; Michal Arbel; Lili Zhang; Christian W Freudiger; Steven S Hou; Dongdong Lin; Xinju Yang; Brian J Bacskai; X Sunney Xie Journal: Sci Adv Date: 2018-11-16 Impact factor: 14.136