Literature DB >> 36223330

Automatic multi-anatomical skull structure segmentation of cone-beam computed tomography scans using 3D UNETR.

Maxime Gillot^1,2, Baptiste Baquero^1,2, Celia Le^1,2, Romain Deleat-Besson^1,2, Jonas Bianchi¹, Antonio Ruellas¹, Marcela Gurgel¹, Marilia Yatabe¹, Najla Al Turkestani^1,3, Kayvan Najarian¹, Reza Soroushmehr¹, Steve Pieper⁴, Ron Kikinis⁵, Beatriz Paniagua⁶, Jonathan Gryak¹, Marcos Ioshida¹, Camila Massaro¹, Liliane Gomes¹, Heesoo Oh⁷, Karine Evangelista¹, Cauby Maia Chaves Junior⁸, Daniela Garib⁹, Fábio Costa¹, Erika Benavides¹, Fabiana Soki¹, Jean-Christophe Fillion-Robin⁶, Hina Joshi¹⁰, Lucia Cevidanes¹, Juan Carlos Prieto¹⁰.

Abstract

The segmentation of medical and dental images is a fundamental step in automated clinical decision support systems. It supports the entire clinical workflow from diagnosis, therapy planning, intervention, and follow-up. In this paper, we propose a novel tool to accurately process a full-face segmentation in about 5 minutes that would otherwise require an average of 7h of manual work by experienced clinicians. This work focuses on the integration of the state-of-the-art UNEt TRansformers (UNETR) of the Medical Open Network for Artificial Intelligence (MONAI) framework. We trained and tested our models using 618 de-identified Cone-Beam Computed Tomography (CBCT) volumetric images of the head acquired with several parameters from different centers for a generalized clinical application. Our results on a 5-fold cross-validation showed high accuracy and robustness with a Dice score up to 0.962±0.02. Our code is available on our public GitHub repository.

Entities: Chemical

Mesh：

Year: 2022 PMID： 36223330 PMCID： PMC9555672 DOI： 10.1371/journal.pone.0275033

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

1 Introduction

Segmentation of medical and dental images is a visual task that aims to identify the voxels of organs or lesions from background grey-level scans. It represents a prerequisite for medical image analysis and supports entire clinical workflows from computer-aided diagnosis [1] to therapy planning [2], intervention [3], and follow-up [4]. Particularly for challenging dental and craniofacial conditions, such as dentofacial deformities, craniofacial anomalies, and tooth impaction, quantitative image analysis requires efficient solutions to solve the time-consuming and user-dependent task of image segmentation. With medical and dental images being acquired at multiple scales and/or with multiple imaging modalities, automated image analysis techniques are needed to integrate patient data across scales of observation. Due to the low signal/noise ratio of Cone-Beam CT (CBCT) images used in Dentistry, the current open-source tools for anatomic segmentation, such as ITK-SNAP [5] and 3D-Slicer [6] are challenging for clinicians and researchers. The large field of view CBCT images commonly used for Orthodontics and Oral Maxillofacial Surgery clinical applications require on average to perform detailed segmentation by experienced clinicians: (Fig 1) 7 hours of work for full face, 1.5h for the mandible, 2h for the maxilla, 2h for the cranial base (CB), 1h for the cervical vertebra (CV), and 30min for the skin. Additional challenges for accurate and robust automatic anatomical segmentation are the rich variety of anatomical structures morphology and the differences in imaging acquisition protocols and scanners from one center to another. Furthermore, patients that present with facial bone defects pose additional challenges for automatic segmentation because of unexpected anatomical abnormalities and variability. For this reason, the training of the machine learning models in the present study also included gold standard (ground-truth) clinicians’ expert segmentations of CBCT images from patients with craniofacial large bone defects such as cleft lip and palate (CLP). Being able to accurately segment those maxillary deformities (Fig 1) is for the diagnosis and treatment planning of correction of the bone defects and craniomaxillofacial anomalies.

Fig 1

Multi-anatomical skull structure manual segmentation of the full-face by combining the mandible, the maxilla, the cranial base, the cervical vertebra, and the skin segmentation.

Patient has written consent on file for the use of the images.

Multi-anatomical skull structure manual segmentation of the full-face by combining the mandible, the maxilla, the cranial base, the cervical vertebra, and the skin segmentation.

Patient has written consent on file for the use of the images. Although in the last decades, automatic approaches such as region seed growing [7], clustering methods, random forests [8], atlas-based system [9], and deep convolutional neural network (CNN) [10] have been proposed to segment the mandible, the maxilla, and the teeth, CBCT image segmentation remains challenging. Those previous studies focused on small samples from a single acquisition protocol; however, scans acquired at different clinical centers with different acquisition protocols, scales, and orientations require laborious manual correction in clinical settings to achieve accurate segmentation. Hence, methods for generalizable automatic image segmentation are sought. The present study objective is to offer a free open-source tool to facilitate medical and dental image segmentation for clinics and research. We focused on the best practices for Artificial Intelligence in healthcare imaging across academia and enterprise researchers. Hence, the use of the new Medical Open Network for Artificial Intelligence (MONAI) framework that implements state-of-the-art machine learning algorithms such as the UNEt TRansformers (UNETR) [11]. In the following sections, we describe the data used to train our machine learning models, followed by related work on approaches to segment medical images, testing the performance of the proposed methods compared to the clinician’s expert segmentation, and discussion of the novel results.

2 Materials

A total of 618 DICOM-formatted CBCT images of the head were used in this work. The images were acquired from 7 clinical centers with various scanners, image acquisition protocols, and field of views. All patient HIPAA identifiable personal information was removed from the DICOM files metadata through an anonymization process in the 3D Slicer platform [6]. The anonymization was performed before the clinical centers shared the data for this retrospective study. The University of Michigan Institutional Review Board HUM00217585 waived the requirement for informed consent and granted IRB exemption. Patients’ skin was not removed from the large field of scans; however, those files are used only for the training of the proposed machine learning model. Two open-source software packages, ITK-SNAP 3.8 [5] and 3D Slicer 4.11 were used by clinical experts to perform user interactive manual segmentation of the volumetric images and common spatial orientation of the head as the ground-truth to train our deep learning models. All the 618 scans don’t come with a full-face segmentation, the dataset was composed of 446 patients with mandible segmentation, 132 with the maxilla, 116 of the cranial base, 80 with the skin, and 14 patients with the cervical vertebra. The image spatial resolution varied from 0.16 to 0.5 mm3 voxels. To test the robustness of the proposed method, patients with CLP were included in the dataset. Those patients have large bone defects in the jaw that varies a lot from one patient to another.

3 Related work

3.1 Region seed growing [7]

This method needs to place the seed inside the region of interest. The grayscale intensity grid and spatial distances from the seed to all the other voxels are computed to estimate a segmentation of similar features. This method showed less accuracy than the following methods and can require the clinicians to place the seeds.

3.2 Atlas-based system [9]

An atlas is defined as the combination of an intensity image and its segmentation to generate a template. From this point, 2 steps occur: label transfer which transfers segmentation labels from pre-labeled atlases to a novel image and label fusion which combines the label transfer results. The main con of this method is the lack of flexibility when exposed to high changes in the data such as in patients with CLP.

3.3 Random forests [8]

A probability grid is made to estimate the initial segmentation based on multiple expert-segmented CBCT images. The appearance features from CBCTs and the context features from the initial probability maps are both extracted to train a first-layer of random forest classifiers. A sequence of classifiers can segment CBCT images by iteratively training the subsequent random forest classifier using both the original CBCT features and the updated segmentation probability maps. Those methods are slow to train, computing-intensive and the prediction time can be high.

3.4 CNN

Previous methods where mostly using 2D [12] or 2.5D UNet [13], limited by computer power. Recent progress in GPU power and network architecture allowed the appearance of 3D CNN architectures showing better results than their 2/2.5D analogs. 3D UNet [10], TransUNet [14], and nnU-Net [15] showed high performance for medical imaging tasks including segmentation. However, the new UNETR architecture showed better results than all the previously cited CNN for CT segmentation.

4 Proposed method

Thanks to recent advances in deep learning, this study proposes a convolutional neural network (CNN) to extract a hierarchical feature representation for segmentation, which is robust to image degradation such as noise, blur, and contrast. Our algorithm requires Python 3.9 and uses various libraries to perform image processing. For the post-processing and the pre-processing, we are using ITK, SimpleITK, VTK, and connected-components-3d libraries. For the data augmentation and the segmentation, we used the MONAI library which simplifies the UNETR implementation and is optimized to process medical images in Python.

4.1 Pre-processing

Depending on the scanner and the image acquisition protocol, the CBCT scans are gray-scaled images with high contrast variation from one patient to another and the image spacing can be different. Among all the different spacing, 0.4 mm3 is the most frequent. It’s also a resolution that keeps enough details of the skull structure to segment while limiting memory usage with reasonable image size. From one center to another, the manual segmentation method can change. Different labels are used and the skull structure can be filled or not. From this point, to have more consistency in the dataset, all the data go through the following pre-processing steps: All the CBCTs and segmentations are re-sampled with a 0.4-mm3 isometric voxel size using respectively a linear and a nearest-neighbor interpolation function. The scans go through a contrast adjustment function Fig 2. A cumulative graph is made from the image histogram ignoring the background color. The new minimum and maximum intensity are selected when reaching an X and X percentage on the cumulative graph. The intensity is then re-scaled in the [0, 1] interval.

Fig 2

Visualization of the contrast adjustment steps on two different scans.

This result is obtained by keeping the data between X = 1% and X = 99% on the cumulative graph.

A “fill hole” morphological operation is applied to the segmentation and the label is set at 1.

Visualization of the contrast adjustment steps on two different scans.

This result is obtained by keeping the data between X = 1% and X = 99% on the cumulative graph.

4.2 UNETR

For this machine learning tool, we decided to use the new state-of-the-art model in 3D scan segmentation, the UNETR. Its architecture utilizes a transformer as the encoder to learn sequence representations of the input volume and effectively capture the global multi-scale information. The network design follows the successful “U-shape” for the encoder and decoder. The transformer encoder is directly connected to a decoder via skip connections at different resolutions to compute the final semantic segmentation output. The size of the scans to segment is not consistent and tends to be large (up to 600x600x600 voxels). No GPU is powerful enough to take this voxel grid size as input. We decided to shape our UNETR classifier with a 128x128x128 voxels input (Fig 3). To segment the entire image, the classifier moves across the scan to perform predictions in different locations. Once the entire image has been processed, segmented crops are merged to match the original input image size. Individual UNETR models were trained for different segmentation needs. All the models share common parameters: feature size = 16, hidden layer = 768, feedforward layer = 3072, number of attention heads = 12, and a dropout rate of 5%.

Fig 3

Overview of the UNETR used.

A 128x128x128x1 cropped volume of the input CBCT is divided into a sequence of 16 patches and projected into an embedding space using a linear layer. A transformer model is fed with the sequence added with 768 position embedding. Via skip connections, the decoder will extract and merge the final 128x128x128x2 crop segmentation from the encoded representations of different layers in the transformer.

Overview of the UNETR used.

4.3 Training

For each skull structure to segment, the patients were sorted by separated folders based on the clinical center they were coming from. The dataset was then split into 3: 70% for the training, 10% for the validation, and 20% for testing. The data was split evenly from each folder to avoid overfitting to any specific center. We used the MONAI “CacheDataset” tool to load the pre-processed data. Those datasets allow the use of transformers for data augmentations. Every time an image and its segmentation are loaded for the training, a number of N cube samples are randomly cropped in the voxel grid. Those cubes all have the same L × L × L shape to match the UNETR input size. For data augmentation (Table 1), random flip and 90° rotation are applied in each direction along with a random shift in intensity and contrast for the scans.

Table 1

Data augmentation transformations for the training.

Data	Random crop	Random flip and rotation	Random shift in intensity	Random contrast adjustment
Images	Anywhere in the scan N_s times	Along X, Y and Z-axis with a 25% probability for each axis for each axis	50% chances of a 0.1 intensity shift	80% chances to change image gamma in a [0.5,2] interval
Segmentation	Anywhere in the scan N_s times		N/A	N/A

This step is applied to N images to generate a batch of size N × N. This batch is then fed into the UNETR the training. For the validation, data augmentation is also applied by only ignoring the cropping step, a prediction occurs on the full image using MONAI sliding window inference to move the UNETR classifier across the image. This network is optimized using the PyTorch library by a combination of a back-propagation algorithm to compute the network gradients and the Adam optimizer with weight decay. In this work, we used the weighted average of both the Dice loss (Table 1) and Cross Entropy Loss (Table 2) function. where p ∈ P is the predicted probability of the i-th voxel and g ∈ G is the ground truth of the i-th voxel. Where x is the input, y is the target, w is the weight, C is the number of classes, and N spans the minibatch dimension as well as d1, …, d for the K-dimensional case.

Table 2

Comparison of manual and automatic segmentation using AUPRC, AUPRC-Baseline, Dice, F2 Score, Accuracy, Recall, and Precision of the 5-fold cross-validation for the 5 skull structures segmentation.

Structure	AUPRC	AUPRC Baseline	Dice	F2 Score	Accuracy	Recall	Precision
Mandible	0.926 ± 0.037	0.011 ± 0.003	0.962 ± 0.020	0.961 ± 0.026	0.9992 ± 0.0005	0.960 ± 0.031	0.965 ± 0.026
Maxilla	0.738 ± 0.096	0.011 ± 0.003	0.853 ± 0.064	0.857 ± 0.061	0.996 ± 0.001	0.862 ± 0.073	0.855 ± 0.099
Cranial base	0.642 ± 0.127	0.018 ± 0.006	0.788 ± 0.103	0.804 ± 0.109	0.992 ± 0.004	0.824 ± 0.099	0.774 ± 0.135
Cervical vertebra	0.602 ± 0.145	0.008 ± 0.006	0.760 ± 0.113	0.723 ± 0.164	0.995 ± 0.004	0.704 ± 0.192	0.854 ± 0.033
Skin	0.947 ± 0.035	0.425 ± 0.72	0.971 ± 0.018	0.982 ± 0.009	0.974 ± 0.018	0.989 ± 0.009	0.954 ± 0.037

The training was done on an NVIDIA Quadro RTX 6000/8000 GPU. With X = 1%, X = 99%, L = L = L = 128, N = N = 10 (batch size of 100), a dropout rate of 0.05, a learning rate of 10−4 and a weight decay of 10−5 it takes around 4h and 22GB of GPU memory for one model to be trained.

4.4 Segmentation and post-processing

Once we have a trained model, the challenge is to segment new scans that possibly have a different contrast and spacing than the ones used for the training. For the prediction, we create a new temporary file to work on and preserve the original. We apply the 2 first pre-processing steps (re-sample in a 0.4mm3 spacing if needed and adjust the contrast). The sliding window inference is then used to segment the whole image. We get as an output a voxel grid of probability on which we apply an argmax function. The segmentation can have some artifacts and unwanted elements. Therefore, we used the connected-components-3d 3.9.1 library [16] to keep the biggest segmented object only. A morphological operation is then applied to the segmentation to fill the holes. The final result is re-sampled to match the original image, orientation, spacing, origin, and size. All the steps are summarized in Fig 4.

Fig 4

Visualization of the automatic maxilla segmentation steps.

Re-sample and contrast adjustment of the input image, segmentation with the sliding window using UNETR, and finally, re-sampling of the cleaned-up segmentation to the input size.

Visualization of the automatic maxilla segmentation steps.

Re-sample and contrast adjustment of the input image, segmentation with the sliding window using UNETR, and finally, re-sampling of the cleaned-up segmentation to the input size.

5 Results

We performed a 5-fold cross-validation, each fold with a different 20% portion of the available data for the test. It allows testing the models on the entirety of the dataset. The MONAI sliding window inference allows overlapping of the classifier for more precision but has a drastic impact on the computation time. During the validation step of the training, a prediction takes about 4s with 20% of overlap. To compute the metrics we used a 50% overlap to segment the test scans and it takes around 24s on GPU for each CBCT to be segmented. The prediction goes up to 1 min with an 80% overlap for even more precision. To compare the clinician experts’ manual segmentation and the AMASSS automatic segmentation, we used the Area Under the Precision-Recall Curve (AUPRC Eq 8) metric for class imbalance. Most of the bone groups represent about 10% of the volume only. Other metrics such as the recall (6), precision (7), Dice coefficient (DC Eq 3), and F2 (4) score were also computed to know how efficient the model is. Where M and A are respectively the binary image of the ground thruth segmentation and the AMASSS output. Where TP stand for the number of true positive in the AMASSS output voxel grid, TN true negative, FP false positive and FN false negative. Where R and P are the recall and the precision values from N confusion matrices for different thresholds. All these measurements (Table 2) vary from zero to one, where zero means no superposition between the two volumes, and one shows a perfect superposition between both. All metrics were performed on the binarized 3D images resulting from the post-processing. From a clinical point of view, it is better to have over-segmented images rather than under-segmented ones, and hence the F2 score was computed considering recall as twice as important as precision. The average results for the mandible and the skin show the high precision of the automatic segmentations with a Dice above 0.96. Additionally, the standard deviation is quite low, indicating that the predictions are robust, consistent, and generalizable to unseen patients. Maxilla and cranial base showed similar results. The lower Dice compared to the mandible can be explained by fewer data used to train, but more importantly because of inconsistency from one ground-truth segmentation to another. The separation between the maxilla and the mandible can change, those regions have very thin bones and the amount of details segmented is different depending on the center. With only 14 segmentation available from one center, the cervical vertebra results are promising, showing the potential to be generalizable in future training with a larger sample. We processed a full-face segmentation (Fig 5) of the patient Fig 1 that was kept out of all training. The CLP and even the cervical vertebra were successfully segmented, showing the robustness of the UNETR.

Fig 5

Visualization of the automatic full-face segmentation results.

In red, the prediction is superposed with the manual segmentation in transparent green. On the full-face, we can see that the models managed to average the separation line between the maxilla and the mandible. The separation on the manual segmentation is different. It also explains why the metrics are lower than the mandible for those two skull structures.

Visualization of the automatic full-face segmentation results.

6 Discussion

This is the first study to our knowledge to use the new 3D UNETR architecture to segment multiple anatomic skeletal, dental, and soft tissue structures in the craniofacial complex of CBCT scans. Recent studies have focused on only one specific facial structure such as the maxilla, [17], mandible [18] or airway [12], and used smaller samples from a single CBCT acquisition protocol, thus, those algorithms are not yet generalizable like the proposed AMASSS. Traditional image processing methods, such as super-voxels and graph clustering [19], atlas-based segmentation [8, 20], watershed methods [21] are available tools that presented good accuracy for segmentation, however, due to image artifacts and noise, that can be caused by intercuspation of the dentition and the presence of metallic crowns, it is still a challenge to segment the images properly and also to segment different tissues such as bone with different densities (boundaries) and soft tissues. Due to these limitations, machine learning methods for image segmentation in dentistry have become popular, and the major limitation in training AI models such as the proposed AMASS is to have a gold standard to serve as training models [22]. To overcome this limitation, inthis study, manual annotations were performed for each scan used in the training, provided by clinicians with expertise and experience in 3D CBCT segmentations. Moreover, AMASSS showed better and similar accuracy when compared to Si Chen et al.’s Maxilla segmentation with a dice score of 0.800 ± 0.029 and Verhelst et al.’s with a dice of 0.9722 ± 0.006 for the mandible segmentation, respectively. Commercial companies such as Materialise [23], Relu [24], and Diagnocat [25] have recently marketed AI-based segmentation for CBCT scans, but they are expensive and the precision of their algorithms require validation by clinicians. Another important challenge in automated systems in dentistry, explained by Schwendicke et al. [26], is to provide solutions that can be largely entered into dental routine practice, and also follows principles such as demonstrating clinical value, protecting patient data, individual privacy, maintaining trustworthiness, and ensuring robustness and generalizability of the tools Towards these goals, the proposed open-source AMASSS algorithm was deployed as a free 3D Slicer extension “Automated dental tools”. The software interface allows users to select the most updated trained model for increased precision of anatomic structures segmentation, continuously updating toward improved identification of patient facial structures and clinical applications [27]. Regarding the advantages and limitations, this study has the capacity of performing the segmentation of multi-structures in approximately 5 minutes; however, to achieve the necessary precision the ground-truth data can take several hours to be manually produced by the clinicians, which makes the addition of new structures of interest challenging and still human-dependent. Also, automated tools such as AMASS focus on future clinical decision support systems, to improve the human-computer interface rather than interrupt the clinical workflow [28], and for this reason, human interaction is still required, but less time-consuming. Future work will continue to increase the databases for cervical vertebra, maxilla, and cranial base as well as add detailed anatomic structures such as the teeth roots and mandibular canals segmentation. Additional potential applications may be generalizable to other imaging modalities such as Magnetic Resonance Imaging, CT, micro CT, and ultrasound, which is been shown in recent manuscripts in the medical field [29, 30].

7 Conclusion

This proposal is a step towards the implementation of dentistry decision support systems, as machine learning techniques are becoming important to automatically and efficiently analyze dental images. The MONAI framework facilitated the processing of 618 CBCTs to perform fast training and data augmentation, which led to the high accuracy and robustness of the AMASSS tool. The UNETR showed high overall performance, achieving a Dice up to 0.962 ± 0.02 on heterogeneous CBCT images. Given its robustness and performance time, this validated free tool was implemented in 2 open-source ecosystems, a web-based clinical decision support system (the Data Storage for Computation and Integration, DSCI) [31], and a user-friendly 3D Slicer module Fig 6. These computer-aided diagnostic tools will aid in diagnosis and therapy planning, especially for patients with craniomaxillofacial anomalies and deformities.

Fig 6

3D Slicer module in development for AMASSS-CBCT.

On the left, we can see the module with the different options/parameters. On the right, the visualisation of the segmentation applied on one small field of view scan with the selected skull structures. The mandible in red, the maxilla in yellow and the root canals in green.

3D Slicer module in development for AMASSS-CBCT.

PONE-D-22-13724

Automatic multi-anatomical skull structure segmentation of cone-beam computed tomography scans using 3D UNETR

PLOS ONE Dear Dr. Gillot, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Aug 08 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Sathishkumar V E Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information. If you are reporting a retrospective study of medical records or archived samples, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information. 3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. 4. Thank you for stating the following in the Acknowledgments Section of your manuscript: "Supported by NIDCR R01 024450, AA0F Dewel Memorial Biomedical Research award and by Research Enhancement Award Activity 141 from the University of the Pacific,Arthur A. Dugoni School of Dentistry." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "The authors received no specific funding for this work." Please include your amended statements within your cover letter; we will change the online submission form on your behalf. 5. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability. Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. 6. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: No ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: No ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I have read the manuscript “Automatic multi-anatomical skull structure segmentation of cone-beam computed tomography scans using 3D UNETR”. I have following comments. 1. The literature suvery is poor and the relevent algorithms are required to be cited in the paper. 2. The introduction of the proposed coarse-to-fine framework is simple, especially about the patch-based semantic segmentation. As the main method of this paper, it would be better to give a more detail explanation about the coarse-to-fine framework and patch-based semantic segmentation. 3. The results from the relevent algorithms (of airway segmentation) are required to be compared with the prposed method results. 4. U-Net Architecture was used in the manuscript. However, authors didn't disclose the details of the deep learning architecture. 5. No result was shown for the comparison of manual and authomatic segmentation which was claimed through the title. The results are required which can show the error/difference between the manual volume assessment and automatic volume assessment. 6. Again Literature is incomplete. There are many more articles on CBCT based volumetric segmentation and written much elaborately. Should have had a proper comparative table on literature and flow chart giving the flow of data analysis. 7. Results are not discussed with any available literature. There lacks of conclusions and analysis of the experimental results in the results section. It would be more convincible if the authors can give additional technical analysis about their experimental results. Reviewer #2: 1.Introduction section needs to be re-written to improve its quality and readability. 2.What is the motivation of the proposed work? Research gaps, objectives of the proposed work should be clearly justified 3.Overall, the basic background is not introduced well, where the notations are not illustrated much clear. 4.The literature has to be strongly updated with some relevant and recent papers focused on the fields dealt with in the manuscript. 5.The study lacks a theoretical framework which is important for the reader to grasp the crust of the research. 6. Explain why the current method was selected for the study, its importance and compare with traditional methods. 7.Authors are suggested to include more discussion on the results and also include some explanation regarding the justification to support why the proposed method is better in comparison towards other methods 8.Does this kind of study have never attempted before? Justify this statement and give an appropriate explanation to do so in this paper. 9.Quality of figures is so important too. Please provide some high-resolution figures. Some figures have a poor resolution. 10.The language usage throughout this paper need to be improved, the author should do some proofreading on it. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 7 Sep 2022 Response to reviews Thank you for your careful consideration of our study for publication in the PLOS ONE Journal. We have addressed each of the reviewer’s constructive comments and Journal requirements that have strengthened this submission Journal Requirements: 1.Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdfAnswer: We have ensured that the manuscript and file naming meets PLOS ONE's style requirements following your sdtyle templates. 2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.If you are reporting a retrospective study of medical records or archived samples, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information. Answer: We have now clarified in the methods section that: ”All patient HIPAA identifiable personal information was removed from the DICOM files metadata through an anonymization process in the 3D Slicer platform . The anonymization was performed before the clinical centers shared the data for this retrospective study.The University of Michigan Institutional Review Board HUM00217585 waived the requirement for informed consent and granted IRB exemption. The patients' skin of the large field of scans was not removed; however those files are used only for training of the proposed machine learning model.” For figure 1, the patient has written consent on file for the use of the images acquired for clinical purposes. 3 and 4. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. Thank you for stating the following in the Acknowledgments Section of your manuscript: "Supported by NIDCR R01 024450, AA0F Dewel Memorial Biomedical Research award and by Research Enhancement Award Activity 141 from the University of the Pacific,Arthur A. Dugoni School of Dentistry." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "The authors received no specific funding for this work." Please include your amended statements within your cover letter; we will change the online submission form on your behalf. Answer: Thank you for your guidance. We have now removed the funding-related text from the manuscript and added to the cover letter the following statement:” This work was supported by NIDCR R01 024450, American Association of Orthodontists Foundation Dewel Memorial Biomedical Research award and by Research Enhancement Award Activity 141 from the University of the Pacific, Arthur A. Dugoni School of Dentistry.” 5. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized. Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions . Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access. We will update your Data Availability statement to reflect the information you provide in your cover letter. Answer: We have explained the ethical restrictions of sharing data that contains facial skin publicly and included this information in the cover letter:” This study data cannot be shared publicly because the scans contains the patient facial skin that could allow facial recognition. Data are available from the University of Michigan Ethics Committee (contact via luciacev@umich.edu) for researchers who meet the criteria for access to confidential data.” 6. Please include your full ethics statement in the ‘Methods’ section of your manuscript file. In your statement, please include the full name of the IRB or ethics committee who approved or waived your study, as well as whether or not you obtained informed written or verbal consent. If consent was waived for your study, please include this information in your statement as well. Answer: We have now clarified that: ”All patient HIPAA identifiable personal information was removed from the DICOM files metadata through an anonymization process in the 3D Slicer platform . The anonymization was performed before the clinical centers shared the data for this retrospective study.The University of Michigan Institutional Review Board HUM00217585 waived the requirement for informed consent and granted IRB exemption. The patients' skin of the large field of scans was not removed; however those files are used only for training of the proposed machine learning model.” For figure 1, the patient has written consent on file for the use of the images acquired for clinical purposes. Reviewer #1: I have read the manuscript “Automatic multi-anatomical skull structure segmentation of cone-beam computed tomography scans using 3D UNETR”. I have following comments. 1. The literature survey is poor and the relevant algorithms are required to be cited in the paper. Answer: We have now added a discussion section and cited 17 additional references to relevant algorithms. 2. The introduction of the proposed coarse-to-fine framework is simple, especially about the patch-based semantic segmentation. As the main method of this paper, it would be better to give a more detail explanation about the coarse-to-fine framework and patch-based semantic segmentation. Answer: We have clarified the introduction and we have now provided more detail of the proposed algorithm framework (Modification of the method section and a new UNETR section) 3. The results from the relevant algorithms (of airway segmentation) are required to be compared with the proposed method results. Answer: We have now added to the discussion section comparison with studies on segmentation algorithms for various anatomic structures:” Recent studies have focused on only one specific facial structure such as the maxilla, [17], mandible [18] or airway [12], and used smaller samples from a single CBCT acquisition protocol, thus, those algorithms are not yet generalizable like the proposed AMASSS.” 4. U-Net Architecture was used in the manuscript. However, authors didn't disclose the details of the deep learning architecture. Answer: We have now added a subsection UNETR in the proposed method. 5. No result was shown for the comparison of manual and automatic segmentation which was claimed through the title. The results are required which can show the error/difference between the manual volume assessment and automatic volume assessment. Answer: Table 2 results present the comparison of manual and automatic segmentation. We added some descriptions of the formula used to compare. 6. Again Literature is incomplete. There are many more articles on CBCT based volumetric segmentation and written much elaborately. Should have had a proper comparative table on literature and flow chart giving the flow of data analysis. Answer: We have now improved the related works section, added a discussion section to compare 17 additional references to relevant algorithms. 7. Results are not discussed with any available literature. There lacks of conclusions and analysis of the experimental results in the results section. It would be more convincible if the authors can give additional technical analysis about their experimental results. Answer: Response above. Thank you for your constructive comments. Reviewer #2: 1.Introduction section needs to be re-written to improve its quality and readability. Answer: Thank you for your helpful suggestion. We have re-written the introduction to improve its readability. And clarify important points to describe our objectives. 2.What is the motivation of the proposed work? Research gaps, objectives of the proposed work should be clearly justified Answer: The introduction now highlights the research gaps and specific objectives of this study. 3.Overall, the basic background is not introduced well, where the notations are not illustrated much clear. Answer: We have re-written the introduction to improve its readability. 4.The literature has to be strongly updated with some relevant and recent papers focused on the fields dealt with in the manuscript. Answer: We have now improved the related works section, added a discussion section to compare 17 additional references to more recent and relevant algorithms. 5.The study lacks a theoretical framework which is important for the reader to grasp the crust of the research. Answer: We are confident that the revised introduction, proposed method and discussion sections address this comment. A new subsection UNETR has been added as well as a detail of the formula used in the training and result section. 6. Explain why the current method was selected for the study, its importance and compare with traditional methods. Answer: The introduction section now better explain why the current method was selected for the study, its importance as well as a comparison with other methods in the related work and discussion section. 7.Authors are suggested to include more discussion on the results and also include some explanation regarding the justification to support why the proposed method is better in comparison towards other methods Answer: We have now improved the related works section, added a discussion section to compare 17 additional references to more recent and relevant algorithms. 8.Does this kind of study have never attempted before? Justify this statement and give an appropriate explanation to do so in this paper. Answer: We have now added a discussion section on the original contribution of this study. 9.Quality of figures is so important too. Please provide some high-resolution figures. Some figures have a poor resolution. Answer: We have submitted high-resolution figure in .tiff format that can be downloaded for full resolution visualization. 10.The language usage throughout this paper need to be improved, the author should do some proofreading on it. Answer: Thank you. The article has now been proof-read by a native English speaker co-author. Submitted filename: Response to reviews.docx Click here for additional data file. 9 Sep 2022 Automatic multi-anatomical skull structure segmentation of cone-beam computed tomography scans using 3D UNETR PONE-D-22-13724R1 Dear Dr. Gillot, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Sathishkumar V E Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: N/A ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: No ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The relevant changes were made. Thus the paper is improved. The paper can be accepted as an original contribution. Reviewer #2: 1.Identified research gaps and contribution of the proposed study should be elaborated. 2.The Limitations of the proposed study need to be discussed before conclusion. 3.What assumptions authors made during the simulation phase of this research work? Provide a critique on this aspect. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No ********** 21 Sep 2022 PONE-D-22-13724R1 Automatic multi-anatomical skull structure segmentation of cone-beam computed tomography scans using 3D UNETR Dear Dr. Gillot: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Sathishkumar V E Academic Editor PLOS ONE

17 in total

1. User-guided 3D active contour segmentation of anatomical structures: significantly improved efficiency and reliability.

Authors: Paul A Yushkevich; Joseph Piven; Heather Cody Hazlett; Rachel Gimpel Smith; Sean Ho; James C Gee; Guido Gerig
Journal: Neuroimage Date: 2006-03-20 Impact factor: 6.556

2. Encoding atlases by randomized classification forests for efficient multi-atlas label propagation.

Authors: D Zikic; B Glocker; A Criminisi
Journal: Med Image Anal Date: 2014-07-02 Impact factor: 8.545

3. Machine learning in orthodontics: Introducing a 3D auto-segmentation and auto-landmark finder of CBCT images to assess maxillary constriction in unilateral impacted canine patients.

Authors: Si Chen; Li Wang; Gang Li; Tai-Hsien Wu; Shannon Diachina; Beatriz Tejera; Jane Jungeun Kwon; Feng-Chang Lin; Yan-Ting Lee; Tianmin Xu; Dinggang Shen; Ching-Chang Ko
Journal: Angle Orthod Date: 2019-08-12 Impact factor: 2.079

4. Medical breast ultrasound image segmentation by machine learning.

Authors: Yuan Xu; Yuxin Wang; Jie Yuan; Qian Cheng; Xueding Wang; Paul L Carson
Journal: Ultrasonics Date: 2018-07-18 Impact factor: 2.890

5. Deep learning-based liver segmentation for fusion-guided intervention.

Authors: Xi Fang; Sheng Xu; Bradford J Wood; Pingkun Yan
Journal: Int J Comput Assist Radiol Surg Date: 2020-04-21 Impact factor: 2.924

6. Web Infrastructure for Data Management, Storage and Computation.

Authors: Serge Brosset; Maxime Dumont; Lucia Cevidanes; Reza Soroushmehr; Jonas Bianchi; Marcela Gurgel; Romain Deleat-Besson; Celia Le; Antonio Ruellas; Marilia Yatabe; Cauby Chaves Junior; Liliane Gomes; Joao Goncalves; Kayvan Najarian; Jonathan Gryak; Martin Styner; Beatriz Paniagua; Juan Carlos Prieto
Journal: Proc SPIE Int Soc Opt Eng Date: 2021-02-14

7. Multiclass CBCT Image Segmentation for Orthodontics with Deep Learning.

Authors: H Wang; J Minnema; K J Batenburg; T Forouzanfar; F J Hu; G Wu
Journal: J Dent Res Date: 2021-03-30 Impact factor: 6.116

8. Panoptic Segmentation on Panoramic Radiographs: Deep Learning-Based Segmentation of Various Structures Including Maxillary Sinus and Mandibular Canal.

Authors: Jun-Young Cha; Hyung-In Yoon; In-Sung Yeo; Kyung-Hoe Huh; Jung-Suk Han
Journal: J Clin Med Date: 2021-06-11 Impact factor: 4.241

9. Clinically applicable artificial intelligence system for dental diagnosis with CBCT.

Authors: Matvey Ezhov; Maxim Gusarev; Maria Golitsyna; Julian M Yates; Evgeny Kushnerev; Dania Tamimi; Secil Aksoy; Eugene Shumilov; Alex Sanders; Kaan Orhan
Journal: Sci Rep Date: 2021-07-22 Impact factor: 4.379

Review 10. Deep Learning for Cardiac Image Segmentation: A Review.

Authors: Chen Chen; Chen Qin; Huaqi Qiu; Giacomo Tarroni; Jinming Duan; Wenjia Bai; Daniel Rueckert
Journal: Front Cardiovasc Med Date: 2020-03-05