| Literature DB >> 36223330 |
Maxime Gillot1,2, Baptiste Baquero1,2, Celia Le1,2, Romain Deleat-Besson1,2, Jonas Bianchi1, Antonio Ruellas1, Marcela Gurgel1, Marilia Yatabe1, Najla Al Turkestani1,3, Kayvan Najarian1, Reza Soroushmehr1, Steve Pieper4, Ron Kikinis5, Beatriz Paniagua6, Jonathan Gryak1, Marcos Ioshida1, Camila Massaro1, Liliane Gomes1, Heesoo Oh7, Karine Evangelista1, Cauby Maia Chaves Junior8, Daniela Garib9, Fábio Costa1, Erika Benavides1, Fabiana Soki1, Jean-Christophe Fillion-Robin6, Hina Joshi10, Lucia Cevidanes1, Juan Carlos Prieto10.
Abstract
The segmentation of medical and dental images is a fundamental step in automated clinical decision support systems. It supports the entire clinical workflow from diagnosis, therapy planning, intervention, and follow-up. In this paper, we propose a novel tool to accurately process a full-face segmentation in about 5 minutes that would otherwise require an average of 7h of manual work by experienced clinicians. This work focuses on the integration of the state-of-the-art UNEt TRansformers (UNETR) of the Medical Open Network for Artificial Intelligence (MONAI) framework. We trained and tested our models using 618 de-identified Cone-Beam Computed Tomography (CBCT) volumetric images of the head acquired with several parameters from different centers for a generalized clinical application. Our results on a 5-fold cross-validation showed high accuracy and robustness with a Dice score up to 0.962±0.02. Our code is available on our public GitHub repository.Entities:
Mesh:
Year: 2022 PMID: 36223330 PMCID: PMC9555672 DOI: 10.1371/journal.pone.0275033
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Multi-anatomical skull structure manual segmentation of the full-face by combining the mandible, the maxilla, the cranial base, the cervical vertebra, and the skin segmentation.
Patient has written consent on file for the use of the images.
Fig 2Visualization of the contrast adjustment steps on two different scans.
This result is obtained by keeping the data between X = 1% and X = 99% on the cumulative graph.
Fig 3Overview of the UNETR used.
A 128x128x128x1 cropped volume of the input CBCT is divided into a sequence of 16 patches and projected into an embedding space using a linear layer. A transformer model is fed with the sequence added with 768 position embedding. Via skip connections, the decoder will extract and merge the final 128x128x128x2 crop segmentation from the encoded representations of different layers in the transformer.
Data augmentation transformations for the training.
| Data | Random crop | Random flip and rotation | Random shift in intensity | Random contrast adjustment |
|---|---|---|---|---|
| Images | Anywhere in the scan | Along X, Y and Z-axis with a 25% probability for each axis for each axis | 50% chances of a 0.1 intensity shift | 80% chances to change image gamma in a [0.5,2] interval |
| Segmentation | N/A | N/A |
Comparison of manual and automatic segmentation using AUPRC, AUPRC-Baseline, Dice, F2 Score, Accuracy, Recall, and Precision of the 5-fold cross-validation for the 5 skull structures segmentation.
| Structure | AUPRC | AUPRC Baseline | Dice | F2 Score | Accuracy | Recall | Precision |
|---|---|---|---|---|---|---|---|
| Mandible | 0.926 ± 0.037 | 0.011 ± 0.003 | 0.962 ± 0.020 | 0.961 ± 0.026 | 0.9992 ± 0.0005 | 0.960 ± 0.031 | 0.965 ± 0.026 |
| Maxilla | 0.738 ± 0.096 | 0.011 ± 0.003 | 0.853 ± 0.064 | 0.857 ± 0.061 | 0.996 ± 0.001 | 0.862 ± 0.073 | 0.855 ± 0.099 |
| Cranial base | 0.642 ± 0.127 | 0.018 ± 0.006 | 0.788 ± 0.103 | 0.804 ± 0.109 | 0.992 ± 0.004 | 0.824 ± 0.099 | 0.774 ± 0.135 |
| Cervical vertebra | 0.602 ± 0.145 | 0.008 ± 0.006 | 0.760 ± 0.113 | 0.723 ± 0.164 | 0.995 ± 0.004 | 0.704 ± 0.192 | 0.854 ± 0.033 |
| Skin | 0.947 ± 0.035 | 0.425 ± 0.72 | 0.971 ± 0.018 | 0.982 ± 0.009 | 0.974 ± 0.018 | 0.989 ± 0.009 | 0.954 ± 0.037 |
Fig 4Visualization of the automatic maxilla segmentation steps.
Re-sample and contrast adjustment of the input image, segmentation with the sliding window using UNETR, and finally, re-sampling of the cleaned-up segmentation to the input size.
Fig 5Visualization of the automatic full-face segmentation results.
In red, the prediction is superposed with the manual segmentation in transparent green. On the full-face, we can see that the models managed to average the separation line between the maxilla and the mandible. The separation on the manual segmentation is different. It also explains why the metrics are lower than the mandible for those two skull structures.
Fig 63D Slicer module in development for AMASSS-CBCT.
On the left, we can see the module with the different options/parameters. On the right, the visualisation of the segmentation applied on one small field of view scan with the selected skull structures. The mandible in red, the maxilla in yellow and the root canals in green.