| Literature DB >> 35455101 |
Thibaud Brochet1, Jérôme Lapuyade-Lahorgue1, Alexandre Huat1,2,3, Sébastien Thureau1,2, David Pasquier4, Isabelle Gardin1,2, Romain Modzelewski1,2, David Gibon3, Juliette Thariat5, Vincent Grégoire6, Pierre Vera1,2, Su Ruan1.
Abstract
In this paper, we propose to quantitatively compare loss functions based on parameterized Tsallis-Havrda-Charvat entropy and classical Shannon entropy for the training of a deep network in the case of small datasets which are usually encountered in medical applications. Shannon cross-entropy is widely used as a loss function for most neural networks applied to the segmentation, classification and detection of images. Shannon entropy is a particular case of Tsallis-Havrda-Charvat entropy. In this work, we compare these two entropies through a medical application for predicting recurrence in patients with head-neck and lung cancers after treatment. Based on both CT images and patient information, a multitask deep neural network is proposed to perform a recurrence prediction task using cross-entropy as a loss function and an image reconstruction task. Tsallis-Havrda-Charvat cross-entropy is a parameterized cross-entropy with the parameter α. Shannon entropy is a particular case of Tsallis-Havrda-Charvat entropy for α=1. The influence of this parameter on the final prediction results is studied. In this paper, the experiments are conducted on two datasets including in total 580 patients, of whom 434 suffered from head-neck cancers and 146 from lung cancers. The results show that Tsallis-Havrda-Charvat entropy can achieve better performance in terms of prediction accuracy with some values of α.Entities:
Keywords: Shannon entropy; Tsallis–Havrda–Charvat entropy; deep neural networks; generalized entropies; head–neck cancer; lung cancer; recurrence prediction
Year: 2022 PMID: 35455101 PMCID: PMC9031340 DOI: 10.3390/e24040436
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Input images: head–neck CT (above) and lung CT (below).
Figure 2Architecture of the multitask neural network for recurrence prediction (T2) with the help of another task (T1: image reconstruction).
Quantitative clinical data processed through the network.
| Clinical Data | Modality |
|---|---|
| Hemoglobin | g/dL |
| Lymphocytes | Giga/L |
| Leucocytes | Giga/L |
| Thrombocytes | Giga/L |
| Albumin | g/L |
| Treatment duration | Days |
| Total irradiation dose | Gy |
| Number of fractions | / |
| Average dose per fraction | Gy |
| Weight at the start and end of treatment | kg |
Qualitative clinical data processed through the network.
| Clinical Data | Modality |
|---|---|
| Gender | M/F |
| Tabacology | Smoker, non-smoker, former smoker |
| Use of induction chemotherapy | Yes/No |
| Use of concomitant chemotherapy | Yes/No |
| TNM | Tumor, Node, Metastasis |
Figure 3Input images: original images (left) vs. reconstructed images (right).
Accuracy obtained by loss function derived from Tsallis–Havrda–Charvat entropy in a function of for the head–neck cancer dataset (p-values lower than 0.05 and accuracies higher than Shannon’s are highlighted in blue).
| 5-Fold | ||||||||
|---|---|---|---|---|---|---|---|---|
|
| 1 | 2 | 3 | 4 | 5 | Average | SD | |
| 0.1 | 0.68 | 0.53 | 0.6 | 0.58 | 0.63 | 0.60 | 0.06 | 0.01 |
| 0.3 | 0.60 | 0.70 | 0.70 | 0.70 | 0.43 | 0.63 | 0.12 | 0.28 |
| 0.5 | 0.58 | 0.58 | 0.60 | 0.70 | 0.73 | 0.64 | 0.07 | 0.27 |
| 0.7 | 0.85 | 0.70 | 0.60 | 0.70 | 0.65 | 0.70 | 0.09 | 0.25 |
| 0.9 | 0.58 | 0.60 | 0.60 | 0.60 | 0.68 | 0.61 | 0.04 | 0.07 |
|
|
|
|
|
|
|
|
|
|
| 1.1 | 0.68 | 0.75 | 0.75 | 0.73 | 0.75 | 0.73 | 0.03 | 0.09 |
| 1.3 | 0.63 | 0.73 | 0.68 | 0.75 | 0.70 | 0.70 | 0.05 | 0.32 |
|
|
|
|
|
|
|
|
|
|
| 1.7 | 0.80 | 0.73 | 0.63 | 0.75 | 078 | 0.74 | 0.07 | 0.07 |
|
|
|
|
|
|
|
|
|
|
| 2.1 | 0.68 | 0.63 | 0.60 | 0.62 | 0.575 | 0.63 | 0.04 | 0.12 |
| 2.3 | 0.73 | 0.73 | 0.73 | 0.73 | 0.7 | 0.72 | 0.01 | 0.09 |
| 2.5 | 0.75 | 0.58 | 0.68 | 0.68 | 0.6 | 0.66 | 0.07 | 0.34 |
| 2.7 | 0.68 | 0.53 | 0.45 | 0.63 | 0.6 | 0.58 | 0.09 | 0.04 |
| 2.9 | 0.73 | 0.73 | 0.73 | 0.75 | 0.73 | 0.73 | 0.01 | 0.07 |
| 3.1 | 0.7 | 0.55 | 0.65 | 0.57 | 0.63 | 0.62 | 0.06 | 0.02 |
| 3.3 | 0.65 | 0.65 | 0.65 | 0.58 | 0.55 | 0.62 | 0.05 | 0.07 |
| 3.5 | 0.73 | 0.75 | 0.73 | 0.75 | 0.70 | 0.73 | 0.02 | 0.08 |
| 3.7 | 0.73 | 0.70 | 0.60 | 0.60 | 0.58 | 0.64 | 0.07 | 0.20 |
| 3.9 | 0.68 | 0.70 | 0.55 | 0.58 | 0.53 | 0.61 | 0.08 | 0.09 |
Accuracy obtained by loss function derived from Tsallis–Havrda–Charvat entropy in a function of for the lung cancer dataset (p-values lower than 0.05 and accuracies higher than Shannon’s are highlighted in blue).
| 5-Fold | ||||||||
|---|---|---|---|---|---|---|---|---|
|
| 1 | 2 | 3 | 4 | 5 | Average | SD | |
| 0.1 | 0.58 | 0.58 | 0.47 | 0.58 | 0.63 | 0.57 | 0.06 | 0.23 |
| 0.3 | 0.58 | 0.58 | 0.58 | 0.58 | 0.68 | 0.60 | 0.04 | 0.13 |
| 0.5 | 0.63 | 0.58 | 0.52 | 0.58 | 0.52 | 0.56 | 0.03 | 0.26 |
| 0.7 | 0.58 | 0.58 | 0.63 | 0.63 | 0.58 | 0.60 | 0.03 | 0.13 |
| 0.9 | 0.63 | 0.53 | 0.68 | 0.47 | 0.53 | 057 | 0.08 | 0.21 |
|
|
|
|
|
|
|
|
|
|
| 1.1 | 0.63 | 0.63 | 0.52 | 0.52 | 0.52 | 0.56 | 0.06 | 0.18 |
| 1.3 | 0.68 | 0.53 | 0.53 | 0.47 | 0.53 | 0.55 | 0.08 | 0.15 |
| 1.5 | 0.58 | 0.53 | 0.53 | 0.47 | 0.53 | 0.53 | 0.04 | 0.44 |
| 1.7 | 0.73 | 0.47 | 0.63 | 0.57 | 0.42 | 0.56 | 0.12 | 0.37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3.5 | 0.74 | 0.73 | 0.68 | 0.33 | 0.58 | 0.61 | 0.17 | 0.12 |
| 3.7 | 0.68 | 0.63 | 0.47 | 0.63 | 0.73 | 0.63 | 0.09 | 0.07 |
| 3.9 | 0.78 | 0.53 | 0.47 | 0.47 | 0.63 | 0.58 | 0.13 | 0.07 |