Francisco Javier Pérez-Benito1, Francois Signol2, Juan-Carlos Pérez-Cortés3, Marina Pollán4, Beatriz Pérez-Gómez5, Dolores Salas-Trejo6, María Casals7, Inmaculada Martínez8, Rafael LLobet9. 1. Institute of Computer Technology, Universitat Politècnica de València, Camino de Vera, s/n, València, 46022 Spain. Electronic address: frapebe@doctor.upv.es. 2. Institute of Computer Technology, Universitat Politècnica de València, Camino de Vera, s/n, València, 46022 Spain. Electronic address: fsignol@iti.es. 3. Institute of Computer Technology, Universitat Politècnica de València, Camino de Vera, s/n, València, 46022 Spain. Electronic address: jcperez@iti.upv.es. 4. National Center for Epidemiology, Carlos III Institute of Health, Monforte de lemos, 5, Madrid, 28029 Spain; Consortium for Biomedical Research in Epidemiology and Public Health (CIBER en Epidemiología y Salud Pública - CIBERESP), Carlos III Institute of Health, Monforte de Lemos, 5, Madrid, 28029 Spain. Electronic address: mpollan@isciii.es. 5. National Center for Epidemiology, Carlos III Institute of Health, Monforte de lemos, 5, Madrid, 28029 Spain; Consortium for Biomedical Research in Epidemiology and Public Health (CIBER en Epidemiología y Salud Pública - CIBERESP), Carlos III Institute of Health, Monforte de Lemos, 5, Madrid, 28029 Spain. Electronic address: bperez@isciii.es. 6. Valencian Breast Cancer Screening Program, General Directorate of Public Health, València, Spain; Centro Superior de Investigación en Salud Pública CSISP, FISABIO, València, Spain. Electronic address: salas_dol@gva.es. 7. Valencian Breast Cancer Screening Program, General Directorate of Public Health, València, Spain; Centro Superior de Investigación en Salud Pública CSISP, FISABIO, València, Spain. Electronic address: casals_mar@gva.es. 8. Valencian Breast Cancer Screening Program, General Directorate of Public Health, València, Spain; Centro Superior de Investigación en Salud Pública CSISP, FISABIO, València, Spain. Electronic address: martinez_inm@gva.es. 9. Institute of Computer Technology, Universitat Politècnica de València, Camino de Vera, s/n, València, 46022 Spain. Electronic address: rllobet@iti.upv.es.
Abstract
BACKGROUND: The breast dense tissue percentage on digital mammograms is one of the most commonly used markers for breast cancer risk estimation. Geometric features of dense tissue over the breast and the presence of texture structures contained in sliding windows that scan the mammograms may improve the predictive ability when combined with the breast dense tissue percentage. METHODS: A case/control study nested within a screening program covering 1563 women with craniocaudal and mediolateral-oblique mammograms (755 controls and the contralateral breast mammograms at the closest screening visit before cancer diagnostic for 808 cases) aging 45 to 70 from Comunitat Valenciana (Spain) was used to extract geometric and texture features. The dense tissue segmentation was performed using DMScan and validated by two experienced radiologists. A model based on Random Forests was trained several times varying the set of variables. A training dataset of 1172 patients was evaluated with a 10-stratified-fold cross-validation scheme. The area under the Receiver Operating Characteristic curve (AUC) was the metric for the predictive ability. The results were assessed by only considering the output after applying the model to the test set, which was composed of the remaining 391 patients. RESULTS: The AUC score obtained by the dense tissue percentage (0.55) was compared to a machine learning-based classifier results. The classifier, apart from the percentage of dense tissue of both views, firstly included global geometric features such as the distance of dense tissue to the pectoral muscle, dense tissue eccentricity or the dense tissue perimeter, obtaining an accuracy of 0.56. By the inclusion of a global feature based on local histograms of oriented gradients, the accuracy of the classifier was significantly improved (0.61). The number of well-classified patients was improved up to 236 when it was 208. CONCLUSION: Relative geometric features of dense tissue over the breast and histograms of standardized local texture features based on sliding windows scanning the whole breast improve risk prediction beyond the dense tissue percentage adjusted by geometrical variables. Other classifiers could improve the results obtained by the conventional Random Forests used in this study.
BACKGROUND: The breast dense tissue percentage on digital mammograms is one of the most commonly used markers for breast cancer risk estimation. Geometric features of dense tissue over the breast and the presence of texture structures contained in sliding windows that scan the mammograms may improve the predictive ability when combined with the breast dense tissue percentage. METHODS: A case/control study nested within a screening program covering 1563 women with craniocaudal and mediolateral-oblique mammograms (755 controls and the contralateral breast mammograms at the closest screening visit before cancer diagnostic for 808 cases) aging 45 to 70 from Comunitat Valenciana (Spain) was used to extract geometric and texture features. The dense tissue segmentation was performed using DMScan and validated by two experienced radiologists. A model based on Random Forests was trained several times varying the set of variables. A training dataset of 1172 patients was evaluated with a 10-stratified-fold cross-validation scheme. The area under the Receiver Operating Characteristic curve (AUC) was the metric for the predictive ability. The results were assessed by only considering the output after applying the model to the test set, which was composed of the remaining 391 patients. RESULTS: The AUC score obtained by the dense tissue percentage (0.55) was compared to a machine learning-based classifier results. The classifier, apart from the percentage of dense tissue of both views, firstly included global geometric features such as the distance of dense tissue to the pectoral muscle, dense tissue eccentricity or the dense tissue perimeter, obtaining an accuracy of 0.56. By the inclusion of a global feature based on local histograms of oriented gradients, the accuracy of the classifier was significantly improved (0.61). The number of well-classified patients was improved up to 236 when it was 208. CONCLUSION: Relative geometric features of dense tissue over the breast and histograms of standardized local texture features based on sliding windows scanning the whole breast improve risk prediction beyond the dense tissue percentage adjusted by geometrical variables. Other classifiers could improve the results obtained by the conventional Random Forests used in this study.
Authors: Andrés Larroza; Francisco Javier Pérez-Benito; Juan-Carlos Perez-Cortes; Marta Román; Marina Pollán; Beatriz Pérez-Gómez; Dolores Salas-Trejo; María Casals; Rafael Llobet Journal: Diagnostics (Basel) Date: 2022-07-28