Literature DB >> 32856859

Robust Estimation of Breast Cancer Incidence Risk in Presence of Incomplete or Inaccurate Information.

Siva Teja Kakileti1,2, Geetha Manjunath1, Andre Dekker2, Leonard Wee2.   

Abstract

PURPOSE: To evaluate the robustness of multiple machine learning classifiers for breast cancer risk estimation in the presence of incomplete or inaccurate information. DATA AND METHODS: Open data for this study was obtained from the BCSC Data Resource (http://breastscreening.cancer.gov/). We conducted two ablation-type experiments to compare the robustness of different classifiers where we randomly switched known information to missing with a missing probability of pm in one experiment, and randomly corrupted the existing information with a probability of pc in another experiment. We considered three prominent machine-learning classifiers such as Logistic regression (LR), Random Forests (RF) and a custom Neural Network (NN) architecture and compared their degradation of discrimination performance as a function of increasing probability of missing or inaccurate data.
RESULTS: LR, RF and custom NN resulted in an Area Under Curve (AUC) of 0.645, 0.643 and 0.649, respectively, on a test set with 500,000 total observations. When we manipulated the data by varying probabilities pm and pc from 0 to 1, NN resulted in better performance in terms of AUC compared to RF and LR as long as less than half the data was missing/inaccurate (that is, for values of pm < 0.5 and pc < 0.5). However, for missing (pm) or corruption (pc) probabilities above 0.5, LR gave similar performance as the custom NN. RF resulted in overall poorer performance when the data had additional missing or incorrect entries.
CONCLUSION: In cases where the input information is missing or inaccurate, our experiments show that the proposed custom NN provides reliable risk estimates in medical datasets like BCSC. These results are particularly important in health care applications where not every attribute of the individual participant might be available.<br />.

Entities:  

Keywords:  Artificial Neural Networks; Breast cancer risk; Machine Learning; inaccurate data; missing values

Mesh:

Year:  2020        PMID: 32856859      PMCID: PMC7771951          DOI: 10.31557/APJCP.2020.21.8.2307

Source DB:  PubMed          Journal:  Asian Pac J Cancer Prev        ISSN: 1513-7368


  22 in total

1.  Estimation of the probability of an event as a function of several independent variables.

Authors:  S H Walker; D B Duncan
Journal:  Biometrika       Date:  1967-06       Impact factor: 2.445

Review 2.  The genetic epidemiology of breast cancer genes.

Authors:  Deborah Thompson; Douglas Easton
Journal:  J Mammary Gland Biol Neoplasia       Date:  2004-07       Impact factor: 2.673

3.  Benign breast disease and the risk of breast cancer.

Authors:  Lynn C Hartmann; Thomas A Sellers; Marlene H Frost; Wilma L Lingle; Amy C Degnim; Karthik Ghosh; Robert A Vierkant; Shaun D Maloney; V Shane Pankratz; David W Hillman; Vera J Suman; Jo Johnson; Cassann Blake; Thea Tlsty; Celine M Vachon; L Joseph Melton; Daniel W Visscher
Journal:  N Engl J Med       Date:  2005-07-21       Impact factor: 91.245

4.  Risk factors for breast cancer according to family history of breast cancer. For the Nurses' Health Study Research Group.

Authors:  G A Colditz; B A Rosner; F E Speizer
Journal:  J Natl Cancer Inst       Date:  1996-03-20       Impact factor: 13.506

5.  Evaluation of breast cancer risk assessment packages in the family history evaluation and screening programme.

Authors:  E Amir; D G Evans; A Shenton; F Lalloo; A Moran; C Boggis; M Wilson; A Howell
Journal:  J Med Genet       Date:  2003-11       Impact factor: 6.318

6.  Risk factors for breast cancer in women with proliferative breast disease.

Authors:  W D Dupont; D L Page
Journal:  N Engl J Med       Date:  1985-01-17       Impact factor: 91.245

7.  Genetic analysis of breast cancer in the cancer and steroid hormone study.

Authors:  E B Claus; N Risch; W D Thompson
Journal:  Am J Hum Genet       Date:  1991-02       Impact factor: 11.025

Review 8.  Breast cancer risk-assessment models.

Authors:  D Gareth R Evans; Anthony Howell
Journal:  Breast Cancer Res       Date:  2007       Impact factor: 6.466

9.  Commentary - radiology in India: the next decade.

Authors:  Dhandhapany Ragavan
Journal:  Indian J Radiol Imaging       Date:  2008-08

10.  Commentary - radiology in India: the next decade.

Authors:  Arjun Kalyanpur
Journal:  Indian J Radiol Imaging       Date:  2008-08
View more
  1 in total

1.  Study the Effect of the Risk Factors in the Estimation of the Breast Cancer Risk Score Using Machine Learning.

Authors:  Sam Khozama; Ali Mahmoud Mayya
Journal:  Asian Pac J Cancer Prev       Date:  2021-11-01
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.