| Literature DB >> 34180308 |
Jihwan Park1, Mi Jung Rho2, Hyong Woo Moon3, Jaewon Kim4, Chanjung Lee4, Dongbum Kim4, Choung-Soo Kim5, Seong Soo Jeon6, Minyong Kang6,7, Ji Youl Lee3.
Abstract
OBJECTIVES: To develop a model to predict biochemical recurrence (BCR) after radical prostatectomy (RP), using artificial intelligence (AI) techniques. PATIENTS AND METHODS: This study collected data from 7,128 patients with prostate cancer (PCa) who received RP at 3 tertiary hospitals. After preprocessing, we used the data of 6,755 cases to generate the BCR prediction model. There were 16 input variables with BCR as the outcome variable. We used a random forest to develop the model. Several sampling techniques were used to address class imbalances.Entities:
Keywords: Doctor’s Answer; PROMISE CLIP registry; artificial intelligence; biochemical recurrence; prostate cancer; radical prostatectomy; random forest
Mesh:
Substances:
Year: 2021 PMID: 34180308 PMCID: PMC8243093 DOI: 10.1177/15330338211024660
Source DB: PubMed Journal: Technol Cancer Res Treat ISSN: 1533-0338
Sixteen Input Variables for BCR Prediction Model.
| No | Variables | Type | N | % |
|---|---|---|---|---|
| 1 | Age at diagnosis | Factor (ordinal) | 5 | 0.001 |
| 2 | BMI | Numeric (continuous) | 22 | 0.003 |
| 3 | Marital status | Factor | 1,450 | 0.215 |
| 4 | Education | Factor | 2,729 | 0.404 |
| 5 | Smoking | Factor | 136 | 0.02 |
| 6 | Drinking | Factor | 123 | 0.018 |
| 7 | Family history of prostate cancer | Factor | 1,276 | 0.189 |
| 8 | Initial PSA | Numeric (continuous) | 75 | 0.011 |
| 9 | Gleason group | Factor (ordinal) | 458 | 0.068 |
| 10 | Max positive core count | Numeric (continuous) | 653 | 0.097 |
| 11 | Core ratio | Numeric (continuous) | 551 | 0.082 |
| 12 | Neoplasm high risk malignant | Factor | 175 | 0.026 |
| 13 | Extracapsular extension | Factor | 235 | 0.035 |
| 14 | Seminal vesicle invasion | Factor | 193 | 0.029 |
| 15 | Lymph node metastasis | Factor | 507 | 0.075 |
| 16 | T staging | Factor | 1,810 | 0.268 |
Basic Information.
| Variable | Sample size | % | |
|---|---|---|---|
| Hospital | Seoul St. Mary’s Hospital of the Catholic University | 1,719 | 25 |
| Samsung Medical Center | 2,383 | 35 | |
| Asan Medical Center | 2,653 | 39 | |
| BCR | Patients without BCR | 4,555 | 67 |
| Patients with BCR | 2,200 | 33 | |
| Total | 6,755 | 100 | |
Descriptive Statistics for 16 Input Variables.
| Variable | Sample size | % | Total | |
|---|---|---|---|---|
| Age at diagnosis | 40-44 | 16 | 0.2 | 6,750 |
| 45-49 | 54 | 0.8 | ||
| 50-54 | 301 | 4.5 | ||
| 55-59 | 846 | 12.5 | ||
| 60-64 | 1,523 | 22.6 | ||
| 65-69 | 1,856 | 27.5 | ||
| 70-74 | 1,645 | 24.4 | ||
| 75-79 | 495 | 7.3 | ||
| 80-84 | 13 | 0.2 | ||
| Over 85 | 1 | 0 | ||
| Marital status | Single | 51 | 1 | 5,305 |
| Married | 5,165 | 97.4 | ||
| Divorced | 24 | 0.5 | ||
| Bereavement | 65 | 1.2 | ||
| Education | Uneducated | 492 | 12.2 | 4,026 |
| Elementary school graduate | 611 | 15.2 | ||
| Middle school graduate | 1,256 | 31.2 | ||
| High school graduate | 1,245 | 30.9 | ||
| University graduate and above | 422 | 10.5 | ||
| Smoking | Non-smoker | 3,667 | 55.4 | 6,619 |
| Ex-smoker | 2,303 | 34.8 | ||
| Smoker | 649 | 9.8 | ||
| Alcohol consumption | Drinker | 4,687 | 70.7 | 6,632 |
| Non-drinker | 1,945 | 29.3 | ||
| Family history of PCa | No family history | 4,798 | 87.6 | 5,479 |
| Family history with first cousin | 567 | 10.3 | ||
| Family history with second cousin | 114 | 2.1 | ||
| Gleason group | 3 + 3 = 6 | 1,672 | 26.6 | 6,297 |
| 3 + 4 = 7 | 2,269 | 36 | ||
| 4 + 3 = 7 | 1,006 | 16 | ||
| 4 + 4 = 8 | 742 | 11.8 | ||
| Gleason sum ≥ 9 | 608 | 9.7 | ||
| Neoplasm high risk malignant | No | 1,496 | 22.7 | 6,580 |
| Yes | 5,084 | 77.3 | ||
| Extracapsular extension (ECE) | No | 4,312 | 66.1 | 6,520 |
| Yes | 2,208 | 33.9 | ||
| Seminal vesicle invasion (SVI) | No | 5,725 | 87.2 | 6,562 |
| Yes | 837 | 12.8 | ||
| Lymph node metastasis (LM) | No | 4,708 | 75.4 | 6,248 |
| Yes | 1,540 | 24.6 | ||
| T staging | Stage 1 | 22 | 0.4 | 4,945 |
| Stage 2 | 3,286 | 66.5 | ||
| Stage 3 | 1,595 | 32.3 | ||
| Stage 4 | 42 | 0.8 | ||
| BMI (Mean) | 34.53 | 6,733 | ||
| Initial PSA (Mean) | 9.72 | 6,680 | ||
| Max positive core count (Mean) | 46.05 | 6,102 | ||
| Core ratio (Mean) | 0.46 | 6,204 | ||
Algorithm Performance Results.
| No | Algorithm and sampling methods | Accuracy | Recall | Precision | F1 score | ROC AUC |
|---|---|---|---|---|---|---|
| 1 | Random forest with SMOTE, one side selection | 80.39 | 68.75 | 95.4 | 79.73 | 87.77 |
| 2 | Random forest with SMOTE, ADASYN | 76.45 | 70.17 | 80.26 | 76.39 | 84.62 |
| 3 | Random forest with random oversampling, neighborhood cleaning rule | 84.64 | 76.64 | 82.36 | 83.55 | 92.23 |
| 4 | Random forest with SMOTEENN | 95.22 | 94.88 | 95.41 | 94.97 | 96.74 |
| 5 | Random forest with SMOTE Tomek, ENN, and random oversampling | 96.59 | 95.49 | 97.66 | 96.59 | 98.83 |
Figure 1.Dr. Answer AI software for BCR prediction model.jpg.