Literature DB >> 31485436

Machine learning methods can more efficiently predict prostate cancer compared with prostate-specific antigen density and prostate-specific antigen velocity.

Satoshi Nitta¹, Masakazu Tsutsumi¹, Shotaro Sakka¹, Tsuyoshi Endo¹, Kenichiro Hashimoto², Morikuni Hasegawa³, Takayuki Hayashi³, Koji Kawai⁴, Hiroyuki Nishiyama⁴.

Abstract

BACKGROUND: Prostate-specific antigen (PSA)-based screening for prostate cancer has been widely performed, but its accuracy is unsatisfactory. To improve accuracy, building an effective statistical model using machine learning methods (MLMs) is a promising approach.
METHODS: Data on continuous changes in the PSA level over the past 2 years were accumulated from 512 patients who underwent prostate biopsy after PSA screening. The age of the patients, PSA level, prostate volumes, and white blood cell count in urinalysis were used as input data for the MLMs. As MLMs, we evaluated the efficacy of three different techniques: artificial neural networks (ANNs), random forest, and support vector machine. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and compared with the PSA level and the conventional PSA-based parameters: PSA density and PSA velocity.
RESULTS: When using two annual PSA testing, all receiver operating characteristic curves of the three MLMs were above the curve for the PSA level, PSA density, and PSA velocity. The AUCs of ANNs, random forest, and support vector machine were 0.69, 0.64, and 0.63, respectively. Those values were higher than the AUCs of the PSA level, PSA density, and PSA velocity, 0.53, 0.41, and 0.55, respectively. The accuracies of the MLMs (71.6% to 72.1%) were also superior to those of the PSA level (39.1%), PSA density (49.7%), and PSA velocity (54.9%). Among the MLMs, ANNs showed the most favorable AUC. The MLMs showed higher sensitivity and specificity than conventional PSA-based parameters. The model performance did not improve when using three annual PSA testing.
CONCLUSION: The present retrospective study results indicate that machine learning techniques can predict prostate cancer with significantly better AUCs than those of PSA density and PSA velocity.

Entities: Chemical

Keywords: Machine leaning method; Prostate cancer; Prostate-specific antigen

Year: 2019 PMID： 31485436 PMCID： PMC6713794 DOI： 10.1016/j.prnil.2019.01.001

Source DB: PubMed Journal: Prostate Int ISSN： 2287-8882

Introduction

Prostate-specific antigen (PSA)–based screening for prostate cancer (PCA) has been widely performed in many countries. PSA is produced by prostate epithelium; therefore, it is organ specific but is not a PCA-specific marker. PSA can be elevated in patients with benign prostate hypertrophy, prostatitis, or other non-PCA conditions. The low specificity of PSA can lead to unnecessary biopsy. The sensitivity of PSA is also unsatisfactory—it is limited to around 50% based on a widely used cutoff level of 4 ng/ml. To improve the accuracy of screening systems, various approaches such as measurements of free PSA, PSA density, PSA velocity, and introduction of the age-specific PSA reference range have been proposed.1, 2, 3 The results are promising; however, they do not show sufficiently strong diagnostic accuracy on their own. To overcome these obstacles, building an effective statistical model using various predictive variables is a promising approach. For this purpose, machine learning techniques have been used extensively in the field of clinical medicine, especially when used for the construction of prediction models. For prediction of PCA, the machine learning technique most evaluated is artificial neural network (ANN).4, 5, 6, 7, 8, 9, 10, 11, 12 ANNs show high area under the curve (AUC) values compared with PSA alone (total PSA or other PSA-based assessments), ranging from 0.67 to 0.87 depending on the selected variables and the examined population. More recently, as alternative machine learning methods (MLMs), support vector machine (SVM) and random forest (RF) have been applied to PSA screening.13, 14 These new algorithms may help to improve diagnosis, but available information is limited. Partly because of the low sensitivity and specificity of PSA, many participants in PSA screening receive repeat PSA testing annually. In general, participants with PSA above the cutoff levels are encouraged to undergo prostate biopsy. However, conservative attitudes toward making the decision to receive an invasive procedure sometimes cause participants to select repeat PSA testing rather than promptly opt for prostate biopsy. In addition, several guidelines recommend annual PSA testing for men with PSA below 4 ng/ml. The Japanese Urological association (JUA) recommends annual PSA testing for men with PSA levels 1–4 ng/ml. The NCCN Prostate Cancer Early Detection Panel recommends PSA testing at 1- to 2-year interval for men with PSA levels 1–3 ng/ml in the age group of 45–75 years. Such a frequent follow-up leads to multiple normal or intermediately abnormal PSA test values. It can result not only in psychological distress for participants but also in inefficient use of medical resources. Based on this background, we conducted the present retrospective study with the objective of building an effective statistical model using two or three annual PSA testing before prostate biopsy. For this purpose, we evaluated the efficacy of SVM and RF in addition to the traditional machine learning technique, ANNs. To our knowledge, there are no previous studies using annual PSA testing to improve the accuracy of PSA screening.

Patients and methods

Patients

The study was approved by the independent ethics committee of Hitachi General Hospital. Between October 2002 and June 2016, 3,911 patients underwent prostate biopsy at Hitachi General Hospital. We accumulated data on continuous changes in the PSA level over the past 2 years from 3911 patients and excluded those patients who received drugs (e.g. dutasteride) that might affect the PSA level. As a result, the overall study cohort (n = 512) was selected from 3911 patients. In addition, three continuous and annually measured PSA values were available for 304 patients. All patients underwent transurethral ultrasound sonography (TRUS) before biopsy. The TRUS examinations were performed by well-trained technicians. The PSA density was calculated by dividing the PSA by the prostate volume (approximate values of ellipsoid: anteroposterior diameter × lateral diameter × vertical diameter × π/6). The PSA velocity was calculated as the rate of PSA change using the first and last values only and the equation pn − p1/tn − t1 [n = total number of PSA tests, p = PSA value, and t = time at PSA test (yr)]. Of 512 patients, 193 (37.7%) were diagnosed with PCA on the first prostate biopsy, which was performed using the 10-core transrectal approach. Table 1 shows the baseline characteristics of the patients. The mean PSA level and PSA distribution were not significantly different between the patients diagnosed with PCA and those not diagnosed with PCA. The mean prostate volume was significantly higher in the former group compared with the latter group, but there was no significant difference in PSA density and PSA velocity between the two groups. Among 319 patients not diagnosed with PCA at the first biopsy, 112 patients underwent a second or more prostate biopsies. The rebiopsy was performed using the 12-core transperitoneal approach. Of the 112 patients, 57 were eventually diagnosed with PCA.

Table 1

Distribution of diagnostic variables in 512 patients who underwent the first biopsy.

Characteristics	Patients diagnosed with PCA	Patients with negative biopsy	P value
Characteristics	N = 193	N = 319	P value
Age (year)			0.52
50–59	7	30
60-69	82	149
≧70	104	140
Mean PSA level (ng/ml)a)	8.6	4.5	0.08
PSA (ng/ml)a)			0.67
<4	2	9
4 < PSA<10	149	260
10 ≦ PSA<20	38	47
≧20	4	3
Mean prostate volume (cc)	55.6	44.8	<0.05
Mean PSA density (ng/ml/cc)	0.20	0.19	0.87
Mean PSA velocity (ng/ml/year)	0.96	0.71	0.08

PCA, prostate cancer; PSA, prostate-specific antigen.

Average of two serial PSA testing

Distribution of diagnostic variables in 512 patients who underwent the first biopsy. PCA, prostate cancer; PSA, prostate-specific antigen. Average of two serial PSA testing

Machine learning methods

Three types of supervised machine learning algorithms (ANNs, SVM, and RF) were applied in this study. A set of input variables comprising age of patients, PSA level (maximum, minimum, median, mean, and variance level), prostate volumes, white blood cell (WBC) count in urinalysis, and result of biopsy was used to create the PCA prediction model. Age of patients, PSA level, and prostate volumes were entered as continuous variables, and the WBC count in urinalysis was entered as seven categorical variables from below 1 WBC/high-power field to 50-99 WBC/high-power field. The results of biopsy were entered as binary: 1 (PCA) and 0 (non-PCA). The machine learning models were fit using scikit-learn 0.18 modules of python throughout this study. The ANN models were fit using the multilayer perceptron Classifier class. The multilayer perceptron is a feedforward ANN model that maps sets of input data onto sets of appropriate outputs. There can be one or more nonlinear hidden layers between the input and the output. The input neurons send information to the hidden layer, and the hidden layer sends data to the output layer. Every neuron in the hidden layer has weighted inputs, a nonlinear activation function (which defines the output, given an input), and one output. There are two main tuning parameters: the number of nodes in the hidden layer and activation function. Training is the weight optimization process in which the error of predictions is minimized and the network reaches a specified level of accuracy. The SVM model is a machine learning model that finds an optimal boundary between the possible outputs. SVM identifies the optimal separating hyperplane that maximizes the margin between the output and input. The SVM models were fit using the SVC class of scikit-learn. There are three main tuning parameters: penalty parameter C, kernel type, and gamma. Parameter C handles the trade-off between maximizing the margin and minimizing the training error; increasing the value improves the classification accuracy for the training data but could lead to overfitting. Kernel type defines the type of nonlinear kernel function (tanh, radial basis function (RBF), polynomial) used for the separation. Gamma is used for kernel functions. Increasing the gamma value improves the classification accuracy for the training data, but this could lead to overfitting. The RF model is a machine learning model built on decision trees. In the decision tree, each node of the tree splits the data into two groups using a cutoff value within one of the features. As the depth increases, the decision tree tends to overfit the training data; the tree fits details of particular data rather than the overall properties of the distributions. The RF method is able to minimize the effect of the overfitting problem by creating an ensemble of randomized decision trees, each of which overfits the data and averages the results to find a better classification. Increasing the number of trees improves the accuracy for the training, but the cost in terms of computation time for learning is increased. For all the three models, the parameters of the estimator were optimized by 10-fold cross-validated grid search over a parameter grid. For each of the 10 “folds,” a model was trained using nine of the folds as training data and the resulting model was validated on the remaining part of the data. For each parameter set in the parameter grid, 10-fold cross-validations were evaluated and the best parameter set was selected.

Statistical analysis

Continuous variables were compared using the independent sample Student t test. Model performance was evaluated using area under the receiver operating characteristic (ROC) curve (AUC), which provides a measure of the discriminatory performance of the model; sensitivity, which is the proportion of true positives that are classified as such; specificity, which measures the proportion of correctly identified true negatives; and accuracy, which is the proportion of correct predictions.

Results

Fig. 1 shows the corresponding ROC curves for ANNs, SVM, and RF using two annual PSA testing to predict pathological diagnosis of the first biopsy in 512 patients. The results were compared with those of the PSA level (cutoff of 4 ng/ml), PSA density (cutoff of 0.20 ng/ml/cc), and PSA velocity (cutoff of 0.75 ng/ml/year). As shown in Fig. 1, all the ROC curves of the three MLMs were above the curve for the PSA level, PSA density, and PSA velocity. In addition, we compared the ROC curves of those methods to predict the result of the second or more biopsy in combination with the results of the first biopsy. After subsequent biopsies, 250 of 512 patients were finally diagnosed as having PCA. In this case, the ROC curves of the three MLMs were above the curve for the PSA level, PSA density, and PSA velocity (Fig. 2).

Fig. 1

Receiver operating characteristic curve for prediction of prostate cancer diagnosis on the first therapy is shown.

Fig. 2

Receiver operating characteristic curve for prediction of prostate cancer diagnosis on the first and second therapies is shown.

Receiver operating characteristic curve for prediction of prostate cancer diagnosis on the first therapy is shown. Receiver operating characteristic curve for prediction of prostate cancer diagnosis on the first and second therapies is shown. Table 2 shows the AUC, sensitivity, specificity, and accuracy of each prediction method for the results of the first biopsy. When using AUC as a measure of predictive model performance, as shown in Table 2, the AUC of ANNs was 0.69. It was superior to those of RF and SVM (0.64 and 0.63, respectively). The AUCs of the PSA level, PSA density, and PSA velocity were 0.53, 0.41, and 0.55, respectively, which were lower than those of the MLMs.

Table 2

Diagnostic performance of different prediction methods using two serial PSA testing and PSA-related parameter.

Outcome	Artificial neural network	Random forest	Support vector machine	PSA density	PSA velocity	PSA
AUC	0.69	0.64	0.63	0.41	0.55	0.53
Sensitivity (%)	56.4	66.7	59.0	37.2	47.1	99.0
Specificity (%)	76.6	56.2	68.7	57.3	60.8	2.8
Accuracy (%)	71.6	72.1	71.6	49.7	54.9	39.1

AUC, area under the receiver operating characteristic curve; PSA, prostate-specific antigen.

Diagnostic performance of different prediction methods using two serial PSA testing and PSA-related parameter. AUC, area under the receiver operating characteristic curve; PSA, prostate-specific antigen. As show in Table 2, the accuracies of the three MLMs were also superior to those of the PSA level, PSA density, and PSA velocity. The sensitivities of the MLMs were 56.4 to 66.7%, which were higher than those of PSA density and PSA velocity. Furthermore, the specificities of the MLMs tended to be higher than conventional PSA–related parameters. We further analyzed the predictive performance of the three MLMs using three annual PSA testing. As shown in Table 3, AUCs of ANNs, RF, and SVM were 0.70, 0.68, and 0.71, respectively. Those were slightly superior to AUCs using two annual PSA testing. In addition, the accuracies, sensitivities, and specificities were higher than those using two annual PSA testing.

Table 3

Diagnostic performance of different prediction methods using three serial PSA testing.

Outcome	Artificial neural network	Random forest	Support vector machine
AUC	0.70	0.68	0.71
Sensitivity (%)	59.1	72.7	68.2
Specificity (%)	64.1	64.1	79.5
Accuracy (%)	72.4	65.8	74.1

AUC, area under the receiver operating characteristic curve; PSA, prostate-specific antigen.

Diagnostic performance of different prediction methods using three serial PSA testing. AUC, area under the receiver operating characteristic curve; PSA, prostate-specific antigen.

Discussion

As machine learning techniques, ANNs were first introduced into PCA diagnosis in 1994 by Snow et al. . The authors used age, PSA level, digital rectal examination (DRE), and TRUS findings as input data and reported the excellent AUC of 0.87. Since then, ANNs have been widely used for this purpose. However, reported AUCs ranged from 0.67 to 0.88 depending on the selected variables and the examined population. According to reviews by Schroder and Kattan, seven of the eight ANN studies used DRE findings as variables. In six studies, percent free PSA was also introduced as a variable. Here, we showed that machine learning techniques using the more simple predictors as input data can efficiently predict PCA. Our analyses have provided the following interesting findings. In the present study, we used age, PSA level, prostate volume, and WBC count in urinalysis as variables. The WBC count in urinalysis was selected in attempting to exclude PSA elevation due to prostatitis. We excluded DRE as a variable to avoid subjective factors. The three examined MLMs revealed both favorite AUCs of 0.63 to 0.69, which were better than that of the PSA level, PSA density, and PSA velocity. We further analyzed the predictive performance of the models by adding PSA density and PSA velocity as variables, but this failed to further improve AUC (data not shown). Therefore, it is possible that these models had already learned factors associated with PSA density and PSA velocity. As the JUA recommends annual PSA testing for men with PSA levels 1–4 ng/ml, a number of Japanese PSA screening participants have prior two or more annual PSA data. Present prediction models can help decision-making for prompt prostate biopsy or further PSA follow-up. However, the model using three annual PSA testing did not improve predictive performance, as shown in Table 3. Therefore, it is possible that PSA levels over the past 2 years were sufficient for this purpose. Although the JUA recommends annual PSA testing, recently, individualized rescreening interval is being proposed for men with PSA levels in the range 1–4 ng/ml. Randazzo et al. conducted a population-based prospective screening study and proposed a retest interval of every 3–4 years for men with the baseline PSA of 1–2 ng/ml. The European Association of Urology recommendation is to postpone PSA follow-up interval to 8 years for men with PSA of <2 ng/ml at the age of 60 years . In the future, if prediction models using PSA testing at optimal interval are developed, they could be a powerful tool for a more individualized and scientific rescreening strategy. Because there are no previous studies using annual PSA testing, we compared three different models using the same variables and population to identify the most suitable prediction model. In a clinical situation, a relatively large number of variables compared with the sample size have a tendency for overfitting bias in machine learnings. The RF model is known to be less prone to overfitting than the traditional ANNs or SVM. Therefore, RF may be a candidate for a suitable model when using annual PSA testing. However, in the present study, the AUC of RF was not superior to those of ANNs, and our study did not have enough power to draw conclusion on this point. Although our study revealed important findings, there are several limitations to our analysis. First, the sample size was relatively small; therefore, further investigation is needed to establish a prediction model. Second, many potential biases resulting from the retrospective design of the analysis must be considered. Third, information about the decision-making process for prostate biopsy selection was not fully available. In conclusion, the present retrospective study showed that machine learning techniques could predict a PCA diagnosis with significantly better AUCs than those of PSA density and PSA velocity. It is possible that the three MLMs developed a tendency to detect PCA from input data more effectively than human intelligence.

Conflicts of interest

No potential conflict of interest relevant to this article was reported.

18 in total

1. Initial biopsy outcome prediction--head-to-head comparison of a logistic regression-based nomogram versus artificial neural network.

Authors: Felix K-H Chun; Markus Graefen; Alberto Briganti; Andrea Gallina; Julia Hopp; Michael W Kattan; Hartwig Huland; Pierre I Karakiewicz
Journal: Eur Urol Date: 2006-08-04 Impact factor: 20.096

2. Novel artificial neural network for early detection of prostate cancer.

Authors: Bob Djavan; Mesut Remzi; Alexandre Zlotta; Christian Seitz; Peter Snow; Michael Marberger
Journal: J Clin Oncol Date: 2002-02-15 Impact factor: 44.544

3. Performance of a neural network in detecting prostate cancer in the prostate-specific antigen reflex range of 2.5 to 4.0 ng/mL.

Authors: R J Babaian; H Fritsche; A Ayala; V Bhadkamkar; D A Johnston; W Naccarato; Z Zhang
Journal: Urology Date: 2000-12-20 Impact factor: 2.649

4. A (-5, -7) proPSA based artificial neural network to detect prostate cancer.

Authors: Carsten Stephan; Hellmuth-Alexander Meyer; Maciej Kwiatkowski; Franz Recker; Henning Cammann; Stefan A Loening; Klaus Jung; Michael Lein
Journal: Eur Urol Date: 2006-05-02 Impact factor: 20.096

5. PSA, PSA density, PSA density of transition zone, free/total PSA ratio, and PSA velocity for early detection of prostate cancer in men with serum PSA 2.5 to 4.0 ng/mL.

Authors: B Djavan; A Zlotta; C Kratzik; M Remzi; C Seitz; C C Schulman; M Marberger
Journal: Urology Date: 1999-09 Impact factor: 2.649

6. Assay-specific artificial neural networks for five different PSA assays and populations with PSA 2-10 ng/ml in 4,480 men.

Authors: Carsten Stephan; Chuanliang Xu; Henning Cammann; Markus Graefen; Alexander Haese; Hartwig Huland; Axel Semjonow; Eleftherios P Diamandis; Mesut Remzi; Bob Djavan; Mark F Wildhagen; Bert G Blijenberg; Patrik Finne; Ulf-Hakan Stenman; Klaus Jung; Hellmuth-Alexander Meyer
Journal: World J Urol Date: 2007-02-28 Impact factor: 4.226

7. Age-specific reference range of prostate-specific antigen and prostate cancer detection in population-based screening cohort in Japan: verification of Japanese Urological Association Guideline for prostate cancer.

Authors: Yasuhide Kitagawa; Kouji Izumi; Kiyoshi Sawada; Atsushi Mizokami; Kazuyoshi Nakashima; Kiyoshi Koshida; Takao Nakashima; Kimiomi Miyazaki; Mikio Namiki
Journal: Int J Urol Date: 2014-06-15 Impact factor: 3.369

8. Multicenter evaluation of an artificial neural network to increase the prostate cancer detection rate and reduce unnecessary biopsies.

Authors: Carsten Stephan; Henning Cammann; Axel Semjonow; Eleftherios P Diamandis; Leon F A Wymenga; Michael Lein; Pranav Sinha; Stefan A Loening; Klaus Jung
Journal: Clin Chem Date: 2002-08 Impact factor: 8.327

Review 9. The comparability of models for predicting the risk of a positive prostate biopsy with prostate-specific antigen alone: a systematic review.

Authors: Fritz Schröder; Michael W Kattan
Journal: Eur Urol Date: 2008-05-22 Impact factor: 20.096

10. Comparison of two different artificial neural networks for prostate biopsy indication in two different patient populations.

Authors: Carsten Stephan; Chuanliang Xu; Patrik Finne; Henning Cammann; Hellmuth-Alexander Meyer; Michael Lein; Klaus Jung; Ulf-Hakan Stenman
Journal: Urology Date: 2007-08-03 Impact factor: 2.649

6 in total

1. Machine Learning-Based Models Enhance the Prediction of Prostate Cancer.

Authors: Sunmeng Chen; Tengteng Jian; Changliang Chi; Yi Liang; Xiao Liang; Ying Yu; Fengming Jiang; Ji Lu
Journal: Front Oncol Date: 2022-07-06 Impact factor: 5.738

2. Advancing Traditional Prostate-specific Antigen Kinetics in the Detection of Prostate Cancer: A Machine Learning Model.

Authors: Marlon Perera; Lewis Smith; Ian Thompson; Geoff Breemer; Nathan Papa; Manish I Patel; Peter Swindle; Elliot Smith
Journal: Eur Urol Focus Date: 2021-12-14

3. Application of Machine Learning Algorithms in Breast Cancer Diagnosis and Classification.

Authors: Clement G Yedjou; Solange S Tchounwou; Richard A Aló; Rashid Elhag; BereKet Mochona; Lekan Latinwo
Journal: Int J Sci Acad Res Date: 2021-10-30

Review 4. Artificial intelligence for clinical oncology.

Authors: Benjamin H Kann; Ahmed Hosny; Hugo J W L Aerts
Journal: Cancer Cell Date: 2021-04-29 Impact factor: 38.585

5. Artificial Intelligence Based Machine Learning Models Predict Sperm Parameter Upgrading after Varicocele Repair: A Multi-Institutional Analysis.

Authors: Jesse Ory; Michael B Tradewell; Udi Blankstein; Thiago F Lima; Sirpi Nackeeran; Daniel C Gonzalez; Elie Nwefo; Joseph Moryousef; Vinayak Madhusoodanan; Susan Lau; Keith Jarvi; Ranjith Ramasamy
Journal: World J Mens Health Date: 2022-01-02 Impact factor: 6.494

6. Prediction of Pathological Upgrading at Radical Prostatectomy in Prostate Cancer Eligible for Active Surveillance: A Texture Features and Machine Learning-Based Analysis of Apparent Diffusion Coefficient Maps.

Authors: Jinke Xie; Basen Li; Xiangde Min; Peipei Zhang; Chanyuan Fan; Qiubai Li; Liang Wang
Journal: Front Oncol Date: 2021-02-04 Impact factor: 6.244

6 in total