| Literature DB >> 30996683 |
Yasunari Miyagi1,2, Toshihiro Habara3, Rei Hirata3, Nobuyoshi Hayashi3.
Abstract
PURPOSE: To identify artificial intelligence (AI) classifiers in images of blastocysts to predict the probability of achieving a live birth in patients classified by age. Results are compared to those obtained by conventional embryo (CE) evaluation.Entities:
Keywords: artificial intelligence; blastocyst; deep learning; live birth; neural network
Year: 2019 PMID: 30996683 PMCID: PMC6452012 DOI: 10.1002/rmb2.12266
Source DB: PubMed Journal: Reprod Med Biol ISSN: 1445-5781
Figure 1A flowchart to make classifiers
Univariate regression functions of conventional embryo evaluation parameters used to predict the probability of live birth
| Independent variable | Formula | Coefficient |
|---|---|---|
| Age |
|
|
|
| ||
|
| ||
| Time of embryo transfer | 1/(1 + Exp( |
|
|
| ||
| Anti‐Müllerian hormone (ng/mL) | 1/(1 + Exp( |
|
|
| ||
| Blastomere number on day 3 |
|
|
|
| ||
|
| ||
| Grade on day 3 (Class A = 1, B = 2, C = 3, D = 4) |
|
|
|
| ||
|
| ||
| Embryo cryopreservation day (Day 5 = 1, Day 6 = 2) |
|
|
|
| ||
| Inner cell mass (A = 1, B = 2, C = 3) |
|
|
|
| ||
| Averaged diameter (µm) | 1/(1 + Exp( |
|
|
| ||
| Body mass index (kg/m2) | 1/(1 + Exp( |
|
|
|
Independent variables, which were related to live birth and also used in the multivariate regression, are presented. Each formula was determined to fit the data distribution. Coefficients are shown as the mean ±SE.
The multivariate logistic regression function, 1/(1 + Exp(β 0+β 1 x 1+ … +β 10 x 10), of the conventional embryo evaluation for predicting live birth
| Independent variable | Coefficient |
| Odds ratio |
|---|---|---|---|
| Constant ( |
| 1.06 × 10−48 | ‐ |
| Age value ( |
| 1.05 × 10−35 | 60.42 |
| Average diameter value ( |
| 1.792 × 10−16 | 163.76 |
| TE value ( |
| 7.226 × 10−7 | 7.17 |
| Embryo cryopreservation day value ( |
| 8.577 × 10−6 | 27.10 |
| ET times value ( |
| 0.0000481 | 13.35 |
| ICM value ( |
| 0.0081 | 3.47 |
| AMH value ( |
| 0.1156 | 3.13 |
| Blastomere number value ( |
| 0.280 | 1.84 |
| Body mass index value ( |
| 0.379 | 1.91 |
| Grade on day 3 value ( |
| 0.936 | 0.92 |
AMH, anti‐Müllerian hormone, ET, embryo transfer; ICM, inner cell mass; TE, trophectoderm.
The values of independent variables (except constant β 0) are values calculated by the univariate regression functions shown in Table 1. Multicollinearity was not observed between any two independent variables. Coefficients are shown as the mean ±SE.
Figure 2The profiles for accuracy with standard deviation (SD) according to the number of training data and classified by age into <35, 35‐37, and 38‐39 y are shown above in the left column, while those for 40‐41 and ≥42 y and all ages are shown above in the right column. The number of training data that achieved the best accuracy with the minimum SD was obtained for each age category. For patients aged <35 y and all ages, the accuracies do not differ; thus, the best number for the training data was determined according to the maximum number of the sum of the sensitivity and the specificity. The best numbers for the training data were 16 665, 17 112, 7848, 8085, 12 144, and 49 245 for patients aged <35, 35‐37, 38‐39, 40‐41, and ≥42‐y and all ages, respectively
Architectures of the best classifier that showed the best accuracy for each age category
| Age (y) | |||||||
|---|---|---|---|---|---|---|---|
| Layers | <35 | 35‐37 | 38‐39 | 40‐41 | ≥42 | All ages | |
| 1. Convolution layer | |||||||
| Output channels | 50 | 40 | 20 | 40 | 40 | 50 | |
| Kernel size | 5 × 5 | 5 × 5 | 5 × 5 | 5 × 5 | 5 × 5 | 5 × 5 | |
| 2. ReLU† | |||||||
| 3. Pooling layer | |||||||
| Kernel size | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | |
| 4. Convolution layer | |||||||
| Output channels | 64 | 64 | 64 | 64 | 64 | 64 | |
| Kernel size | 5 × 5 | 5 × 5 | 5 × 5 | 5 × 5 | 5 × 5 | 5 × 5 | |
| 5. ReLU | |||||||
| 6. Pooling layer | |||||||
| Kernel size | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | 2 × 2 | |
| 7. Flatten layer | |||||||
| 8. Linear layer size | 210 | 210 | 210 | 210 | 210 | 210 | |
| 9. ReLU | |||||||
| 10. Linear layer size | 2 | 2 | 2 | 2 | 2 | 2 | |
| 11. Softmax layer | |||||||
ReLU, rectified linear units.
The proper convolutional neural network structures, which consisted of eleven layers in convolutional deep learning, were obtained. The numbers of output channels in the first convolution layer were different.
Figure 3The process used to obtain the function of the logistic regression model from the confidence score, which was the estimated probability of belonging to the live birth category, was determined in order to predict the probability of live birth by applying the data distribution of the patients. A sample of patients aged 35‐37 y is shown. The histogram of the confidence scores for both live and nonlive births that were confirmed by tracking is shown in the upper panel. The incidence of live birth as a function of the probability is plotted as dots and shown in the lower panel. The logistic regression model with extrapolations that fit the dots was constructed as the function of the confidence scores to predict the probability of live birth
Coefficients of the logistic regression, y = 1/(1 + Exp(β 0+β 1 x)), showing the probability of live birth as a function of the confidence score, which is the AI‐generated predicted probability of live birth obtained from an image of the blastocyst
| Patient age (y) |
|
| Odds ratio |
|---|---|---|---|
| <35 | 3.81 (±2.79) | −7.91 (±5.65) | 2724.39 |
| 35‐37 | 3.23 (±3.18) | −6.87 (±7.08) | 962.95 |
| 38‐39 | 2.12 (±1.97) | −3.36 (±4.33) | 28.79 |
| 40‐41 | 3.04 (±2.99) | −3.70 (±5.61) | 40.45 |
| ≥42 | 2.93 (±5.12) | −0.71 (±11.19) | 2.03 |
| all ages | 1.70 (±1.23) | −3.30 (±2.47) | 27.11 |
β 0, β 1, coefficients; SE, standard error; x, confidence score of the blastocyst; y, probability of live birth.
Figure 4The functions used to predict the probability of live birth are plotted according to age categories into <35, 35‐37, 38‐39, 40‐41, and ≥42 y and all ages, respectively. The functions for ages <35 and for 35‐37 y seemed similar. When the age advanced above 35 y, and especially when it was equal to or greater than 42 y, the probability of live birth decreased. These functions, which were derived from artificial intelligence, seemed to be consistent with the significance of age
Figure 5Examples of original images of the blastocyst in patients aged 38‐39 y
Live birth prediction by the classifiers of the AI, deep learning of the convolutional neural network (upper panel) and the multivariate regression of the conventional embryo evaluation (lower panel)
| Patient age (y) | Actual live birth | Actual nonlive birth | Predicted live birth | Predicted nonlive birth | accuracy | sensitivity | specificity | PPV | NPV | AUC | 95% CI of AUC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| The AI with deep learning in the convolutional neural network | |||||||||||
| <35 | 176 | 278 | 80 | 374 | 0.639 | 0.261 | 0.878 | 0.575 | 0.652 | 0.592 | 0.538‐0.646 |
| 35‐37 | 77 | 173 | 20 | 230 | 0.708 | 0.156 | 0.954 | 0.600 | 0.717 | 0.634 | 0.558‐0.711 |
| 38‐39 | 33 | 109 | 12 | 130 | 0.782 | 0.212 | 0.954 | 0.583 | 0.800 | 0.671 | 0.560‐0.782 |
| 40‐41 | 26 | 135 | 25 | 136 | 0.807 | 0.385 | 0.889 | 0.400 | 0.882 | 0.713 | 0.594‐0.831 |
| ≥42 | 8 | 127 | 14 | 121 | 0.881 | 0.375 | 0.913 | 0.214 | 0.959 | 0.696 | 0.488‐0.904 |
| All ages | 318 | 821 | 93 | 1046 | 0.721 | 0.148 | 0.944 | 0.505 | 0.741 | 0.574 | 0.537‐0.612 |
| The conventional embryo evaluation | |||||||||||
| <35 | 176 | 278 | 167 | 287 | 0.610 | 0.471 | 0.697 | 0.497 | 0.676 | 0.632 | 0.579‐0.685 |
| 35‐37 | 77 | 173 | 12 | 238 | 0.700 | 0.090 | 0.971 | 0.583 | 0.706 | 0.634 | 0.557‐0.711 |
| 38‐39 | 33 | 109 | 0 | 142 | 0.768 | 0.000 | 1.000 | NA | 0.768 | 0.775 | 0.675‐0.875 |
| 40‐41 | 26 | 135 | 0 | 161 | 0.839 | 0.000 | 1.000 | NA | 0.839 | 0.736 | 0.620‐0.851 |
| ≥42 | 8 | 127 | 0 | 135 | 0.941 | 0.000 | 1.000 | NA | 0.941 | 0.764 | 0.567‐0.960 |
| All ages | 318 | 821 | 172 | 967 | 0.740 | 0.305 | 0.909 | 0.564 | 0.771 | 0.723 | 0.688 −0.757 |
The ability of a test is usually presented as its sensitivity, specificity, and AUC. The clinical outcome is presented as the accuracy. In the AI, the accuracy increases as a function of age (P < 1.5 × 10−10), while the sensitivity or the specificity does not (Cochran‐Armitage test). The ranges of the sensitivity and specificity were nearly 0.20‐0.40 and 0.88‐0.95, respectively. In conventional embryo evaluation, the accuracy, sensitivity, and specificity changed as a function of age (P < 1.1 × 10−17, P < 1.3 × 10−13, and P < 3.6 × 10−25, respectively (Cochran‐Armitage test)). When patient was older than 37 years, the sensitivities were 0, and the specificities were 1 because of the low probability of a live birth. This phenomenon occurs when all judgments are determined to be always nonlive births. The conventional embryo evaluation method seems to be ineffective in patients aged older than 37 years. The sum value of the sensitivity and the specificity of the AI and the conventional embryo evaluation was 1.196 ± 0.08 and 1.046 ± 0.07, respectively (P = 0.01 and P = 0.034 by unpaired t test and Mann‐Whitney test, respectively). AUC, area under the curve; CI, confidence interval; NPV, negative predictive value; PPV, positive predictive value.
P < 0.05 for the AI vs the conventional embryo evaluation.
Figure 6The sum of the sensitivity and the specificity by the classifier of the artificial intelligence (AI) and the laboratory data for the age categories <35, 35‐37, 38‐39, 40‐41, and ≥42 y. As age advances, the sum of the sensitivity and the specificity increased in the AI and decreased in the laboratory data. The sum values for the AI and the laboratory data were 1.196 ± 0.08 and 1.046 ± 0.07 (mean ± SD), respectively. The AI achieved significantly higher values (P = 0.01 and P = 0.034 by unpaired t test and Mann‐Whitney test, respectively). As age advanced, the AI classifier became more useful