Literature DB >> 31607794

Feasibility of predicting live birth by combining conventional embryo evaluation with artificial intelligence applied to a blastocyst image in patients classified by age.

Yasunari Miyagi1,2, Toshihiro Habara3, Rei Hirata3, Nobuyoshi Hayashi3.   

Abstract

PURPOSE: To identify the multivariate logistic regression in a combination (combination method) involving artificial intelligence (AI) classifiers in images of blastocysts along with a conventional embryo evaluation (CEE) to predict the probability of accomplishing a live birth in patients classified by maternal age.
METHODS: Retrospectively, a total of 5691 blastocysts were enrolled. Images captured 115 hours or 139 hours if not yet sufficiently large after insemination were classified according to age as follows: <35, 35-37, 38-39, 40-41, and ≥42 years old. The classifiers for each category were created by using convolutional neural networks associated with deep learning. Next, the feasibility of a method combining AI with multivariate logistic model functions by CEE was investigated.
RESULTS: The values of the area under the curve (AUC) and the accuracies to predict live birth achieved by the CEE/AI/combination methods were 0.651/0.634/0.655, 0.697/0.688/0.723, 0.771/0.728/0.791, 0.788/0.743/0.806 and 0.820/0.837/0.888, and 0.631/0.647/0.616, 0.687/0.675/0.671, 0.725/0.697/0.732, 0.714/0.776/0.801, and 0.910/0.866/0.784 for age categories of <35, 35-37, 38-39, 40-41, and ≥42 years old, respectively.
CONCLUSIONS: Though there were mostly no significant differences regarding the AUC and the sensitivity plus specificity in all age categories, the combination method seemed to be the best.
© 2019 The Authors. Reproductive Medicine and Biology published by John Wiley & Sons Australia, Ltd on behalf of Japan Society for Reproductive Medicine.

Entities:  

Keywords:  artificial intelligence; blastocyst; deep learning; live birth; neural network

Year:  2019        PMID: 31607794      PMCID: PMC6780028          DOI: 10.1002/rmb2.12284

Source DB:  PubMed          Journal:  Reprod Med Biol        ISSN: 1445-5781


INTRODUCTION

The goal of assisted reproductive technology (ART) is securing a live birth. Development failure of embryo, or miscarriage, results in gloomy psychological results, the loss of time, and inflicted costs for the patients. Morphological structures, such as smooth endoplasmic reticulum clusters (sERCs), vacuoles or refractile bodies, have been studied. None of these structures have been found to be prognostic with respect to developmental ability of oocytes.1 Conventional morphological evaluation has had confined success to identify aneuploid embryos.2, 3, 4, 5, 6 Though time‐lapse information has been reported to be predictive of aneuploidy, the available evidence may be not enough to ensure introduction of time‐lapse microscopy2. Suboptimal embryos can be euploid, while embryos of good morphological quality may be aneuploid.2, 7, 8 Pre‐implantation genetic testing for aneuploidy (PGT‐A)9, 10 is another approach to examine chromosomal profiles, but it is an invasive method for the embryo and is relating to ethical debates. Some countries forbid the transfer of an embryo after biopsy. The profile of the chromosome of the biopsied sample does not always indicate the profile of the rest of the sample because the embryo might have genetic heterogeneity. Mosaicism in the trophectoderm (TE) was found. A single biopsy for TE may not represent the complete TE.11 There is a report that classifiers using artificial intelligence applied toward an image of a blastocyst implanted later had a potential to predict the probability of live birth.12 Thus, there have been no established procedures to predict live birth. Age is one of the most critical factors in sterility.13, 14 The number and quality of oocyte decrease as age advances. Patients older than 35 years should receive fast evaluation for the reason of sterility.15 Women older than 40 years should ensure more immediate evaluation16. An aged oocyte shows dysfunction of cellular organelles and increase in chromosomal abnormality.17 Advanced age is a risk factor for female sterility, miscarriage, and stillbirth.18 The delivery rate categorized by age (<35, 35‐37, 38‐39, 40‐41, 42‐45 years old) affects the developmental speed of the embryo significantly (P < 0.0001).19 The live birth rates associated with ART of patients in <35, 35‐37, 38‐39, 40‐41, and ≥42 years old were 0.20, 0.17, 0.12, 0.08, and 0.01, respectively, by the Japan Society of Obstetrics and Gynecology in 2015.20 Thus, age is one of the most critical factors in fertility, and there are no standard procedures to treat blastocysts or patients by age. Thus, we reported our system with an application of deep learning in a convolutional neural network21, 22, 23, 24 with artificial intelligence (AI), which was applied to blastocyst images classified by age to explore a means of meeting this study by noninvasively predicting live births.25 Deep learning becomes popular among machine learning methods including logistic regression,26 naive Bayes,27 nearest neighbor,28 neural network,29 random forest,30 and deep learning. AI can generate the confidence score that is a probability showing the estimated value of belonging to the live birth category. The score can be recognized as a blastocyst ranking. This approach will make it easier for physicians and embryologists to select better blastocysts. We here demonstrate the retrospective predictions of live birth by using the multivariate regression function in combination with a conventional embryo evaluation (CEE) method that includes clinical information, observation, and grading of the morphological features of blastocysts and is applied with AI to blastocysts images categorized by age. In this article, we demonstrate the advanced outcome by introducing the multivariate regression function defined as the combination method and by investigating the optimal cutoff points of the receiver operator characteristic (ROC) curves generated by AI, CEE, and the combination method, respectively. We present the feasibility of the combination method to predict the probability of accomplishing a live birth.

MATERIALS AND METHODS

Patients and data preparation

This study was approved by the Institutional Review Board (IRB) at Okayama Couples’ Clinic (IRB no. 18000128‐05) and was performed with explanations to the patients and a Web site with additional information with an opt‐out option. A total of 5691 blastocysts obtained from patients from January 2009 to April 2017 with fully deidentified data were enrolled. All of blastocysts were tracked. Whether the outcome was a live birth or a non‐live birth was confirmed. All data were divided into training datasets and test datasets randomly at a ratio of 4 to 1 (Figure 1).
Figure 1

A flowchart for making classifiers

A flowchart for making classifiers

CEE

Every blastocyst with the morphological features and clinical information, such as maternal age, body mass index, time of in vitro fertilization, time of embryo transfer, FSH value, anti‐Müllerian hormone value, blastocyst grade on day 3, embryo cryopreservation day, blastomere number on day 3 after insemination, grade of TE, grade of inner cell mass, antral follicle count, average diameter of the blastocyst, existence of oviduct infertility, existence of endometriosis, existence of immune sterility, insemination procedures, ovarian stimulation methods, refractile body, sERC grade, degree of blastocyst expansion, existence of a vacuole, male body mass index, and male age, were pursued to assess the outcome of live birth or non‐live birth. The information above is defined as CEE in this study.25 This information was provided by embryologists and doctors who were engaged in clinical practice for at least twenty years and were thought to be specialists who carried out the standardized laboratory practice related to embryo morphology assessment according to the international consensus meeting in 2011.31 The relationships between live birth and each factor in the CEE were examined, and then, we obtain univariate regression functions. The significant factors without multicollinearity, indicating a state of strong correlations among the independent variables, were chosen in the multivariate analysis. Then, a multivariate regression function for the CEE to predict live birth was obtained.

Blastocyst images

An image of the blastocyst was captured on about at 115 hours or 139 hours if the blastocyst was not yet large enough after insemination. The image was saved and deidentified, containing no data, which could be used to identify the person. The images were sent to the AI system offline. The images were classified by maternal age into five categories of patients who were <35, 35‐37, 38‐39, and 40‐41, or ≥42 years old. The numbers of non‐live births and live births were 1389 and 876, respectively, in the <35 group; 863 and 381, respectively, in the 35‐37 group; 545 and 164, respectively, in the 38‐39 group; 674 and 130, respectively, in the 40‐41 group; and 633 and 36, respectively, in the ≥42 group. The probabilities of live birth for the age groups <35, 35‐37, 38‐39, 40‐41, and ≥42 years old were 0.387, 0.306, 0.231, 0.162, and 0.054, respectively. The images of blastocysts which resulted in live births and those of blastocysts which resulted in miscarriages and resulted in non‐live births were used to create the AI classifiers.25

Preparation for AI

All of images that were deidentified were transferred to the AI system offline. Each image was cropped to a square and then saved in 100 × 100 pixel size. Eighty percent of the training dataset was used as the AI training dataset. The rest of the dataset was defined as the validation dataset. Thus, the AI training dataset, validation dataset, and test dataset were not overlapped. The AI classifier was trained by an AI training dataset with simultaneous validation and then tested with the test dataset. The training datasets were augmented, because the blastocyst image processing of the arbitrary any degrees of rotation can yield to images resulting in the different vector data of the same category.25

AI classifier

AI classifier programs in each age category were developed. The classifiers consisted of convolutional neural networks32 that attempted to imitate the visual cortex of the brain of the mammals21, 22, 23, 24, 32, 33, 34, 35, 36, 37 and L2 regularization38, 39 to acquire the probability for predicting either live birth or non‐live birth.25 We conducted deep learning with a convolutional neural network of 11 layers consisting of convolution layers with various kernel sizes40, 41, 42 and output channels, pooling layers,43, 44, 45, 46 flattened layers,47 linear layers,48, 49 rectified linear unit layers,50, 51 and one softmax layer.52, 53 The softmax layer generated confidence score that was the probability of a live birth (Table 1). We used cross‐validation.54, 55, 56 The suitable number of images for the training data was studied by evaluating variances and accuracy using the fivefold cross‐validation method (Figure 1): First, the test data consisted of the initial twenty percent of the data collected in each category, and a classifier was trained. Next, the test data were changed to the next twenty percent of the data. We repeated this procedure five times to encompass all data. The number of augmented training data was investigated until the accuracy and the variance of accuracies were likely to show the maximum and minimum values, respectively. This procedure reveals the suitable number of training data to validate the prediction performance more accurately by combining five times. Then, the best classifier showing the smallest variance and the best accuracy was chosen by varying the architecture of the neural network and by varying hyperparameters and an image size (40 × 40, 50 × 50, 75 × 75, and 100 × 100 pixels). In case of the accuracies were not clearly different, the best classifier was selected based on the values of the summation of the specificity and the sensitivity. The AI classifiers were obtained for each age category.
Table 1

Architectures of the best classifier that showed the best accuracy for each age category

Layers Age (years)
<3535‐3738‐3940‐41≥42All ages
1. Convolution layerOutput channels504050645050
Kernel size5 × 55 × 55 × 55 × 55 × 55 × 5
2. ReLU       
3. Pooling layerKernel size2 × 22 × 22 × 22 × 22 × 22 × 2
4. Convolution layerOutput channels646464646464
Kernel size5 × 55 × 55 × 55 × 55 × 55 × 5
5. ReLU       
6. Pooling layerKernel size2 × 22 × 22 × 22 × 22 × 22 × 2
7. Flatten layer       
8. Linear layer size 210 210 210 210 210 210
9. ReLU       
10. Linear layer size 222222
11. Softmax layer       

The proper convolutional neural network structures, which consisted of eleven layers in convolutional deep learning, were obtained. The numbers of output channels in the first convolution layer were different.

Abbreviation: ReLU, Rectified linear units.

Architectures of the best classifier that showed the best accuracy for each age category The proper convolutional neural network structures, which consisted of eleven layers in convolutional deep learning, were obtained. The numbers of output channels in the first convolution layer were different. Abbreviation: ReLU, Rectified linear units.

Live birth prediction function of AI and CEE

By using both the value, x, calculated by the multivariate regression function for the CEE, and the confidence score, x, derived from the AI classifier, the multivariate logistic model functions, y = 1/(1 + Exp(β + β)); β, β, β: coefficients, were developed to predict the probability of live birth. The area under the receiver operator characteristic curve (AUC); the optimal cut‐point value corresponding to the point with the lowest distance to the upper‐left corner of the ROC curve57; and the sensitivity, specificity, and accuracy of AI, CEE, and the combination of AI and CEE were obtained.

Development environment

The environment for development used in this study was as follows: Intel Core i5, 3.30 GHz, 32 GB (Santa Clara, California, USA) and NVIDIA GeForce GTX 1080 Ti (Santa Clara, California, USA), Windows 10 (Redmond, Washington, USA) and Mathematica 11.3 (Wolfram Research, Champaign, IL, USA).

Statistics

Mathematica 11.3 was used for statistical analyses.

RESULTS

The live birth ratio of all data was 0.279. The live birth ratios for the age groups <35, 35‐37, 38‐39, 40‐41, and ≥42 years old were 0.387 (876/2265), 0.306 (381/1244), 0.231 (164/709), 0.162 (130/804), and 0.054 (36/669), respectively.25 Univariate regression functions and the multivariate regression function of the CEE for predicting the probability of live birth are shown in Tables 2 and 3, respectively. Ten independent variables for the multivariate function with no multicollinearity were detected. The variables demonstrated in Table 3 were acquired using the formulae in Table 2. The results demonstrated that the age seemed to be the most important because the P‐value of the age was the minimum, as shown in Table 3. When the ten values were substituted into the multivariate logistic regression function, the calculated value was the predicted probability of live birth by the CEE.
Table 2

Univariate regression functions of the CEE parameters for predicting the probability of live birth

Independent variablesFormulaeCoefficients
Agek/(1 + Exp(β0 + β1x)) β0 = −10.742 ± 4.106 (P = 0.0089)
   β1 = 0.284 ± 0.109 (P = 0.0088)
  k = 0.451
Times of embryo transfer1/(1 + Exp(β0 + β1x)) β0 = 0.635 ± 1.158 (P = 0.584)
   β1 = 0.156 ± 0.123 (P = 0.204)
Anti‐Müllerian Hormone (ng/ml)1/(1 + Exp(β0 + β1x)) β0 = 1.282 ± 2.640 (P = 0.627)
   β1 = 0.062 ± 0.139 (P = 0.678)
Blastomere number on Day 3k/(2πσ2)1/2 Exp(‐(x‐m)2/(2σ2))σ = 4.668 ± 0.773 (P = 4.179 × 10−5)
   m = 11.624 ± 0.663 (P = 1.969 × 10−10)
  k = 4.643 ± 0.611 (P = 3.91 × 10−6)
Grade on Day 3k/(1 + Exp(β0 + β1x)) β0 = −7.967 ± 8.012 (P = 0.320)
(Class A = 1, B = 2, C = 3, D = 4)  β1 = 2.584 ± 2.582 (P = 0.317)
  k = 0.319
Embryo cryopreservation day β0 + β1x β0 = 0.435
(Day 5 = 1, Day 6 = 2)  β1 = −0.131
Inner Cell Mass β0 + β1x β0 = 0.479 ± 0.037 (P = 0.049)
(A = 1, B = 2, C = 3)  β1 = −0.131 ± 0.017 (P = 0.083)
Trophectoderm β0 + β1x β0 = 0.526 ± 0.002 (P = 0.0026)
(A = 1, B = 2, C = 3)  β1 = −0.124 ± 0.001 (P = 0.005)
Averaged diameter (µm)1/(1 + Exp(β0 + β1x)) β0 = 2.623 ± 5.312 (P = 0.621)
   β1 = −0.011 ± 0.030 (P = 0.723)
Body mass index (kg/m2)1/(1 + Exp(β0 + β1x)) β0 = −0.631 ± 0.844 (P = 0.454)
   β1 = 0.079 ± 0.035 (P = 0.026)

Independent variables, which were related to live birth and were also used in the multivariate regression, are presented. Each formula is determined to fit the data distribution. Coefficients are shown as the mean ± SE.

Table 3

Multivariate logistic regression function, 1/(1 + Exp(β + β + ... + β), of the CEE for predicting live birth

Independent variablesCoefficients P‐valueOdds ratio
Constant (β0) β0 = 6.642 ± 0.5002.81 × 10−40
Age value (β1) β1 = −4.047 ± 0.3683.68 × 10−28 57.24
Average diameter value (β2) β2 = −4.628 ± 0.7496.359 × 10−10 102.34
TE value (β3) β3 = −2.164 ± 0.4622.832 × 10−6 8.71
Embryo cryopreservation day value (β4) β4 = −3.202 ± 0.8622.230 × 10−4 24.59
ET times value (β5) β5 = −2.652 ± 0.8171.158 × 10−3 14.19
ICM value (β6) β6 = −0.837 ± 0.5370.1192.31
AMH value (β7) β7 = −1.078 ± 0.7670.1602.94
Blastomere number value (β8) β8 = −0.618 ± 0.5930.2981.85
Body mass index value (β9) β9 = −0.819 ± 0.8380.3282.27
Grade on day 3 value (β10) β10 = 0.266 ± 0.930.7781.31

The values of independent variables except constant β are calculated values by univariate regression functions, shown in Table 1. Multicollinearity is not observed between any two independent variables. Coefficients are shown as the mean ± SE.

AMH, Anti‐Müllerian hormone; Embryo transfer; ICM, Inner cell mass; Trophoectoderm.

Univariate regression functions of the CEE parameters for predicting the probability of live birth Independent variables, which were related to live birth and were also used in the multivariate regression, are presented. Each formula is determined to fit the data distribution. Coefficients are shown as the mean ± SE. Multivariate logistic regression function, 1/(1 + Exp(β + β + ... + β), of the CEE for predicting live birth The values of independent variables except constant β are calculated values by univariate regression functions, shown in Table 1. Multicollinearity is not observed between any two independent variables. Coefficients are shown as the mean ± SE. AMH, Anti‐Müllerian hormone; Embryo transfer; ICM, Inner cell mass; Trophoectoderm. Prior to creating the best AI classifier, the best numbers of the training dataset were determined to be 18 120, 17 910, 8505, 7716, and 12 840 in the groups <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively, in this study.25 Then, the best values in the L2 regularization were determined to be 0.01, 0.0005, 0.01, 0.0001, and 0.00015 for the groups <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively. Next, the best AI classifiers were obtained. The best image size was 50 × 50 pixels. It took 0.15 seconds/image to classify and generate the confidence score. Next, the multivariate regression functions in combination with CEE and with AI applied to the blastocyst images of patients categorized by age, which were defined as combination methods, were obtained as shown in Table 4. The AUC values for predicting live birth accomplished by the CEE/AI/combination methods were 0.651 ± 0.027/0.634 ± 0.027/0.655 ± 0.027, 0.697 ± 0.037/0.688 ± 0.036/0.723 ± 0.036, 0.771 ± 0.052/0.728 ± 0.054/0.791 ± 0.050, 0.788 ± 0.063/0.743 ± 0.066/0.806 ± 0.061, and 0.820 ± 0.106/0.837 ± 0.103/0.888 ± 0.089 (mean ± SE) for <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively, as shown in Figure 2 and Table 5. The AUC values of the CEE/AI/combination methods were 0.745 ± 0.069/0.726 ± 0.075/0.773 ± 0.088 (mean ±  SD), respectively. There were no significant differences between CEE, AI, and combination methods in each age category. The AUC value significantly increased as a function of age in all methods because the slopes of the linear regression were more than zero in the CEE/AI/combination methods with P‐values of 0.00425/0.00580/0.00219, respectively, by linear regression analysis. Furthermore, the optimal cutoff points from the ROC curve of the CEE/AI/combination methods were 0.420/0.414/0.388, 0.279/0.314/0.281, 0.190/0.226/0.219, 0.190/0.268/0.142, and 0.184/0.118/0.037 for <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively (Table 5).
Table 4

Coefficients of logistic regression, y = 1/(1 + Exp(β + β + β)), that show the probability of live birth as a function of the CEE score and of the confidence score that is the AI‐generated predicted probability for live birth from a blastocyst image

Patient age (years) β0 (±SE) β1 (±SE) β2 (±SE)
Age < 354.326 (±1.686)−4.150 (±1.010)−5.634 (±4.648)
35 ≤ age < 383.628 (±0.672)−4.266 (±1.685)−5.508 (±2.408)
38 ≤ age < 407.314 (±2.346)−9.123 (±2.896)−18.988 (±11.408)
40 ≤ age < 425.044 (±0.833)−10.929 (±4.119)−4.031 (±2.593)
42 ≤ age8.087 (±1.995)−19.341 (±7.444)−20.985 (±12.226)

Abbreviations: β, β, β, Coefficients; SE, Standard error; x, Score of the CEE; x, Confidence score of the blastocyst; y, Probability of live birth.

Figure 2

The AUC values for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method of CEE and AI. No significant differences were observed in each age category. The AUC value significantly increased as a function of age in all methods because the slopes of the linear regression were more than zero in the CEE/AI/combination methods with P‐values of 0.00425/0.00580/0.00219, respectively, by linear regression analysis. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI

Table 5

Comparison of CEE, artificial intelligence (AI) and the combination of CEE and AI

Patient age (years)Actual live birthActual non‐live birthPredicted live birthPredicted non‐live birthAccuracySensitivitySpecificityPPVa NPVb AUCc 95% CId of the AUCCut‐point
The CEE
<351812721962570.6310.5800.6650.5360.7040.6510.598‐0.7030.420
35‐37841651141350.6870.7140.6730.5260.8220.6970.625‐0.7680.279
38‐393310954880.7250.7270.7250.4440.8980.7710.670‐0.8720.190
40‐4120141541070.7140.7000.7160.2590.9440.7880.665‐0.9110.190
≥426128141200.9100.6670.9220.2860.9830.8200.612‐1.030.184
The AI with deep learning in the convolutional neural network
<351812721712820.6470.5300.7240.5610.6990.6340.581‐0.6870.414
35‐37841651071420.6750.6550.6850.5140.7960.6880.615‐0.7600.314
38‐393310956860.6970.6970.6970.4110.8840.7280.622‐0.8340.226
40‐4120141421190.7760.6500.7940.3090.9410.7430.613‐0.8730.268
≥426128221120.8660.8330.8670.2270.9910.8370.636‐1.0390.118
Combination of the AI and the CEE
<351812722292240.6160.6520.5920.5150.7190.6550.600‐0.7070.388
35‐37841651301190.6710.7860.6120.5080.8490.7230.653‐0.7930.281
38‐393310955870.7320.7580.7250.4550.9080.7910.693‐0.8890.219
40‐4120141401210.8010.7000.8160.3500.9500.8060.687‐0.9250.142
≥42612835990.7841.0000.7730.1711.0000.8880.713‐1.0630.037

The value of the CEE was a function of multivariate logistic regression with no multicollinearity of conventional laboratory and clinical factors to predict the probability of live birth. The value of AI was a confidence score that was the AI‐generated predicted probability for live birth from an image of the blastocyst in patients categorized by age. The value of the combination was a function of multivariate logistic regression with CEE and AI as described in Table 4. The optimal cut‐point of live birth was the value corresponding to the point with the lowest distance to the upper‐left corner of the receiver operator characteristic (ROC) curve.62 The accuracies, sensitivities, and specificities were obtained by using cut‐points.

Positive precidtive value.

Negative predictive value.

The area under the curve.

Confidence interval.

Coefficients of logistic regression, y = 1/(1 + Exp(β + β + β)), that show the probability of live birth as a function of the CEE score and of the confidence score that is the AI‐generated predicted probability for live birth from a blastocyst image Abbreviations: β, β, β, Coefficients; SE, Standard error; x, Score of the CEE; x, Confidence score of the blastocyst; y, Probability of live birth. The AUC values for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method of CEE and AI. No significant differences were observed in each age category. The AUC value significantly increased as a function of age in all methods because the slopes of the linear regression were more than zero in the CEE/AI/combination methods with P‐values of 0.00425/0.00580/0.00219, respectively, by linear regression analysis. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI Comparison of CEE, artificial intelligence (AI) and the combination of CEE and AI The value of the CEE was a function of multivariate logistic regression with no multicollinearity of conventional laboratory and clinical factors to predict the probability of live birth. The value of AI was a confidence score that was the AI‐generated predicted probability for live birth from an image of the blastocyst in patients categorized by age. The value of the combination was a function of multivariate logistic regression with CEE and AI as described in Table 4. The optimal cut‐point of live birth was the value corresponding to the point with the lowest distance to the upper‐left corner of the receiver operator characteristic (ROC) curve.62 The accuracies, sensitivities, and specificities were obtained by using cut‐points. Positive precidtive value. Negative predictive value. The area under the curve. Confidence interval. The sensitivities at the optimal cut‐point of the CEE/AI/combination methods were 0.580/0.530/0.652, 0.714/0.655/0.786, 0.727/0.697/0.758, 0.700/0.650/0.700, and 0.667/0.833/1.000 for <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively, as shown in Figure 3 and Table 5. There were no significant differences in any of the methods with respect to the sensitivities in each age category except age <35 years old. In the age category of <35 years old, the sensitivity of the AI was significantly lower than that of the combination method (P = 0.019, by the chi‐square test). The sensitivity significantly increased in all methods as a function of age because the P‐values by the Cochran–Armitage test in the CEE, the AI, and the combination methods were 0.01248, 0.0623, and 0.00453, respectively. The sensitivities of the CEE/AI/combination methods were 0.678 ± 0.059/0.673 ± 0.109/0.779 ± 0.134 (mean ± SD), respectively.
Figure 3

The sensitivity for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method with CEE and AI. There were no significant differences in all methods for sensitivities in each age category except age <35 years. In the age category of <35 years, the sensitivity of the AI was significantly lower than that of the combination method (P = 0.019, by the chi‐square test). The sensitivity significantly increased as a function of age in all methods because the P‐values in the CEE, the AI, and the combination method were 0.01248, 0.0623, and 0.00453, respectively, by the Cochran–Armitage test. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI. * P < 0.05

The sensitivity for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method with CEE and AI. There were no significant differences in all methods for sensitivities in each age category except age <35 years. In the age category of <35 years, the sensitivity of the AI was significantly lower than that of the combination method (P = 0.019, by the chi‐square test). The sensitivity significantly increased as a function of age in all methods because the P‐values in the CEE, the AI, and the combination method were 0.01248, 0.0623, and 0.00453, respectively, by the Cochran–Armitage test. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI. * P < 0.05 The specificities at the optimal cut‐point of the CEE/AI/combination methods were 0.665/0.724/0.592, 0.673/0.685/0.612, 0.725/0.697/0.725, 0.716/0.794/0.816, and 0.922/0.867/0.773 for <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively, as shown in Figure 4 and Table 5. There were no significant differences in any of the methods with respect to the specificities in each age category except age <35, 40‐41, and ≥42 years old. In the age of <35 years old, the specificity of AI was significantly higher than that of the combination method (P = 0.0014, by the chi‐square test). In the age category of 40‐41 years old, the specificity of the combination method was significantly higher than that of CEE (P = 0.049, by the chi‐square test). In the age category of ≥42 years old, the specificity of CEE was significantly higher than that of the combination method (P = 0.001, by the chi‐square test). The specificity significantly increased in all methods as a function of age because the P‐values by the Cochran–Armitage test in the CEE, the AI, and the combination methods were 1.577 × 10−8, 3.721 × 10−5, and 2.216 × 10−8, respectively. The specificities of the CEE/AI/combination methods were 0.740 ± 0.105/0.753 ± 0.076/0.704 ± 0.098 (mean ± SD), respectively.
Figure 4

The specificity for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method with CEE and AI. There were no significant differences in all methods for specificities in each age category except age <35, 40‐41 and ≥42 years. In the age category of <35 years, the specificity of the AI was significantly higher than the combination method (P = 0.0014, by the chi‐square test). In the age category of 40‐41 years old, the specificity of the combination method was significantly higher than the CEE (P = 0.049) by the chi‐square test. In the age category of ≥42 years old, the specificity of the CEE was significantly higher than that of the combination method (P = 0.001, by the chi‐square test). The specificity significantly increased as a function of age in all methods because the P‐values in the CEE, the AI and the combination method were 1.577 × 10−8, 3.721 × 10−5 and 2.216 × 10−8, respectively, by the Cochran–Armitage test. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI. * P < 0.05; ** P < 0.005

The specificity for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method with CEE and AI. There were no significant differences in all methods for specificities in each age category except age <35, 40‐41 and ≥42 years. In the age category of <35 years, the specificity of the AI was significantly higher than the combination method (P = 0.0014, by the chi‐square test). In the age category of 40‐41 years old, the specificity of the combination method was significantly higher than the CEE (P = 0.049) by the chi‐square test. In the age category of ≥42 years old, the specificity of the CEE was significantly higher than that of the combination method (P = 0.001, by the chi‐square test). The specificity significantly increased as a function of age in all methods because the P‐values in the CEE, the AI and the combination method were 1.577 × 10−8, 3.721 × 10−5 and 2.216 × 10−8, respectively, by the Cochran–Armitage test. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI. * P < 0.05; ** P < 0.005 The sensitivities plus specificities at each optimal cut‐point of the CEE/AI/combination methods were 1.245/1.254/1.244, 1.387/1.340/1.398, 1.452/1.394/1.483, 1.416/1.444/1.516, and 1.589/1.700/1.773 for <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively, as shown in Figure 5 and Table 5. The sensitivity plus specificity significantly increased in all methods as a function of age because the slopes of the linear regression were more than zero in the CEE/AI/combination methods with P‐values of 0.0288/ 0.0199/0.0091, respectively, by the linear regression analysis. The sensitivities plus specificities of the CEE/AI/combination methods were 1.418 ± 0.124/1.426 ± 0.168/1.483 ± 0.193 (mean ± SD), respectively.
Figure 5

The specificity plus specificity for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method with CEE and AI. The sensitivity plus specificity significantly increased as a function of age in all methods because the slopes of linear regression were more than zero in the CEE/AI/combination methods with P‐values of 0.0288/0.0199/0.0091, respectively, by linear regression analysis. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI

The specificity plus specificity for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age and a combination method with CEE and AI. The sensitivity plus specificity significantly increased as a function of age in all methods because the slopes of linear regression were more than zero in the CEE/AI/combination methods with P‐values of 0.0288/0.0199/0.0091, respectively, by linear regression analysis. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI The accuracies at the cut‐points for predicting live birth accomplished by the CEE/AI/combination methods were 0.631/0.647/0.616, 0.687/0.675/0.671, 0.725/0.697/0.732, 0.714/0.776/0.801, and 0.910/0.866/0.784 for <35, 35‐37, 38‐39, 40‐41, and ≥42 years old, respectively, as shown in Figure 6 and Table 5. No significant differences in any methods with respect to accuracies in each age category except for age ≥42 years old were observed. In the age category of ≥42 years old, the accuracy of the combination method was significantly lower than that of CEE (P = 0.004 by the chi‐square test). The accuracy significantly increased in all methods as a function of age because the P‐values by the Cochran–Armitage test in the CEE, the AI, and the combination methods were 3.812 × 10−10, 7.306 × 10−8, and 1.238 × 10−7, respectively. The accuracies of the CEE/AI/combination methods were 0.733 ± 0.105/0.733 ± 0.089/0.721 ± 0.077 (mean ± SD), respectively.
Figure 6

The accuracy for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age, and the combination method with CEE and AI. There were no significant differences in all methods for accuracies in each age category except for age ≥42 years. In the age category of ≥42 years old, the accuracy of the combination method was significantly lower than that of the CEE (P = 0.004, by the chi‐square test). The accuracy significantly increased as a function of age in all methods because the P‐values in the CEE, the AI, and the combination methods were 3.812 × 10−10, 7.306 × 10−8, and 1.238 × 10−7, respectively, by the Cochran–Armitage test. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI. * P < 0.005

The accuracy for predicting live birth achieved by conventional embryo evaluation (CEE), artificial intelligence (AI) applied to blastocyst images from patients categorized by age, and the combination method with CEE and AI. There were no significant differences in all methods for accuracies in each age category except for age ≥42 years. In the age category of ≥42 years old, the accuracy of the combination method was significantly lower than that of the CEE (P = 0.004, by the chi‐square test). The accuracy significantly increased as a function of age in all methods because the P‐values in the CEE, the AI, and the combination methods were 3.812 × 10−10, 7.306 × 10−8, and 1.238 × 10−7, respectively, by the Cochran–Armitage test. AI, artificial intelligence; CEE, conventional embryo evaluation; CEE + AI, combination method with CEE and AI. * P < 0.005

DISCUSSION

Here, we developed new multivariate logistic functions in combination with both the CEE and the AI that we had previously published25 with improvement for predicting live birth in patients categorized by age, as shown in Table 4. The values of independent variables of the function are values calculated by multivariate regression functions of the CEE parameters and values as confidence scores generated by the AI classifiers of deep learning with a convolutional neural network using images of blastocysts that were categorized by age. This multivariate logistic function, defined as the combination method, demonstrated the best results, although there were mostly no significant differences from those of the CEE or the AI. We believe the values of AUC, sensitivity, and specificity are the most important statistics for evaluating test methods because they are independent of patient distribution. On the other hand, accuracy is dependent on patient distribution. For example, if a test method that always outputs non‐live birth for any inputted data would be applied to the images of patients aged ≥42 years, it would demonstrate very high accuracy because most patients aged ≥42 years belong to the non‐live birth category. The AUC values for predicting live birth accomplished by the CEE/AI/combination methods are shown in Figure 2 and Table 5. Although there were no significant differences in each age category, the combination method showed the best predictive results in each category. There seems to be no comparable study for predicting live birth. However, in terms of the AUC from pre‐implantation genetic screening, there is a report that a prediction model classifying embryos into low‐, medium‐, or high‐risk categories achieved an AUC of 0.74.58 The AUC values of the combination method, especially for ages ≥38 years old in this study, were better. The AUC values of the combination method for age ≥40 years in this study were higher than 0.8, indicating good predictive results. The sensitivities at the optimal cut‐point of the CEE/AI/combination methods are shown in Figure 3. There were no significant differences among all methods for sensitivities in each age category except age <35 years, at which the sensitivity of the AI was significantly lower than the combination method. The sensitivities of AI were lower than those of the combination method in other age categories, although these values were not significantly different. The AI classifiers of deep learning with a convolutional neural network with architectures in this study seemed to demonstrate moderate sensitivity. Because the incidences of live birth in each category were lower than those of non‐live birth, it might be more difficult to prepare a good training dataset of live births for the AI to create the classifiers. Therefore, more datasets of blastocysts that result in live births might improve the sensitivity. The specificities at the optimal cut‐point of the CEE/AI/combination methods are shown in Figure 4. There were no significant differences among the CEE, AI and combination methods for specificities in each age category except age <35, 40‐41, and ≥42 years old. In the age category of <35 years old, the specificity of the AI was significantly higher than the combination method. In the age category of 40‐41 years old, the specificity of the combination method was significantly higher than the CEE. In the age category of ≥42 years old, the specificity of the CEE was significantly lower than the combination method. These complicated results suggest that many and various morphological feature types of blastocysts result in non‐live births. Therefore, more datasets of blastocysts that result in non‐live births might improve the specificity. The sensitivity plus specificity‐1, known as Youden's index,59 is a statistic value that is useful for the performance of a dichotomous diagnostic test and is often used in ROC analysis. The value of sensitivity plus specificity provides equal weight to false‐negative and false‐positive values. Therefore, we investigated the sensitivity plus specificity. The sensitivities plus specificities at each optimal cut‐point of the CEE/AI/combination methods are shown in Figure 5. The sensitivities plus specificities of the combination method were higher than those of the CEE or the AI in all age categories. The combination methods seemed to be the best from the viewpoint of the performance of a dichotomous diagnostic test. The accuracies to predict live birth accomplished by the CEE/AI/combination methods are shown in Figure 6. There were no significant differences among all methods for accuracies in each age category except the category of age ≥42 years, in which the accuracy of the CEE was significantly higher than the combination method, probably because the specificity of the CEE, 0.922, was superior to that of the other methods and the incidence of non‐live birth was 94.6%. If a test with high specificity will be applied to a dataset in which the incidence of a negative result is close to one, the accuracy will be very high. Therefore, we believe the superiority of CEE in terms of the accuracy for age ≥42 years should be acceptable as data bias. The live birth rate for each transfer was reported to be 0.668 based on some clinical factors, such as body mass index.60 There is another study reported that the TE grade is the single statistically significant independent factor to predict live birth and the live birth probabilities of TE grade A, B, and C are 0.499, 0.339, and 0.080, respectively.61 The accuracies of the combination method we obtained were superior to the accuracies in these reports. Attention should be paid to the fact that the prediction of the probability of live birth cannot reach 1.00 because there are clinical disincentives to accomplish live birth that the AI classifiers cannot detect, such as uterine factors60 (eg, uterine myomas,62 intrauterine adhesions,63, 64 and endometrial polyps65); endometriosis66; ovarian function67; oviduct obstruction68, 69; immune disorders,70, 71 maternal diseases such as diabetes mellitus,72 and chronic endometritis73, 74; and the uterine microbiota75, 76). The more advanced the age, the greater was the value of the AUC, sensitivity, specificity, sensitivity plus specificity, and accuracy in the function of the CEE, AI, and combination methods, and this trend was significant. Because the morphological findings in CEE are not modified by age, we believe that the phenomenon could be partially derived from the appropriate selection of the optimal cutoff point of the multivariate regression functions of the CEE parameters and of the confidence score of the AI. In particular, because the incidence of live birth decreases as age advances, lowering the optimal cutoff point could improve the accuracies and the values of the sensitivity plus specificity. In our previous work,25 in which the cutoff points were set at 0.5 and were derived from the logistic regression model for binary data, the accuracies, sensitivities, and specificities were lower than those in the present study, in which the optimal cutoff points were lower than 0.5 (Table 5). As such, the improvement of the values of AUC, accuracies, sensitivities, and specificities in this study might be achieved not by the progressive skill of the embryologists or physicians but by the improvement of the classifier and selecting the appropriate cutoff point. On the other hand, age is very important for live birth from the viewpoint of biology. The contribution of age was observed in the combination method as well as in the CEE and the AI. Thus, AI seems to recognize some findings related to age from the blastocyst images. There could be some differences in the morphological features of the blastocyst at the same timepoint after insemination in patients of different ages. It is important to create a good AI classifier to acquire a multivariate logistic function of both CEE and AI. Our AI classifiers have a classification potential almost identical to that of the CEE classifiers. Because the embryologists and physicians who were thought to be specialists provided the CEE in this study, the AI classifiers might be useful in clinical practice. The AI classifiers and the CEE by well‐trained embryologists and physicians will improve the combined method. Even for highly skilled embryologists and physicians, it would be useful to refer to the prediction values calculated by the combination method. When the morphological grades of the blastocysts77, 78, 79 are identical, the several‐digit real number of the calculated prediction values would be helpful to determine the order of blastocyst selection. We think it is desirable for embryologists and physicians to refer the combination method that shows the predicted probability of live birth. The blastocysts can be ordered by the prediction values for embryo transfer. The combination method might psychologically confirm the historical experience of embryologists and physicians using AI in practice for the first time. It is emphasized that blastocysts evaluated by CEE, AI and combination methods are intact and can be transferred to the uterus with no ethical problems. It took 0.15 seconds/image to calculate the prediction probability by the AI. Because CEE information is usually obtained as a routine conventional procedure, the prediction probability of the combined method will be generated in a moment. It is also emphasized that the images of the blastocyst can be transferred, and the probabilities can be reported either by e‐mail or by saving the information to cloud spaces via the Internet so that expensive facilities and equipment might not be necessary for medical institutes worldwide. The AI classifier seems to be almost as good and might be superior to novice practitioners for classifying blastocysts. In clinical practice, AI used by embryologists who are trained insufficiently or are unskilled would be able to facilitate the prediction of live births from blastocysts to a level similar to that of specialists. Time and financial costs of training could be saved. This efficiency will afford embryologists and physicians time to work on other tasks, such as counseling patients. We would like to improve the AI, which would result in an improved combination method. Improvements in the architecture of the network and the hyperparameters used for training would be able to create the classifiers better, despite some of the clinical disincentives to accomplish live birth. The following neural networks have made progress: LeNet80 in 1998, AlexNet81 in 2012, GoogLeNet82 in 2014, ResNet83 in 2015, and Squeeze‐and‐Excitation networks84 in 2017. We previously tested ResNet with modification for our dataset, but the result was inferior to the neural network architectures we created in this study (data not shown). The AI for image recognition is still being developed. We believe that we may need more varied patterns of images for datasets. Usually, 500‐1000 images may be required for each class with deep learning.85 Such a large number of datasets for each age category will improve the value of AUC, sensitivity, specificity, and accuracy of the classifier with deep learning. It is also considered to investigate the image size,85, 86 the appropriate number of training datasets, the appropriate timing after insemination to capture images, and the regularization values for further study to improve the accuracies and to avoid overfitting87, 88, 89, 90, 91, 92 that is an error that occurs when a classifier is too fit to a limited set of data. It would be better to study other parameters, such as information of time lapses regarding their potential to predict live birth. The time‐lapse information might be included in the future. When the AI progresses, better results will be delivered. Deep learning with a convolutional neural network was applied to develop classifiers to predict the probability of a live birth from a blastocyst image categorized by age. The combination method with both CEE and AI demonstrated good AUC values in the range of 0.655‐0.888. Less than 0.15 second was necessary to complete the analysis of an image. It should be emphasized that this method causes no harm to the embryo. The embryo can be transferred after the prediction of the probability is acquired. It could provide financial savings for clinical institutes as well as patients. It could supply a rapid and useful diagnosis of the classification and allows tests over distances. It is thought that this AI will be useful in reproductive medicine. Although further or advanced study may be required for validation, this system indicates that this combination method would be feasible and may offer profits to both medical workers as well as patients. The contents in the article have been recognized as patentable in Japan (patent no. 6468576).

CONFLICT OF INTEREST

Yasunari Miyagi, Toshihiro Habara, Rei Hirata, and Nobuyoshi Hayashi declare they have no conflicts of interest.

ETHICAL APPROVAL

In this study, human rights statements and informed consent: All procedures followed were performed in accordance with the ethical standards of the responsible committee on human experimentation and with the Helsinki Declaration of 1964 and its later amendments. Informed consent was obtained from all patients for inclusion. We obtained additional informed consent from all patients for whom identifying information is included. A Web site with additional information including an “opt‐out” option was set up for this study. Animal studies: No animals were used in this study. The protocol for this research project including the use of human patients was approved by the IRB at Okayama Couples’ Clinic (IRB no. 18000128‐05). Clinical trial registry: This study is not a clinical trial.
  56 in total

Review 1.  Logistic regression and artificial neural network classification models: a methodology review.

Authors:  Stephan Dreiseitl; Lucila Ohno-Machado
Journal:  J Biomed Inform       Date:  2002 Oct-Dec       Impact factor: 6.317

2.  Index for rating diagnostic tests.

Authors:  W J YOUDEN
Journal:  Cancer       Date:  1950-01       Impact factor: 6.860

3.  Sensitivity analysis in bayesian classification models: multiplicative deviations.

Authors:  M Ben-Bassat; K L Klove; M H Weil
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  1980-03       Impact factor: 6.226

Review 4.  Clinical applications of preimplantation genetic testing.

Authors:  Paul R Brezina; William H Kutteh
Journal:  BMJ       Date:  2015-02-19

5.  Automation of Detection of Cervical Cancer Using Convolutional Neural Networks.

Authors:  Vidya Kudva; Keerthana Prasad; Shyamala Guruvare
Journal:  Crit Rev Biomed Eng       Date:  2018

6.  Automated Melanoma Recognition in Dermoscopy Images via Very Deep Residual Networks.

Authors:  Lequan Yu; Hao Chen; Qi Dou; Jing Qin; Pheng-Ann Heng
Journal:  IEEE Trans Med Imaging       Date:  2016-12-21       Impact factor: 10.048

Review 7.  Female age-related fertility decline. Committee Opinion No. 589.

Authors: 
Journal:  Fertil Steril       Date:  2014-03       Impact factor: 7.329

8.  Assessment of day-3 morphology and euploidy for individual chromosomes in embryos that develop to the blastocyst stage.

Authors:  Jennifer L Eaton; Michele R Hacker; Doria Harris; Kim L Thornton; Alan S Penzias
Journal:  Fertil Steril       Date:  2008-04-28       Impact factor: 7.329

9.  Estrogen deprivation and cardiovascular disease risk in primary ovarian insufficiency.

Authors:  Jacob P Christ; Marlise N Gunning; Giulia Palla; Marinus J C Eijkemans; Cornelis B Lambalk; Joop S E Laven; Bart C J M Fauser
Journal:  Fertil Steril       Date:  2018-03-28       Impact factor: 7.329

10.  Feasibility of artificial intelligence for predicting live birth without aneuploidy from a blastocyst image.

Authors:  Yasunari Miyagi; Toshihiro Habara; Rei Hirata; Nobuyoshi Hayashi
Journal:  Reprod Med Biol       Date:  2019-02-19
View more
  3 in total

1.  Application of deep learning to the classification of uterine cervical squamous epithelial lesion from colposcopy images combined with HPV types.

Authors:  Yasunari Miyagi; Kazuhiro Takehara; Yoko Nagayasu; Takahito Miyake
Journal:  Oncol Lett       Date:  2019-12-12       Impact factor: 2.967

2.  Artificial intelligence-the future is now.

Authors:  Mark P Trolice; Carol Curchoe; Alexander M Quaas
Journal:  J Assist Reprod Genet       Date:  2021-07-07       Impact factor: 3.412

3.  Embryo selection with artificial intelligence: how to evaluate and compare methods?

Authors:  Mikkel Fly Kragh; Henrik Karstoft
Journal:  J Assist Reprod Genet       Date:  2021-06-26       Impact factor: 3.412

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.