| Literature DB >> 35674312 |
S M Diakiw1, J M M Hall1,2,3, M D VerMilyea4,5, J Amin6, J Aizpurua7, L Giardini7, Y G Briones7, A Y X Lim8, M A Dakka1, T V Nguyen1, D Perugini1, M Perugini1,9.
Abstract
STUDY QUESTION: Can an artificial intelligence (AI) model predict human embryo ploidy status using static images captured by optical light microscopy? SUMMARY ANSWER: Results demonstrated predictive accuracy for embryo euploidy and showed a significant correlation between AI score and euploidy rate, based on assessment of images of blastocysts at Day 5 after IVF. WHAT IS KNOWN ALREADY: Euploid embryos displaying the normal human chromosomal complement of 46 chromosomes are preferentially selected for transfer over aneuploid embryos (abnormal complement), as they are associated with improved clinical outcomes. Currently, evaluation of embryo genetic status is most commonly performed by preimplantation genetic testing for aneuploidy (PGT-A), which involves embryo biopsy and genetic testing. The potential for embryo damage during biopsy, and the non-uniform nature of aneuploid cells in mosaic embryos, has prompted investigation of additional, non-invasive, whole embryo methods for evaluation of embryo genetic status. STUDY DESIGN, SIZE, DURATION: A total of 15 192 blastocyst-stage embryo images with associated clinical outcomes were provided by 10 different IVF clinics in the USA, India, Spain and Malaysia. The majority of data were retrospective, with two additional prospectively collected blind datasets provided by IVF clinics using the genetics AI model in clinical practice. Of these images, a total of 5050 images of embryos on Day 5 of in vitro culture were used for the development of the AI model. These Day 5 images were provided for 2438 consecutively treated women who had undergone IVF procedures in the USA between 2011 and 2020. The remaining images were used for evaluation of performance in different settings, or otherwise excluded for not matching the inclusion criteria. PARTICIPANTS/MATERIALS, SETTING,Entities:
Keywords: ICSI outcome; IVF; PGT-A; artificial intelligence; assisted reproduction; embryo quality; genetics; machine learning; preimplantation genetic testing for aneuploidy
Mesh:
Year: 2022 PMID: 35674312 PMCID: PMC9340116 DOI: 10.1093/humrep/deac131
Source DB: PubMed Journal: Hum Reprod ISSN: 0268-1161 Impact factor: 6.353
Composition of datasets used for development of the Day 5 genetics artificial intelligence model.
| Datasets | Total Day 5 dataset (uncleansed) | Day 5 blind test dataset (uncleansed) | Day 5 blind test dataset (cleansed) |
|---|---|---|---|
| Number of embryo images | 5050 | 1001 | 786 |
| Number of patients | 2438 | 788 | 658 |
| Dates treated | 2011–2020 | 2011–2020 | 2011–2020 |
| Number of cycles | 2485 | 798 | 664 |
| Average cycles per patient (range) | 1.0 (1–3) | 1.0 (1–2) | 1.0 (1–2) |
| Average embryo cohort size (range) | 2.0 (1–17) | 1.3 (1–5) | 1.2 (1–4) |
| Average patient age in years (range) | 36.2 (19–53) | 35.6 (19–53) | 35.2 (19–51) |
| Number of donor gamete(s) used (%) | 1106 (22.0%) | 211 (21.1%) | 170 (21.6%) |
| Number of euploid embryos (%) | 3251 (64.4%) | 645 (64.4%) | 613 (78.0%) |
| Number of aneuploid embryos (%) | 1799 (35.6%) | 356 (35.6%) | 173 (22.0%) |
| Number of transferred embryos (%) | ND | 156 (15.6%) | 148 (18.8%) |
|
| |||
|
| |||
|
| |||
| Number of successful pregnancies (%) | ND | 92 (59.0%) | 87 (58.8%) |
| Number of unsuccessful pregnancies (%) | ND | 64 (41.0%) | 61 (41.2%) |
|
| |||
| Ovation—Austin (TX, USA) | 3328 (65.9%) | 671 (67.0%) | 522 (66.4%) |
| San Antonio IVF (TX, USA) | 538 (10.7%) | 103 (10.3%) | 78 (9.9%) |
| Midwest Fertility Specialists (IN, USA) | 236 (4.7%) | 45 (4.5%) | 37 (4.7%) |
| California Fertility Partners (CA, USA) | 943 (18.7%) | 182 (18.2%) | 149 (19.0%) |
| Ovation—Baton Rouge (LA, USA) | 5 (0.1%) | 0 (0.0%) | 0 (0.0%) |
|
| |||
|
| |||
|
| |||
| Monosomy—n (%) | 483 (26.8%) | 80 (22.5%) | 39 (22.5%) |
| Trisomy—n (%) | 466 (25.9%) | 88 (24.7%) | 35 (20.2%) |
| Full gains or losses—n (%) | 1217 (67.6%) | 237 (66.6%) | 119 (68.8%) |
| Segmental duplications or deletions—n (%) | 382 (21.2%) | 86 (24.2%) | 33 (19.1%) |
| Single chromosomal abnormalities—n (%) | 1093 (60.8%) | 219 (60.6%) | 101 (58.4%) |
| Multiple abnormalities (complex)—n (%) | 657 (36.5%) | 137 (39.4%) | 72 (41.6%) |
| Chromosome 1—n (%) | 83 (4.6%) | 23 (6.6%) | 9 (5.2%) |
| Chromosome 2—n (%) | 120 (6.7%) | 19 (5.5%) | 12 (6.9%) |
| Chromosome 3—n (%) | 77 (4.3%) | 20 (5.8%) | 10 (5.8%) |
| Chromosome 4—n (%) | 111 (6.2%) | 27 (7.8%) | 8 (4.6%) |
| Chromosome 5—n (%) | 102 (5.7%) | 27 (7.8%) | 11 (6.4%) |
| Chromosome 6—n (%) | 86 (4.8%) | 17 (4.9%) | 10 (5.8%) |
| Chromosome 7—n (%) | 99 (5.5%) | 21 (6.1%) | 6 (3.5%) |
| Chromosome 8—n (%) | 110 (6.1%) | 24 (6.9%) | 11 (6.4%) |
| Chromosome 9—n (%) | 104 (5.8%) | 27 (7.8%) | 10 (5.8%) |
| Chromosome 10—n (%) | 86 (4.8%) | 11 (3.2%) | 5 (2.9%) |
| Chromosome 11—n (%) | 98 (5.4%) | 14 (4.0%) | 7 (4.0%) |
| Chromosome 12—n (%) | 71 (3.9%) | 16 (4.6%) | 7 (4.0%) |
| Chromosome 13—n (%) | 140 (7.8%) | 36 (10.4%) | 20 (11.6%) |
| Chromosome 14—n (%) | 121 (6.7%) | 25 (7.2%) | 12 (6.9%) |
| Chromosome 15—n (%) | 196 (10.9%) | 33 (9.5%) | 20 (11.6%) |
| Chromosome 16—n (%) | 255 (14.2%) | 49 (14.1%) | 26 (15%) |
| Chromosome 17—n (%) | 75 (4.2%) | 17 (4.9%) | 10 (5.8%) |
| Chromosome 18—n (%) | 127 (7.1%) | 31 (8.9%) | 17 (9.8%) |
| Chromosome 19—n (%) | 113 (6.3%) | 25 (7.2%) | 15 (8.7%) |
| Chromosome 20—n (%) | 78 (4.3%) | 21 (6.1%) | 12 (6.9%) |
| Chromosome 21—n (%) | 221 (12.3%) | 46 (13.3%) | 19 (11.0%) |
| Chromosome 22—n (%) | 323 (18.0%) | 52 (15.0%) | 28 (16.2%) |
| Sex chromosomes—n (%) | 161 (8.9%) | 34 (9.8%) | 15 (8.7%) |
The Day 5 blind test dataset of 1001 images was cleansed by the UDC method to remove poor quality and mislabeled images (remaining n = 786).
Some cohorts consisted of a combination of Day 5 and Day 6 embryos—these were separated according to dataset (see Supplementary Table SI).
Percentage calculated as proportion of aneuploid embryos in dataset. Embryos could have multiple chromosomes involved.
Number of embryos with monosomic/trisomic changes include those with single abnormalities only and those with a single full gain or loss accompanied by segmental changes.
Number of embryos with full/segmental changes include those with single or multiple abnormalities of the same type.
ND, not determined; UDC, untrainable data cleansing.
Figure 1.Performance of the Day 5 artificial intelligence (AI) algorithm for predicting the likelihood of human embryo euploidy on uncleansed and cleansed blind test datasets. (A) Confusion matrices depicting true positives (TP), false positives (FP), false negatives (FN) and true negatives (TN) for the Day 5 AI model predicting embryo euploid status. Matrices are shown for uncleansed (top panel) and cleansed (bottom panel) blind test datasets. (B) Receiver-operating characteristic (ROC) curves for uncleansed (top panel) and untrainable data cleansing (UDC)-cleansed (bottom panel) Day 5 blind test datasets. The AUC values are depicted. (C) Precision-recall curves (PRC) for uncleansed (top panel) and UDC-cleansed (bottom panel) Day 5 blind test datasets. The AUC values are depicted. (D) The correlation between the genetics AI score and the proportion of euploid embryos was evaluated using four defined euploid likelihood categories as depicted. The statistical method used was Chi-square test for trend (df, degrees of freedom).
Results of simulated cohort ranking analyses to evaluate the ability of the genetics artificial intelligence model to enrich for euploid embryos over random ranking and the Gardner score.
| Measurement | Proportion of cohorts with top one ranked embryo euploid | Proportion of cohorts with one of top two ranked embryos euploid | Proportion of cohorts with both top two ranked embryos euploid |
|---|---|---|---|
|
| |||
|
| |||
| Genetics AI model | 82.4% | 97.0% | 66.4% |
| Random | 65.2% | 88.9% | 43.2% |
| Improvement | 26.4% | 9.1% | 53.7% |
|
| |||
|
| |||
|
| |||
| Genetics AI model | 81.1% | 96.3% | 63.7% |
| Gardner | 68.0% | 90.6% | 46.6% |
| Improvement | 19.3% | 6.3% | 36.7% |
|
| |||
|
| |||
|
| |||
| Genetics AI model | 81.1% | 96.3% | 63.7% |
| Gardner | 71.9% | 92.2% | 50.2% |
| Improvement | 12.8% | 4.4% | 27.4% |
A subset of 918 of the 1001 images in the Day 5 blind test dataset had associated Gardner grades. Gardner ranking was performed using a 3BB threshold as described in Materials and methods section.
A subset of 918 of the 1001 images in the Day 5 blind test dataset had associated Gardner grades. Gardner ranking was performed using a four-group system as described in Materials and methods section.
AI, artificial intelligence.
Figure 2.Correlations between the Day 5 genetics artificial intelligence (AI) score and level of mosaicism, monosomic abnormalities, and performance on Day 6 human embryos. (A) Correlation between average genetics AI score and embryos based on ploidy status, including euploid, aneuploid, or mosaic status. (B) Correlation between average AI score and embryo ploidy status, separating mosaic embryos according to level of mosaicism. (C) The correlation between the AI score and the proportion of euploid embryos was evaluated using euploid likelihood categories on a dataset of images taken of blastocyst-stage embryos on Day 6 of in vitro culture. (D) Average genetics AI score in embryos with monosomic or trisomic changes compared to euploid embryos. (E) Correlation between AI score and the proportion of embryos with monosomic changes in different AI score categories. Average AI scores were compared using one-way ANOVA with Tukey’s multiple comparisons post-test (Student’s t-test was used to compare monosomic with trisomic changes), and Chi-square test for trend was used where indicated (df = degrees of freedom). P-values are represented as follows: *P <0.05, ***P <0.001.
Figure 3.Performance of the Day 5 genetics artificial intelligence (AI) model in different demographics and using time-lapse images. (A) The correlation between the AI score and the proportion of human euploid embryos was evaluated using euploid likelihood categories on a double-blind test dataset of images from a clinic in India. (B) The correlation between AI score and the proportion of euploid embryos evaluated on a double-blind test dataset of images taken using the GERI time-lapse imaging system by three clinics in Spain. (C) The correlation between AI score and the proportion of euploid embryos evaluated on a blind test dataset of images taken using the EmbryoScope time-lapse imaging system by a clinic in Malaysia. The statistical method used was Chi-square test for trend (df, degrees of freedom).