| Literature DB >> 35527994 |
Yanran Zhang1, Lei Shen2,3, Xinghui Yin2, Wenfeng Chen3.
Abstract
Background: Natural-cycle in vitro fertilization (NC-IVF) is an in vitro fertilization (IVF) cycle without gonadotropins or any other stimulation of follicular growth. Previous studies on live-birth prediction of NC-IVF were very few; the sample size was very limited. This study aims to construct a machine learning model to predict live-birth occurrence of NC-IVF using 57,558 linked cycle records and help clinicians develop treatment strategies. Design andEntities:
Keywords: HFEA; NC-IVF; ensemble learning; live birth; machine learning
Mesh:
Year: 2022 PMID: 35527994 PMCID: PMC9072737 DOI: 10.3389/fendo.2022.838087
Source DB: PubMed Journal: Front Endocrinol (Lausanne) ISSN: 1664-2392 Impact factor: 6.055
Figure 1The overall model building framework.
Description of 35 fields in the dataset.
| Field name | Field type | Description |
|---|---|---|
| Patient Age at Treatment | Categorical | Patient age at treatment, banded as follows: 18–34, 35–37, 38–39, 40–42, 43–44, 45–50. |
| Total Number of Previous Treatments, Both IVF and DI at Clinic | Numeric | The number of treatment cycles of IVF and DI the patient has previously had at the clinic associated with this treatment. |
| Total Number of Previous IVF Cycles | Numeric | The number of treatment cycles of IVF the patient has previously had. |
| Total Number of Previous DI Cycles | Numeric | The number of treatment cycles of DI the patient has previously had. |
| Total Number of IVF Pregnancies | Numeric | Times the patient has been pregnant through IVF. |
| Total Number of DI Pregnancies | Numeric | Times the patient has been pregnant through DI. |
| Total Number of Live Births—Conceived through IVF | Numeric | The number of live births the patient has had through IVF. |
| Total Number of Live Births—Conceived through DI | Numeric | The number of live births the patient has had through DI. |
| Type of Infertility—Female Primary | Categorical | 1 if the patient has never been pregnant, 0 otherwise. |
| Type of Infertility—Female Secondary | Categorical | 1 if the patient has ever been pregnant, 0 otherwise. |
| Type of Infertility—Male Primary | Categorical | 1 if the partner has never impregnated any woman, 0 otherwise. |
| Type of Infertility—Male Secondary | Categorical | 1 if the partner has ever impregnated some woman, 0 otherwise. |
| Type of Infertility—Couple Primary | Categorical | 1 if the patient has never been pregnant while the partner has never impregnated any woman, |
| Type of Infertility—Couple Secondary | Categorical | 1 if the patient has ever been pregnant while the partner has ever impregnated some woman, |
| Cause of Infertility—Tubal Disease | Categorical | 1 if the primary cause of infertility is due to tubal disease, 0 otherwise. |
| Cause of Infertility—Ovulatory Disorder | Categorical | 1 if the primary cause of infertility is due to ovulatory disorder, 0 otherwise. |
| Cause of Infertility—Male Factor | Categorical | 1 if the primary cause of infertility is due to the partner, 0 otherwise. |
| Cause of Infertility—Patient Unexplained | Categorical | 1 if the primary cause of infertility is unknown, 0 otherwise. |
| Cause of Infertility—Endometriosis | Categorical | 1 if the primary cause of infertility is due to endometriosis, 0 otherwise. |
| Cause of Infertility—Cervical Factors | Categorical | 1 if the primary cause of infertility is due to cervical factors, 0 otherwise. |
| Cause of Infertility—Partner Sperm Concentration | Categorical | 1 if the primary cause of infertility is due to partner sperm concentration, 0 otherwise. |
| Cause of Infertility—Partner Sperm Morphology | Categorical | 1 if the primary cause of infertility is due to partner sperm morphology, 0 otherwise. |
| Causes of Infertility—Partner Sperm Motility | Categorical | 1 if the primary cause of infertility is due to partner sperm motility, 0 otherwise. |
| Cause of Infertility—Partner Sperm Immunological Factors | Categorical | 1 if the primary cause of infertility is due to partner sperm immunological factors, 0 otherwise. |
| Main Reason for Producing Embryos Storing Eggs | Categorical | The main reason for storing eggs in this cycle and producing embryos in subsequent cycles, includes treatment now, for storing eggs. |
| Specific Treatment Type | Categorical | The specific treatment type used in this cycle includes IVF and ICSI. |
| PGD | Categorical | 1 if this cycle involved the use of preimplantation genetic diagnosis, 0 otherwise. |
| PGD Treatment | Categorical | 1 if this cycle would be contained in the “PGD” CaFC category on the HFEA website, |
| PGS | Categorical | 1 if this cycle involved the use of preimplantation genetic screening, 0 otherwise. |
| PGS Treatment | Categorical | 1 if this cycle would be contained in the “PGS” CaFC category on the HFEA website, |
| Elective Single Embryo Transfer | Categorical | 1 if this cycle involved the deliberate use of only one embryo, 0 otherwise. |
| Fresh Cycle | Categorical | 1 if this cycle used fresh embryos, 0 otherwise. |
| Frozen Cycle | Categorical | 1 if this cycle used frozen embryos, 0 otherwise. |
| Embryos Transferred | Numeric | The number of embryos transferred into the patient in this cycle. |
| Live-Birth Occurrence | Categorical | 1 if there were one or more live births as a result of this cycle, 0 otherwise. |
IVF, in vitro fertilization; DI, donor insemination; ICSI, intracytoplasmic sperm injection; HFEA, Human Fertilisation and Embryology Authority.
Figure 2Correlation matrix of 31 features. The redder grids indicate higher positive correlation values of feature pairs, the bluer ones indicate higher negative correlation values, and the white ones indicate no correlation.
Baseline characteristics of NC-IVF cycles.
| Characteristic | NC-IVF cycles 2005–2016 (n = 57,558) | |||
|---|---|---|---|---|
| Positive live birth (n = 12,340) | Negative live birth (n = 45,218) | |||
| n | % | n | % | |
| 18–34 | 6,222 | 50.42 | 18,409 | 4.09 |
| 35–37 | 3,211 | 26.02 | 11,089 | 24.52 |
| 38–39 | 1,642 | 13.31 | 6,824 | 15.09 |
| 40–42 | 1,060 | 8.59 | 6,478 | 14.33 |
| 43–44 | 163 | 1.32 | 1,747 | 3.86 |
| 45–50 | 42 | 0.34 | 671 | 1.48 |
| Female primary | 3,186 | 25.82 | 13,642 | 30.17 |
| Female secondary | 1,667 | 13.51 | 7,597 | 16.80 |
| Male primary | 3,171 | 25.70 | 13,515 | 29.89 |
| Male secondary | 1,655 | 13.41 | 7,608 | 16.83 |
| Couple primary | 3,678 | 29.81 | 15,867 | 35.09 |
| Couple secondary | 1,151 | 9.33 | 5,227 | 11.56 |
| Tubal disease | 2,029 | 16.44 | 8,818 | 19.50 |
| Ovulatory disorder | 1,844 | 14.94 | 6,484 | 14.34 |
| Male factor | 4,916 | 39.84 | 17,007 | 37.61 |
| Patient unexplained | 3,191 | 25.86 | 11,933 | 26.39 |
| Endometriosis | 725 | 5.88 | 2,614 | 5.78 |
| Cervical factors | 6 | 0.05 | 27 | 0.06 |
| Partner sperm concentration | 47 | 0.38 | 235 | 0.52 |
| Partner sperm morphology | 40 | 0.32 | 144 | 0.32 |
| Partner sperm motility | 23 | 0.19 | 116 | 0.37 |
| Partner sperm | 2 | 0.02 | 5 | 0.01 |
| IVF | 5,570 | 45.14 | 21,539 | 47.63 |
| ICSI | 6,770 | 54.86 | 23,679 | 52.37 |
IVF, in vitro fertilization; ICSI, intracytoplasmic sperm injection.
Evaluation metrics of all models.
| Model | Accuracy | Recall | Specificity | Precision | NPV | MCC | F1 |
|---|---|---|---|---|---|---|---|
| DT | 74.19 | 61.90 | 86.47 | 82.06 | 69.42 | 49.90 | 70.57 |
| LD | 74.44 | 61.62 | 87.26 | 82.87 | 69.45 | 50.57 | 70.68 |
| LR | 74.34 | 62.27 | 86.42 | 82.10 | 69.59 | 50.17 | 70.82 |
| NB | 57.14 | 15.57 | 98.72 | 92.40 | 53.90 | 25.72 | 26.65 |
| Linear SVM | 74.38 | 60.72 | 88.04 | 83.54 | 69.15 | 50.69 | 70.33 |
| ANN | 74.42 | 62.24 | 86.61 | 82.30 | 69.64 | 50.37 | 70.87 |
| BT | 67.78 | 72.88 | 62.69 | 66.14 | 69.80 | 35.75 | 69.34 |
| AdaBoost | 74.37 | 61.33 | 87.41 | 82.97 | 69.33 | 50.49 | 70.53 |
| GentleBoost | 73.85 | 62.87 | 84.82 | 80.55 | 69.55 | 48.88 | 70.62 |
| LogitBoost | 74.47 | 61.22 | 87.72 | 83.29 | 69.34 | 50.75 | 70.57 |
| RUSBoost | 74.33 | 61.22 | 87.44 | 82.98 | 69.28 | 50.43 | 70.46 |
| RSM | 74.41 | 61.28 | 87.54 | 83.11 | 69.33 | 50.60 | 70.54 |
The values in the table represent percentages.
NPV, negative predictive value; MCC, Matthews correlation coefficient; DT, decision tree; LD, linear discriminant; LR, logistic regression; NB, naive Bayes; SVM, support vector machine; ANN, artificial neural network; BT, bagged tree; RSM, random subspace method.
Figure 3The ROC curves and AUC scores of six machine learning models. (A) The ROC curve and AUC score of the DT model: the deep purple curve refers to the ROC curve, the area under the curve is covered by light purple color, the orange dot represents the threshold that corresponds to the optimal operating point, and the AUC score is clearly marked. (B–F) The ROC curve and AUC score of LD, LR, NB, Linear SVM, and ANN, respectively. ROC, receiver operating characteristic; AUC, area under the receiver operating characteristic curve; DT, decision tree; LD, linear discriminant; LR, logistic regression; NB, naive Bayes; ANN, artificial neural network.
Figure 4The ROC curves and AUC scores of six ensemble learning models. (A) The ROC curve and AUC score of BT model: the deep purple curve refers to the ROC curve, the area under the curve is covered by light purple color, the orange dot represents the threshold that corresponds to the optimal operating point, and the AUC score is clearly marked. (B–F) The ROC curve and AUC score of AdaBoost, GentleBoost, LogitBoost, RUSBoost, and RSM, respectively. ROC, receiver operating characteristic; AUC, area under the receiver operating characteristic curve; BT, bagged tree; RSM, random subspace method.
Figure 5Comprehensive comparison of all models. (A) Comprehensive ROC curves of six machine learning models: the larger the area under the curve, the better the performance. (B) Comprehensive ROC curves of six ensemble learning models. (C) Comprehensive comparison of two ROC curves: ANN model, i.e., the best machine learning model, and LogitBoost model, i.e., the best ensemble learning model. (D) In this histogram, the metrics of twelve models, including accuracy, recall, specificity, precision, NPV, MCC, F1, and AUC, are stacked into columns; hence, the higher the column, the better the performance. ROC, receiver operating characteristic; ANN, artificial neural network; NPV, negative predictive value; MCC, Matthews correlation coefficient.