| Literature DB >> 34907216 |
Guillaume Bachelot1, Rachel Lévy1,2, Charlotte Dupont1,2, Antonin Lamazière3,4, Anne Bachelot5, Céline Faure1, Sébastien Czernichow6.
Abstract
We aimed to develop and evaluate a machine learning model that can stratify infertile/fertile couples on the basis of their bioclinical signature helping the management of couples with unexplained infertility. Fertile and infertile couples were recruited in the ALIFERT cross-sectional case-control multicentric study between September 2009 and December 2013 (NCT01093378). The study group consisted of 97 infertile couples presenting a primary idiopathic infertility (> 12 months) from 4 French infertility centers compared with 100 fertile couples (with a spontaneously conceived child (< 2 years of age) and with time to pregnancy < 12 months) recruited from the healthy population of the areas around the infertility centers. The study group is comprised of 2 independent sets: a development set (n = 136 from 3 centers) serving to train the model and a test set (n = 61 from 1 center) used to provide an unbiased validation of the model. Our results have shown that: (i) a couple-modeling approach was more discriminant than models in which men's and women's parameters are considered separately; (ii) the most discriminating variables were anthropometric, or related to the metabolic and oxidative status; (iii) a refined model capable to stratify fertile vs. infertile couples with accuracy 73.8% was proposed after the variables selection (from 80 to 13). These influential factors (anthropometric, antioxidative, and metabolic signatures) are all modifiable by the couple lifestyle. The model proposed takes place in the management of couples with idiopathic infertility, for whom the decision-making tools are scarce. Prospective interventional studies are now needed to validate the model clinical use.Trial registration: NCT01093378 ALIFERT https://clinicaltrials.gov/ct2/show/NCT01093378?term=ALIFERT&rank=1 . Registered: March 25, 2010.Entities:
Mesh:
Year: 2021 PMID: 34907216 PMCID: PMC8671584 DOI: 10.1038/s41598-021-03165-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Data flow diagram for the study: model training iteration. The model was trained and tuned after variable selection on a development set originating from patients of mutiple specialized centers. Firstly the model was internaly evaluated using cross-validation. Then, a test set of couples originating from a different institution was used for an external validation. The model predictions were matched with the expected clinical output. Model training iteration (i.e. introduced externally) process was eventually performed to build a refined model with the selected features.
Figure 2Orthogonal partial least squares discriminant analysis models (training and evaluation) calculated from men, women and couples development sets. (A) External evaluation of men fertility status: labelled external test set observations (fertile and infertile as green and red stars, respectively) are projected on the score plots generated from fertile (green dots) and infertile (red dots) men (135 × 50) set. R2 (fit) and Q2 (prediction capability) are 0.471 and 0.322, respectively. (B) External evaluation of women fertility status: labelled external test set observations (fertile and infertile as green and red stars, respectively) are projected on the score plots generated from fertile (green dots) and infertile (red dots) women (135 × 30) development set. R2 and Q2 were 0.560 and 0.462. (C) External evaluation of couples: labelled external test set observations (fertile and infertile as green and red stars, respectively) are projected on the score plots generated from fertile (green dots) and infertile (red dots) couples (135 × 80) development set. R2 and Q2 were 0.624 and 0.487. Accuracy scores are 0.590, 0.574 and 0.688 for models based on men, women and couples data, respectively.
Figure 3Supervised Orthogonal partial least squares discriminant (OPLS-DA) model with 13 variables selected from the 24 variables model with variable importance for projections (VIP) > 1: reductionist model. (A) Score plot generated from fertile (labeled as green triangles) and infertile couples (labeled as red dots). R2 and Q2 were 0.467 and 0.410, respectively. Accuracy (internal validation) = 0.838 and calculated on the test set = 0.705. (B) VIP (Variable Importance for the Projection) the plot summarized the influence of the 13 variables on the model. (C) Score plot reductionist model from the merged development and test sets (61 couples) (black stars). (D) Score plot reductionist model from the combined development and test sets. Scores are labeled according to their fertility status (green or red).