Literature DB >> 33027251

Machine learning to predict mesenchymal stem cell efficacy for cartilage repair.

Yu Yang Fredrik Liu¹, Yin Lu², Steve Oh², Gareth J Conduit¹.

Abstract

Inconsistent therapeutic efficacy of mesenchymal stem cells (MSCs) in regenerative medicine has been documented in many clinical trials. Precise prediction on the therapeutic outcome of a MSC therapy based on the patient's conditions would provide valuable references for clinicians to decide the treatment strategies. In this article, we performed a meta-analysis on MSC therapies for cartilage repair using machine learning. A small database was generated from published in vivo and clinical studies. The unique features of our neural network model in handling missing data and calculating prediction uncertainty enabled precise prediction of post-treatment cartilage repair scores with coefficient of determination of 0.637 ± 0.005. From this model, we identified defect area percentage, defect depth percentage, implantation cell number, body weight, tissue source, and the type of cartilage damage as critical properties that significant impact cartilage repair. A dosage of 17 - 25 million MSCs was found to achieve optimal cartilage repair. Further, critical thresholds at 6% and 64% of cartilage damage in area, and 22% and 56% in depth were predicted to significantly compromise on the efficacy of MSC therapy. This study, for the first time, demonstrated machine learning of patient-specific cartilage repair post MSC therapy. This approach can be applied to identify and investigate more critical properties involved in MSC-induced cartilage repair, and adapted for other clinical indications.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 33027251 PMCID： PMC7571701 DOI： 10.1371/journal.pcbi.1008275

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.475

Introduction

Articular cartilage is a critical tissue with multifaceted mechanical functions. It holds compression, absorbs shock, and enables smooth articulation at the joints. Cartilage injury is unfortunately common due to tears, accidents and arthritis, which often leads to joint pain, stiffness, and inflammation. Cartilage disorders affect millions of people worldwide, including 52.2 million adults in US [1], and more than 10 million in UK [2]. In particular, osteoarthritis alone affects more than 200 million people globally [3]. Adult cartilage has limited self-repair capacity due to its avascular nature [4], thus treatments are often necessary to accelerate repair and relieve pain during joint motions. Besides the conservative treatments and conventional surgical options, such as microfracture and autologous chondrocyte implantation (ACI), mesenchymal stem cell (MSC) has also been widely investigated in the management of cartilage damages in recent decades [5]. Although significant success has been achieved for MSC therapy in cartilage repair, the efficacy of therapy has been inconsistent. This is likely attributed to the complex cellular mechanisms and dynamic interplay across different populations of cells involved in the stem cell assisted tissue repair processes. MSC therapy is also complicated by heterogeneity of cell, culture conditions, delivery methods, and recipients’ conditions, which are all highly variable in current clinical trials and laboratory studies. Thus, disconnectedness between the in vitro, pre-clinical, and clinical performances of MSCs have been broadly observed [5], which has so far rendered the analysis of MSCs’ therapeutic efficacy largely retrospective, rather than predictive. As a result, there is a lack of guidelines on MSC therapy strategy to promote optimal therapeutic efficacy. Setting guidelines for MSC therapy requires identification of critical properties that affect MSCs’ therapeutic efficacy most significantly. To achieve this, quantitative assessment of the significance of individual property is needed. However, this is ineffective through conventional controlled biomedical experiments where one or at most a few properties can be interrogated at a time. To overcome this challenge, we use machine learning to capture multi-property correlations and exploit all of the information in a database. A machine learning model predicts based on the training dataset, and each algorithm has a basic set of parameters to fit multidimensional functions that can be changed to improve its accuracy [6]. Deep learning methods are able to predict multiple output properties simultaneously [7]. In this paper, we performed a meta-analysis on MSC therapy for cartilage repair. The data we analyzed were generated by different researchers using different experimental designs; as a result, the properties considered in one study may not always be addressed in another, which has led to a database containing “missing information” in some of its entries. Many machine learning methods do not analyze the entries with incomplete information, which often results in a shrinking database with compromised cognitive performance. We adapt a neural network formalism [8-12] with a unique capacity to “fill” the missing data by learning the correlations across multiple properties, and recursively imputes with precise estimates. Furthermore, our machine learning method computes the uncertainty of predictions raised from experimental noise and computational extrapolation, which allows the neural network model to focus on the most confident predictions. The coefficient of determination (R2) of our machine learning model in predicting MSC therapy outcome was 0.637 ± 0.005 in cross-validation test. Through machine learning, we identified defect area percentage, defect depth percentage, implantation cell number, body weight, tissue source, and cartilage damage type as critical therapy properties of cartilage repair. In particular, an optimal dosage range of 17-25 million cells was identified for achieving the best therapeutic outcome. We also predicted that the optimal therapy outcome was most likely to be achieved in patients with cartilage defects less than 6% in area and 22% in depth of the knee cartilage. Larger defects significantly dampen the efficacy of MSC therapy. The capacity of predicting MSCs’ therapeutic outcome using machine learning holds great clinical significance in suggesting critical therapy input properties to maximize the therapeutic benefits. Further development of this technology could extend its applications in other diseases and cell types, and shed light on substantial improvements in cell therapy efficacy and consistency.

Methods

Data sets

We collected data from 36 published articles on PubMed [13-48] to train and validate our machine learning models. Some articles comprised more than one type of cartilage injury models or treatment conditions. In total, 15 clinical trial conditions and 29 animal model conditions (1 goat, 6 pigs, 2 dogs, 9 rabbits, 9 rats, and 2 mice) on osteochondral injury or osteoarthritis were included, where MSCs were transplanted to repair the cartilage tissue. We documented each case into an entry of a database. We considered the cell- and treatment target-related factors as input properties, including species, body weight, tissue source, cell number, cell concentration, defect area, defect depth, and type of cartilage damage. The therapeutic outcomes were considered as output properties, which were evaluated using integrated clinical and histological cartilage repair scores, including the international cartilage repair society (ICRS) scoring system, the O’Driscoll score, the Pineda score, the Mankin score, the osteoarthritis research society international (OARSI) scoring system, the international knee documentation committee (IKDC) score, the visual analog score (VAS) for pain, the knee injury and osteoarthritis outcome score (KOOS), the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), and Lyscholm score. In this study, these scores were linearly normalized to a number between 0 and 1, with 0 representing the worst damage or pain, and 1 representing the completely healthy tissue. The list of entries was combined together to form a database.

Neural network formalism

We now define the neural network formalism that was used to capture the functional relation between all properties, and predict these relations for new therapies. The establishment of the core neural network and its critical feature on estimating the uncertainty in predictions are described as follow, before the second novel aspect of handling missing data. Each entry x = (x1, …, xI) to the neural network is a vector of length I, with the first I − 1 variables being the distinct treatment conditions (including species, body weight, tissue source, cell number, cell concentration, defect area, defect depth, and type of cartilage damage); and the final Ith variable is the therapeutic outcome. We intended to find a function f that satisfies the fixed-point equation f(x) ≡ x for all entries in the database. The trivial solution to this fixed-point equation is the identity operator, f(x) = x, but this solution does not allow us to impute data using the function f. We search for a solution to the fixed-point equation that is orthogonal to the identity operator, and allow the function to predict a given component of x from some or all other components. The output (y1, …, yI) is a vector of length I, with the first I − 1 variables being the predicted treatment conditions (if unknown); and the final Ith variable is the therapeutic outcome. A linear superposition of hyperbolic tangents was chosen to model the function x, with and The neural network with one layer of hidden nodes was shown in Fig 1. Each hidden node ηhj performed a tanh operation on a superposition of input properties xi with variational parameters Aihj and Bhj for 1 ≤ i ≤ I. Each property yj for 1 ≤ j ≤ I was predicted separately as a superposition of all hidden nodes with variational parameters Chj and Dj. There are exactly as many given properties as predicted properties, since all types of properties are treated equally by the ANN. We set the Ajhj = 0 so that the network predicts yj without knowledge of the true quantity xj. A hyperbolic tangent activation function was used to constrain the magnitude of ηhj, giving the weights Chj sole responsibility for the amplitude of the output response. The variational parameters were selected to minimize the mean square error of predictions of the training data.

Fig 1

Neural network to model data.

Neural network to model data.

The graph illustrates how different inputs xi are used to calculate the outputs for y1 (top) and y2 (bottom); similar graphs can be drawn for all other yj to compute all the predicted properties. Linear combinations (grey lines) of the given properties (red) are taken by the hidden nodes (blue) through a non-linear tanh operation is applied, and a linear combination (grey lines) of the hidden nodes returns the predicted property (green). The ANN has to be trained on a provided data set. The parameters {Aihj, Bhj, Chj, Dj} are initialized with random values, and varied following a random walk. The new values are accepted, if the new function f(x) models the fixed-point equation f(x) ≡ x better, which is quantitatively measured by the error function, This form is also known as the root-mean-square error (RMSE) cost function. The optimization is equivalent to the minimization of the RMSE cost function and a steepest descent approach is used. In order to measure the uncertainty in the ANN’s prediction, we train a number of models simultaneously, and treat their average as the overall prediction and their standard deviation as the uncertainty. The pseudocode is shown in Algorithm A, at least 100 training models were used to evaluate the uncertainty. This concept is similar when estimating the uncertainty in ensemble models, with the underlying model being changed to neural networks and the uncertainty generated accounts for both experimental uncertainty in the underlying data and the uncertainty in the extrapolation of the training data [49, 50].

Handling incomplete data

Sometimes a database may contain entries with incomplete input information due to experimental design or data acquisition problems. The possibility of such missing data is higher in meta-analysis when results from studies with acceptable differences in design and purpose are pooled to form a database. In our database, for example, the osteochondral defect studies took the area of defect as a common critical data for evaluating the severity of injury [14, 17]. However, this information was not always presented in osteoarthritis studies due to difficulties in precisely measuring defect area with complicated geometry [25, 48]. This leads to “missing data” in the entries. We noticed that underlying correlations may exist across the different properties, and can be distilled by a neural network to “fill in” the missing information. A typical neural network formalism requires each property to be either an input or output of the network, and all inputs must be provided to compute a valid output. In contrast, our neural network takes the known treatment conditions and the therapeutic outcome (if known) as inputs then outputs the predictions for unknown treatment conditions and the therapeutic outcome. Then following the flowchart in Fig 2 the neural network is applied iteratively to cycle the predictions of the unknown treatment conditions and therapeutic outcome until self-consistency, an expectation-maximization algorithm [51].

Fig 2

Data imputation algorithm for the vector x.

Data imputation algorithm for the vector x.

After checking the missing properties of entries, we set x0 = x, and replace all the missing data by averages from the training data set. We then iteratively compute x as a function of x and f(x) until we reach convergence after n iterations. The algorithm is shown in Fig 2. For any unknown properties, we first set missing values to the average of the values present in the data set. With estimates for all values of the neural network we then recursively apply the following equation until convergence: where n denotes the iteration step, f(x) is a prediction for x obtained from the neural network. The converged result is then returned instead of f(x). The function f remains fixed on each iteration of the cycle. A softening parameter, γ ∈ [0, 1], is used to combine the results with the existing predictions, and a γ > 0 serves to prevent oscillations and divergences of the predictions. Typically, we set γ as 0.5. The performance against the missing data percentage in the database is shown in S1 Text (Fig. B). Thus we were able to utilize the full information in the database, derive a more robust model and enhance the quality of predictions.

Validation

The model was initially fitted on a training dataset, which is a set of entries from our database. The fitted model then used to predict the outputs in a second validation dataset, to provide an unbiased evaluation of the model. To assess the performance of the model, we adapt the coefficient of determination (R2) metric when training our model on the validation dataset. In this work, we are only using the therapeutic outcome as the variable for R2: where is the therapeutic outcome from the ith case (patient or animal) and is the corresponding prediction. The value of R2 ranges from negative infinity to 1 and is a measure of the fit to the perfect identity line , where R2 = 1 means a perfect fit, R2 = 0 corresponds to making the most naive prediction of all values being the average of the data. To confirm the accuracy of the neural network prediction and avoid overfitting, the R2 was calculated within the leave-one-out cross-validation framework. We first removed one entry from the database at a time for all the entries, trained the model on the remaining entries, and presented the inputs of the unseen entry to predict its output. Eventually we then gathered all the predicted properties of every entry, and computed the R2 against the actual experimental properties.

Other machine learning methods

We compare our neural network algorithm with a variety of other machine learning approaches in S1 Text (Fig. A(i)). Random Forest (RF) [52] is a popular method, which builds an ensemble of decision trees to predict individual results. However, decision trees require all their input to be present during training that makes it impossible to build RF models using incomplete entries but to drop them, we use the imputation algorithm to fill the database and record the second-best R2 value of 0.554 compared to the value of 0.637 from our neural network method. We have also tested the K-Nearest Neighbor (KNN) and Multiple Linear Regression (MLR) method [53], where 3 nearest neighbors was chosen as the optimal setting of KNN using Euclidean distance. Another popular method of analyzing sparse databases is matrix factorization, where the matrix of condition and treatment values is approximately factorized into two lower-rank matrices that are then used to predict therapeutic outcome for the new patient. We used the modern Collective Matrix Factorization (CMF) [54] implementation for comparison, and the hyperparameter alpha for the CMF model was chosen heuristically as 0.99. The R2 value is -0.003, the reason might be the CMF method assumes linearity in the interaction of latent factors which fails to capture some complex non-linear interactions. We also use the leave-one-out cross-validation to determine other hyperparameters of the neural network in S1 Text (Fig. A).

Selection of input properties

The procedure for the neural network to select the most appropriate input properties is challenging for our meta-analysis, as discussed before the available properties vary across different studies, and the same or related properties may be reported in different ways. The input properties were categorized into two types, factual and derived. The factual properties were: species, implantation cell number, defect area, defect depth, type of cartilage damage, body weight, and tissue source. The type of tissue source can be further classified into bone marrow (BM), adipose tissue (AD), synovial fluid (SF), Warton’s jelly (WJ), synovial tissue (ST), and umbilical cord blood (UCB). The derived properties emerged from our biological intuitions and may not have been used in the previous studies, such as defect area percentage, defect depth percentage, and cell concentration. We first trained a neural network to take only one input property and predicted the cartilage repair score. This allowed us to probe the performance of individual property in Fig 3. It is possible that two or more properties of MSC therapy were individually not impactful to the cartilage repair, but when used in combination they allow the model to capture important correlation. For example, both implantation cell concentration and defect volume have low R2 values (-0.04 and 0.12 respectively), but the implantation cell number, which is the product of two former properties, gives a R2 value of 0.41.

Fig 3

Accuracy of the neural network.

Accuracy of the neural network.

R2 values of the neural network model trained with individual property (bar) and the combination of best performing individual properties (red line). The shaded red area represents the uncertainty of each R2 value. The full set of factual and derived properties was provided as inputs to train the neural network. A correlation test is performed between all properties to make sure no pairs of input properties are closely correlated, finding that both Pearson’s Correlation and Spearman’s Rank Correlation coefficients are smaller than 0.53. The individual properties’ correlation with the cartilage repair score was computed and sorted in descending order in Fig 3. The most correlated property is defect depth percentage, with an R2 value of 0.55, followed by defect area percentage (0.42), cell number (0.41) and body weight (0.30). The tissue type BM, AD, and type of cartilage damage are less correlated, and the cell concentration along with other tissue types (SF, WJ, ST, and UCB) are negatively correlated with the cartilage repair score. The top four properties gives an R2 value of 0.625 ± 0.012, and the combination of all seven positive properties has a maximum R2 of 0.637 ± 0.005. Overfitting was observed at a decreasing R2 with more than seven descriptors, this happened when the system matched the training dataset but failed with unseen data draw from the validation dataset. Fewer descriptors did not provide a sufficient basis set, so we chose the first seven descriptors where each of them individually yields a positive R2 value, which captured more correlations of clinical properties without overfitting and provided higher quality uncertainty prediction. The tissue type BM and AD have been consolidated into a single tissue type property, and we have a total of six different input properties.

Results

With the identified six critical input properties, the neural network used for our machine learning model achieved a R2 of 0.637 ± 0.005 with blind cross-validation. The neural network also delivered the prediction uncertainty in terms of the absolute error between the predicted value and the actual value, as plotted in Fig 4. The random errors associated with the model correctly followed a normal distribution so should well capture the true uncertainty. There are 18 entries out of the total 44 entries lie outside the one standard deviation region. We will exploit all of this knowledge in the next subsection.

Fig 4

Histogram of errors for predictions using our neural network.

The dotted red line is fitted with a normal distribution. Each bin contains four data points. Overpredicted refers to predicted values are better than post-treatment cartilage repair scores. Underpredicted refers to predicted values are worse than post-treatment cartilage repair scores.

Histogram of errors for predictions using our neural network.

Imputation

With access to the uncertainties in Fig 4, we can gain further insight from the neural network predictions. In particular, we can discard predictions carrying large uncertainty, and trust only those with smaller uncertainty. The idea is illustrated in Fig 5A, where we select four of the points from Fig 5B, including that with the largest uncertainty that has the highest likelihood of deviating from the true value so should give the largest error, as well as other quartiles in uncertainty. This allowed us to focus on the most confident predictions only at the expense of reporting fewer predictions, e.g. discard the data point with the largest uncertainty (yellow bar) and recalculate the sum of squares for the R2 value. By doing so, the quality of the remaining neural network predictions increases as the root-mean-square error between the predicted values and the actual values decreases when a smaller fraction of predictions is accepted and validated as shown in Fig 5B, 100% of data validated means we predicted and validated against every entry in our database, and all of these values contribute to the final R2. 75% of data validated means that we calculate the R2 using only the 75% of the data with smallest uncertainties in their predictions. The best R2 value of 0.743 was reported at 82% of data being validated, and then reached the plateau when less than 70% of data are being validated. Validating fewer data can lead to significant noise and is less applicable in the real-world where we wish to impute as much as possible, therefore we focus on the >50% regime. The result confirms that the neural network is able to accurately and truthfully inform us about the uncertainties in its predictions; and so the confidence of predictions is correlated with their accuracy.

Fig 5

Model performance after imputation.

Model performance after imputation.

(A) shows an example when making predictions for just four data points. The y-axis is the prediction from the machine learning, and the x-axis delineates four different sample predictions ordered by their uncertainty. The colored dots represent the predicted value and their uncertainty that is also predicted by the machine learning method is shown by the colored bars (magenta, turquoise, green, and yellow), the violin plot represents the probability density distribution for predicted outcomes, and the red dots are the true (unknown to machine learning) values used for validation, and the difference between predicted and true values is measured as the grey arrows. The sum of squares (SS) value is then normalized to calculate the R2 value. (B) shows the R2 value with percentage of data validated, and the data points are color-coded by their uncertainty ranking. The blue line is the trend line fitted to the data points. The turquoise, green, and yellow points in (A) are the points at 50%, 75%, and 100% in (B). We note that this post-processing corresponds to increase in accuracy, once a model was trained, and the desired level of confidence can be specified and used to return only sufficiently accurate predictions. The projected cartilage repair score along with the confidence level of the prediction will be provided once the patient’s condition has been set as inputs to the model, which will allow clinicians to focus treatments on those most likely to lead to success, and trials to focus on the most illuminating input property space.

Identifying anomalous results

With the computed uncertainties of prediction, we identified entries with particularly high deviations between the predicted and experimental results. Those can then be re-examined, and corrected to improve the training dataset. Most predictions of our model were expected to lie within one standard deviation (±1) of the experimental results, as shown in Fig 4. The 18 entries lay outside of the one standard deviation region are shown in Table 1. Three of them were from clinical trials, and the other 15 were from animal studies. A positive number of standard deviation away means our neural network overpredicts the cartilage repair score, and a negative number means underprediction. We analyzed the over- and underpredicted repair scores as follows.

Table 1

The table highlights predicted entries where the number of standard deviations out by clinical results are greater than 1 or less than -1, which indicates our prediction is away from the experiments.

	Species	Authors	Damage	Standard deviations out by
Overpredicted	Rabbit	Katayama et al. [33]	Defect	2.88
	Rat	Dahlin et al. [38]	Defect	2.16
	Rat	Papadopoulou et al. [47]	Arthritis	2.15
	Minipig	Ha et al. [15]	Defect	1.84
	Rat	Zhu et al. [35]	Defect	1.35
	Rat	Zhang et al. [43]	Arthritis	1.20
	Minipig	Wu et al. [17]	Defect	1.10
	Minipig	Lee et al. [23]	Defect	1.04
	Rabbit	Li et al. [34]	Defect	1.02
Underpredicted	Rat	Xue et al. [37]	Defect	-1.03
	Rabbit	Park et al. [14]	Defect	-1.15
	Human	de Windt et al. [18]	Defect	-1.26
	Rabbit	Li et al. [34]	Defect	-1.30
	Human	Koh et al. [22]	Defect	-1.40
	Rabbit	Ma et al. [32]	Defect	-1.62
	Rat	Park et al. [13]	Defect	-1.74
	Human	Fodor et al. [26]	Arthritis	-1.76
	Piglet	Ando et al. [29]	Defect	-2.34

In general, our model predicted 80% of the clinical trials with an error smaller than one standard deviation, which was better than that of 48% for the animal studies. Three human clinical trial outcomes were underpredicted. In two of the cases, the researchers performed additional surgical procedures besides MSC implantation to repair the damaged cartilage. De Windt et al. implanted debrided autologous chondrocytes together with MSCs in their procedure [18]. The interaction between MSCs and chondrocytes was not considered as an input property in the current neural network, but might promote the cartilage repair. Koh et al. performed microfracture surgery before MSC implantation [22]. The recruitment of autologous MSCs from the subchondral bone to the defect cartilage area by the microfracture surgery was likely the cause of the underpredicted outcome from the neural network. The most underpredicted entry with -2.34 standard deviations away from the actual experiment outcome, appeared in the study from Ando et al., where an MSC-based tissue scaffold was implanted to chondral defects in porcine models [29]. Similarly, Li et al. encapsulated MSCs in microspheres prior to transplantation to the rabbit osteochondral defects [34], which yielded a standard deviation of -1.30. In both cases, the use of scaffold likely induced pre-differentiation of MSCs towards chondrogenic lineage, and the production of extracellular matrix proteins before transplantation might have greatly promoted the repair efficacy. Xue et al. also delivered MSCs to their rat model in tissue-engineered scaffold made from poly (lactide-co-glycolide) (PLGA)/nano-hydroxyapatite (NHA), but the MSCs possibly remained at undifferentiated status [37]. This resulted in a smaller underprediction by the neural network with a standard deviation of -1.03. Another underprediction with a standard deviation of -1.62 was seen in the study from Ma et al. [32], where an autologous graft was transplanted together with the MSCs. In this study, the mosaicplasty might have contributed significantly to the repair, which was not analyzed as an input to the neural network. For overpredicted repair scores, Katayama et al. reported their MSC treatment efficacy to rabbit cartilage defect [33] at a much lower level than the neural network prediction, with 2.88 standard deviations away. Although the isolation and subculture of MSCs were performed using standard protocols, the authors did not provide sufficient quality control of the cells before the treatment. The uncertainty in cell purity and quality might have resulted in the suboptimal repair. Re-visiting these inaccurately predicted cases has allowed us to gain further insights on the therapeutic efficacy of MSCs in cartilage repair. The majority of the less accurate predictions occurred in animal trials, where special delivery methods or manipulations to the MSCs have been implemented. These findings implied the potential impact of these novel therapy input properties on cartilage repair, although they are not readily applied in clinical trials. We also realized that not all the less accurately predicted cases were associated with special delivery strategy or cell modification, and the underlying causes were not obvious. It is reasonable to believe that the potency of MSCs, secretome profile, and the surgical procedures might all impose significant impacts on the therapeutic outcome. Including these information as input properties in the database would empower the neural network to enhance the prediction accuracy.

Influence of properties

The patients’ pre-treatment conditions and therapeutic strategies were encoded within the input properties for the model to make predictions. The relative strength of the properties on predicting the cartilage repair score, defined as the change of R2 on removing a property, is plotted Fig 6. The pre-treatment conditions such as defect area percentage, defect depth percentage, and body weight play important roles in the treatment outcome. Whereas the treatment strategy properties, such as the implantation cell number and the tissue source, impact the outcome to a lesser extent. We now study these input properties in descending order of importance.

Fig 6

Illustration of the relative strength of properties used in our model.

Defect area and depth percentage

We first investigated the two most important properties: defect area percentage and defect depth percentage; a surface plot is shown in Fig 7 where the cartilage repair score has been normalized against the full range of scores in the database. It is worthwhile to note that although most training dataset has defect area percentage less than 30% and defect depth percentage greater than 40%, our neural network model extrapolated cartilage repair of a patient with indications beyond the existing range of conditions in the database. The neural network can do this due to its unique ability to handle missing data over the full range of conditions (0-100%). In general, the cartilage repair score drops as the percentage of defect area and depth increases, implying the difficulty for MSC therapy to achieve full recovery in patients with severe cartilage damages.

Fig 7

Surface plot of the normalized cartilage repair score based on defect area percentage and defect depth percentage.

The trajectory of changing area or depth is shown in white arrows.

Surface plot of the normalized cartilage repair score based on defect area percentage and defect depth percentage.

The trajectory of changing area or depth is shown in white arrows. The study showed that critical thresholds of damage exist for effective cartilage repair to happen, which is similar to the case of volumetric muscle loss [55]. In cartilage repair models, a “critical size” osteochondral defect that can not effectively repair by itself, has been widely used. In most cases, such critical sizes were applied at estimated default values for different animal models. Some studies have attempted to experimentally determine the critical size of the defect in terms of depth and diameter in specific animal models [56]. In our machine learning model, we predicted those “critical size” defects as we observed a rapid decrease in the normalized cartilage repair score when the defect area percentage increases from 6% to 35%. Another fast drop was observed at 64%, because minimal repair should be expected when more than 70% of cartilage area is damaged. These sharp drop-offs identified from the model indicates the presence of multiple “critical sizes” that constrain cartilage repair to different levels post MSC therapy. These quantitative cartilage repair predictions based on the patients’ defect conditions provide useful references for the clinicians to make decisions on the therapy.

Body weight

As shown in Fig 6, the body weight also acts as an important input property in our neural network: heavier species tend to have a better therapeutic outcome. However, this may be attributed to the large inter-species weight differences in the database. The lack of intra-species weights information in the databased has made further analysis difficult. This could be a valuable topic for further investigation.

Implantation cell number

The next most important input property is the implantation cell number. Fig 8 shows a near linear increase in the cartilage repair score with implantation cell number less than 17 million. The normalized cartilage repair score is above 0.9 between 17 to 25 million implantation cell number, and is maintained around 0.8 in the 25 to 75 million range. Further increase in the implantation cell number results in a sudden drop of the normalized cartilage repair score to below 0.7.

Fig 8

Impact of implantation cell number on the cartilage repair.

Impact of implantation cell number on the cartilage repair.

The left y-axis shows the predicted patient’s cartilage repair scores normalized to the range of score that patients have been evaluated in clinical trials. The right y-axis shows the number of studies (blue histogram) that use a certain cell number in our database. The determination of MSC dose for therapy remains intuitive in current practice. A wide range of implantation cell numbers has been found in the literature, ranging from a few thousand to 10 billion with the majority falling between 1 to 100 million [5]. Besides the implantation cell number, these cells were also transplanted at a vast range of concentrations in different animal studies and clinical trials, between a thousand to a billion cells per millilitre of the delivery agents [5]. Controversial results on the cell dose-dependent influence on cartilage repair have been reported. On one hand, higher cell number and concentration have been associated with better chondrogenesis and cartilage repair [57-62]. The high cell density likely recapitulated the mesenchymal condensation process that occurred during embryonic development of cartilage, and promoted MSC differentiation towards chondrogenic lineage [63]. On the other hand, native cartilage is an ECM-rich avascular tissue with low cell density. Studies have pointed out the limitation to cell saturation and survival [64], and high dose of MSC transplantation was likely to increase the risk of synovitis and synovial proliferation [57, 65]. In this study, we untangle the long-lasting controversy through machine learning approach, and recommend an optimal dose of 17-25 million MSC for human therapy. This conclusion is partly supported by a dose-dependent MSC Phase II clinical trial [48] to treat osteoarthritis patients, which is unseen to the machine learning model, where MSC dose larger than 25 million resulted in a decline in the patients’ cartilage repair scores. This overturns the long-standing protocol of using fewer than 2 million cells for implantation.

Tissue source

The tissue sources of MSCs, bone marrow (BM) and adipose tissue (AD), have been combined to form a new property in our model. These two sources of MSCs are the most widely used and studied, mainly because of the high accessibility to BM and AD. The abundant MSC number obtainable from BM and AD also determines that these cells have greater potential to be produced at large scale for allogenic uses. The number of occurences of BM and AD MSCs were abundant in our database, and the machine learning results suggested that both BM and AD MSCs are beneficial to the treatment. However, more studies are needed to reach a conclusion on the effects of other tissue sources, including synovia fluid (SF), Wharton’s jelly (WJ), synovia tissue (ST), and umbilical cord blood (UCB). Their individual performances were tentatively analyzed and displayed in Fig 3 based on the current database.

Type of cartilage damage

The least important property in this machine learning model is the type of cartilage damage. Although fundamental difference exists in the causes and pathologies between osteochondral defect and osteoarthritis, the mechanisms of cartilage repair through MSC therapy in both cases may share many commonalities, such as differentiation of the MSCs into chondrocytes at the damage site, secretion of regenerative factors, and immune regulation.

Discussion

In this study, we have developed a neural network model that exploits the inter-property and property-property correlations to predict the therapeutic efficacy of MSC transplantation for cartilage repair based on animal results and human clinical trials. We started with cartilage injury models where different MSCs were given and measures of their performance were recorded. We characterized the cartilage repair score and filled the missing information using the neural network while training the model. The assessment of new patient would provide input information for the model to make predictions on human clinical trial outcomes and the recommended properties, clinicians would be given the uncertainty in the prediction along with the confidence level to decide the most suitable therapy for treatment. We reported an optimal implantation cell number of 17-25 million to treat patients with cartilage damages, and quantitatively demonstrated how the key factors, including the number of cells implanted, defect area, and depth, could impact the post-transplantation healing. In particular, the neural network has the ability to systematically estimate the confidence level of each prediction, make decisions based on reliable results, and expedite trials. The predictive power of our model enables personalized therapy. We predicted the optimal therapeutic outcome based on individual patient’s disease conditions, including defect area percentage, defect depth percentage, and body weight. For patients with severe cartilage damages beyond the threshold for effective repair, other treatment strategies should be considered. Together, the predictions from our model would serve as important references to the clinicians and scientists to design better MSC therapy strategies for cartilage repair, and their findings can be used to further optimize the model. The technology can also be adapted for MSC therapies to other medical indications, and address other biomedical questions. There is open access to the data and codes at https://doi.org/10.17863/CAM.52036.

We provide additional details, including the algorithm to calculate uncertainties and figures that validate the hyperparameters for our machine learning method.

(PDF) Click here for additional data file. 13 Mar 2020 Dear Mr Liu, Thank you very much for submitting your manuscript "Machine learning mesenchymal stem cell efficacy for cartilage repair" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Qing Nie Associate Editor PLOS Computational Biology Arne Elofsson Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: In this manuscript (m.s.), the authors proposed a neural network model to predict the efficacy of MSC therapies in cartilage repair. The training data was collected from 36 published articles on PubMed. Compared with standard neural network approach, the authors claimed the novelties of their methods lied in the ability to 1) impute missing data and 2) quantify the uncertainty in prediction. The model identified several significant factors that affect cartilage repair, and suggested the optimal selection of MSC in treatment. The m.s. also speculated that the method can be applied to other clinical studies. Overall, the m.s. addresses the important biological problem of cartilage repair prediction and contributes an associated machine learning method with certain new features. However, to improve the soundness of their conclusions and enhance readability for the PLoS CB audience, a few technical issues should be stressed as follows prior to the recommendation of publication. Major Points 1. On the presentation of methods. It leaves me the impression that some necessary technical details are omitted in the current m.s., which should be included in the Methods part, or provided as Supplementary Information. 1) The authors claimed uncertainty-quantification and missing data imputation are the unique features of their proposed method. Unfortunately, I am not able to check such claim after reading the Methods part of the m.s., since they are currently presented in a highly abstract and rough way, and the details are not provided even in the Supplementary. For example, in line 103-109, to describe the uncertainty-quantification procedure, they wrote, “Several separate networks were trained on the data with different weights, and their variance was a measure of uncertainty in the predictions which accounts for both experimental uncertainty in the underlying data and the uncertainty in the extrapolation of the training data [49,50]. This concept is similar when estimating the uncertainty in ensemble models, with the underlying model being changed to neural networks and the uncertainty estimates generated accurately represent the observed errors in the prediction.” How can different network be obtained by solving a simple minimization problem? How are the weights determined? How many networks are actually trained in author’s data and what about the robustness? Can certain equations/mathematical notations be used to replace such narrative sentences? in line 103-109, when describing the imputation method, they wrote “We treated all properties as both inputs and outputs of the model and adapt an expectation-maximization algorithm to exploit the relationship across the properties using an iterative approach. ” What is the objective function of EM algorithm in current setting? Does the iteration finally converge in real-data implementation? Can the iteration procedure be explicitly written in pseudo-code or mathematical expressions? Therefore, I recommend the authors to revise the writing of methods part and associated SI by adding more details about implementation procedures, making it possible for readers to directly inspect the rationale/plausibility of their methods from the methodology perspective. 2) Implementation details of neural network & reproducibility issue The m.s. reported the R-square regarding the neural network performance, while seems to neglect reporting other key parameters. I suggest the authors to disclose other parameters and perhaps discuss robustness in Supplementary material (such as learning rate, robustness for hidden layer number and layer width, robustness for activation function (ReLU v.s.tanh) and trained weights), which is generally vital to performance of machine learning techniques. I also think it would be very helpful to make the training codes publicly available for reproducible check and further benchmarking/ application, which is quite common for machine learning algorithms. 2. Validation of the imputation function As mentioned above, the imputation function was alleged to be the highlight of current m.s. For validation of the effectiveness, I recommend the authors to conduct further computational experiment regarding imputation values. For instance, the authors may treat some known values as missing data in the cartilage data, and then evaluate the performance. I notice similar validation has been done in material data by some of the authors in reference [10], which can also be done here to strengthen the claim of current m.s.. 3. Comparison with other state-of-art machine learning methods. The sample size of cartilage repair data used in this m.s. seems not to be overwhelmingly large, therefore neural network method might not out-perform other classical machine learning approaches by default. Since the m.s. used the word ‘Machine Learning’ instead of ‘Neural Network’ in the article title, I suggest the authors to evaluate the performance of other powerful machine learning methods for continuous variables (e.g. random regression forests) to predict the outcome and compare the results (for instance, R-square in testing dataset) with neural network approach, hence highlighting neural network as the desired machine learning method and providing a baseline R-square for comparison. Minor Points 1. On Line 40, it stated that ‘We adapt a deep learning method ’, which seems inaccurate. In fact, the authors only used very simple single-layer neural network in the current m.s., far from the widely perceived ‘deep learning’ that has multi-layer structure and huge amount of parameters. 2. On line 91, the expression might be f(x)=y instead of f(x)=x. Reviewer #2: Liu et al. present a neural network model to predict cartilage healing via MSC therapy. The model is trained on public data and its analysis and results are of interest both from methodological and biological perspective. There are a number of serious concerns I have regarding the detail given in the paper and the presentation of results that are important to be addressed. These are listed below. Major comments 1. Methods: not enough detail, need to provide more details (some could be in supplement), e.g. why was this network topology chosen, number of hidden nodes. “adapt an EM algorithm” - details of this? 2. Methods validation: It seems that this network topology successfully handles missing data by the leave-one-out training strategy. Some discussion of this strategy is needed. I.e. since the input data set is relatively small, would other imputation methods by (e.g.) matrix factorization, followed by a single hidden layer rather than several separate networks, work equally well or better? How does performance of this model compare to alternatives that may not handle missing data explicitly but could do so implicitly (e.g. a variational autoencoder)? 3. Lack of reproducibility: in addition to the lack of written details on methodology, no source code nor any of the data used to train the model is provided. This is recognized as a growing concern in ML for biomedical studies (see https://arxiv.org/abs/2003.00898). For this study to be helpful to others, and to follow PLOS guidelines, working code should be provided along with documentation such that it can be followed by others. Data used as input could also be summarized in a supplement. 4. Seven features are reported as provided best predictions, but the difference between 7 and 4 features seems to be very small. What is the R^2 for 4 features? Along with its variance via cross-validation? Are 4 predictors worse than 7? 5. How closely correlated are the defect depth and defect area? And what possible effects may this have on predictions? 6. More explanation of Fig. 4 is needed. It is not at all clear what is being presented here? RMSE zero without imputation and increases with imputation? 7. In Fig. 6, sharp drop offs are observed for both parameters. Is this reflecting a real biological prediction or is it due to lack of enough data to “fill-in” - I.e. make more continuous - the landscape in fig 6? Minor comments - Fig. 5: 3D pie charts are not good ways of representing data as they skew the actual pie chart! (“closer” slices look bigger) Please choose alternative method for visualizing. 2D pie charts are ok. - Fig. 8 roadmap - this figure does not seem pertinent to the work done in the paper. Does not contain info relevant to the work done. Update or remove. - Please edit thoroughly and carefully for English language errors, eg plural of “mechanism” incorrect in multiple places, - Line 267 typo “we now studying” ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: No: The numerical data that underlies graphs or summary statistics has not been provided in spreadsheet form. I also have not found the specific training data link at https://www.openaccess.cam.ac.uk as provided by the author. Reviewer #2: No: Summary of the data mined from literature and details of how this was done is not provided. ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see 10 May 2020 Submitted filename: Response letter.pdf Click here for additional data file. 1 Jun 2020 Dear Mr Liu, Thank you very much for submitting your manuscript "Machine learning mesenchymal stem cell efficacy for cartilage repair" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Qing Nie Associate Editor PLOS Computational Biology Arne Elofsson Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: In the resubmitted m.s., Liu et al. improved the presentation of method details -- especially on the data imputation procedure, and included benchmarking results with other machine-learning algorithms. Overall, I appreciate the efforts by the authors to address the major concerns raised in my last review. However, prior to the final recommendation of publication, I suggest some confusions/ambiguities to be clarified in the current revised manuscript. 1. Confusions about the input/output of network In the clean version of m.s. line 69-71, it is said that the input properties are the various factors, and output properties are outcomes measured by scores. However, in line 134, it states all properties are both inputs and outputs of the models. I suggest the authors to be more specific and explicit here, eliminating the inconsistency, since general readers are very likely to have the same confusion with me. I recommend to clarify the exact meanings of the variables in Figure 1, in the concrete text of cartilage repair data – for example, the vector x simply represents various factors in data, or both factors and outcomes? And the authors may explain in detail how to predict outcomes when using only the factors given a new test data? Also, when computing R-square, what variables are used? 2. More details about variable selections in benchmarking When comparing with other methods, the m.s. seems not state how the features are selected optimally for linear regression or KNN. From Methods and Figure 3, we know variable selection is important to improve the R-square of authors’ model. For sake of fair benchmarking, I suggest the authors may also tune the parameters or especially do feature selections for other machine learning algorithms. Also, the RF algorithm can run with incomplete data using some naïve imputation strategies, instead of author’s approach, which might also be incorporated into comparison. These details should be clearly recorded in the manuscript or SI to improve the reproducibility and soundness of the paper. 3. Accessibility of data/codes By the time of submitting the review report, I still cannot get access to the data and codes using provided webpage or placeholder doi. It shows the doi not found, and I cannot find where to enter the ID on https://www.openaccess.cam.ac.uk/. I will be grateful if the authors can provide me more instructions to find the data, or the direct link to the database. Reviewer #2: In this revision, the authors have greatly clarified several points and the manuscript is much improved as a result. Below are a couple concerns that I think still need to be addressed. 1. Fig 5A is not at all clear, much more detail needed, e.g. “Values” and “Data points” and not meaningful axis labels. Is the point that the data points are ordered by uncertainty? Perhaps mark specifically where each point in 5A comes from in 5B? It is also not clear what the relationship is between the uncertainty in estimates and the error (distance between true value and predicted) and what the significance of this is? 2. “The data and codes are available at www.openaccess.cam.ac.uk . ( ID: 6CB38B01-7F11-4041-A4F6-74F863C31946 and p laceholder DOI link: " ext-link-type="uri" xlink:type="simple">https://doi.org/10.17863/CAM.52036" Neither the doi link nor the ID in openaccess.cam database work for me. Thus there is still no available code associated with this manuscript. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: No: By the time of submitting the review report, I still cannot get access to the data and codes using provided webpage or placeholder doi. It shows the doi not found, and I cannot find where to enter the ID on https://www.openaccess.cam.ac.uk/. I will be grateful if the authors can provide me more instructions to find the data, or the direct link to the database. Reviewer #2: No: authors say that a database is provided, but the link doesn't work ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see 9 Jun 2020 Submitted filename: 2nd_response_letter.pdf Click here for additional data file. 9 Jul 2020 Dear Mr Liu, Thank you very much for submitting your manuscript "Machine learning to predict mesenchymal stem cell efficacy for cartilage repair" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Qing Nie Associate Editor PLOS Computational Biology Arne Elofsson Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: In this revision, the authors have addressed most of my concerns regarding the scientific aspects for the manuscript, except that they do not perform the comparison with random forest + naïve imputation strategy, leaving it as a future work. I appreciate their efforts, and will not insist them to perform such specific task, as long as they make the codes publicly available and reproducible for readers to do the benchmarking themselves. However, after reviewing the submitted data and codes (I was not able to access them until this round of revision), I find that the reproducibility issue should be brought into attention before the recommendation of acceptation. In the files provided by authors, I can only find a simple python file defining the class of Neural Network, which is quite routine and not novel. I cannot figure out how the data imputation (EM algorithm), claimed as the major contribution and novelty in current work, is achieved in the code. Nor do the authors provide script to reproduce their key findings in the manuscript using the cartilage repair database. Note it is the policy of the PLoS CB that “authors must clearly provide detail, data, and software to ensure readers' ability to reproduce the models, methods, and results”. The reproducibility issue is especially important in the field of machine learning. I therefore strongly recommend the authors to at least provide 1) the script to reproduce their key results in the manuscript using their dataset, 2) brief tutorials or documentations about their defined functions or class in a user-friendly way, making it convenient for the interested readers to directly implement the algorithm (especially data imputation and uncertainty quantification) in other datasets and do their desired benchmarking to validate the algorithms. Overall, I do think that open, reproducible scripts and clear documentations about the proposed algorithms are necessary before this manuscript is accepted. Reviewer #2: The reviewers have addressed all of my concerns. ********** Have all data underlying the figures and results presented in the manuscript been provided? Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information. Reviewer #1: No: datasets--yes, numerical data that underlies graphs in spreadsheet-- no, reproducible codes -- not convinced Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see 17 Aug 2020 Submitted filename: 3rd_response_letter.pdf Click here for additional data file. 20 Aug 2020 Dear Mr Liu, We are pleased to inform you that your manuscript 'Machine learning to predict mesenchymal stem cell efficacy for cartilage repair' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Qing Nie Associate Editor PLOS Computational Biology Arne Elofsson Deputy Editor PLOS Computational Biology *********************************************************** 30 Sep 2020 PCOMPBIOL-D-20-00130R3 Machine learning to predict mesenchymal stem cell efficacy for cartilage repair Dear Dr Liu, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Laura Mallard PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

53 in total

1. Bone marrow-derived mesenchymal stem cells versus bone marrow nucleated cells in the treatment of chondral defects.

Authors: Yi Zhang; Fuyou Wang; Jiarong Chen; Zhigang Ning; Liu Yang
Journal: Int Orthop Date: 2011-10-28 Impact factor: 3.075

2. Repair of osteochondral defects by mosaicplasty and allogeneic BMSCs transplantation.

Authors: Xin Ma; Yuan Sun; Xiangguo Cheng; Youshui Gao; Bin Hu; Gen Wen; Yebin Qian; Wenqi Gu; Yanjie Mao; Wanjun Liu
Journal: Int J Clin Exp Med Date: 2015-04-15

3. Matrix-induced autologous mesenchymal stem cell implantation versus matrix-induced autologous chondrocyte implantation in the treatment of chondral defects of the knee: a 2-year randomized study.

Authors: Isık Akgun; Mehmet C Unlu; Ozan A Erdal; Tahir Ogut; Murat Erturk; Ercument Ovali; Fatih Kantarci; Gurkan Caliskan; Yamac Akgun
Journal: Arch Orthop Trauma Surg Date: 2014-12-30 Impact factor: 3.067

4. Variation of mesenchymal cells in polylactic acid scaffold in an osteochondral repair model.

Authors: Yasushi Oshima; Frederick L Harwood; Richard D Coutts; Toshikazu Kubo; David Amiel
Journal: Tissue Eng Part C Methods Date: 2009-12 Impact factor: 3.056

5. Articular cartilage regeneration with autologous marrow aspirate and hyaluronic Acid: an experimental study in a goat model.

Authors: Khay-Yong Saw; Paisal Hussin; Seng-Cheong Loke; Mohd Azam; Hui-Cheng Chen; Yong-Guan Tay; Sharon Low; Keng-Ling Wallin; Kunaseegaran Ragavanaidu
Journal: Arthroscopy Date: 2009-09-17 Impact factor: 4.772

6. Chondrogenesis of human bone marrow-derived mesenchymal stem cells in agarose culture.

Authors: C-Y Charles Huang; Paul M Reuben; Gianluca D'Ippolito; Paul C Schiller; Herman S Cheung
Journal: Anat Rec A Discov Mol Cell Evol Biol Date: 2004-05

7. Injectable mesenchymal stem cell therapy for large cartilage defects--a porcine model.

Authors: Kevin B L Lee; James H P Hui; Im Chim Song; Lenny Ardany; Eng Hin Lee
Journal: Stem Cells Date: 2007-07-26 Impact factor: 6.277

8. Repair of articular cartilage defects in rabbits using CDMP1 gene-transfected autologous mesenchymal cells derived from bone marrow.

Authors: R Katayama; S Wakitani; N Tsumaki; Y Morita; I Matsushita; R Gejo; T Kimura
Journal: Rheumatology (Oxford) Date: 2004-06-08 Impact factor: 7.580

9. Synovial membrane-derived mesenchymal stem cells supported by platelet-rich plasma can repair osteochondral defects in a rabbit model.

Authors: Jae-Chul Lee; Hyun Jin Min; Hee Jung Park; Sahnghoon Lee; Sang Cheol Seong; Myung Chul Lee
Journal: Arthroscopy Date: 2013-06 Impact factor: 4.772

10. Adipose-Derived Mesenchymal Stem Cells With Microfracture Versus Microfracture Alone: 2-Year Follow-up of a Prospective Randomized Trial.

Authors: Yong-Gon Koh; Oh-Ryong Kwon; Yong-Sang Kim; Yun-Jin Choi; Dae-Hyun Tak
Journal: Arthroscopy Date: 2015-11-14 Impact factor: 4.772

2 in total

1. Proteomic Analysis of Hypoxia-Induced Senescence of Human Bone Marrow Mesenchymal Stem Cells.

Authors: Liping Mai; Guodong He; Jing Chen; Jiening Zhu; Shaoxian Chen; Xinghua Hou; Hui Yang; Mengzhen Zhang; Yueheng Wu; Qiuxiong Lin; Min Yang; Xiaohong Li
Journal: Stem Cells Int Date: 2021-08-27 Impact factor: 5.443

Review 2. Recent trends in stem cell-based therapies and applications of artificial intelligence in regenerative medicine.

Authors: Sayali Mukherjee; Garima Yadav; Rajnish Kumar
Journal: World J Stem Cells Date: 2021-06-26 Impact factor: 5.326

2 in total