Literature DB >> 32435672

The importance of scoring recognition fitness in spheroid morphological analysis for robust label-free quality evaluation.

Kazuhide Shirai^1,2, Hirohito Kato¹, Yuta Imai¹, Mayu Shibuta¹, Kei Kanie¹, Ryuji Kato^1,3.

Abstract

Because of the growing demand for human cell spheroids as functional cellular components for both drug development and regenerative therapy, the technology to non-invasively evaluate their quality has emerged. Image-based morphology analysis of spheroids enables high-throughput screening of their quality. However, since spheroids are three-dimensional, their images can have poor contrast in their surface area, and therefore the total spheroid recognition by image processing is greatly dependent on human who design the filter-set to fit for their own definition of spheroid outline. As a result, the reproducibility of morphology measurement is critically affected by the performance of filter-set, and its fluctuation can disrupt the subsequent morphology-based analysis. Although the unexpected failure derived from the inconsistency of image processing result is a critical issue for analyzing large image data for quality screening, it has been tackled rarely. To achieve robust analysis performances using morphological features, we investigated the influence of filter-set's reproducibility for various types of spheroid data. We propose a new scoring index, the "recognition fitness deviation (RFD)," as a measure to quantitatively and comprehensively evaluate how reproductively a designed filter-set can work with data variations, such as the variations in replicate samples, in time-course samples, and in different types of cells (a total of six normal or cancer cell types). Our result shows that RFD scoring from 5000 images can automatically rank the best robust filter-set for obtaining the best 6-cell type classification model (94% accuracy). Moreover, the RFD score reflected the differences between the worst and the best classification models for morphologically similar spheroids, 60% and 89% accuracy respectively. In addition to RFD scoring, we found that using the time-course of morphological features can augment the fluctuations in spheroid recognitions leading to robust morphological analysis.

Entities: CellLine Chemical Disease Species

Keywords: Cell manufacturing; Label-free quality evaluation; Object recognition; Spheroid; Spheroid morphology

Year: 2020 PMID： 32435672 PMCID： PMC7229423 DOI： 10.1016/j.reth.2020.02.004

Source DB: PubMed Journal: Regen Ther ISSN： 2352-3204 Impact factor: 3.419

Introduction

Spheroids, in vitro three-dimensionally cultured cellular aggregates, have been shown to mimic in vivo biological functions compared with two-dimensionally cultured cells [[1], [2], [3], [4]]. Therefore, their importance in drug development research has grown. To understand the physiological responses for testing pharmaceutical efficacy and safety, human cell-derived spheroids have been studied as replacements for animal models as a new in vitro drug screening platform. Cancer spheroids [[5], [6], [7]], liver spheroids [8], and heart spheroids [[9], [10], [11]] represent some of the cutting-edge cell applications in development. Moreover, based on recent advances in stem cell engineering, stem cell-derived spheroids are expected to be applied clinically [12,13]. In tissue engineering applications, spheroids are used as building blocks to manipulate larger scale tissues or organs [11,14]. One of the most advantageous features of spheroids is the balance of their biological complexity and their scalability. From the aspect of screening, spheroids are highly compatible with high-throughput screening technologies, such as multi-well plate assay systems or high content analysis platforms [15]. From the aspect of manufacturing, spheroids can enable the highest efficiency in large-scale cell source processing, up to the scale of 1010 cells, such as in induced pluripotent stem cell manufacturing [16] or mesenchymal stem cell-derived implantable tissues [12]. Despite the growing expectations for such spheroid applications, the technology to control the quality of spheroid production is still limited. Although it is essential to prepare massive numbers of spheroids with controlled quality for any application, spheroid evaluation technology, which can balance three important criteria (“efficiency,” “resolution,” and “non-invasiveness”) is still lacking. For spheroid evaluation, conventional biochemical assay techniques can feasibly expand their evaluation throughput. However, their evaluation per spheroid is limited to measuring the average value of all spheroid-comprising cells and difficult to discriminate their delicate differences. Conventional molecular biology techniques, such as sequencing or quantitative PCR analysis, can sensitively measure their differences, although still costly for high throughput screening. High content imaging has great potential for obtaining single-cell or intracellular organelle level evaluation data per spheroid. However, the imaging resolution commonly negatively correlates with their throughput. Moreover, most of the fluorescent-staining-based techniques are limited to end-point assays, therefore evaluated spheroids cannot be further used for the leading applications. Non-invasive cell evaluation technologies have been introduced to evaluate in vitro three-dimensionally cultured cells including spheroid such as measurements for oxygen gradients [17] and optical coherence tomography [18]. Among these, label-free microscopic image-based analysis is one of the technologies which can balance “efficiency,” “resolution,” and “non-invasiveness.” Our group has applied label-free image-based morphology analysis for enabling quantitative, high throughput, and non-invasive profiling of cells [19,20], colonies [21,22], and cell aggregates [23]. Marklein et al. reported high content imaging of early morphological signatures of human mesenchymal stem cells [24]. Oja et al. have also reported image-based analysis to detect aging in clinical-grade mesenchymal stromal cell cultures [25]. Maddah et al. have reported the application of a system for automated morphology-based evaluation of induced pluripotent stem cell cultures [26]. Although such label-free image-based analysis works have been growing, studies discussing the robustness of their analysis performance is still scarce. For better image-based analysis, especially for cell manufacturing applications, it is crucial to investigate the robustness of image-based quality evaluations, balancing its accuracy and reproducibility. In this work, we investigated to develop the concept to maximize the “reproducibility” of label-free morphology-based analysis for spheroids. Generally, the workflow of conventional image-based cell evaluation analysis consists of 3 steps: recognition, measurement, and analysis (Fig. 1, Supplementary information Fig. S1, S2). The very first step, target recognition, is the image processing step, which try to recognize the region of “spheroid area” by the combination of image processing filters for further measures. Although it has a critical impact on all subsequent processes, the recognition of “whole spheroid” has been a very subjective process, rather than an evidence-based process. One of the biggest reasons for the subjectivity is that the three-dimensional spheroids have fine contrast area with their main body, although their outer surface region with loose aggregates makes poor contrast (Supplementary information Fig. S3). By such ambiguous contrast, the definition of the outline of whole spheroid can vary significantly between operators who design the filter-set. In other words, in spheroid images, it is a fact that there exists an “uncertain area” (Supplementary information Fig. S3A) at the outer region of spheroid which reflect the spheroid quality, however their recognition level is highly dependent on operators' decisions. Since operators commonly design their filter-set for label-free images only with limited and representative images, and evaluate their performance only by their feelings, the unexpected variation of “uncertain area” can critically fail the recognition process and disturb the subsequent analysis (Supplementary information Fig. S3B). For example, even if a filter-set was designed to “sharply” recognize spheroids within the first small dataset, their recognition can be “loose” in the second dataset by the existence of new type of “uncertain area” in new data.

Fig. 1

Schematic illustration of this study. (Left: Orange column) The work-flow in the illustration indicates the conventional image analysis scheme for morphology-based analysis. From the original image, the objective target in the image (spheroids in this study) is recognized by image processing (step 1: Recognition). The recognized area (colored in green) is commonly designed to cover the total spheroid area including their outer borders. Then, from the recognized area, morphological features are measured (step 2: Measurement). Using morphological features as multiple descriptors of the objective target, further analysis (step 3: Analysis) is conducted. (Center column) The uncertain area, the gap between the “recognized area” and the “certain area,” is defined as the uncertain area ratio in the image. Because annotation of the true spheroid area is difficult in label-free images, we defined the uncertain area by calculating the “area with low-intensity SD” to measure the recognition fitness in each image. Practically, within the recognition area (green), the uncertain area (light blue) is flagged, and their total ratios were scored as a “uncertain area ratio.” It should be noted that the “certain area” only defines the region of main body of spheroid, and its outer border in the whole spheroid is dependent on the recipe. Such certain area ratios are called “recognition fitness” in our study. (Right: Grey column) At present, the fitness and performance of a recipe is highly dependent on operators. In this study, we evaluated such recognition fitness with a more objective scoring criterion, the recognition fitness deviation (RFD). In this concept, the importance of evaluating a recipe by the summary of all the uncertain area ratios in each image, the SD value within the fitness of the recipe for data variations is proposed. By summarizing the SDs of uncertain area ratios with three types of scores, the radar chart can be illustrated to show the robustness of the recipe. The smaller the RFD (the area of radar chart), the more robust the recipe is. To solve this basic issue in morphological analysis, we here investigated the influence of non-robust recognition filter-sets, and propose a “recognition fitness deviation (RFD)” as a new scoring index to objectively rank the most robust recognition filter-set which leads to the best analysis performance. To investigate this concept, we compared the effect of three types of spheroid recognition filter-sets (designated as recipes) and investigated their effects on cell type classification performances only from their label-free images. For this model, phase-contrast microscopic images of spheroids, including cancer cells (A-498, A549, NCI–H23, U-251) and healthy cells (HASMC, NHDF), covering different or similar morphological features, were analyzed. Our RFD scoring, which is designed to reflect the deviation of recognition fitness toward different replicate samples, time points, and cell types, was shown to quantitatively indicate the most robust spheroid recognition recipe, which leads to the best cell type classification model using only spheroid morphology. Moreover, our investigation indicated that time-course morphological feature usage could complement the fluctuations of designed recipes and improve the analysis performance in combination with RFD evaluation.

Methods

Cell culture

Healthy human dermal fibroblast cells (NHDF (Lot No. 01439)), human aortic smooth muscle cells (HASMC (Lot No. 01293)), human adenocarcinoma cells derived from lung cancer (A549 (Lot No. 60150896)), human lung adenocarcinoma cells (NCI–H23 (Lot No. 58078626)), human renal cancer cells (A-498 (Lot No. 58033335)), and human astrocytoma cells (U-251 (Lot No. unidentified)) were used. Cell culture was performed using an appropriate medium according to the culture protocol described in the product information sheet (American Type Culture Collection). The cells were seeded on a 10-cm dish (172958, Thermo Fisher Scientific Inc., Waltham, MA, USA) and cultured. Each medium contained 10% fetal bovine serum (Lot No. 13N059, 172012-500 ML, Nichirei Bioscience, Tokyo, Japan) and 1% penicillin and streptomycin (26253-84, Nacalai Tesque, Kyoto, Japan) was added, and the cells were cultured at 37 °C, under 5% CO2. Cell suspensions were seeded in a Prime Surface 96-well plate (MS9096U, Sumitomo Bakelite, Tokyo, Japan) at a concentration of 1500 cells/well for spheroid formation.

Image acquisition

Phase-contrast images (1000 × 1000 pixels) at 4× magnification were captured using the automatic cell culture observation system BioStation CT (Nikon, Tokyo, Japan) at intervals of 6 h for 38 times over approximately 9 days. In this study, we selected time point 24 (= 144 h) to time point 32 (= 192 h), representing a total of 9 time points, as the period where spheroids formed stably for the analysis. For each cell type, 24 spheroids were prepared for each sample replicates to make total 5328 images (24 spheroids × 37 time points × 6 cell types). The images only from the successfully formed spheroids without noise were selected for analysis (19 for A-498, 21 for A549, 18 for NHDF, 17 for HASMC, 23 for NCI–H23, and 19 for U-251 (total 117)).

Image processing

CL-Quant software version 3.20 (Nikon, Tokyo, Japan) was used to design filter-sets for recognizing spheroids in images and to measure the recognized area. Multiple combinations of image processing filter-sets were called “recipes.” Three different recipes designed with different concepts were compared; Recipe A: using the soft-matching function in CL-Quant, which is the automated machine-learning algorithm based on user-selected areas. Recipe B consisted of: (1) normalization of the background; (2) soft-matching; (3) removal of objects; (4) filling holes. Recipe C included: (1) normalization of the background; (2) thresholding of the intensity; (3) opening and closing to fill holes; (4) flattening of the background; (5) thresholding of the intensity; (6) opening and closing to fill holes; (7) merge recognition areas from step 3 and 6; (8) fill holes. The three recipes were designed by three different operators. Recipe A was designed using 10–20 images from only NHDF. It was intended to capture the spheroid surface area information. Recipe B was modified from recipe A, but was optimized to fit cancer spheroid images (different from the cancer cells used in this study), and was intended to fit sharply to those cancer spheroids, but was first to be applied to the six cell types in this work. Recipe C was designed with 10–20 images of all the six cell types used in this work. However, recipe C was intended to emphasize the subtle differences of the spheroid surface areas, which tend to become characteristically loose for some cell types. From the recognized area using each recipe, a total of 11 morphological features (area, compactness, correlation mean, energy mean, entropy means, homogony mean, inertia means, length: width ratio (ratio of length/width), perimeter, shape factor, std dev (standard deviation of) intensity) were analyzed (Supplementary information Table S1). The feature calculation details are described in previous works [[19], [20], [21], [22], [23]]. Each feature was normalized with standard normalization for further analysis.

Morphological analysis for comparing the performances of the recipes

For the comparison of the effects from the different morphological features extracted from different recipes, the similar morphological features were analyzed using hierarchical clustering based on Euclidean distance. To further compare the differences of recipes, ridge regression was performed to classify six cell types with leave-one-out cross-validation. In the clustering and classification, the effect of morphological features was compared using only time point 3, or all time points. All analysis was performed using R software (version 3.2).

Measurement of recognition fitness deviations

To quantitatively evaluate the spheroid recognition fitness in all images, the recognition fitness deviation (RFD) was designed to score the robustness of image processing recipes (Fig. 1). First, for RFD calculations, the “uncertain area ratio” in each image was calculated. In each image, the uncertain area ratio was defined as the ratio of the “uncertain area” (Supplementary information Fig. S1A) in the “recognized area” by the recipe. The “uncertain area” is the area with poor contrast, therefore implementation of outline region can vary between the operators. The “certain area” is the spheroid main body area, where contrast is clearer and operators tend to recognize easily. In our study, we defined the “uncertain area” as a low-intensity value (<55) which repeats within a 5-pixel horizontal window, since in the raw image, the “certain area” of spheroid commonly shows high-intensity standard deviation (SD), and the rest of the image field shows faint intensity differences (Fig. 1A). It is important to note that the “high-intensity SD area,” which we call the “certain area” is the area, which probably includes the true spheroid main body, but is not the whole spheroid. Limited recognition of such area loses the characteristics of spheroids. Using the quantitative definition of uncertain area ratio, we could compare the area where operators differ in implementation in all the images with the same quantitation criteria automatically. When all the uncertain area ratios are calculated for all images using different recipes, the SD values were calculated within different sample replicates (score 1), different time points (score 2), and different cell types (score 3). The sum of scores 1 to 3 reflects the deviation of each recipe's recognition performance. If the deviation is high, the recipe is not working reproductively in some samples. Therefore, we designated these as RFD. As an illustration, sum of score 1 and score 2 is plotted in each hexagon axis per each cell type, and the total area of the radar chart reflects score 3 (Fig. 1).

Results

Diversity of spheroid morphology and fluctuation of spheroid recognition

In this study, six types of cells were selected to mimic the varieties of spheroids (Fig. 2A). Even by seeding the same cell number in a well, their morphologies were found to have diversity with their details. First, size difference is a clear morphological feature. Some spheroids shrink to a smaller size (U-251 and A-498) than other spheroids during the culture. Comparing such sizes in cancer cells (U-251, NCI–H23, A-498, and A549), the difference between normal and cancer cannot be categorized simply with their sizes. Second, the tightness of spheroid aggregation, reflected by the intensity distribution in the spheroid area, is also a characteristic feature. Although there is a slight tendency for normal cell spheroids to appear brighter, it is difficult to classify them as normal or cancer cells or their tissue origins. Therefore, it was clear that spheroid morphological features were more complicated than their differences in morphological characteristics, as in two-dimensionally cultured cells.

Fig. 2

Representative images of spheroids and their recognition. Using the same initial phase-contrast image (left column), three different types of recipes (recipe A, recipe B, and recipe C) were applied to recognize the same spheroid. The black area is the non-spheroid area defined by each recipe, and in the recognized area, the raw spheroid image (left column) is overlaid to indicate the “uncertain area ratio” visually. The row shows their morphological and recognition differences in six cell types. White bars in U-251, NCI–H23, A498, and A549: 75 μm, in NHDF and HASMC: 150 μm. In the bottom row, the recognized area outline (green line) visualizes the “outline of whole spheroid,” which varies greatly among the recipes. In other words, it indicates that there are conceptual differences of recipes regarding fitting sharply or loosely to capture the spheroid surface morphology. To analyze spheroid varieties using image analysis, we compared three recipes (A–C; Fig. 2A). The three recipes were designed by three operators aiming for the same goal, the morphological analysis of spheroids. However, their analytical situations and concepts for designing their recipes were different. They differed not only in their filter-set combinations but also the data which each operator focused upon to develop their recipe. For the design of recipe A, the operator utilized images of only one cell type (NHDF). The operator attempted to recognize the outermost surface of spheroids since NHDFs tend to aggregate loosely, and some cells float to the surface. For the design of recipe B, the operator used images of other cancer cell spheroids, which were not included in the six prepared cell types, and modified recipe A to fit the cell type. Thus, different filters were added in recipe B. For the design of recipe C, the operator utilized images of all six cell types and attempted to create a recipe to recognize various types of cells from scratch. However, in this recipe design, the operator attempted to augment the differences between various spheroids by rendering a recipe that sensitively recognizes the differences of spheroid surface collapse. As a result, their recognition of fitness was found to show diversity. However, the characteristics of these recipes were only evident when their recognition results for all cell types were paneled for visualization. For example, comparisons for single-cell types, such as U-251 or A-498, did not show clear differences between recipes. By the paneled comparison, recipe A showed overall “fat” recognition, recipe B showed overall “fit” recognition, and recipe C showed fluctuated recognition, which was “invasive” or “disordered,” sensitively reflecting the spheroid surface status. It is essential to realize that such paneled comparison results shown in Fig. 2A are only partial results in the more than 5000 images, showing one timepoint with one image from three replicates. This result strongly indicates that a recipe evaluated only by limited numbers of images does not assure robust performance and can critically disrupt further analysis when the number of images or cell types increases. In other words, the robustness of image processing requires an evaluation from the aspect of its overall performance toward the varieties of data by some quantitative index.

Evaluation of a recipe's robustness with recognition fitness deviation

To quantitatively and comprehensively evaluate the recipes, we analyzed the “recognition fitness,” which derives from the gap between the “recognized area (defined by the recipe designer's implementation)” and the “certain area (where spheroid main body can be clearly defined)”, designated as “uncertain area” (Fig. 1 and Supplementary information Fig. S1). To objectively score such fitness, we here introduced an algorithm to measure the “uncertain area ratio”. By analyzing large image data covering the variation in the sample replicates, time points, and cell types, we compared the recipe's reproducibility in their recognition fitness (Fig. 3A). In the plots of the uncertain area ratio, we found that even under replicate conditions (17–23 spheroids per each condition), there were outliers indicating unexpected recognition results.

Fig. 3

Evaluation of recipes using recognition fitness deviation (RFD). (A) Variation of uncertain area ratios among varieties of data (varieties within sample replicates, time points, and cell types). In each graph, the X-axis shows the time point (6 h), the first Y-axis (left) shows the uncertain area ratio for each plot in color (each spheroid), and the second Y-axis (right) shows the time-course changes in SD summarizing 17–23 plots (red line). Among the six-cell types evaluated, the red line pattern was similar among A-498, A549, and NCI–H23; NCI–H23 is only shown as a representative. The alphabetically indicated plots, a–h, are representative spheroid examples to indicate different uncertain area ratios in 3B. The alphabetically indicated (i-1)–(i-4) are representative cases of the different uncertain area ratios found in the same spheroid in 3B. (B) Representative images to indicate the uncertain area ratios between spheroids and their recognition areas. (a–h) indicates the alphabetically indicated plots in 3A. U-251 and NCI–H23, and NHDF and HASMC are lined vertically, to compare the differences of recognition fitness between pairs of similar spheroid morphologies. White bars in a–d: 65 μm, in e–h: 130 μm. (C) Representative images are indicating the uncertain area ratios within the time-course in the same spheroid (HASMC). (i-1) –(i-4) indicates the alphabetically indicated plots in 3A. All white bars: 130 μm. (D) The fluctuation of morphological features measured from different recognition fitness within the spheroid (HASMC) captured in 2C. The X-axis indicates the time points (6 h), and the Y-axis indicates the normalized feature values (Area, Length: width ratio, and Inertia mean). (E) The RFD evaluation of the three recipes, as a summary of recognition fitness among all time-points, spheroid image replicates, and cell types. Specifically, fitness among six cell types is indicated with the hexagon axis as a radar chart. If a recipe can be robustly used for a variety of cells, time-courses, and image replicates, the RFD (radar area) becomes smaller. Moreover, the outlier deviation of uncertain area was found in the time course as well (ex. Recipe C for HASMC recognition). When such recognition performances were paneled within all cell types, it was again found that there are specific cell types that show large deviations (ex. NHDF and HASMC recognition by recipe C). By detailed confirmation of each recipe's recognition image, the uncertain area ratio was visually confirmed to reflect the unexpected failures in recognition reproducibility (Fig. 3B and C). In our data, the NHDF and HASMC were the two most difficult spheroids to be robustly recognized. To further confirm the influence of such fluctuations in recipe performance, we compared the effect on the measured morphological features from their recognized areas (Fig. 3D, and Supplementary information Fig. S3). Even from the same spheroid, the morphological features were found to show significant differences when the deviations of the uncertain area ratios were high. This result indicated that if a recipe were not robust enough, the measured morphological features can contain significant noise. To visually and quantitatively evaluate the total performances of the recipes, we summarized the SDs of the uncertain area ratio within identical replicate samples (score 1), time points (score 2), and cell types (score 3) to visualize in a radar chart (Fig. 3E). In this visualization, the small and uniform radar area indicates the robustness of a recipe. With this scoring, recipe A can be ranked as the best of the three recipes quantitatively. We further designated these summed SDs of uncertain area ratios as “recognition fitness deviations (RFD).”

Effects of a recipe's recognition in further morphology-based analysis

When a recipe results in poor recognition robustness, the further measured morphological features can poorly express the morphological characteristics of spheroids. As a result, the following analysis using morphological features is affected. As a model case of this confirmation, we compared the analysis results (step 3 in Fig. 1) based on the three recipe's spheroid recognition and morphological measurement. First, by using the morphological features from a single time point (time point 3), we compared the recipe's effect in hierarchical clustering. In this analysis, we focused on how NHDF and HASMC, the most difficult spheroids to discriminate, would be clustered. The NHDF and HASMC spheroids were mostly clustered in separate clusters with recipe A, whether both cell types were combined in the same cluster in the other two recipes (Fig. 4A). Especially in recipe C, the morphological characteristics indicated by the heatmap became faint, because there appeared a “peaky morphological values” in the total morphological data by the fluctuating recognition.

Fig. 4

Comparison of hierarchical clustering results of spheroid morphologies between recipes. (A) Comparison of recipes with morphological features at time point 3. Columns: 17–23 spheroids per six cell types (117 spheroids). Row: 11 morphological features at only time point 3. (B) Comparison of recipes with morphological features at all time points. Columns: 17–23 spheroids per six cell types (117 spheroids). Row: 11 morphological features × nine time points. The heatmap indicates the normalized value for each feature (blue: low, yellow: high). The color label for each spheroid column under the clustering indicates cell types: pink, U-251; light blue, NCI–H23; yellow, A-498; green, A549; red, NHDF, blue, HASMC. The colored bars indicate the cluster of morphologically similar cell types: red, NHDF, blue, HASMC. The red horizontal line at the tree indicates the Euclidian distance = 10. We further utilized the same morphological features for the classification of six cell types by ridge regression. Recipe A showed better performance both in the (1) six-cell type classifications, and the (2) NHDF/HASMC classification, compared with other recipes. Especially, with the most challenging spheroids to be recognized, the classification of NHDF and HASMC tend to fail in recipe B and recipe C (Fig. 5A). This result indicates that the performance of the morphology-based prediction model can be critically affected by the performance of the utilized image processing recipe because the measured morphological features had less interpretable information. Moreover, recipe A, which showed the lowest RFD score, was found to show the best performance.

Fig. 5

Comparison of the confusion matrix of cell-type classification performances using morphological features from different recipes and their time points. (A) The classification performances of models using morphological features of time point 3. (B) The classification performances of models using morphological features of all time points. In the matrix, the numbers indicate the counts of spheroids classified by the model. The grey cell indicates the correctly classified, and the red cell indicates the misclassified (≥2). On the right, the total misclassification spheroid number is indicated. Second, since our previous studies showed the importance of using time-course morphological features in morphology-based predictions [21], we investigated the effect of using time-course information on recipe's performance. With hierarchical clustering, the spheroids of NHDF and HASMC were clustered in different clusters in recipe A, although they tend to be mixed under closer trees in the rest of the recipes (Fig. 4B). However, with recipe B, the mis-clustering rate was improved compared with the clustering results using only a single time point (Fig. 4A). With the six-cell type classification, the performances of all three recipes increased with both (six cell types, and NHDF/HASMC classification) compared with the classification model using only a single time point (Fig. 5). Moreover, the classification of NHDF or HASMC with recipe B and recipe C were significantly improved. Consequently, by the use of time-course morphological features, the performance itself and the deviation of performances among recipes were found to be improved. However, it should be noted that even with such performance improvements using time-course data, recipe B and recipe C showed lower performances when compared with recipe A, the lowest RFD scoring recipe.

Discussion

In this work, to obtain robust performance in the morphological image-based evaluation of cells, we propose RFD scoring concept as a means of objectively selecting the recipe, instead of the conventional expert experience-based selection, to enhance the reproducibility of spheroid image analysis. The reproducibility of image processing has rarely been discussed and quantitatively scored, since most of the image processing design was accomplished by manual trial and error until the operator was satisfied. This experience-based image processing design has been the standard in most image processing studies in various fields. To explore evidence-based options to the subjective operator-biased image processing pipeline, we presently propose an objective scoring concept to evaluate image processing recipes (Supplementary information Fig. S1 and S2). Our strategy is simple. Instead of insisting on a recipe based on limited evaluation, our RFD scoring enables the comprehensive and automatic evaluation of the recipe's performance for all the possible variations of the acquired images. Faced with a great variation and volume of images, the RFD scoring can perform the evaluation automatically, instead of necessitating an image-by-image manual scrutiny by the operator. Therefore, by its nature, RFD scoring can be applied to any type of image analysis of spheroids, including spheroid viability prediction, spheroid metabolic potency assessment, spheroid differentiation analysis, and spheroid morphometry measurement, as the first step of image analysis to compare the custom-made recipes for each data. As an analogy, our RFD score for selecting robust recipe can be interpreted as “Melting temperature value for selecting robust primers in quantitative polymerase chain reaction” (Supplementary information Fig. S1). Our data revealed the risk and importance of evaluating the “reproducibility” of manually designed spheroid recognition filter-sets, and proposed the RFD scoring to comprehensively and automatically evaluate their performances. To the best of our knowledge, this study is the first to show that a label-free morphological feature can discriminate 6 different cell qualities (the difference of cancer or normal, or differences in cancer cell types). The study is also the first to investigate the reproducibility of spheroid image analysis with 5000 experimentally obtained spheroid images. Our data clarified that different recipes designed with different concepts show significant variations in spheroid recognition, especially when they are checked in a panel with different data variations. By the objective quantification of such recognition fitness, we found that such variability of recipe performance can occur not only between cell types, but also during their time course, and even within replicated samples. Therefore, to design a robust recipe that can be promise reproducible results in big image data analysis, our data suggest that human-dependent or self-proposed recipe evaluation has a considerable reproducibility risk. Therefore, in this study, we evaluated the cell recognition performances widely throughout the data with a new quantitative scoring index, the RFD. To investigate the performance of our RFD scoring, we compared three recipes designed differently using different concepts and compared their actual performances not only in the recognition step but also in their morphology-based analysis steps. As a result, we found a good negative correlation of the proposed RFD score and their analysis performances, where a lower RFD indicates a higher recipe reproducible performance. With recipe A and recipe B (modified from recipe A), both recognition performance seemed very similar at a glance (Fig. 2A). However, when RFD were scored in detail and overall, the score indicated higher robustness for recipe A, and recipe A resulted in the best performances in clustering and classification. In the morphology-based analysis, we examined the effect of time-course data usage in different recipes. As a result, both the results of clustering and classification, the performance of all recipes, including the lower performing recipes (recipe B and recipe C), could be improved. Therefore, it was suggested that time course morphological information that can be obtained from label-free imaging could partially compensate for the fluctuation disorder of spheroid recognitions in less robust recipes. However, it is important to note that even with time-course data usage, the RFD score evaluation showed the best performing recipe, recipe A. In other words, it was clear that with lower RFD score recipe, which shows a lower rate of uncertain area recognition under any condition, works best as a morphology-based analysis model. In this study, we focused on the issue of the “recognition of object,” which is presently designed and evaluated manually by an operator, with spheroid images. However, with the recent progress of deep learning algorithms, biological-image analysis now has new approaches. There are algorithms to detect the object area with high precision by training the object feature through deep learning [27,28]. In such algorithms, the object, the spheroid in our case, can be recognized with higher recognition fitness compared with our three compared recipes. However, although any other algorithms may show higher fitness in some data, our work suggest that their robustness evaluation is more important. Moreover, with deep learnings, more and more volume of annotated spheroid images (which are difficult to obtain) is required to design robust recognition. Apart from the object recognition approach, there are also algorithms, which uses the total image pixel information including all object and background with convolutional feature extraction [29,30]. With such algorithms, there is no need to evaluate our recognition fitness, and their extracted information can be used for further analysis. However, even with such algorithms, our proposing concept of checking the robustness of the algorithm with varieties of data remains essential, because “automatic feature extraction ability” does not promise the robustness of image processing. Considering the practical application of image-based cell evaluation technology with various types of patients or lot diversities, it should be essential to establish robust image processing to provide robust measurement results for subsequent analysis. Our RFD scoring concept will release the image processing from the present human-oriented decision in image processing design to lead for a more automated cell recognition process that can be optimized with the growth of data. Moreover, by introducing such scoring in image processing, the robustness of the recipe can be optimized using automated machine learning algorithms (Supplementary information Fig. S5). Further studies should evaluate the effect of our RFD scoring in more varied images, including different cell types, image magnifications, and images from different microscopes. We believe our work will contribute to the mechanization and automation of image-based in-process monitoring technology in cell processing.

Declaration of Competing Interest

A collaboration research support from Nikon Corporation was funded to Ryuji Kato. The first author Kazuhide Shirai is the employee of Nikon Corporation, who have been administrated as PhD candidate in the Graduate School of Pharmaceutical Sciences, Nagoya University.

29 in total

1. High Content Imaging of Early Morphological Signatures Predicts Long Term Mineralization Capacity of Human Mesenchymal Stem Cells upon Osteogenic Induction.

Authors: Ross A Marklein; Jessica L Lo Surdo; Ian H Bellayr; Saniya A Godil; Raj K Puri; Steven R Bauer
Journal: Stem Cells Date: 2016-02-29 Impact factor: 6.277

Review 2. The third dimension bridges the gap between cell culture and live tissue.

Authors: Francesco Pampaloni; Emmanuel G Reynaud; Ernst H K Stelzer
Journal: Nat Rev Mol Cell Biol Date: 2007-10 Impact factor: 94.444

3. Imaging cell picker: A morphology-based automated cell separation system on a photodegradable hydrogel culture platform.

Authors: Mayu Shibuta; Masato Tamura; Kei Kanie; Masumi Yanagisawa; Hirofumi Matsui; Taku Satoh; Toshiyuki Takagi; Toshiyuki Kanamori; Shinji Sugiura; Ryuji Kato
Journal: J Biosci Bioeng Date: 2018-06-09 Impact factor: 2.894

4. Impact of the spheroid model complexity on drug response.

Authors: Oliver Ingo Hoffmann; Christian Ilmberger; Stefanie Magosch; Mareile Joka; Karl-Walter Jauch; Barbara Mayer
Journal: J Biotechnol Date: 2015-03-03 Impact factor: 3.307

5. Development and Characterization of a Scaffold-Free 3D Spheroid Model of Induced Pluripotent Stem Cell-Derived Human Cardiomyocytes.

Authors: Philippe Beauchamp; Wolfgang Moritz; Jens M Kelm; Nina D Ullrich; Irina Agarkova; Blake D Anson; Thomas M Suter; Christian Zuppinger
Journal: Tissue Eng Part C Methods Date: 2015-03-16 Impact factor: 3.056

Review 6. 3D tumor spheroids as in vitro models to mimic in vivo human solid tumors resistance to therapeutic drugs.

Authors: Ana S Nunes; Andreia S Barros; Elisabete C Costa; André F Moreira; Ilídio J Correia
Journal: Biotechnol Bioeng Date: 2018-10-27 Impact factor: 4.530

7. Classifying and segmenting microscopy images with deep multiple instance learning.

Authors: Oren Z Kraus; Jimmy Lei Ba; Brendan J Frey
Journal: Bioinformatics Date: 2016-06-15 Impact factor: 6.937

8. Morphology-Based Analysis of Myoblasts for Prediction of Myotube Formation.

Authors: Kiyoshi Ishikawa; Kei Yoshida; Kei Kanie; Kenji Omori; Ryuji Kato
Journal: SLAS Discov Date: 2018-08-13 Impact factor: 3.341

9. Characterization of primary human hepatocyte spheroids as a model system for drug-induced liver injury, liver function and disease.

Authors: Catherine C Bell; Delilah F G Hendriks; Sabrina M L Moro; Ewa Ellis; Joanne Walsh; Anna Renblom; Lisa Fredriksson Puigvert; Anita C A Dankers; Frank Jacobs; Jan Snoeys; Rowena L Sison-Young; Rosalind E Jenkins; Åsa Nordling; Souren Mkrtchian; B Kevin Park; Neil R Kitteringham; Christopher E P Goldring; Volker M Lauschke; Magnus Ingelman-Sundberg
Journal: Sci Rep Date: 2016-05-04 Impact factor: 4.379

10. Biomaterial-Free Three-Dimensional Bioprinting of Cardiac Tissue using Human Induced Pluripotent Stem Cell Derived Cardiomyocytes.

Authors: Chin Siang Ong; Takuma Fukunishi; Huaitao Zhang; Chen Yu Huang; Andrew Nashed; Adriana Blazeski; Deborah DiSilvestre; Luca Vricella; John Conte; Leslie Tung; Gordon F Tomaselli; Narutoshi Hibino
Journal: Sci Rep Date: 2017-07-04 Impact factor: 4.379

1 in total

1. Morphological heterogeneity description enabled early and parallel non-invasive prediction of T-cell proliferation inhibitory potency and growth rate for facilitating donor selection of human mesenchymal stem cells.

Authors: Yuta Imai; Kei Kanie; Ryuji Kato
Journal: Inflamm Regen Date: 2022-01-30

1 in total