Yuchen R He1,2, Shenghua He3, Mikhail E Kandel1,2, Young Jae Lee2,4, Chenfei Hu1,2, Nahil Sobh2,5, Mark A Anastasio1,2,6, Gabriel Popescu1,2,6. 1. Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States. 2. Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States. 3. Department of Computer Science & Engineering, Washington University in St. Louis, St. Louis, Missouri 63130, United States. 4. Neuroscience Program, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States. 5. NCSA Center for Artificial Intelligence Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States. 6. Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.
Abstract
Traditional methods for cell cycle stage classification rely heavily on fluorescence microscopy to monitor nuclear dynamics. These methods inevitably face the typical phototoxicity and photobleaching limitations of fluorescence imaging. Here, we present a cell cycle detection workflow using the principle of phase imaging with computational specificity (PICS). The proposed method uses neural networks to extract cell cycle-dependent features from quantitative phase imaging (QPI) measurements directly. Our results indicate that this approach attains very good accuracy in classifying live cells into G1, S, and G2/M stages, respectively. We also demonstrate that the proposed method can be applied to study single-cell dynamics within the cell cycle as well as cell population distribution across different stages of the cell cycle. We envision that the proposed method can become a nondestructive tool to analyze cell cycle progression in fields ranging from cell biology to biopharma applications.
Traditional methods for cell cycle stage classification rely heavily on fluorescence microscopy to monitor nuclear dynamics. These methods inevitably face the typical phototoxicity and photobleaching limitations of fluorescence imaging. Here, we present a cell cycle detection workflow using the principle of phase imaging with computational specificity (PICS). The proposed method uses neural networks to extract cell cycle-dependent features from quantitative phase imaging (QPI) measurements directly. Our results indicate that this approach attains very good accuracy in classifying live cells into G1, S, and G2/M stages, respectively. We also demonstrate that the proposed method can be applied to study single-cell dynamics within the cell cycle as well as cell population distribution across different stages of the cell cycle. We envision that the proposed method can become a nondestructive tool to analyze cell cycle progression in fields ranging from cell biology to biopharma applications.
The cell
cycle[1] is an orchestrated process that
leads to genetic replication
and cellular division. This precise, periodic progression is crucial
to a variety of processes, such as cell differentiation, organogenesis,
senescence, and disease. Significantly, DNA damage can lead to cell
cycle alteration and serious afflictions, including cancer.[2] Conversely, understanding the cell cycle progression
as part of the cellular response to DNA damage has emerged as an active
field in cancer biology.[3]Morphologically,
the cell cycle can be divided into interphase
and mitosis. The interphase[1] can further
be divided into three stages: G1, S, and G2. Since the cells are preparing
for DNA synthesis and mitosis during G1 and G2 respectively, these
two stages are also referred to as the “gaps” of the
cell cycle.[4] During the S stage, the cells
are synthesizing DNA, with the chromosome count increasing from 2N
to 4N.Traditional approaches for distinguishing different stages
within
the cell cycle rely on fluorescence microscopy[5] to monitor the activity of proteins that are involved in DNA replication
and repair, e.g., proliferating cell nuclear antigen (PCNA).[6] A variety of signal processing techniques, including
support vector machine (SVM),[7] intensity
histogram and intensity surface curvature,[8] level-set segmentation,[9] and k-nearest
neighbor,[10] have been applied to fluorescence
intensity images to perform classification. In recent years, with
the rapid development of parallel-computing capability[11] and deep learning algorithms,[12] convolutional neural networks have also been applied to
fluorescence images of single cells for cell cycle tracking.[13,14] Since all these methods are based on fluorescence microscopy, they
inevitably face the associated limitations, including photobleaching,
chemical, and phototoxicity, weak fluorescent signals that require
large exposures, as well as nonspecific binding. These constraints
limit the applicability of fluorescence imaging to studying live cell
cultures over large temporal scales.[15]Quantitative phase imaging (QPI)[16] is
a family of label-free imaging methods that has gained significant
interest in recent years due to its applicability to both basic and
clinical science.[17] Since the QPI methods
utilize the optical path length as intrinsic contrast, the imaging
is noninvasive and, thus, allows for monitoring live samples over
several days without concerns of degraded viability.[17] As the refractive index is linearly proportional to the
cell density,[18] independent of the composition,
QPI methods can be used to measure the nonaqueous content (dry mass)
of the cellular culture.[19] In the past
two decades, QPI has also been implemented as a label-free tomography
approach for measuring 3D cells and tissues.[20−27] These QPI measurements directly yield biophysical parameters of
interest in studying neuronal activity,[28] quantifying subcellular contents,[29] as
well as monitoring cell growth along the cell cycle.[30−32] Recently, with the parallel advancement in deep learning, convolutional
neural networks were applied to QPI data as universal function approximators[33] for various applications.[34] It has been shown that deep learning can help computationally
substitute chemical stains for cells[35] and
tissues,[36] extract biomarkers of interest,[37] enhance imaging quality,[38] as well as solve inverse problems.[39]In this article, we present a new methodology for cell cycle
detection
that utilizes the principle of phase imaging with computational specificity
(PICS).[37,40] Our approach combines spatial light interference
microscopy (SLIM),[41] a highly sensitive
QPI method, with recently developed deep learning network architecture
E-U-Net.[42] We demonstrate on live HeLa
cell cultures that the proposed method classifies cell cycle stages
solely using SLIM images as input. The signals from the fluorescent
ubiquitination-based cell cycle indicator (FUCCI)[43] were only used to generate ground truth annotations during
the deep learning training stage. Unlike previous methods that perform
single-cell classification based on bright-field and dark-field images
from flow cytometry[44] or phase images from
ptychography,[45] our method can classify
all adherent cells in the field of view and perform longitudinal studies
over many cell cycles. Evaluated on a test set consisting of 408 unseen
SLIM images (over 10 000 cells), our method achieves F-1 scores
over 0.75 for both the G1 and S stage. For the G2/M stage, we obtained
a lower score of 0.6, likely due to the round cells going out of focus
in the M-stage. Using the classification data outputted by our method,
we created binary maps that were used back into the QPI (input) images
to measure single cell area, dry mass, and dry mass density for large
cell populations in the three cell cycle stages. Because our SLIM
imaging is nondestructive, all individual cells can be monitored over
many cell cycles without loss of viability. We envision that our proposed
method can be extended to other QPI imaging modalities and different
cell lines, even those of different morphology, after proper network
retraining for high throughput and nondestructive cell cycle analysis,
thus eliminating the need for cell synchronization.
Results
The experiment setup is illustrated in Figure . We utilized spatial light interference
microscopy (SLIM)[41] to acquire the quantitative
phase map of live HeLa cells prepared in six-well plates. By adding
a QPI module to an existing phase contrast microscope, SLIM modulates
the phase delay between the incident field and the scattered field,
and an optical path length map is then extracted from four intensity
images via phase-shifting interferometry.[16] Due to the common-path design of the optical system, we were able
to acquire both the SLIM signals and epi-fluorescence signals of the
same field of view (FOV) using a shared camera. Figure B shows the quantitative phase map of live
HeLa cell cultures using SLIM.
Figure 1
Schematic of the imaging system. (A) The
SLIM module was connected
to the side port of an existing phase contrast microscope. This setup
allows us to take colocalized SLIM images and fluorescence images
by switching between transmission and reflection illumination. (B)
Measurements of HeLa cells. (C) mCherry fluorescence signals. (D)
mVenus fluorescence signals. (E) Cell cycle stage masks generated
by using adaptive thresholding to combine information from all three
channels. Scale bar is 100 μm.
Schematic of the imaging system. (A) The
SLIM module was connected
to the side port of an existing phase contrast microscope. This setup
allows us to take colocalized SLIM images and fluorescence images
by switching between transmission and reflection illumination. (B)
Measurements of HeLa cells. (C) mCherry fluorescence signals. (D)
mVenus fluorescence signals. (E) Cell cycle stage masks generated
by using adaptive thresholding to combine information from all three
channels. Scale bar is 100 μm.To obtain an accurate classification between the three stages within
one cell cycle interphase (G1, S, and G2), we used HeLa cells that
were encoded with fluorescent ubiquitination-based cell cycle indicator
(FUCCI).[43] FUCCI employs mCherry, an hCdt1-based
probe, and mVenus, an hGem-based probe, to monitor proteins associated
with the interphase. FUCCI transfected cells produce a sharp triple
color-distinct separation of G1, S, and G2/M. Figure C and 1D demonstrate
the acquired mCherry signal and mVenus signal, respectively. We combined
the information from all three channels via adaptive thresholding
to generate a cell cycle stage mask (Figure E). The procedure of sample preparation and
mask generation is presented in detail in the Materials
and Methods section and Figure S1.
Deep Learning
With the SLIM images as input and the
FUCCI cell masks as ground truth, we formulated the cell cycle detection
problem as a semantic segmentation task and trained a deep neural
network to infer each pixel’s category as one of the “G1”,
“S”, “G2/M”, or background labels. Inspired
by the high accuracy reported in previous works,[42] we used the E-U-Net (Figure A) as our network architecture. The E-U-Net architecture
upgraded the classic U-Net[46] by swapping
its original encoder layers with a pretrained EfficientNet.[47] Since the EfficientNet was already trained on
the massive ImageNet data set, it provided more sophisticated initial
weights than the randomly initialized layers from the scratch U-Net
as in previous approaches.[37] This transfer
learning strategy enables the model to utilize “knowledge”
of feature extraction learned from the ImageNet data set, achieving
faster convergence and better performance.[42] Since EfficientNet was designed using a compound scaling coefficient,
it is still relatively small in size. Our trained network used EfficientNet-B4
as the encoder and contained 25 million trainable parameters in total.
Figure 2
PICS training
procedure. (A) We used a network architecture called
the E-U-Net that replaces the encoder part of a standard U-Net with
the pretrained EfficientNet-B4. Within the encoder path, the input
images were downsampled 5 times through 7 blocks of encoder operations.
Each encoder operation consists of multiple MBConvX modules that consist
of convolutional layers, squeeze and excitation, and residual connections.
The decoder path consists of concatenation, convolution and upsampling
operations. (B) The model loss values on the training data set and
the validation data set after each epoch. We picked the model checkpoint
with the lowest validation loss as our final model and used it for
all analysis. (C) The model’s average F-1 score on the training
data set and the validation data set after each epoch.
PICS training
procedure. (A) We used a network architecture called
the E-U-Net that replaces the encoder part of a standard U-Net with
the pretrained EfficientNet-B4. Within the encoder path, the input
images were downsampled 5 times through 7 blocks of encoder operations.
Each encoder operation consists of multiple MBConvX modules that consist
of convolutional layers, squeeze and excitation, and residual connections.
The decoder path consists of concatenation, convolution and upsampling
operations. (B) The model loss values on the training data set and
the validation data set after each epoch. We picked the model checkpoint
with the lowest validation loss as our final model and used it for
all analysis. (C) The model’s average F-1 score on the training
data set and the validation data set after each epoch.We trained our E-U-Net with 2046 pairs of SLIM images and
ground
truth masks for 120 epochs. The network was optimized by an Adam optimizer[48] against the sum of the DICE loss[49] and the categorical focal loss.[50] After each epoch, we computed the model’s loss and
overall F1-score on both the training set and the validation set,
which consists of 408 different image pairs (Figure B,C). The weights of parameters that make
the model achieve the lowest validation loss were selected and used
for all verification and analysis. The training procedure is described
in the Materials and Methods.
PICS Performance
After training the model, we evaluated
its performance on 408 unseen SLIM images from the test data set.
The test data set was selected from wells that are different from
the ones used for network training and validation during the experiment. Figure A shows randomly
selected images from the test data set. Figure B and 3C show the
corresponding ground truth cell cycle masks and the PICS cell cycle
masks, respectively. It can be seen that the trained model was able
to identify the cell body accurately.
Figure 3
PICS results on the test data set. (A)
SLIM images of HeLa cells
from the test data set. (B) Ground truth cell cycle phase masks. (C)
PICS-generated cell cycle phase masks. Scale bar is 100 μm.
PICS results on the test data set. (A)
SLIM images of HeLa cells
from the test data set. (B) Ground truth cell cycle phase masks. (C)
PICS-generated cell cycle phase masks. Scale bar is 100 μm.We reported the raw performance of our PICS methods
in Figure S2, with pixel-wise precision,
recall,
and F1-score for each class. However, we noticed that these metrics
did not reflect the performance in terms of the number of cells. Thus,
we performed a postprocessing step on the inferred masks to enforce
particle-wise consistency, as detailed in the Materials
and Methods and Figure S3. After
this postprocessing step, we evaluated the model’s performance
on the cellular level and produced the cell count-based results shown
in Figure . Figure A shows the histogram
of cell body area for cells in different stages, derived from both
the ground truth masks and the prediction masks. Figure B and 4C show similar histograms of cellular dry mass and dry mass density,
respectively. The histograms indicated that there is a close overlap
between the quantities derived from the ground truth masks and the
prediction masks. The cell-wise precision, recall, and F-1 score for
all three stages are shown in Figure D. Each entry is normalized with respect to the ground
truth number of cells in that stage. Our deep learning model achieved
over 0.75 F-1 scores for both the G1 stage and the S stage, and a
0.6 F-1 score for the G2/M stage. The lower performance for the G2/M
stage is likely due to the round cells going out of focus during mitosis.
To better compare the performance of our PICS method with the previously
reported works, we produced two more confusion matrices (Figure S4) by merging labels to quantify the
accuracy of our method in classifying cells into [“G1/S”,
“G2/M”] and [“G1”, “S/G2/M”].
For all the classification formulations, we also computed the overall
accuracy. Compared to the overall accuracy of 0.91[13] from a method that used convolutional neural networks on
fluorescence image pairs to classify single cells into “G1/S”
or “G2”, our method achieved a comparable overall accuracy
of 0.89 (Figure S4A). Compared to the F1-score
of 0.94 and 0.88 for “G1” and “S/G2” respectively
from a method[14] that used convolution neural
networks on fluorescence images, our method achieved a lower F-1 score
for “G1” and a comparable F-1 score for “S/G2/M”
(Figure.S4B). Compared to the method[44] that classifies single-cell images from flow
cytometry, our method achieved a lower F-1 score for “G1”
and “G2/M” and a higher F-1 score for “S”.
Figure 4
PICS performance
on the test data set. (A–C) Histograms
of cell area, dry mass and dry mass density for cells in G1, S, and
G2/M, generated by the ground truth mask (in blue) and by PICS (in
green). A Gaussian distribution (in blue) was fitted to the ground
truth data and another Gaussian distribution (in red) was fitted to
the PICS results. (D) Confusion matrix for PICS inference on the test
data set. (E) Mean, standard deviation and their ratio (underlined
for visibility) of cell area, dry mass and dry mass density obtained
from the fitted Gaussian distributions. The top number is the fitted
parameter on the ground truth population, while the bottom number
is fitted on the PICS prediction population.
PICS performance
on the test data set. (A–C) Histograms
of cell area, dry mass and dry mass density for cells in G1, S, and
G2/M, generated by the ground truth mask (in blue) and by PICS (in
green). A Gaussian distribution (in blue) was fitted to the ground
truth data and another Gaussian distribution (in red) was fitted to
the PICS results. (D) Confusion matrix for PICS inference on the test
data set. (E) Mean, standard deviation and their ratio (underlined
for visibility) of cell area, dry mass and dry mass density obtained
from the fitted Gaussian distributions. The top number is the fitted
parameter on the ground truth population, while the bottom number
is fitted on the PICS prediction population.We calculated the means and standard deviations of the best fit
Gaussian for the area, dry mass, and dry mass density distributions
for populations of cells in each of the three stages: G1 (N = 4430 cells), S (N = 6726 cells), and
G2/M (N = 1865 cells). The standard deviation divided
by the mean, σ/μ, is a measure of the distribution spread.
These values are indicated in each panel of Figures A–C and summarized in Figure E (the top parameter was from
the ground truth population and the bottom parameter was from the
PICS prediction population). We note that the G1 phase is associated
with distributions that are most similar to a Gaussian. It is interesting
that the S-phase exhibits a bimodal distribution in both area and
dry mass, indicating the presence of a subpopulation of smaller cells
at the end of G1 phase. However, the dry mass density even for this
bimodal population becomes monomodal, suggesting that the dry mass
density is a more uniformly distributed parameter, independent of
cell size and weight. Similarly, the G2/M area and dry mass distributions
are skewed toward the origin, while the dry mass density appears to
have a minimum value of ∼0.0375 pg/μm2 (within
the orange rectangles). Interestingly, early studies of fibroblast
spreading also found that there is a minimum value for the dry mass
density that cells seem to exhibit.[51]
PICS Application
The PICS method can be applied to
track the cell cycle transition of single cells, nondestructively. Figure A shows the time-lapse
SLIM measurements and PICS inference of HeLa cells. The time increment
was roughly 2 h between two measurements and the images at t = 2, 6, 10, and 14 h were displayed in Figure A. Our deep learning model
has not seen any of these SLIM images during training. The comparison
between the SLIM images and the PICS inference showed that the deep
learning model produced accurate cell body masks and assigned viable
cell cycle stages. We showed in Figure B,C the results of manually tracking two cells in this
field of view across 16 h and using the PICS cell cycle masks to compute
their cellular area and dry mass. Figure B demonstrates the cellular area and dry
mass change for the cell marked by the red rectangle. We observed
an abrupt drop in both the area and dry mass around t = 8 h, at which point the mother cell divides into two daughter
cells. The PICS cell cycle mask also captured this mitosis event as
it progressed from the “G2/M” label to the “G1”
label. We observed a similar drop in Figure C after 14 h due to mitosis of the cell marked
by the orange rectangle. Figure C also shows that the cell continues growing before t = 14 h and the PICS cell cycle mask progressed from the
“S” label to the “G2/M” label correspondingly.
Note that this long-term imaging is only possible due to the nondestructive
imaging allowed by SLIM. It is possible that the PICS inference will
produce inaccurate stage label for some frames. For instance, PICS
inferred label “G2/M” for the cell marked by the blue
rectangle at t = 2, 10 h, but inferred label “S”
for the same cell at t = 6 h. Such inconsistency
can be manually corrected when the user made use of the time-lapse
progression of the measurement as well as the cell morphology measurements
from the SLIM image.
Figure 5
PICS on time lapse of FUCCI-HeLa cells. (A) SLIM images
and PICS
inference of cells measured at 2, 6, 10, and 14 h. The time interval
between imaging is roughly 2 h. We manually tracked two cells (marked
in red and orange). (B) Cell area and dry mass change of the cell
in the red rectangle, across 16 h. These values were obtained via
PICS inferred masks. We can observe an abrupt drop in cell dry mass
and area as the cell divides after around 8 h. (C) Cell area and dry
mass change of the cell in orange rectangle, across 16 h. We can observe
that the cell continues growing in the first 14 h as it goes through
G1, S, and G2 phase. It divides between hour 14 and hour 16, with
an abrupt drop in its dry mass and cell area. Scale bar is 100 μm.
PICS on time lapse of FUCCI-HeLa cells. (A) SLIM images
and PICS
inference of cells measured at 2, 6, 10, and 14 h. The time interval
between imaging is roughly 2 h. We manually tracked two cells (marked
in red and orange). (B) Cell area and dry mass change of the cell
in the red rectangle, across 16 h. These values were obtained via
PICS inferred masks. We can observe an abrupt drop in cell dry mass
and area as the cell divides after around 8 h. (C) Cell area and dry
mass change of the cell in orange rectangle, across 16 h. We can observe
that the cell continues growing in the first 14 h as it goes through
G1, S, and G2 phase. It divides between hour 14 and hour 16, with
an abrupt drop in its dry mass and cell area. Scale bar is 100 μm.We also demonstrated that the PICS method can be
used to study
the statistical distribution of cells across different stages within
the interphase. The PICS inferred cell area distribution across G1,
S, and G2/M is plotted in Figure A, whereby a clear shift between cellular area in these
stages can be observed. We performed Welch’s t test on these three groups of data points. To avoid the impact on p-value due to the large sample size, we randomly sampled
20% of all data points from each group and performed the t test on these subsets instead. After sampling, we have 884 cells
in G1, 1345 cells in S, and 373 cells in G2/M. The p-values are less than 10–3, indicating statistical
significance. The same analysis was performed on the cell dry mass
and cell dry mass density, as shown in Figures B,C. We observed a clear distinction between
cell dry mass in S and G2/M as well as between cell dry mass density
in G1 and S. These results agree with the general expectation that
cells are metabolically active and grow during G1 and G2. During S,
the cells remain metabolically inactive and replicate their DNA. Since
the DNA dry mass only accounts for a very small factor of the total
cell dry mass,[32] the distinction between
G1 cell dry mass and S cell dry mass is less obvious than the distinction
between S cell dry mass and G2/M cell dry mass. We also noted that
our observation on the cell dry mass density distribution agrees with
previous findings.[31]
Figure 6
Statistical analysis
from PICS inference on the test data set.
(A) Histogram and box plot of cell area. The p-value
returned from Welch’s t test indicated statistical
significance. (B) Histogram and box plot of cell dry mass. The p-value returned from Welch’s t test
indicated statistical significance. (C) Histogram and box plot of
cell dry mass density. The p-value returned from
Welch’s t test indicated statistical significance
comparing cells in G1 and S. The box plot and Welch’s t test are computed on 20% of all data points in G1, S,
and G2/M, randomly sampled. The sample size is 884 for G1, 1345 for
S, and 373 for G2/M. Outliers are omitted from the box plot. (***p < 0.001).
Statistical analysis
from PICS inference on the test data set.
(A) Histogram and box plot of cell area. The p-value
returned from Welch’s t test indicated statistical
significance. (B) Histogram and box plot of cell dry mass. The p-value returned from Welch’s t test
indicated statistical significance. (C) Histogram and box plot of
cell dry mass density. The p-value returned from
Welch’s t test indicated statistical significance
comparing cells in G1 and S. The box plot and Welch’s t test are computed on 20% of all data points in G1, S,
and G2/M, randomly sampled. The sample size is 884 for G1, 1345 for
S, and 373 for G2/M. Outliers are omitted from the box plot. (***p < 0.001).
Discussion
We
proposed a PICS-based cell cycle stage classification workflow
for fast, label-free cell cycle analysis on adherent cell cultures
and demonstrated it on the HeLa cell line. Our new method utilizes
trained deep neural networks to infer an accurate cell cycle mask
from a single SLIM image. The method can be applied to study single-cell
growth within the cell cycle as well as to compare the cellular parameter
distributions between cells in different cell cycle phases.Compared to many existing methods of cell cycle detection,[7−10,13,14,44,45,52] our method yielded comparable accuracy for at least
one stage in the cell cycle interphase. The errors in our PICS inference
can be corrected when the time-lapse progression and QPI measurements
of cell morphology were taken into consideration. Due to the difference
in the underlying imaging modality and data analysis techniques, we
believe that our method has three main advantages. First, our method
uses a SLIM module, which can be installed as an add-on component
to a conventional phase contrast microscope. The user experience remains
the same as using a commercial microscope. Significantly, due to the
seamless integration with the fluorescence channel on the same field
of view, the instrument can collect the ground truth data very easily,
while the annotation is automatically performed via thresholding,
rather than manually. Second, our method does not rely on fluorescence
signals as input. On the contrary, our method is built upon the capability
of neural networks to extract label-free cell cycle markers from the
quantitative phase map. Thus, the method can be applied to live cell
samples over long periods of time without concerns of photobleaching
or degraded cell viability due to chemical toxicity, opening up new
opportunities for longitudinal investigations. Third, our approach
can be applied to large sample sizes consisting of entire fields of
views and hundreds of cells. Since we formulated the task as semantic
segmentation and trained our model on a data set containing images
with various cell counts, our method worked with FOVs containing up
to hundreds of cells. Also, since the U-Net[46] style neural network is fully convolutional, our trained model can
be applied to images with arbitrary size. Consequently, the method
can directly extend to other cell data sets or experiments with different
cell confluences, as long as the magnification and numerical aperture
stay the same. Since the input imaging data is nondestructive, we
can image large cell populations over many cell cycles and study cell
cycle phase-specific parameters at the single cell scale. As an illustration
of this capability, we measured distributions of cell area, dry mass
and dry mass density for populations of thousands of cells in various
stages of the cell cycle. We found that the dry mass density distribution
drops abruptly under a certain value for all cells, which indicates
that live cells require a minimum dry mass density.During the
development of our method, we followed standard protocols
in the community,[53] such as preparing a
diverse enough training data set, properly splitting the training,
validation and test data set, and closely monitoring the model loss
convergence to ensure that our model can generalize. Our previous
studies showed that, with high-quality ground truth data, the deep
learning-based methods applied to quantitative phase images are generalizable
to predict cell viability[54] and nuclear
cytoplasmic ratio[37] on multiple cell lines.
Thus, although we only demonstrated our method on HeLa cells due to
the limited availability of cell lines engineered with FUCCI(CA)2,
we believe PICS-based instruments are well-suited for extending our
method to different cell lines and imaging conditions with minimal
effort to perform extra training. Our typical training takes approximately
20 h, while the inference is performed within 65 ms per frame.[37] Thus, we envision that our proposed workflow
is a valuable alternative to the existing methods for cell cycle stage
classification and eliminates the need for cell synchronization.
Materials
and Methods
FUCCI Cell and HeLa Cell Preparation
HeLa/FUCCI(CA)2[43] cells were acquired from the RIKEN cell bank
and kept frozen in a liquid nitrogen tank. Prior to the experiments,
we thawed and cultured cells into T75 flasks in Dulbecco’s
Modified Eagle Medium (DMEM with low glucose) containing 10% fetal
bovine serum (FBS) and incubated in 37 °C with 5% CO2. When the cells reached 70% confluency, the flask was washed with
phosphate-buffered saline (PBS) and trypsinized with 4 mL of 0.25%
(w/v) Trypsin EDTA for 4 min. When the cells started to detach, they
were suspended in 4 mL of DMEM and passaged onto a glass-bottom six-well
plate. HeLa cells were then imaged after 2 days of growth.
SLIM Imaging
The SLIM system architecture is shown
in Figure A. We attached
a SLIM module (CellVista SLIM Pro; Phi Optics) to the output port
of a phase contrast microscope. Inside the SLIM module, the spatial
light modulator matched to the back focal plane of the objective controlled
the phase delay between the incident field and the reference field.
We recorded four intensity images at phase shifts of 0, π/2,
π, and 3π/2 and reconstructed the quantitative phase map
of the sample. We measured both the SLIM signal and the fluorescence
signal with a 10×/0.3 NA objective. The camera we used was Andor
Zyla with a pixel size of 6.5 μm. The exposure time for SLIM
channel and fluorescence channel was set to 25 and 500 ms, respectively.
The scanning of the multiwell plate was performed automatically via
a control software developed in-house.[37,55] For each well,
we scanned an area of 7.5 × 7.5 mm2, which took approximately
16 min for the SLIM and the fluorescence channels. The data set we
used in this study were collected over 20 h, with approximately 30
min interval between each round of scanning.
Cellular Dry Mass Computation
We recovered the dry
mass asusing
the same procedure outlined in previous
works.[18,19] λ = 550 nm is the central wavelength;
γ = 0.2 mL/g is the specific refraction increment, corresponding
to the average of reported values;[18,56] and ϕ(x,y) is the measured phase. Eq provides the dry mass density at
each pixel, and we integrated over the region of interest to get the
cellular dry mass.
Ground Truth Cell Cycle Mask Generation
To prepare
the ground truth cell cycle masks for training the deep learning models,
we combined information from the SLIM channel and the fluorescence
channels (Figure S1A) by applying adaptive
thresholding (Figure S1B). All the code
was implemented in Python, using the scikit-image library. We first
applied the adaptive thresholding algorithm on the SLIM images to
generate accurate cell body masks. Then we applied the algorithm on
the mCherry fluorescence images and mVenus fluorescence images to
get the nuclei masks that indicate the presence of the fluorescence
signals. To ensure the quality of the generated masks, we first applied
the adaptive thresholding algorithm on a small subset of images with
a range of possible window sizes. Then we manually inspected the quality
of the generated masks and selected the best window size to apply
to the entire data set. After getting these three masks (cell body
mask, mCherry FL mask, and mVenus FL mask), we took the intersection
among them. Following the FUCCI color readout detailed in ref (43), a presence of mCherry
signal alone indicates the cell is in G1 stage and a presence of mVenus
signal alone indicates the cell is in S stage. The overlapping of
both signals indicates the cell is in G2 or M stage. Since the cell
mask is always larger than the nuclei mask, we filled in the entire
cell area with the corresponding label. To do so, we performed connected
component analysis on the cell body mask and counted the number of
pixels marked by each fluorescence signal in each cell body and took
the majority label. We handled the case of no fluorescence signal
by automatically labeling them as S because both fluorescence channels
yield low-intensity signals only at the start of the S phase.[43] Before using the mask for analysis, we also
performed traditional computer vision operations, e.g., hole filling.
on the generated masks to ensure the accuracy of computed dry mass
and cell area (Figure S1C).
Deep Learning
Model Development
We used the E-U-Net
architecture[42] to develop the deep learning
model that can assign a cell cycle phase label to each pixel. The
E-U-Net upgraded the classic U-Net[46] architecture
by swapping its encoder component with a pretrained EfficientNet.[47] Compared to previously reported transfer-learning
strategies, e.g., utilizing a pretrained ResNet[57] for the encoder part, we believe the E-U-Net architecture
is superior since the pretrained EfficientNet attains higher performance
on the benchmark data set while remaining compact due to the compound
scaling strategy.[47]The EfficientNet
backbone we ended up using for this project was EfficientNet-B4 (Figure A). The entire E-U-Net-B4
model contains around 25 million trainable parameters, which is smaller
compared to the number of parameters from the stock U-Net[46] and other variations.[58] We trained the network with 2046 image pairs in the training data
set and 408 image pairs in the validation data set. Each image contains
736 × 736 pixels. The model was optimized using an Adam optimizer[48] with default parameters against the sum of the
DICE loss[49] and the categorical focal loss.[50] The DICE loss was designed to maximize the dice
coefficient D (eq ) between the ground truth label (g) and prediction label (p) at each pixel. It has been shown in
previous works that DICE loss can help tackle class imbalance in the
data set.[59] Besides DICE loss, we also
utilized the categorical focal loss FL(p) (eq ). The categorical focal loss extended the cross entropy loss by
adding a modulating factor (1 – p)γ. It helped the model to focus
more on wrong inferences by preventing easily classified pixels dominating
the gradient. We tuned the ratio between these two loss values and
launched multiple training sessions. In the end we found the model
trained against an equally weighted DICE loss and categorical focal
loss gave the best results.The model was trained for 120 epochs,
taking
over 18 h on an Nvidia V-100 GPU. For learning rate scheduling, we
followed previous works[60] and implemented
learning rate warm-up and cosine learning rate decay. During the first
five epochs of training, the learning rate will increase linearly
from 0 to 4 × 10–3. After that, we decreased
the learning rate at each epoch following the cosine function. On
the basis of our experiments, we ended up relaxing the learning rate
decay such that the learning rate in the final epoch will be half
of the initial learning rate instead of zero.[60] We plotted the model’s loss value on both the training data
set and the validation data set after each epoch (Figure B) and picked the model checkpoint
with the lowest validation loss as our final model to avoid overfitting.
All the deep learning code was implemented using Python 3.8 and TensorFlow
2.3.
Postprocessing
We evaluated the performance of our
trained E-U-Net on an unseen test data set and reported the precision,
recall, and F-1 score for each category: G1, S, G2/M, and background,
respectively (Figure S2). The pixel-wise
confusion matrix indicated our model achieved high performance in
segmenting the cell bodies from the background. However, since this
pixel-wise evaluation overlooked the biologically relevant instance,
i.e., the number of cells in each cell cycle stage, we performed an
extra step of postprocessing to evaluate that.We first performed
connected-component analysis on the raw model predictions. Within
each connected component, we applied a simple voting strategy where
the majority label will take over the entire cell. Figure S3A,B illustrate this process. We believe enforcing
particle-wise consistency, in this case, is justified because it is
impossible for a single cell to have two cell cycle stages at the
same time and that our model is highly accurate in segmenting cell
bodies, with over 0.96 precision and recall (Figure S2). We then computed the precision, recall, and F-1 score
for each category on the cellular-level. For each particle in the
ground truth, we used its centroid (or the median coordinates if the
centroid falls out of the cell body) to determine if the predicted
label matches the ground truth. The cellular-wise metrics were reported
in Figure B.Before using the postprocessed prediction masks to compute the
area and dry mass of each cell, we also performed hole-filling as
we did for the ground truth masks to ensure the values are accurate
(Figure S3C).