| Literature DB >> 28218899 |
Felix Buggenthin1, Florian Buettner1,2, Philipp S Hoppe3,4, Max Endele3, Manuel Kroiss1,5, Michael Strasser1, Michael Schwarzfischer1, Dirk Loeffler3,4, Konstantinos D Kokkaliaris3,4, Oliver Hilsenbeck3,4, Timm Schroeder3,4, Fabian J Theis1,5, Carsten Marr1.
Abstract
Differentiation alters molecular properties of stem and progenitor cells, leading to changes in their shape and movement characteristics. We present a deep neural network that prospectively predicts lineage choice in differentiating primary hematopoietic progenitors using image patches from brightfield microscopy and cellular movement. Surprisingly, lineage choice can be detected up to three generations before conventional molecular markers are observable. Our approach allows identification of cells with differentially expressed lineage-specifying genes without molecular labeling.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28218899 PMCID: PMC5376497 DOI: 10.1038/nmeth.4182
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1Prediction of hematopoietic lineage choice up to three generations before molecular marker annotation using deep neural networks.
(a) Hematopoietic stem cells (gray) can differentiate and are annotated as committed towards the granulocytic/monocytic (GM, blue) lineage via detection of CD16/32, or towards the megakaryocytic/erythroid lineage (MegE, red) via GATA1-mCherry expression. These conventional markers necessarily appear after the lineage decision (gray box). (b,c) Exemplary image patches of a branch of single cells committing to either GM (b, upper row) or MegE (c, upper row) lineage (scale bars: 10µm). Cells with no marker expression are called “latent”, cells with marker expression “annotated” (b,c, middle row). Our automatic image processing pipeline allows robust cell identification and thus quantification of movement and morphology (demonstrated with cell size in lower rows of b and c) (d) A single image patch and the according cell’s displacement (white node) with respect to the previous time point are fed into a convolutional neural network (CNN) consisting of convolutional and fully connected layers (see Methods for more details on the network architecture). The last fully connected hidden layer (yellow) can be interpreted as patch-specific features. (e) To account for temporal dependencies we feed the CNN-derived patch features of a cell (yellow) in a recurrent neural network (RNN). The nodes in the hidden layer are connected to output nodes as well as all other hidden nodes across time (left); this temporal dependency is further illustrated in an unrolled representation of the RNN (right), where yellow squares represent the patch feature vectors at a specific time point and forward/backward arrows reflect the bidirectional architecture of the RNN. Every patch is assigned a lineage score between 0 and 1 (0=MegE, 1=GM, 0.5=unsure). (f) Two experiments are used for training, while one experiment is left out to assess generalization quality of the learned model. We repeat this procedure three times in a round-robin fashion. (g) Area under the receiver operating characteristics curve (AUC; 1.0=perfect classification, 0.5=random guessing) determines the performance of the trained models. Annotated cells (generations 0,+1,+2) and latent cells up to three generations before marker onset (generations -3,-2,-1) show AUCs higher than 0.77 (n=3 rounds, 4204 single cells in total). (h,i) AUCs when only (contiguous) subsets of image patches are used to compute the cell lineage score. AUCs over 0.75 were reached when using the first ~25% of timepoints in the cell cycle from latent (h) and annotated cells (i), respectively.
Figure 2Subsets of cells with differential PU.1-eYFP expression can be distinguished two generations after experiment start.
(a) Increase of PU.1-eYFP concentration in a branch with annotated GM marker onset. (b) Decrease of PU.1-eYFP concentration in a branch with annotated MegE marker onset. Concentrations (black dots) are fitted with a B-spline (black/colored line). (c) Cells without marker expression (unknown or latent fate, black) show an intermediate PU.1-eYFP concentration compared to cells annotated for GM (blue) and MegE (red) lineage. (d) Concentration of PU.1-eYFP for unknown and latent cells in generations after experiment start, subdivided into predicted GM (blue) and MegE (red), based on CNN-RNN lineage score for round 1 (see Supplementary Fig. 4 for rounds 2 and 3). PU.1-eYFP concentration is significantly different in generation 2 (P<0.01) and 3-8 (P<0.001, unpaired wilcoxon rank-sum test) after experiment start between the two predicted groups. (e) Significantly different PU.1-eYFP expression in generation 2 after experiment start is detected in 55 MegE (red) vs. 34 GM (blue) predicted cells, respectively.