| Literature DB >> 28327978 |
Mark-Anthony Bray1, Sigrun M Gustafsdottir2, Mohammad H Rohban1, Shantanu Singh1, Vebjorn Ljosa1, Katherine L Sokolnicki1, Joshua A Bittker3, Nicole E Bodycombe2, Vlado Dancík2, Thomas P Hasaka3, Cindy S Hon2, Melissa M Kemp2, Kejie Li2, Deepika Walpita2, Mathias J Wawer2, Todd R Golub4, Stuart L Schreiber2, Paul A Clemons2, Alykhan F Shamji2, Anne E Carpenter1.
Abstract
Background: Large-scale image sets acquired by automated microscopy of perturbed samples enable a detailed comparison of cell states induced by each perturbation, such as a small molecule from a diverse library. Highly multiplexed measurements of cellular morphology can be extracted from each image and subsequently mined for a number of applications. Findings: This microscopy dataset includes 919 265 five-channel fields of view, representing 30 616 tested compounds, available at "The Cell Image Library" (CIL) repository. It also includes data files containing morphological features derived from each cell in each image, both at the single-cell level and population-averaged (i.e., per-well) level; the image analysis workflows that generated the morphological features are also provided. Quality-control metrics are provided as metadata, indicating fields of view that are out-of-focus or containing highly fluorescent material or debris. Lastly, chemical annotations are supplied for the compound treatments applied. Conclusions: Because computational algorithms and methods for handling single-cell morphological measurements are not yet routine, the dataset serves as a useful resource for the wider scientific community applying morphological (image-based) profiling. The dataset can be mined for many purposes, including small-molecule library enrichment and chemical mechanism-of-action studies, such as target identification. Integration with genetically perturbed datasets could enable identification of small-molecule mimetics of particular disease- or gene-related phenotypes that could be useful as probes or potential starting points for development of future therapeutics.Entities:
Keywords: U2OS; cellular morphology; high-content screening; image-based screening; phenotypic profiling; small-molecule library
Mesh:
Substances:
Year: 2017 PMID: 28327978 PMCID: PMC5721342 DOI: 10.1093/gigascience/giw014
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Sample images of U2OS cells from the small-molecule Cell Painting experiment. Images are shown from a DMSO well (negative control, top row) and a parbendazole well (bottom row). The columns display the 5 channels imaged in the Cell Painting assay protocol (see Table 1 for details about the stains and channels imaged).
Details of dyes, stained cellular sub-compartments, and channels imaged in the Cell Painting assay
| Channel name | |||
|---|---|---|---|
| Dye | Organelle or cellular component | CellProfiler | ImageXpress |
| Hoechst 33 342 | Nucleus | DNA | w1 |
| Concanavalin A/Alexa Fluor 488 conjugate | Endoplasmic reticulum | ER | w2 |
| SYTO 14 green fluorescent nucleic acid stain | Nucleoli, cytoplasmic RNA | RNA | w3 |
| Phalloidin/Alexa Fluor 594 conjugate, wheat germ agglutinin (WGA)/Alexa Fluor 594 conjugate | F-actin cytoskeleton, Golgi, plasma membrane | AGP | w4 |
| MitoTracker Deep Red | Mitochondria | Mito | w5 |
The CellProfiler channel name refers to the name given by the software to each channel; this nomenclature also applies to the naming of the extracted morphological features. The ImageXpress channel name refers to the text in the raw image file name identifying the acquired wavelength. Please note that this protocol was later updated to use Phalloidin/Alexa Fluor 568 and WGA/Alexa Fluor 555, as described in [9].
Summary of the raw and intermediately processed data included in this Data Descriptor and nomenclature in the GigaDB and GitHub repositories
| Data item | Location | Description |
|---|---|---|
| Raw fluorescence images | The Cell Image Library [ | Five fluorescence channels, acquired at 6 fields of view per well at ×20 magnification (0.656 μm/pixel). The experiment comprises 406 plates in 384-well format (plates 24 277–26 796). We include a bash shell script to facilitate downloading the archives. |
| CellProfiler pipelines | GitHub: pipelines folder, GigaDB: pipelines.zip | CellProfiler software was used to correct for uneven illumination, perform quality control, and delineate cells into nuclei, cell body, and cytoplasmic sub-compartments and measure morphological features for each sub-compartment. |
| Illumination correction functions | GigaDB: <plate_ID>/illumination_correction_functions | An ICF is an estimation of the spatial illumination distribution introduced by the microscopy optics. There is 1 ICF per channel for each plate. |
| Quality control metadata | GigaDB: <plate_ID>/quality_control | Each field of view is assessed for the presence of 2 artifacts (focal blur and saturated objects), and assigned a label of 1 if present and 0 if not. |
| Extracted morphological features | GigaDB: <plate_ID>/extracted_features | A SQLite database comprising 4 tables (a) 1 per-image cellular statistic (e.g., cell count), (b) 3 per-cell cell tables, measuring size, shape, intensity, textural, and adjacency statistics for the nuclei, cytoplasm, and cell body. |
| Morphological profiles | GigaDB: <plate_ID>/profiles | Per-well averages of each extracted morphological feature computed across the cells. |
| Image curation statistics | GigaDB, GitHub: image_curation_statistics.csv | A summary of image statistics, such as the number of images, wells, and sites in the plates archived at The Cell Image Library, the number of sites with quality measures, and the number of wells with morphological profiles. |
| Chemical annotations | GigaDB, GitHub: chemical_annotations.csv | Chemical annotations including the compound names, SMILES, and PubChem identifiers (CID/SID) |