Literature DB >> 32348304

SMART-Q: An Integrative Pipeline Quantifying Cell Type-Specific RNA Transcription.

Xiaoyu Yang¹, Seth Bergenholtz¹, Lenka Maliskova¹, Mark-Phillip Pebworth^2,3, Arnold R Kriegstein^3,4, Yun Li⁵, Yin Shen^1,4.

Abstract

Accurate RNA quantification at the single-cell level is critical for understanding the dynamics of gene expression and regulation across space and time. Single molecule FISH (smFISH), such as RNAscope, provides spatial and quantitative measurements of individual transcripts, therefore, can be used to explore differential gene expression among a heterogeneous cell population if combined with cell identify information. However, such analysis is not straightforward, and existing image analysis pipelines cannot integrate both RNA transcripts and cellular staining information to automatically output cell type-specific gene expression. We developed an efficient and customizable analysis method, Single-Molecule Automatic RNA Transcription Quantification (SMART-Q), to enable the analysis of gene transcripts in a cell type-specific manner. SMART-Q efficiently infers cell identity information from multiplexed immuno-staining and quantifies cell type-specific transcripts using a 3D Gaussian fitting algorithm. Furthermore, we have optimized SMART-Q for user experiences, such as flexible parameters specification, batch data outputs, and visualization of analysis results. SMART-Q meets the demands for efficient quantification of single-molecule RNA and can be widely used for cell type-specific RNA transcript analysis.

Entities: Chemical

Mesh：

Substances：
RNA, Messenger
RNA

Year: 2020 PMID： 32348304 PMCID： PMC7190163 DOI： 10.1371/journal.pone.0228760

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

Comparative analysis of gene expression profiles among single cells is critically important to better understand the regulation of transcriptional activity, due to the substantial heterogeneity of transcriptional profiles across cell populations [1]. This issue is more pronounced in tissue samples and primary cells that involve multiple cell types or identities, rendering it indispensable to appropriately distinguish different cell types during analysis. Fluorescence in situ hybridization (FISH) [2-4] method has provided an avenue to investigate gene expressions in single cells. Single molecule FISH (smFISH) [5], with multiple fluorescent oligos hybridized to each transcript, can quantitatively evaluate the stochastic expression of genes. RNAscope [6] has greatly improved the accuracy and efficacy in detecting the single molecule transcript by utilizing double Z probes and successive amplification. With the aforementioned methods, gene transcripts can be quantified by counting individual dots in 3D stacks at the single-cell level. Various approaches have been developed to analyze data derived from FISH experiments [7-9]. These methods include a 3D Gaussian filtering step to correct illumination and a Laplacian of Gaussian enhancing step to obtain local maxima above a certain threshold [10]. Among them, starfish (https://spacetx-starfish.readthedocs.io/en/latest/), an open-source Python-based platform developed for analyzing spatial transcriptomics, is by far the most efficient imaging analysis tool in dealing with multiplexed spatial smFISH [11-17] and in-situ sequencing (ISS) data [18]. While starfish is useful for analyzing FISH data, it has several limitations. The first and foremost limitation is that starfish doesn’t have the features for processing additional layers of information such as cell marker immunostaining in a heterogeneous cell population, hindering the analysis of gene transcription in complex developmental or pathological processes [19, 20]. Besides, there is still room for improvement in multiple aspects including signal to noise enhancement, precise nuclei segmentation, and options for adjusting parameters during intermediate steps, etc. Here, we present the Single-Molecule Automatic mRNA Transcription Quantification pipeline (SMART-Q) with flexible and user-friendly features to allow for automatic detection of gene transcript signals, immunofluorescence signals, and precise segmentation of single cells. SMART-Q can analyze multiple channels in a single pipeline, and can accurately and efficiently quantify cell type-specific single-molecule RNA through integration with cell markers with improved user experience.

Materials and methods

Cell culture

Tissues are dissected and primary cells are disassociated from developmental dorsal cortex according to the protocol from Nowakowski et al [21]. Samples were collected with prior informed consent in strict observance of legal and institutional ethical regulations. All protocols were approved by the Human Gamete, Embryo, and Stem Cell Research Committee (GESCR) and Institutional Review Board at the University of California, San Francisco. Cells were cultured on coverslips and infected with lenti-virus expressing either GFP or mCherry. Cells were fixed in 4% PFA on Day 4 for staining.

RNAscope and immunocytochemistry staining

smFISH targeting nascent RNA of HES1 or BCL11A were performed using RNAscope® Multiplex Fluorescent Reagent Kit v2(ACDBio). Probes binding the intronic region of target genes were designed and synthesized by ACDBio. FISH signal was labeled with TSA Plus Cyanine 5 (Perkin Elmer). Immunocytochemistry was carried out after FISH procedure [22]. Antibodies targeting GFP(Abcam, ab1218), mCherry (Abcam, ab205402), GFAP (Ab4648) and SATB2 (Abcam, ab34735) were incubated overnight. Secondary antibodies including Alexa Fluor 594 Goat anti-chicken IgY secondary antibody (Thermo Fisher Scientific, A11042), Alexa Fluor 488 donkey anti-mouse IgG secondary antibody (Thermo Fisher Scientific, A21202), Alexa Fluor 546 donkey anti-mouse IgG secondary antibody (Thermo Fisher Scientific, A10036) and Alexa Fluor 488 donkey anti-rabbit IgG secondary antibody (Thermo Fisher Scientific, A21206) were incubated at RT for 1hr. Nuclei are stained with DAPI for 5 min before mounting with ProLong™ Gold Antifade Mountant (Thermo Fisher Scientific, P36930).

Image acquisition

Images were acquired by TSC SP8 Leica equipped with a 40× 1.43 NA oil objective. 2 sequential scans were performed to avoid spectral overlap. The pixel size in the image plane is 0.285 μm ×0.285 μm. The Z-step size was 0.4μm.

Code availability statement

The SMART-Q program is freely accessible on Github (https://github.com/shenlab-ucsf/SMART-Q).

Results

Enhanced architecture for source codes

In previous releases of starfish, the program was structured largely as a single executable function, requiring users to run through the entire pipeline before determining if a parameter choice is appropriate. In addition, when transfered to Jupyter Notebook, the default scripts written for Jupyter Notebook are not optimally organized or detailed for first-time users, creating an excessively front-loaded learning curve. SMART-Q is created with a new coding architecture that simplifies and modularizes each step of the pipeline into modular architecture (Fig 1) with the option to change parameters and assess quality at each step of analysis. By standardizing each module of the pipeline, users can effortlessly flow through the pipeline and change parameters much more efficiently when needed without sacrificing accuracy and flexibility.

Fig 1

Schematic of SMART-Q’s workflow under the new coding architecture.

Schematic of SMART-Q’s workflow under the new coding architecture.

(A) 3D stacks of smFISH and Immunofluorescence images obtained by confocal. GFP and mCherry are stained to represent differrent cell types. (B) Image files are converted to SMART-Q format as input. (C) (1) Filtering removes noise and amplifies signals. (2) Detection finds all RNA transcripts. (3a) Nuclei segmentation identifies all nuclei in DAPI stain. (3b) If the user is quantifying mature mRNA, an additional step is implemented to determine coordinates of all positive cells in each channel. (4) Assign nuclei to cell type-specific channel(s). (5) Final images and (6) final data are saved as PNG and Excel. Specifically, we implement the workflow as follows: 3D stacks of images are converted into SMART-Q format for each experiment (Fig 1A and 1B). SMART-Q first filters images using Gaussian high pass and Gaussian low pass filters (Fig 1C(1)). A Gaussian high pass filters out background noise, while a Gaussian low pass amplifies and smooths signals from fluorescent spots [23]. The RNA signal is then detected in three dimensions by fitting Gaussians to fluorescent spots of the image (Fig 1C(2)) [10]. Segmentation is then performed on the nuclei channel in two dimensions to determine the location of each nucleus (Fig 1C(3a)). If nascent RNA is the target of analysis, then nuclei are simply assigned to cell channel(s) (Fig 1C(4)). If mature mRNA is the target of analysis, then segmentation is also performed on the cell marker channels (Fig 1C(3b)), and then nuclei are automatically assigned to cell marker channel(s) (Fig 1C(4)). Finally, the positional data derived from RNA detection and segmentation are integrated to determine the final quantification of transcripts in each nucleus or cell (Fig 1C(5)). At the end of the pipeline, additional features are added so that images are saved for a quick review of the results and optional quality assurance. The final results and metadata are saved in Excel and CSV format. Quantification results are saved in cumulative batch files for optimal analysis within Excel (Fig 1C(6)). For users who wish to customize the pipeline by modifying or adding a step, the code has been optimized to make it easily readable and adaptable. Each channel type (transcripts, nuclei, cells) has been simplified to a Python class object, while each step of the pipeline is represented as a single function that belongs solely to the channel type(s) that uses it. With a specialized class for each of the three channel types, the code can easily accommodate any number of each channel type. In addition, with clutter reduced to an absolute minimum, users can efficiently and effortlessly locate relevant modules of the code that they wish to customize without having to waste time on irrelevant sections.

Determination of the optimal threshold for RNA detection

The final detection of RNA transcripts heavily depends on the parameters chosen by the user. The parameter that has the most impact on results, called min_mass in the program, is the minimum intensity that a diffraction-limited spot must have in order to be recognized by the detection function. In the previous starfish pipeline, choosing the correct value for this parameter was a difficult task, as it was impossible to precisely compare the results of RNA detection to the image of the original RNA signal nor was it possible to quantify the intensity value of each detected spot. In order to solve this problem, our SMART-Q provides a visualization tool that overlays the results of detection onto the post-filtered image of the RNA signal. SMART-Q provides a default threshold for FISH signal detection, and additional higher and lower thresholds can be defined by users. The visualization utility can identify spots in each intensity interval, categorize them and overlay the detection results relative to an upper bound and lower bound of the user’s choice (Fig 2A and 2C). Then users can quickly tell the quality of the spots identified within each interval (Fig 2B and 2D). The feature of displaying multiple intensity cutoffs simultaneously in SMART-Q drastically improve user’s experience in choosing an appropriate value for the minimum intensity in RNA detection. Notabley, our method results in similar RNA counts compared to FISH-quant (Fig 2E and 2F).

Fig 2

Assessing the quality of RNA detection results.

Graphic representations of the intensities of all detected spots relative to a chosen upper bound intensity and lower bound intensity. By comparing the results of different bounds, the user can determine an optimal minimum brightness for detecting gene transcripts. (A, B) Upper bound intensity of 0.07 and lower bound of 0.06. (C, D) Upper bound of 0.24 and lower bound of 0.23. B and D are the zoom-in of boxed subregion in A and C. The scale bar in A and C are 50 μm. (E) RNA detection by FISH-quant with the same data source in B or D. Detected dots with min_intensity = 200, quality score>60 (green circles). (F) Paired RNA detection are performed among 5 coverslips using SMART-Q or RNAscope. Paired t test. N = 12, P = 0.8030.

Assessing the quality of RNA detection results.

Increased accuracy and quality control for the segmentation of nuclei and cells

Segmentation has been recognized as a challenge shared by all existing methods. This is because segmentation results frequently have errors that must be corrected. Some methods, such as FISH-quant, use the Moore-Neighbor tracing algorithm modified by Jacob’s stopping criteria in order to determine nuclei boundaries [24]. However, this approach requires users to manually trace every nucleus they wish to analyze, which is time-consuming and human labor intensive, and thus highly inefficient, especially for large data sets. Other methods, such as starfish, determine nuclei boundaries using the watershed algorithm, which finds nuclei boundaries based on local minima and produces a segmented region for each local minimum. However, this strategy is known for its tendency to over-segment cells, meaning that a single nucleus may be mistakenly fractured into multiple subcomponents [25-27]. Moreover, when nuclei or cells border each other, they are prone to be under-segmented, meaning that multiple nuclei may be merged as one. In addition, images may contain background noise or artifacts, which cannot be effectively removed by starfish. Thus, segmentation remains as one critical step in the pressing need for method improvement. SMART-Q provides three solutions mitigating the aforementioned issues encountered in the segmentation. First, we have added a new parameter, called minimum depth to allieviate the over-segmentation potential that commonly occur in other analysis pipelines [25, 28]. When the watershed function classifies pixels by measuring saliency in contours, regions which can be taken as catchment basins are formed by local geometric structure. The minimum depth is a factor reflecting the height between watershed minima and various lower boundary points, or the height limits between neighboring catchment basins. Defining of the minimum depth enables sequential combination of watershed whose depth is below the minimum. With this additional parameter, the new watershed function in SMART-Q ensures that each local minima is significant enough to warrant separation of regions, thus effectively avoiding separations due to concavities with low depth. We recommend setting this factor at a value between 10−6 to 10−7, the optimal value for preventing over-segmentation based on our experience. On the other hand, watershed function with minimum depth provides solutions to handle cases of clustered nuclei or cells, which are more often observed in tissue sections. When bordering nuclei are under-segmented, depth in the bordering region tends to be shallow (Fig 3A), which shares similar curvature in over-segmented regions. Leveraging such tendency, the user may set the minimum depth to slightly higher values, which artificially over-segments the merged nuclei, thereby effectively separating nuclei that share a border and correcting the under-segmentation error. The optimal parameter we recommend is between 5*10−5 and 5*10−6 for mammalian cells (Fig 3B).

Fig 3

Use of minimum depth and correction in segmentation.

Use of minimum depth and correction in segmentation.

(A) Segmentation of nuclei is performed with starfish’s default settings. Magenta arrow designates undersegmentation, or multiple nuclei erroneously merged together. No minimum depth was used. (B) Minimum depth of 10−5 is used to artificially oversegment the previously undersegmented nuclei. White arrow designates oversegmented nuclei. (C) Oversegmentation can easily be corrected using the new segmentation correcting function. (D) The original nuclei image, which accurately resembles the final nuclei segmentation. (E) Outline of nuclear by FISH-quant. Entire image was seleted for autodetection. Empty arrows are pointing to the under-segmented nuclear that need to be separated manually. The scale bar is 10 μm. Second, SMART-Q enables an additional parameter: a minimum size parameter. Use of this minimum size parameter allows for automatic removal of artifacts and background noise. Because dead cells’ nuclei diminish in size upon death, they can easily be removed during the process of segmentation with the use of the minimum size parameter. Finally, for nuclei that have been mis-identified or over-segmented that cannot be fixed by parameter adjustment, SMART-Q provides a new function for direct quality correction. Previously in starfish, the user was unable to fix the results of segmentation directly and instead had to rely on iterative alteration of the parameters to achieve desirable results. Our method, in contrast, first visualizes the results of segmentation, giving a unique ID to each segmented area. With these unique IDs, users are empowered to perform flexible manipulation of the segmentation results, such as complete deletion of certain region(s) or merging multiple into one, by simply feeding SMART-Q ID(s) of the corresponding region(s) (Fig 3B, 3C and 3D), thus providing the feasibility of manual correction of incorrectly assigned nuclei after the visual inspection and verification. When the same image is analyzed by outline function in FISH-quant (Fig 3E). The entire imgae is selected and nuclear are automatically detected by intensity threshold that can’t be adjusted. When the FISH-quant failed to separated two closeby nucleus, one have manually dividing two joint nucleus by drawing the boundary which can be very inefficicent with the risk of introducing bias. The implementation of new parameters minimum depth and minimum size significantly improves the accuracy and flexibility of the segmentation step. Any remaining issues in segmentation can now be directly fixed using our new quality correction function, substantially reducing the amount of time and effort to achieve accurate segmentation.

Addition of a new feature to enable assignment of nuclei to cell types

RNA quantification experiments involving heterogeneous populations often seek to understand the differences between the various cell types or identities analyzed. For example, a researcher may aim to identify the differences in gene expression between wild type cells and CRISPR-modified cells or between differentiated and undifferentiated cells. In order to detect these differences, one must be able to accurately and efficiently assign identity to each nucleus or cell to achieve precise quantification of RNA transcripts in a cell type-specific manner. To maximally benefit from this new functionality, we need to determine both positional data of each nucleus (e.g., DAPI staining) and each cellular channel (e.g., IF staining of cell markers). Here we infected radial glia cells with GFP and mCherry expressing lenti-vrius and followed by and performed immunocytochemistry of GFP and mCherry proteins to model different cell types and FISH staining targeting intronic region of HES1 as an example. In SMART-Q, we determine the positional data of each nucleus by performing segmentation on the DAPI staining (Fig 4A and 4E). Cell staining is categorized using immunofluorescence properties that differ across cell types, such that cells will only exhibit positive staining if they belong to a particular cell type or identity according to the user’s experimental design (Fig 4C and 4D). When quantifying the nascent RNA and the mature mRNA, different strategies are employed to determine the positions of channel-positive cells. In the case of mature mRNA quantification where RNA signals are distributed in both nucleus and cytoplasm, the user is required to perform segmentation on both cell identity and morphology channel. Therefore, we are able to integrate the data derived from cellular segmentation to determine which nuclei belong to channel-positive cells, and outline the territory for counting FISH signals. In current version of SMART-Q, cell segmentation is based on 2D segmentation algorithm and might not accurate when there’re overlapping boundaries. So we are only supporting cell segmentation based on DAPI stating. In the case of nascent RNA quantification (Fig 4B), where cellular segmentation is not performed, we implement an approach that requires merely a threshold to determine the position of each bright region in a cell staining. The data is then integrated with nuclei segmentation results to automatically assign nuclei to channel-positive cells (Fig 4G and 4H). In both cases, users can perform quality control to ensure correct results. Next we validated cell type specific analysis in a human primary cells from the developing cortex. Here radial glia cells (RG) are labeled with GFAP antibody and excitatory neurons (eNs) are labeled with SATB2 antibody on the same slide. We quantified the expression of HES1 gene that are only expressed in RG and BCL11A gene expressed in both RG and eNs. We demonstate cell type-specific expression of HES1 in GFAP positive cells (Fig 4I–4L), while BCL11A are expressed in both cell types (Fig 4M–4P) by RNAscope.

Fig 4

Integrative analysis of cell type-specific transcirpt counts by SMART-Q.

(A, C, D) Immunocytochemistry staining for nuclear, GFP and mCherry. (B) RNAscope targeting HES1 transcripts in radial glia cells. (E) Nuclear segmentation. (F) Pseudocolor composition of all channels, including cell type-specific transcript counting. (G, H) Transcript counts in GFP/mCherry positive cells. (I, M) Composite image of immunocytochemistry staining in primary cells. Radial glia cells are labeld with GFAP (red), and excitatory neurons are labeled with SATB2 (green). (J,N) Nascent RNA transcript counts in all nuclei. (K, O) Nascent RNA transcript counts in identified radial glia cells. (L, P) Nascent RNA counts in identified excitatory neurons. (I-L) RNAscope with probes targeting intronic region of HES1. (M-P) RNAscope with probes targeting intronic regions of BCL11A. The scale bars are 30 μm.

Integrative analysis of cell type-specific transcirpt counts by SMART-Q.

Streamlined procedure for saving data and parameter settings

While developing SMART-Q, we have identified a need for saving data for subsequent analyses in other programs, which might be implemented in platforms other than Python, such as R, Excel, or Google Sheets. In order to expedite this process, we chose to save final results in a batch file containing the results for every sample analyzed in a batch. We have written template files for post-analysis in Google Sheets, which are available to users to customize for their purposes. We have also added a new feature that allows users to save settings and metadata for re-establishing the analysis of a specific sample. Previously in starfish, once analysis was complete, the settings for the previous sample were not stored. If a change needed to be made to a parameter or step in the pipeline in the future, it was impossible to re-establish the pipeline with the parameters previously used. In SMART-Q, a metadata file with all parameters and settings is now saved each time during the analysis. These metadata files are not saved in batch as final results are, but are rather saved in individual files for each sample analyzed. If a parameter or step of the pipeline is later desired to be altered, this metadata file can be used as input in a modified version of the pipeline to re-establish the pipeline with the previous parameters, while still allowing the user to change any parameters or steps of their choice.

Simplified image file input feature and streamlined visualization for results display and quality assurance

We further empower the SMART-Q with the ability to execute ImageJ or Fiji macro scripts, a feature that was lacking in other analysis pipelines. Users can now create composite images of the final results and execute any ImageJ and Fiji macros within SMART-Q, making it possible to create desired images in a batch analysis, drastically reducing the amount of time this process would otherwise require (Fig 4F). We additionally find it useful to save images at each step of the pipeline for quality assurance. We designed images that best exemplify the effects of each step and now display and save images for the following steps: filtering, RNA detection, segmentation, channel assignment, and final quantification. These will allow users to review critical quality checkpoints whenever desired without having to re-run the pipeline.

Discussion

SMART-Q is time-saving for analyzing large datasets. For one image with 4 channels and 20 Z-stacks, it takes about 5 min from confocal image output to cell type-specific RNA counts. In comparison to FISH-quant, we found FISH-quant spends similar amount of time in splitting images and 3D dot identification, but much more time in segmentation when manually outlining multiple nuclei. Notably, multiple jupyter-notebook interfaces can run simultaneously, while other pipelines could only analyze images one by one. Thus, by using SMART-Q, users can save days of time when processing hundreds of images. Some improvements can be implemented to make SMART-Q even more powerful in cell identity assignment in the future. Currently, segmentation is performed on the 2D axis rather than the 3D axis, which creates issues when a significant portion of multiple nuclei or cells share the same z plane. In highly confluent tissue samples, the lack of an adequate method to accurately segment overlapping nuclei or cells suggests that the RNA transcripts belonging to overlapping regions cannot be properly quantified, and thus they must be removed from analysis during quality control. Similarly, overlapping nuclei can be difficult to assign to a particular cell type, as it may be difficult to determine which nucleus belongs to a channel-positive cell. Future efforts are warranted to ameliorate this issue by expanding the capability to segment along the z-axis as well or by integrating an optional hand-drawing or semi-supervised drawing application into the segmentation method.

Conclusion

We have developed SMART-Q to quantify RNA transcripts at the single cell level with assigned cellular identity. Through its new modular design and a strong focus on ease of use and customizability, SMART-Q is applicable to any 3D FISH images, solving cell type-specific transcripts quantification in different experimental approaches. SMART-Q can automatically assign nuclei to cell channels, allowing users to compare results, both quantitatively and visually, among any number of cell types depending on the user’s experimental design. SMART-Q provides quality control functionalities to test varying thresholds for RNA transcripts detection, and to improve nuclei and cell segmentation. Finally, by saving both quantitative and qualitative results, SMART-Q enhances users’ capabilities with respect to quality assurance and streamlined analysis of quantification results. Overall, the streamlined and modular characteristics of SMART-Q significantly improve the user experience and makes cell type-specific RNA quantification analysis highly accurate and efficient. 7 Feb 2020 PONE-D-20-01686 SMART-Q: An Integrative Pipeline Quantifying Cell Type-Specific RNA Transcription PLOS ONE Dear Dr. Shen, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Reviewers are suggesting to test SMART-Q for more than one RNA and cell types, and provide the demonstration of the segmentation of both of nulcei and cells. We would appreciate receiving your revised manuscript by Mar 23 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. We look forward to receiving your revised manuscript. Kind regards, Ruijie Deng Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements: 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. In your Methods, please state the exact origin of the cells used in your study. 3. Please amend the manuscript submission data (via Edit Submission) to include author Lenka Maliskova. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: In this manuscript, the author developed an efficient and customizable analysis method, which can analyze gene transcripts in a cell type-specific manner. The method is facile and bring great convenience for users. I recommend its publication after some revisions as below: 1. The method is based on Z-stack images, RNA amplicons may merged together, or when target RNA has high expression, many amplicons may be too close to each other to be distinguished. The accuracy for RNA detection may be limited. 2. The author stated that their method has increased accuracy and quality control for the segmentation of nuclei and cells. However, they just showed the data for the segmentation of nuclei, while didn’t show the evidence of the segmentation of cells. Clinging cells always exist, and cell boundaries is more difficult to determined. 3. Another issue is how about the processing time SMART-Q required. Reviewer #2: In this manuscript, the authors develop an efficient and customizable method to analyze data derived from FISH or RNAscope experiments, called Single-Molecule Automatic RNA Transcription Quantification (SMART-Q). The SMART-Q improves the features for processing additional layers of information compared with starfish, which is an open-source Python-based platform. It efficiently infers cell identity information from multiplexed immuno-staining and quantifies different cells with assigned cellular identity using a 3D Gaussian fitting algorithm. Furthermore, the authors have optimized SMART-Q for user experiences, such as flexible parameters specification, batch data outputs, and visualization of analysis results. SMART-Q may meet the demands for efficient quantification of single-molecule RNA and can be widely used for cell type-specific RNA transcript analysis. I think this work would meet the criteria of PLOS ONE, if the following issues are properly addressed. 1. The author claimed that the SMART-Q can quantify cell type-specific RNA transcription. However, for cell type, the authors use the same type of cell infected with lenti-virus expressing either GFP or mCherry to character different cellular identity, this is not strict and accurate; for specific RNA, the authors only use a nascent RNA of HES1, which can't reflect cell type-specific RNA transcription. Therefore, I suggest that two different cell types and two cell type-specific RNA are needed in this work. 2. After infected with lenti-virus expressing either GFP or mCherry, cells have already produced different fluorescence signals, which could be effectively distinguished. But, why was still immunocytochemistry carried out targeting GFP and mCherry? Please explain it. 3. Most of the references are published before 5-10 years, and it is recommended to update the references. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step. Submitted filename: PONE-D-20-01686-comment.docx Click here for additional data file. 22 Mar 2020 Journal Requirements: When submitting your revision, we need you to address these additional requirements: 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.plosone.org/attachments/PLOSOne_formatting_sample_main_body.pdf and http://www.plosone.org/attachments/PLOSOne_formatting_sample_title_authors_affiliations.pdf Thanks for editors comments. We checked the style and edited accordingly. 2. In your Methods, please state the exact origin of the cells used in your study. Cell source are updated with more details in method. Line 97-103 3. Please amend the manuscript submission data (via Edit Submission) to include author Lenka Maliskova. Author list is amended to include Lenka Maliskova. We have also added Mark-Phillip Perbworth and Arnold Kriegstein due to their contributions for the manuscript revision. Review Comments to the Author: Reviewer #1: In this manuscript, the author developed an efficient and customizable analysis method, which can analyze gene transcripts in a cell type-specific manner. The method is facile and bring great convenience for users. I recommend its publication after some revisions as below: 1. The method is based on Z-stack images, RNA amplicons may merged together, or when target RNA has high expression, many amplicons may be too close to each other to be distinguished. The accuracy for RNA detection may be limited. We appreciate reviewer#1’s recognition of the efficiency of SMART-Q in cell type-specific analysis of the FISH signal. When imaging RNAscope slides, we found the average size of RNA amplicons are ~800nm. Each dot exists in about 3~4 Z-stacks. If most of the dots are merged, such as mean diameter is more than 3~5um, FISH methods need to be adjusted to get more specific and clear staining. If there’s only a small proportion of merged amplicon of regular size, it can be corrected in SMART-Q. In the 3D Gaussian algorithm, the fluorescent intensity is fitting into Gaussian distribution both in XY and XZ direction. In this case, overlapping dots in XY direction can still be separated by the saddle between the two points in XZ direction. Besides, in fluorescent image analysis, 3D imaging is more comprehensive and accurate than 2D in FISH quantification as the number of dots might be different in different layers of the Z-axis. 2. The author stated that their method has increased accuracy and quality control for the segmentation of nuclei and cells. However, they just showed the data for the segmentation of nuclei, while didn’t show the evidence of the segmentation of cells. Clinging cells always exist, and cell boundaries is more difficult to determined. SMART-Q can do 2D cell segmentation, which bases on the same algorithm as nuclear segmentation. However, the current SMART-Q version focuses on the nascent RNA quantification, and cell segmentation is not the first segmentation choice. We agree with the difficulties Reviewer#1 mentioned in identifying cell boundaries. 3D segmentation could perhaps better solve the problem but need perfect cell morphology staining and lots of computing power, which will compromise the efficiency of SMART-Q aimed in this version. We adjusted the description not to overclaim the function (line 269-271) and will incorporate an updated 3D segmentation algorithm into SMART-Q in the future to serve more applications. 3. Another issue is how about the processing time SMART-Q required. SMART-Q is timesaving when analyzing large datasets. For one image with four channels and 20 Z-stacks, it takes about 1 min to split an image into individual channels and stacks, 1 min to format into SMART-Q inputs and 3 min to perform 3D analysis in jupyter-notebook. We compared the processing time required with FISH-quant. FISH-quant spends a similar amount of time in splitting images and 3D dot identification, but much more time in segmentation due to outlining cells or nuclei manually. Notably, multiple jupyter-notebook interfaces can run simultaneously, while other pipelines could only analyze images one by one. When processing hundreds of images, this will save days. We add these descriptions to the manuscript line 327-363. Reviewer #2: In this manuscript, the authors develop an efficient and customizable method to analyze data derived from FISH or RNAscope experiments, called Single-Molecule Automatic RNA Transcription Quantification (SMART-Q). The SMART-Q improves the features for processing additional layers of information compared with starfish, which is an open-source Python-based platform. It efficiently infers cell identity information from multiplexed immuno-staining and quantifies different cells with assigned cellular identity using a 3D Gaussian fitting algorithm. Furthermore, the authors have optimized SMART-Q for user experiences, such as flexible parameters specification, batch data outputs, and visualization of analysis results. SMART-Q may meet the demands for efficient quantification of single-molecule RNA and can be widely used for cell type-specific RNA transcript analysis. I think this work would meet the criteria of PLOS ONE, if the following issues are properly addressed. 1. The author claimed that the SMART-Q can quantify cell type-specific RNA transcription. However, for cell type, the authors use the same type of cell infected with lenti-virus expressing either GFP or mCherry to character different cellular identity, this is not strict and accurate; for specific RNA, the authors only use a nascent RNA of HES1, which can't reflect cell type-specific RNA transcription. Therefore, I suggest that two different cell types and two cell type-specific RNA are needed in this work. We appreciate Reviewer 2’s comments. To adequately address the concern, we used the primary culture of the human embryonic cortex, which had a mixture of radial glia (RG), intermediate progenitor cells (IPC), excitatory neurons (eN) and inhibitory neurons (iN). As a proof of principle, we picked an RG-specific gene, HES1, with its expression in GFAP (a marker for RG) positive cells, but not in SATB2 positive cells that are eN (Fig4. I-L). We also picked another gene BCL11A, which is expressed in both RG and eN. We show the detection of positive FISH signals in both cell types using BCL11A FISH probes (Fig4. M-P). These results demonstrate the capability of SMART-Q in cell-type-specific transcription analysis. (Line 276-292) 2. After infected with lenti-virus expressing either GFP or mCherry, cells have already produced different fluorescence signals, which could be effectively distinguished. But, why was still immunocytochemistry carried out targeting GFP and mCherry? Please explain it. For the lenti-virus vector induced expression of GFP and mCherry, we could see the fluorescence in live cells, but it will be bleached by RNAscope staining procedure. To overcome this issue, we had to use the immunocytochemistry to enhance the signal. 3. Most of the references are published before 5-10 years, and it is recommended to update the references. Thanks for Reviewer #2’s suggestion. Some newly published references, including ref 4, 7, 21, 22 are added. Submitted filename: Response to Reviews.docx Click here for additional data file. 1 Apr 2020 SMART-Q: An Integrative Pipeline Quantifying Cell Type-Specific RNA Transcription PONE-D-20-01686R1 Dear Dr. Shen, We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements. Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication. Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. With kind regards, Ruijie Deng Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: (No Response) Reviewer #2: (No Response) ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No 17 Apr 2020 PONE-D-20-01686R1 SMART-Q: An Integrative Pipeline Quantifying Cell Type-Specific RNA Transcription Dear Dr. Shen: I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. For any other questions or concerns, please email plosone@plos.org. Thank you for submitting your work to PLOS ONE. With kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Ruijie Deng Academic Editor PLOS ONE

22 in total

1. Stochastic gene expression in a single cell.

Authors: Michael B Elowitz; Arnold J Levine; Eric D Siggia; Peter S Swain
Journal: Science Date: 2002-08-16 Impact factor: 47.728

2. In situ sequencing for RNA analysis in preserved tissue and cells.

Authors: Rongqin Ke; Marco Mignardi; Alexandra Pacureanu; Jessica Svedlund; Johan Botling; Carolina Wählby; Mats Nilsson
Journal: Nat Methods Date: 2013-07-14 Impact factor: 28.547

3. Spatial organization of the somatosensory cortex revealed by osmFISH.

Authors: Simone Codeluppi; Lars E Borm; Amit Zeisel; Gioele La Manno; Josina A van Lunteren; Camilla I Svensson; Sten Linnarsson
Journal: Nat Methods Date: 2018-10-30 Impact factor: 28.547

4. FISH-quant: automatic counting of transcripts in 3D FISH images.

Authors: Florian Mueller; Adrien Senecal; Katjana Tantale; Hervé Marie-Nelly; Nathalie Ly; Olivier Collin; Eugenia Basyuk; Edouard Bertrand; Xavier Darzacq; Christophe Zimmer
Journal: Nat Methods Date: 2013-04 Impact factor: 28.547

5. Molecular hybridization of radioactive DNA to the DNA of cytological preparations.

Authors: M L Pardue; J G Gall
Journal: Proc Natl Acad Sci U S A Date: 1969-10 Impact factor: 11.205

6. Actin gene expression visualized in chicken muscle tissue culture by using in situ hybridization with a biotinated nucleotide analog.

Authors: R H Singer; D C Ward
Journal: Proc Natl Acad Sci U S A Date: 1982-12 Impact factor: 11.205

7. Single-molecule mRNA detection and counting in mammalian tissue.

Authors: Anna Lyubimova; Shalev Itzkovitz; Jan Philipp Junker; Zi Peng Fan; Xuebing Wu; Alexander van Oudenaarden
Journal: Nat Protoc Date: 2013-08-15 Impact factor: 13.491

8. Detection and segmentation of cell nuclei in virtual microscopy images: a minimum-model approach.

Authors: Stephan Wienert; Daniel Heim; Kai Saeger; Albrecht Stenzinger; Michael Beil; Peter Hufnagl; Manfred Dietel; Carsten Denkert; Frederick Klauschen
Journal: Sci Rep Date: 2012-07-11 Impact factor: 4.379

9. Nanoscale imaging of RNA with expansion microscopy.

Authors: Fei Chen; Asmamaw T Wassie; Allison J Cote; Anubhav Sinha; Shahar Alon; Shoh Asano; Evan R Daugharthy; Jae-Byum Chang; Adam Marblestone; George M Church; Arjun Raj; Edward S Boyden
Journal: Nat Methods Date: 2016-07-04 Impact factor: 28.547

10. A method for manual and automated multiplex RNAscope in situ hybridization and immunocytochemistry on cytospin samples.

Authors: Sara Chan; Audrey Filézac de L'Etang; Linda Rangell; Patrick Caplazi; John B Lowe; Valentina Romeo
Journal: PLoS One Date: 2018-11-20 Impact factor: 3.240

2 in total

1. QuantISH: RNA in situ hybridization image analysis framework for quantifying cell type-specific target RNA expression and variability.

Authors: Anni Virtanen; Sampsa Hautaniemi; Sanaz Jamalzadeh; Antti Häkkinen; Noora Andersson; Kaisa Huhtinen; Anna Laury; Sakari Hietanen; Johanna Hynninen; Jaana Oikkonen; Olli Carpén
Journal: Lab Invest Date: 2022-02-15 Impact factor: 5.502

2. Cell-type-specific 3D epigenomes in the developing human cortex.

Authors: Michael Song; Mark-Phillip Pebworth; Xiaoyu Yang; Ming Hu; Armen Abnousi; Changxu Fan; Jia Wen; Jonathan D Rosen; Mayank N K Choudhary; Xiekui Cui; Ian R Jones; Seth Bergenholtz; Ugomma C Eze; Ivan Juric; Bingkun Li; Lenka Maliskova; Jerry Lee; Weifang Liu; Alex A Pollen; Yun Li; Ting Wang; Arnold R Kriegstein; Yin Shen
Journal: Nature Date: 2020-10-14 Impact factor: 49.962

2 in total