| Literature DB >> 31113981 |
Chanthirika Ragulan1,2, Katherine Eason1, Elisa Fontana1,2, Gift Nyamundanda1,2, Noelia Tarazona3, Yatish Patil1,2, Pawan Poudel1, Rita T Lawlor4,5, Maguy Del Rio6, Si-Lin Koo7, Wah-Siew Tan8, Francesco Sclafani9, Ruwaida Begum9, Larissa S Teixeira Mendes2, Pierre Martineau6, Aldo Scarpa4,5, Andrés Cervantes3, Iain Beehuat Tan8,10,11, David Cunningham2,9, Anguraj Sadanandam12,13.
Abstract
Previously, we classified colorectal cancers (CRCs) into five CRCAssigner (CRCA) subtypes with different prognoses and potential treatment responses, later consolidated into four consensus molecular subtypes (CMS). Here we demonstrate the analytical development and validation of a custom NanoString nCounter platform-based biomarker assay (NanoCRCA) to stratify CRCs into subtypes. To reduce costs, we switched from the standard nCounter protocol to a custom modified protocol. The assay included a reduced 38-gene panel that was selected using an in-house machine-learning pipeline. We applied NanoCRCA to 413 samples from 355 CRC patients. From the fresh frozen samples (n = 237), a subset had matched microarray/RNAseq profiles (n = 47) or formalin-fixed paraffin-embedded (FFPE) samples (n = 58). We also analyzed a further 118 FFPE samples. We compared the assay results with the CMS classifier, different platforms (microarrays/RNAseq) and gene-set classifiers (38 and the original 786 genes). The standard and modified protocols showed high correlation (> 0.88) for gene expression. Technical replicates were highly correlated (> 0.96). NanoCRCA classified fresh frozen and FFPE samples into all five CRCA subtypes with consistent classification of selected matched fresh frozen/FFPE samples. We demonstrate high and significant subtype concordance across protocols (100%), gene sets (95%), platforms (87%) and with CMS subtypes (75%) when evaluated across multiple datasets. Overall, our NanoCRCA assay with further validation may facilitate prospective validation of CRC subtypes in clinical trials and beyond.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31113981 PMCID: PMC6529539 DOI: 10.1038/s41598-019-43492-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Assessment of different protocols and reproducibility of reduced gene subtype-based nCounter assay in fresh frozen samples. (a) Flowchart showing the major steps of the NanoCRCA assay protocols. Specifically, this flowchart demonstrates the difference between standard and modified protocols. Though the modified protocol has additional steps, it substantially reduces the cost without significantly increasing the time of the assay. (b,c) Heatmap of expression levels of the selected 48 subtype-specific genes (and 2 additional genes) for 22 fresh frozen samples from the Montpellier and OriGene cohorts as measured on a custom nCounter panel using b) standard protocol and c) both standard and modified protocols (median centred within protocols before clustering). (d) A scatter plot of gene expression measurements for 48 genes in 22 fresh frozen samples between the standard and modified protocols (median centred within protocols before correlation). Each point is coloured by the gene’s weight and subtype (PAM score) in the CRCA-786 centroids. Correlation co-efficient (Pearson’s r) value is shown. (e) Venn diagram indicating the number of samples that were classifiable by the standard and modified protocols, and the concordance between classifiable samples. (f) Heatmap of expression levels of the selected 48 subtype-specific genes (and 2 additional genes) from technical duplicates of 5 samples assayed using modified protocol with a maximum interval of 40 weeks to show the reproducibility of the assay. (g) A scatter plot of gene expression measurements for the 48 genes in 5 technical duplicates. Each point is coloured by the gene’s weight and subtype (PAM score) in the CRCA-786 centroids. Correlation co-efficient (Pearson’s r) value is shown.
Figure 2Assessment of protocols and reproducibility of reduced gene subtype-based nCounter assay in FFPE samples. (a) Heatmap of expression levels of the selected 48 subtype-specific genes (and 2 additional genes) for 12 patient samples from the RETRO-C cohort as measured on a custom nCounter panel using both standard and modified protocols (24 samples – 12 each from two protocols; median centred within protocols before clustering). (b) A scatter plot of gene expression measurements for 48 genes in 12 samples between the standard and modified protocols (median centred within protocols before correlation). Each point is coloured by the gene’s weight and subtype (PAM score) in the CRCA-786 centroids. Correlation co-efficient (Pearson’s r) value is shown. (c) Venn diagram indicating the number of samples that were classifiable by the standard and modified protocols, and the concordance between classifiable samples. (d) Heatmap of expression levels of the selected 48 subtype-specific genes (and 2 additional genes) from 5 technical duplicates assayed using modified protocol with a maximum interval of 13 weeks to show the reproducibility of the assay. (e) A scatter plot of gene expression measurements for 48 genes in 5 samples between technical duplicates. Each point is coloured by the gene’s weight and subtype (PAM score) in the CRCA-786 centroids. Correlation co-efficient (Pearson’s r) value is shown.
Figure 3Selection of a robust 38-gene panel. (a) Overview of the process and pipelines used to select a robust gene set for the NanoCRCA assay using in-laboratory developed idSample and intPredict computational tools. (b) Bar plot showing the probability of a sample from our original published dataset (n = 387) belonging to a given subtype, as assessed using idSample. The dotted line represents a cut-off of 70% probability of a single subtype in each sample. (c) A line plot showing median MCR and number of genes as selected using the intPredict pipeline and samples selected in b) (n = 195). The light blue band shows the 95% credible interval of the median MCRs. (d) Heatmap showing the gene expression of the 38-gene panel selected by the intPredict pipeline in (c) in the 195 samples selected by idSample from b). The top bar shows the 786-gene signature-based subtype of the samples. (e) Line plots showing MCR using PAM at different numbers of genes for all the subtypes (upper) and individual subtypes (lower) for 195 samples from b). PS - prediction strength, PAM - prediction analysis of microarrays, BW - between-within group sum of squares ratio, RF - random forest, DLDA - diagonal linear discriminant analysis, SVM - support vector machine, SVR - support vector regression (SVR), and SV – support vector.
Figure 4NanoCRCA subtyping, pathway and 786-gene signature analysis with subtype stability. (a) Heatmap showing the expression of the 38-gene-panel as measured using the NanoCRCA assay in the three fresh frozen cohorts (n = 179). From top to bottom, the upper bars indicate the cohort and the NanoCRCA subtypes as determined by nCounter profiles. The right-hand vertical bar indicates the subtype association of each gene. (b) Heatmap of nCounter PanCancer Progression Panel-based gene expression profiles from the Montpellier and OriGene cohorts of samples (n = 34). Upper bars are as in a). Genes are grouped according to functional annotations provided by NanoString Technologies. (c,d) Heatmap of c) RNAseq/microarray gene expression profiles and d) NanoCRCA gene expression profiles from samples from all the three cohorts having matched data (n = 47). From top to bottom, the upper bars indicate the cohort, the CMS subtype, the CRCA-786 and CRCA-38 subtypes as determined by microarray/RNAseq profiles, and the NanoCRCA subtypes as determined by nCounter profiles. The right-hand vertical bar indicates the subtype association of each gene. (e) Distribution of subtypes according to each classifier. Samples that were of undetermined subtype were excluded for each classifier (NanoCRCA n = 40; CRCA-38 n = 43; CRCA-786 n = 43; CMS n = 37). P-values resulting from statistical tests of proportion for each subtype between the three CRCA classifiers are shown on the left-hand side. (f) Chord plot illustrating the tendency of samples to be classified as the same subtype between the three assays. Samples from all three cohorts which had no undetermined subtype calls were included (n = 34). Each arc connects the classification of a sample in two different assays, and each sample is represented by three arcs (connecting NanoCRCA, CRCA-38 and CRCA-786 subtypes). Samples with the same subtype in all three assays are coloured by that subtype. Samples that had discordant classification between the assays are coloured grey (4/34 samples).
Figure 5Montpellier cohort: NanoCRCA assay, its comparison with other platforms and the CMS classifier; and Singapore FF cohort: NanoCRCA assay. (a) A summary of the Montpellier cohort showing patient characteristics, sample size and available microarray data. (b) Heatmap showing the expression of the 38-gene-panel in the Montpellier cohort as measured using NanoCRCA assay (n = 17). From top to bottom, the upper bars indicate the CMS, CRCA-786 and CRCA-38 subtypes as determined by microarray profiles, and the NanoCRCA subtypes as determined by nCounter profiles. The right-hand vertical bar indicates the subtype association of each gene. The percentage of samples falling into each CMS and CRCA subtype is shown on the right. (c,d) Comparisons between NanoCRCA and microarray-based classifications CRCA-38, CRCA-786 and CMS showing c) percent concordance to NanoCRCA and d) statistical significance (Fisher’s exact test) in the Montpellier cohort. (e) A summary of the Singapore FF cohort showing patient characteristics and sample size. (f) Heatmap showing the expression of the 38-gene panel in the Singapore FF cohort as measured using NanoCRCA assay (n = 145). The subtypes as assigned using the NanoCRCA assay are shown on the top bar. The right-hand vertical bar indicates the subtype association of each gene. Subtype colours are as in b).
Figure 6Translation of assay via matched fresh frozen and FFPE tissue. (a) A summary of the INCLIVA-Valencia cohort, showing patient characteristics and sample size. (b) A scatter plot showing median expression of the 38 genes in samples with tumour cellularity ≥70% in both fresh frozen and FFPE tissues (n = 24). Colours indicated each gene’s association with the subtypes. (c) Alluvial diagram showing the subtype classification of matched fresh frozen and FFPE tissues for cellularity-selected samples (excluding undetermined samples; n = 14). (d) A summary of the Singapore FFPE cohort, showing patient characteristics and sample size. (e) A heatmap showing the expression of the 38-gene panel in Singapore FFPE samples as measured using the NanoCRCA assay (n = 106). The subtypes as assigned using the NanoCRCA assay are shown on the top bar. The right-hand vertical bar indicates the subtype association of each gene. Subtype colours are as in b).