| Literature DB >> 35177756 |
Paolo Martini1,2, Gabriele Sales1, Linda Diamante3, Valentina Perrera3,4, Chiara Colantuono5, Sara Riccardo5, Davide Cacchiarelli5,6, Chiara Romualdi7, Graziano Martello8.
Abstract
Genomic imprinting and X chromosome inactivation (XCI) are two prototypical epigenetic mechanisms whereby a set of genes is expressed mono-allelically in order to fine-tune their expression levels. Defects in genomic imprinting have been observed in several neurodevelopmental disorders, in a wide range of tumours and in induced pluripotent stem cells (iPSCs). Single Nucleotide Variants (SNVs) are readily detectable by RNA-sequencing allowing the determination of whether imprinted or X-linked genes are aberrantly expressed from both alleles, although standardised analysis methods are still missing. We have developed a tool, named BrewerIX, that provides comprehensive information about the allelic expression of a large, manually-curated set of imprinted and X-linked genes. BrewerIX does not require programming skills, runs on a standard personal computer, and can analyze both bulk and single-cell transcriptomes of human and mouse cells directly from raw sequencing data. BrewerIX confirmed previous observations regarding the bi-allelic expression of some imprinted genes in naive pluripotent cells and extended them to preimplantation embryos. BrewerIX also identified misregulated imprinted genes in breast cancer cells and in human organoids and identified genes escaping XCI in human somatic cells. We believe BrewerIX will be useful for the study of genomic imprinting and XCI during development and reprogramming, and for detecting aberrations in cancer, iPSCs and organoids. Due to its ease of use to non-computational biologists, its implementation could become standard practice during sample assessment, thus raising the robustness and reproducibility of future studies.Entities:
Mesh:
Year: 2022 PMID: 35177756 PMCID: PMC8854590 DOI: 10.1038/s42003-022-03087-4
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
Fig. 1Analyses of imprinted gene expression in naive pluripotent cells with BrewerIX.
a BrewerIX rational and overall implementation scheme for the Standard pipeline. b False discovery rate estimates obtained by comparing WES calls and BrewerIX biallelic calls in one male BJ fibroblast and two iPS cell lines. Three threshold combinations of overall depth (OD) and minimal coverage of the minor allele (MAC) were used; true positives (TP) in cyan; false positives (FP) in shades of orange. c BrewerIX gene summary panel results on bulk RNA-seq data from isogenic human fibroblasts (BJ FIBRO), primed (HPD00) and naive (HPD01/3/4) iPSCs. The larger the dot, the higher the number of SNVs supporting the biallelic call. The brighter the orange, the closer to 1 is the average of the allelic ratios (minor/major) of all the biallelic SNVs. Empty dots indicate detected genes with no evidence of biallelic expression, gray squares indicate genes detected but not reaching the user’s thresholds, while the absence of any symbol indicates that the gene was not detected. d BrewerIX SNV summary panel for MEG3 in the case study shown in panel c. A barplot for each sample is reported, with as many bars as the number of SNVs per gene. Solid colors represent actual SNV with both loci expressed, blue and red are the reference and the alternative/minor allele. Transparent colors indicate SNVs detected with no evidence of biallelic expression, while grayscale colors indicate SNVs that do not meet the minimum coverage. e Experimental validation of the indicated MEG3 SNVs by PCR followed by Sanger sequencing. The SNVs of interest are highlighted by a red box. See Supplementary Table 2 for a list of all SNVs validated. Each SNVs was detected in two independent experiments, using either forward or reverse sequencing primers. f BrewerIX gene summary panel results on bulk RNA-seq data generated by Yagi et al.[47]. Murine ESCs were expanded in either 2i/L or S/L conditions, while mouse embryonic fibroblasts (MEF) serve as controls. g BrewerIX gene summary panel results from bulk RNA-seq data of mESCs cultured in 2i/L or S/L (two biological replicates) by Kolodziejczyk and colleagues[48]. See Fig. 2a for matching single-cell RNA-seq samples.
Fig. 2Analyses of single-cell RNA-seq data of mouse embryonic and human adult cells.
a Analysis of single-cell RNA-seq data from mESCs cultured in 2i/L or S/L, matching those shown in Fig. 1g. Results are summarized as percentages (degree of blue) of cells in which a given gene was expressed bi-allelically. The number of cells analyzed: 2i/L 384, S/L 288. b Average allelic ratio (AAR) is defined as the average of paternal/maternal ratios across single cells for all genes in X chromosome in male and female embryonic cells detected by single-cell RNAseq[41]. Wilcoxon tests were performed between pairs of sequential developmental stages of female embryos (mid2cell—late2cell, late2cell—4-cell, 4-cell– 16cell, 16cell—earlyblast. The number of cells for male (M) and female (F) for each developmental stage: mid2cell 6 M, 6 F; late2cell 4 M, 6 F; 4-cell 3 M, 11 F; 16cell 27 M, 23 F; earlyblast 28 M, 15 F. See also Supplementary Fig. 6. c Genes with frequent LOI across mouse developmental stages obtained by studying three datasets[41, 49, 50]. On the y axis, the average allelic ratios (AAR) of single samples (single cells or single embryos for the Santini dataset). Developmental stages have been collapsed into broader categories (cleavage, morula, and blastocyst, see “Methods”). Number of cells for developmental stage: Deng et al. zygote 4, early2cell 8, mid2cell 12, late2cell 10, 4-cell 14, 8-cell 28, 16cell 50, earlyblast 43, midblast 60, lateblast 30; Borensztein et al., 2-cell 6, 4-cell 10, 8-cell 29, 16-cell 15, 32-cell 26, 64-cell 20; Santini et al. Blastocyst 8. See also Supplementary Figs. 7 and 8. d Analysis of single-cell RNA-seq data[34] from 772 human fibroblasts and 48 lymphoblastoid cells from 5 female individuals (IND1-5). Results are summarized as percentages (degree of blue) of cells in which a given gene was expressed bi-allelically. Gray indicates undetected genes. Number cells: IND1 229, IND2 159, IND3 192, IND4 192, and IND5 48. e Results for X chromosome genes on samples described in panel d. f BrewerIX gene summary panel results from bulk RNA-seq data from human breast cancer samples[53]. LN indicates matching metastatic lymph nodes. g Analysis of single-cell RNA-seq data from breast cancer samples, matching those analyzed in panel f. Number of cells: BC01 22, BC02 53, BC03 33, BC03LN 53, BC04 55, BC05 76, BC06 18, BC07LN 52, BC08 22, BC09 55, BC10 15, BC11 11. Gray indicates undetected genes.
Fig. 3Analysis of bulk and single-cell RNA-seq data from human organoids for 14 selected genes.
a Analysis of single-cell RNA-seq data from the fetal neocortex, cortical-like ventricle from cerebral organoids (Vent) and whole-cerebral organoids (minibrains). Gray indicates undetected genes. b Summarized view of the imprinting status of 14 selected genes in 4 different studies in human minibrains and cortical organoids.
Fig. 4Precision-recall analysis.
Precision and recall analysis on the validated SNV from this study and from Santini et al. (a) precision and (b) recall using increasing Overall Depth (OD) and three different Allelic Ratio (AR). c Precision and d recall using increasing OD and three P-value cutoffs for the binomial test. The horizontal dashed lines define the cutoffs of acceptable precision and recall values, while the green areas indicate the best overall depths.