| Literature DB >> 32181701 |
Marco Cavalli1, Klev Diamanti2, Gang Pan1, Rapolas Spalinskas3, Chanchal Kumar4,5, Atul Shahaji Deshmukh6, Matthias Mann6, Pelin Sahlén3, Jan Komorowski2,7, Claes Wadelius1.
Abstract
The liver is the largest solid organ and a primary metabolic hub. In recent years, intact cell nuclei were used to perform single-nuclei RNA-seq (snRNA-seq) for tissues difficult to dissociate and for flash-frozen archived tissue samples to discover unknown and rare cell subpopulations. In this study, we performed snRNA-seq of a liver sample to identify subpopulations of cells based on nuclear transcriptomics. In 4282 single nuclei, we detected, on average, 1377 active genes and we identified seven major cell types. We integrated data from 94,286 distal interactions (p < 0.05) for 7682 promoters from a targeted chromosome conformation capture technique (HiCap) and mass spectrometry proteomics for the same liver sample. We observed a reasonable correlation between proteomics and in silico bulk snRNA-seq (r = 0.47) using tissue-independent gene-specific protein abundancy estimation factors. We specifically looked at genes of medical importance. The DPYD gene is involved in the pharmacogenetics of fluoropyrimidine toxicity and some of its variants are analyzed for clinical purposes. We identified a new putative polymorphic regulatory element, which may contribute to variation in toxicity. Hepatocellular carcinoma (HCC) is the most common type of primary liver cancer and we investigated all known risk genes. We identified a complex regulatory landscape for the SLC2A2 gene with 16 candidate enhancers. Three of them harbor somatic motif breaking and other mutations in HCC in the Pan Cancer Analysis of Whole Genomes dataset and are candidates to contribute to malignancy. Our results highlight the potential of a multi-omics approach in the study of human diseases.Entities:
Keywords: human liver; multi-omics data integration; proteomics; snRNA-seq
Mesh:
Year: 2020 PMID: 32181701 PMCID: PMC7185313 DOI: 10.1089/omi.2019.0215
Source DB: PubMed Journal: OMICS ISSN: 1536-2310
FIG. 1.Schematic cross-sectional view of the structural organization and cell populations of a liver lobule. (A) Each lobule presents a radial structure with a central vein (CV) in the middle from which HC cords radiate toward the so called portal triad consisting of branches of the portal vein (PV) and hepatic artery (HA) and bile ducts. The sinus is delimited by HCs that are arranged back to back in cords and it is lined by specialized sinusoidal endothelial cells. Kupffer and immune cells are located in the sinusoidal lumen, while hepatic stellate cells are localized in the space of Disse. (B) Workflow representing the design of the multi-omics study. The crystal structure of the glucose transporter was obtained from PDB (ID: 4ZWB, Deng et al., 2015). HC, hepatocyte.
FIG. 2.t-distributed stochastic neighbor embedding (t-SNE) plot for snRNA-seq analysis of 4282 liver nuclei. (A) Overview of different liver cell populations. (B–I) Expression of cell type-specific gene markers shown in the right-hand side of each panel. snRNA-seq, single-nuclei RNA-seq.
FIG. 3.Multi-omics overview of DPYD. (A) Number of probe-distal HiCap interactions associated to liver-specific chromHMM annotations. (B) Expression levels from snRNA-seq in different liver cell types. The size of the dot represents the number of nuclei that express the gene, while the color intensity the overall level of expression. (C) Heatmap comparing the expected and experimental levels of protein abundance from RNA-seq and MS proteomics experiments, respectively. The first column shows the log2-average expression of genes from the in silico bulk snRNA-seq, while the second one illustrates the estimated protein abundance calculated after calibrating the in silico bulk snRNA-seq levels for RTP abundancy estimation factors. The third column shows the experimental level of the protein abundance detected by MS. The last two columns show the estimated protein abundance calculated after calibrating the log2-average in silico bulk scRNA-seq levels and the log2-average number of reads of bulk RNA-seq. (D, E) Circos plots illustrating interactions of the probe for DPYD with distal elements. (D) The first track shows gene annotations overlapping probe-distal interactions. The second track shows manually curated chromHMM annotations overlapping interacting regions, while the inner one shows the experimental HiCap interactions. The purple arrow marks the probe location. (E) A zoom-in on the area of interest for the circos plot in (D). The first two tracks show chromHMM and gene annotations, respectively. The third track shows six enhancers interacting with the DPYD promoter. The inner track shows probe-distal (red) HiCap interactions. The purple arrow marks the probe location. (F) The genomic landscape for the SNP rs74450569 from (E) (marked in blue) located in a distal element overlapping an active enhancer and interacting with the DPYD probe. The UCSC genome browser tracks represent from the top: (1) the ChIP-seq signals for two active enhancer-specific histone modifications and ChromHMM annotations in HepG2 and (2) the transcription factors binding from ChIP-seq experiments from the ENCODE project with the coloring (light gray to dark gray) proportional to the signal strength observed in different cell lines (cell abbreviations can be found at: https://tinyurl.com/watv2v7). DPYD, dihydropyrimidine dehydrogenase; MS, mass spectrometry; RTP, mRNA-to-protein; scRNA-seq, single-cell RNA sequencing.
FIG. 4.Overview of top 10 HCC genes with the largest number of significant probe-probe or probe-distal interactions detected in both proteomics and snRNA-seq. (A) Number of probe-distal (right panel) HiCap interactions associated to liver-specific chromHMM annotations. (B) Expression levels from snRNA-seq in different liver cell types. The size of the dot represents the number of nuclei that expresses the gene, while the color intensity the overall level of expression. (C) Heatmap comparing the expected and experimental levels of protein abundance from RNA-seq and MS proteomics experiments, respectively. The first column shows the log2-average expression of genes from the in silico bulk snRNA-seq, while the second one illustrates the estimated protein abundance calculated after calibrating the in silico bulk snRNA-seq levels for RTP abundancy estimation factors. The third column shows the experimental level of the protein abundance detected by MS. The last two columns show the estimated protein abundance calculated after calibrating the log2-average in silico bulk scRNA-seq levels and the log2-average number of reads of bulk RNA-seq. (D, E) Circos plots illustrating interactions of the probe for SLC2A2 with distal elements harboring active enhancer elements. (D) The first track shows gene annotations overlapping probe-distal interactions. The second track shows manually curated chromHMM annotations overlapping interacting regions, while the inner one shows the experimental HiCap interactions. The purple arrow marks the probe location. (E) A zoom-in on the area of interest for the circos plot in (D). The first two tracks show chromHMM and gene annotations, respectively. The third track shows HCC-specific mutations detected from the PanCancer consortium. The inner track shows 16 the probe-distal HiCap interactions overlapping active enhancer elements. The purple arrow marks the probe location. (F) The genomic landscape of the T > A motif-breaking mutation identified in (E) (marked in blue) located in one of the introns of the TNIK gene. The UCSC genome browser tracks represent from the top: (1) the ChIP-seq signals for two active enhancer-specific histone modifications and ChromHMM annotations in HepG2 and (2) the transcription factors binding from ChIP-seq experiments from the ENCODE project with the coloring (light gray to dark gray) proportional to the signal strength observed in different cell lines (cell abbreviations can be found at: https://tinyurl.com/watv2v7). HCC, hepatocellular carcinoma.