| Literature DB >> 21126332 |
Daniel Klevebring1, Linn Fagerberg, Emma Lundberg, Olof Emanuelsson, Mathias Uhlén, Joakim Lundeberg.
Abstract
BACKGROUND: An interesting field of research in genomics and proteomics is to compare the overlap between the transcriptome and the proteome. Recently, the tools to analyse gene and protein expression on a whole-genome scale have been improved, including the availability of the new generation sequencing instruments and high-throughput antibody-based methods to analyze the presence and localization of proteins. In this study, we used massive transcriptome sequencing (RNA-seq) to investigate the transcriptome of a human osteosarcoma cell line and compared the expression levels with in situ protein data obtained in-situ from antibody-based immunohistochemistry (IHC) and immunofluorescence microscopy (IF).Entities:
Mesh:
Substances:
Year: 2010 PMID: 21126332 PMCID: PMC3014981 DOI: 10.1186/1471-2164-11-684
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Study summary.
| Method | Number of antibodies | Number of genes analyzed | Percentage of total number of genes analyzed | Number of genes present of common subset | Percentage present of common subset |
|---|---|---|---|---|---|
| na | na | na | 2345 | 85.3% | |
| 5329 | 4380 | 21.2% | 2439 | 88.7% | |
| 3626 | 3268 | 15.9% | 2023 | 73.6% | |
| 2749 | 2749 | 13.3% | na | na |
Figure 1Overview of the data types used. (A) Images acquired from IHC were automatically processed and annotated. (B) Images from IF were manually annotated with staining intensity and a validation score. (C) For RNA-sequencing, reads mapping uniquely to exons were counted and an RPKM value was calculated for each gene.
Figure 2(A) Overlap of IHC data with RNA data. In IHC, 88.7% of the investigated genes are present. (B) As in (A), but overlap between IF and RNA-sequencing. In IF, 73.9% of the investigated genes are called present. The 'present overlaps' between IHC and RNA-sequencing and IF and RNA-seq are 77.2% and 64.4%, respectively.
Figure 3(A) Venn diagram of presence flags for the three platforms (A = IHC, B = IF, C = RNA-seq). 60.1% of all genes investigated are present in all platforms. Only 34 genes (1.2%) are absent in all platforms. (B) Cumulative transcription density curves for the categories where RNA-sequencing data are available (C, AB, BC, ABC). A left-shifted curve contains a larger fraction of low transcribed genes. The category C (blue line) (RNA-seq only) contains genes with lower transcription than the full HPA subset (solid black line) (KS-test, one-sided, p = 3.8 × 10-7). Also, genes in BC (green line) (RNA-seq positive, IF positive, IHC negative) have generally a lower level of transcription than the HPA subset (KS-test, one-sided, p = 4 × 10-5). This is likely due to a higher sensitivity in IF than IHC. Interestingly, genes in the HPA subset display generally higher transcription levels (KS-test, one-sided, p < 2.2 × 10-16) than all protein coding genes (dashed black line).
Overlap between RNA and protein based on RNA expression bins.
| Percent of present genes | ||||
|---|---|---|---|---|
| All antibodies | Antibodies with supportive Western blot | |||
| Fraction of genes based on transcript level | IHC | IF | IHC | IF |
| 95.2% | 79.9% | 96.6% | 83.1% | |
| 82.1% | 67.8% | 83.4% | 73.2% | |
DAVID enrichment for certain categories
| Group | Enriched theme | p-value |
|---|---|---|
| ABC | Nucleus | 1, 03 × 10-14 |
| Intracellular | 8,22 × 10-23 | |
| AB | Glycosylation site:N-linked (GlcNAc...) | 9,97 × 10-6 |
| BC | Extracellular region | 4,39 × 10-7 |
Figure 4(A) Western blot data for the groups defined in figure 3A. The groups A and AB generally contain a larger fraction of low-scoring antibodies. (B) IF Reliability scores for the same subgroups. The fraction of antibodies with a supportive staining in the ABC group is about three times higher that in the B group (p < 2 × 10-3).