| Literature DB >> 16086847 |
Rickard Sandberg1, Ingemar Ernberg.
Abstract
BACKGROUND: Cell lines as model systems of tumors and tissues are essential in molecular biology, although they only approximate the properties of in vivo cells in tissues. Cell lines have been selected under in vitro conditions for a long period of time, affecting many specific cellular pathways and processes.Entities:
Mesh:
Year: 2005 PMID: 16086847 PMCID: PMC1273632 DOI: 10.1186/gb-2005-6-8-r65
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Sources of gene-expression data
| Source | Number of cell lines | Number of normal tissue samples | Number of tumor samples | Dataset | Platform |
| [15] | 60 | - | - | I | Hu6800 |
| [17] | - | 59 | - | I | Hu6800 |
| [16] | - | 60 | 189 | I | Hu6800 |
| [18] | 25 | 65 | 5 | II | HGU95A |
Figure 1Identification of outlier samples by correlation analysis and scalar factors. (a) Plotting the average correlation for each sample from pairwise comparisons to all other samples (y-axis). The samples were sorted according to their average correlation (x-axis). We used an average correlation of 0.34 as a cutoff (marked with a dashed line). (b) Comparison of the average correlation (x-axis) with the scalar factor used in the global scaling procedure (y-axis). Many of the samples with low average correlations had been rescaled using high scaling factors, indicating that they might have had poor hybridizations. Again, the dashed line displays the average correlation cutoff.
Figure 2Correlation matrix between all normal samples from two studies. The gene-expression profiles of each normal tissue sample were compared to all other normal tissue samples from the other dataset by measuring the correlation across all genes. The normal samples from Hsiao et al. [17] are presented along the y-axis and samples from Ramaswamy et al. [16] along the x-axis. The correlation matrix displays each pairwise comparison and each entry is color-coded according to the scale bar to the right of matrix. Black rectangles highlight correlation values between the samples from the same tissues in the two different datasets.
Figure 3The gene-expression profiles of cell lines compared to normal and tumor tissues. (a) Projection of each sample in dataset I into SVD space drawn by the correlation of each sample to SVD eigenarray 1 (x-axis) and 2 (y-axis). The normal tissue samples of CNS origin from two laboratories (green squares, Hsiao et al [17]; black squares, Ramaswamy et al. [16]) were overlapping, as well as the tumor tissue samples (red squares, Ramaswamy et al. [16]). The cell lines were separated from tissue samples by the first SVD eigenarray. Samples of lymphoma and leukemia origin were also separated in the SVD analysis. (b) Projection of each sample in dataset II into the SVD space drawn by the correlation of each sample to SVD eigenarray 1 (x-axis) and 2 (y-axis). The cell lines (crosses) were separated from tissue samples. Whole blood samples were distinctly clustered close to the cell lines. (c) Other separation of normal samples. Significance analysis of microarrays (SAM) was used to identify differentially expressed genes between cell line and tissue samples in dataset I. The number of statistically significant genes (x-axis) as a function of the median and 90th percentile of the FDR (y-axis) estimated based on 1,000 permutations. (d) SAM analysis of cell line versus tissue samples in dataset II. Identical parameters as in (c). (e) Plot of the degree of differential expression between cell lines and tissues for each gene in dataset I (x-axis) versus dataset II (y-axis) respectively. The degree of differential expression was measured using the signal-to-noise metric [23].
Classification of cell lines and tissue samples across five datasets
| Dataset reference | Accuracy (%) | Number of cell lines | Number of tissue samples |
| Dataset I | 99* | 60 | 371 |
| Dataset II | 100 | 25 | 70 |
| Dataset III [8] | 100 | 10 | 123 |
| Dataset IV [24] | 95 | 15 | 64 |
| Dataset V [12] | 96 | 10 | 81 |
*One cell line (breast cell line HS578T) was misclassified as a tissue sample.
Figure 4The gene-expression signature of in vitro growth. All genes found to be differentially expressed between cell lines and tissues across two dataset I and II (576 genes) were subject to hierarchical clustering (average linkage and Euclidean distance metric) using the Genesis software [43]. Before clustering, all genes were normalized to an average expression level of zero and a standard deviation of one (that is unit length). Above the cluster image, samples are labeled as cell lines, normal tissues and tumor tissues (except for the primary cultures and FACS-sorted cells in datasets II that were not annotated). (a) Top part of the cluster presents the genes found to be downregulated in vitro. These genes were not detected in vitro and were often only expressed in a subset of tissue samples. It is likely that these genes represent downregulated tissue markers from the respective tissues. (b) In contrast, genes found to be upregulated in vitro were highly expressed in all cell lines, while occasionally expressed in a few tissue samples. Specific clusters of genes in (a) and (b) are annotated on the right of the cluster image (clusters A to H). Specific groups of samples are annotated in color above the cluster image and by number below the cluster image (cluster numbers 1 to 7). Cluster number 1, kidney and liver samples; cluster number 2, lung and muscle; cluster number 3, lymphomas; cluster number 4, leukemias (ALL); cluster number 5, leukemias (AML); cluster number 6, CNS tumors (medullablastoma and glioblastoma); cluster number 7, germinal center cells.
Biological process upregulated in vitro
| GO category | Total number of genes | Genes changed | Log10 ( | FDR | GO ID |
| Translation | 76 | 36 | -7.95 | 0.0000 | GO:0043037 |
| Ribosome biogenesis and assembly | 42 | 19 | -4.10 | 0.0037 | GO:0042254 |
| Ribosome biogenesis | 41 | 19 | -4.27 | 0.0000 | GO:0007046 |
| Regulation of translation | 33 | 14 | -2.81 | 0.0077 | GO:0006445 |
| Translational initiation | 23 | 13 | -4.20 | 0.0042 | GO:0006413 |
| tRNA metabolism | 27 | 12 | -2.69 | 0.0070 | GO:0006399 |
| tRNA modification | 23 | 11 | -2.82 | 0.0078 | GO:0006400 |
| tRNA aminoacylation for protein translation | 21 | 10 | -2.59 | 0.0125 | GO:0006418 |
| tRNA aminoacylation | 21 | 10 | -2.59 | 0.0125 | GO:0043039 |
| rRNA processing | 17 | 10 | -3.53 | 0.0056 | GO:0006364 |
| rRNA metabolism | 17 | 10 | -3.53 | 0.0056 | GO:0016072 |
| Regulation of translational initiation | 14 | 8 | -2.79 | 0.0075 | GO:0006446 |
| Translational elongation | 14 | 7 | -2.07 | 0.0400 | GO:0006414 |
| Transcription from Pol I promoter | 7 | 5 | -2.44 | 0.0141 | GO:0006360 |
| RNA processing | 123 | 52 | -9.02 | 0.0000 | GO:0006396 |
| RNA metabolism | 130 | 52 | -8.00 | 0.0000 | GO:0016070 |
| mRNA metabolism | 64 | 21 | -2.27 | 0.0217 | GO:0016071 |
| mRNA processing | 57 | 20 | -2.56 | 0.0123 | GO:0006397 |
| RNA splicing | 41 | 18 | -3.70 | 0.0030 | GO:0008380 |
| RNA splicing, via transesterification reactions with bulged adenosine as nucleophile | 33 | 15 | -3.37 | 0.0050 | GO:0000377 |
| RNA splicing, via transesterification reactions | 33 | 15 | -3.37 | 0.0050 | GO:0000375 |
| Nuclear mRNA splicing, via spliceosome | 33 | 15 | -3.37 | 0.0050 | GO:0000398 |
| RNA modification | 25 | 11 | -2.46 | 0.0143 | GO:0009451 |
| Nucleobase, nucleoside, nucleotide and nucleic acid metabolism | 806 | 192 | -4.43 | 0.0000 | GO:0006139 |
| Nucleotide metabolism | 61 | 20 | -2.18 | 0.0304 | GO:0009117 |
| Nucleotide biosynthesis | 45 | 16 | -2.21 | 0.0303 | GO:0009165 |
| Ribonucleotide metabolism | 28 | 13 | -3.09 | 0.0047 | GO:0009259 |
| Ribonucleotide biosynthesis | 27 | 13 | -3.28 | 0.0048 | GO:0009260 |
| Purine nucleotide metabolism | 29 | 12 | -2.37 | 0.0164 | GO:0006163 |
| Purine nucleotide biosynthesis | 26 | 12 | -2.86 | 0.0080 | GO:0006164 |
| Purine ribonucleotide metabolism | 25 | 11 | -2.46 | 0.0143 | GO:0009150 |
| Purine ribonucleotide biosynthesis | 24 | 11 | -2.63 | 0.0115 | GO:0009152 |
| Nucleoside triphosphate metabolism | 23 | 10 | -2.23 | 0.0299 | GO:0009141 |
| Ribonucleoside triphosphate metabolism | 20 | 9 | -2.17 | 0.0295 | GO:0009199 |
| Ribonucleoside triphosphate biosynthesis | 19 | 9 | -2.35 | 0.0167 | GO:0009201 |
| Nucleoside triphosphate biosynthesis | 20 | 9 | -2.17 | 0.0295 | GO:0009142 |
| Purine ribonucleoside triphosphate metabolism | 20 | 9 | -2.17 | 0.0295 | GO:0009205 |
| Purine ribonucleoside triphosphate biosynthesis | 19 | 9 | -2.35 | 0.0167 | GO:0009206 |
| Purine nucleoside triphosphate metabolism | 21 | 9 | -2.00 | 0.0413 | GO:0009144 |
| Purine nucleoside triphosphate biosynthesis | 19 | 9 | -2.35 | 0.0167 | GO:0009145 |
| Nucleoside metabolism | 14 | 7 | -2.07 | 0.0400 | GO:0009116 |
| Protein metabolism | 836 | 210 | -6.86 | 0.0000 | GO:0019538 |
| Protein biosynthesis | 207 | 72 | -7.76 | 0.0000 | GO:0006412 |
| Intracellular transport | 176 | 63 | -7.36 | 0.0000 | GO:0046907 |
| Protein transport | 149 | 52 | -5.75 | 0.0000 | GO:0015031 |
| Intracellular protein transport | 138 | 50 | -6.11 | 0.0000 | GO:0006886 |
| Amino acid and derivative metabolism | 126 | 36 | -2.31 | 0.0198 | GO:0006519 |
| Amino acid metabolism | 98 | 29 | -2.19 | 0.0297 | GO:0006520 |
| Ubiquitin-dependent protein catabolism | 48 | 26 | -7.39 | 0.0000 | GO:0006511 |
| Modification-dependent protein catabolism | 48 | 26 | -7.39 | 0.0000 | GO:0019941 |
| Protein targeting | 70 | 23 | -2.44 | 0.0139 | GO:0006605 |
| Protein folding | 46 | 22 | -5.14 | 0.0000 | GO:0006457 |
| Ubiquitin cycle | 31 | 12 | -2.10 | 0.0351 | GO:0006512 |
| Amino acid activation | 21 | 10 | -2.59 | 0.0125 | GO:0043038 |
| Polyamine metabolism | 5 | 4 | -2.27 | 0.0271 | GO:0006595 |
| Metabolism | 2008 | 457 | -12.88 | 0.0000 | GO:0008152 |
| Biosynthesis | 423 | 119 | -6.33 | 0.0000 | GO:0009058 |
| Energy pathways | 128 | 38 | -2.74 | 0.0074 | GO:0006091 |
| Energy derivation by oxidation of organic compounds | 89 | 32 | -4.02 | 0.0036 | GO:0015980 |
| Main pathways of carbohydrate metabolism | 56 | 20 | -2.67 | 0.0069 | GO:0006092 |
| Coenzyme and prosthetic group metabolism | 55 | 18 | -2.00 | 0.0419 | GO:0006731 |
| Coenzyme metabolism | 44 | 16 | -2.32 | 0.0200 | GO:0006732 |
| Glucose catabolism | 30 | 12 | -2.23 | 0.0307 | GO:0006007 |
| Coenzyme and prosthetic group biosynthesis | 31 | 12 | -2.10 | 0.0351 | GO:0046138 |
| Oxidative phosphorylation | 13 | 11 | -6.25 | 0.0000 | GO:0006119 |
| Coenzyme biosynthesis | 23 | 10 | -2.23 | 0.0299 | GO:0009108 |
| Cellular respiration | 11 | 9 | -4.94 | 0.0000 | GO:0045333 |
| Aerobic respiration | 9 | 8 | -4.92 | 0.0000 | GO:0009060 |
| Tricarboxylic acid cycle | 18 | 8 | -1.94 | 0.0462 | GO:0006099 |
| ATP synthesis coupled electron transport ( | 6 | 5 | -2.91 | 0.0061 | GO:0042775 |
| ATP synthesis coupled electron transport | 6 | 5 | -2.91 | 0.0061 | GO:0042773 |
| Cell cycle | 324 | 89 | -4.32 | 0.0000 | GO:0007049 |
| Cell organization and biogenesis | 315 | 83 | -3.38 | 0.0054 | GO:0016043 |
| DNA metabolism | 188 | 64 | -6.53 | 0.0000 | GO:0006259 |
| Mitotic cell cycle | 153 | 58 | -7.84 | 0.0000 | GO:0000278 |
| Cytoplasm organization and biogenesis | 202 | 55 | -2.73 | 0.0073 | GO:0007028 |
| DNA replication and chromosome cycle | 83 | 30 | -3.85 | 0.0033 | GO:0000067 |
| M phase | 62 | 26 | -4.68 | 0.0000 | GO:0000279 |
| Nuclear organization and biogenesis | 79 | 25 | -2.36 | 0.0176 | GO:0006997 |
| DNA packaging | 69 | 25 | -3.31 | 0.0049 | GO:0006323 |
| S phase of mitotic cell cycle | 72 | 25 | -3.00 | 0.0043 | GO:0000084 |
| Chromosome organization and biogenesis ( | 77 | 24 | -2.20 | 0.0300 | GO:0007001 |
| DNA replication | 67 | 23 | -2.72 | 0.0071 | GO:0006260 |
| Nuclear division | 54 | 22 | -3.82 | 0.0031 | GO:0000280 |
| Establishment and/or maintenance of chromatin architecture | 64 | 21 | -2.27 | 0.0217 | GO:0006325 |
| M phase of mitotic cell cycle | 45 | 20 | -4.15 | 0.0040 | GO:0000087 |
| DNA repair | 59 | 20 | -2.36 | 0.0173 | GO:0006281 |
| Mitosis | 42 | 19 | -4.10 | 0.0037 | GO:0007067 |
| Microtubule-based process | 45 | 19 | -3.61 | 0.0059 | GO:0007017 |
| DNA-dependent DNA replication | 35 | 15 | -3.04 | 0.0044 | GO:0006261 |
| Microtubule cytoskeleton organization and biogenesis | 27 | 14 | -3.94 | 0.0034 | GO:0000226 |
| G1/S transition of mitotic cell cycle | 35 | 13 | -2.06 | 0.0396 | GO:0000082 |
| G2/M transition of mitotic cell cycle | 21 | 9 | -2.00 | 0.0413 | GO:0000086 |
| M-phase specific microtubule process | 12 | 7 | -2.56 | 0.0132 | GO:0000072 |
| Chromosome segregation | 14 | 7 | -2.07 | 0.0400 | GO:0007059 |
| Microtubule nucleation | 9 | 6 | -2.65 | 0.0117 | GO:0007020 |
| DNA replication initiation | 10 | 6 | -2.32 | 0.0190 | GO:0006270 |
| Spindle assembly | 8 | 6 | -3.05 | 0.0045 | GO:0007051 |
| Tubulin folding | 9 | 6 | -2.65 | 0.0117 | GO:0007021 |
| Mitotic spindle assembly | 6 | 5 | -2.91 | 0.0061 | GO:0007052 |
| Pre-replicative complex formation and maintenance | 5 | 4 | -2.27 | 0.0271 | GO:0006267 |
| Histone modification | 12 | 7 | -2.56 | 0.0132 | GO:0016570 |
| Covalent chromatin modification | 12 | 7 | -2.56 | 0.0132 | GO:0016569 |
| Physiological process | 2917 | 574 | -3.84 | 0.0032 | GO:0007582 |
| Macromolecule biosynthesis | 345 | 100 | -5.98 | 0.0000 | GO:0009059 |
| Response to endogenous stimulus | 77 | 23 | -1.89 | 0.0486 | GO:0009719 |
| Response to DNA damage stimulus | 71 | 22 | -2.02 | 0.0412 | GO:0006974 |
Biological process downregulated in vitro
| GO category | Total number of genes | Genes changed | Log10 ( | FDR | ID |
| Membrane signaling and cell adhesion | |||||
| Cell communication | 1088 | 565 | -13.76 | 0.0000 | GO:0007154 |
| Signal transduction | 831 | 428 | -8.86 | 0.0000 | GO:0007165 |
| Cell surface receptor linked signal transduction | 413 | 232 | -8.66 | 0.0000 | GO:0007166 |
| Cell adhesion | 257 | 139 | -4.10 | 0.0000 | GO:0007155 |
| Cell-cell signaling | 240 | 132 | -4.38 | 0.0000 | GO:0007267 |
| Cell motility | 197 | 105 | -2.91 | 0.0171 | GO:0006928 |
| G-protein coupled receptor protein signaling pathway | 175 | 102 | -4.87 | 0.0000 | GO:0007186 |
| Enzyme linked receptor protein signaling pathway | 107 | 61 | -2.78 | 0.0231 | GO:0007167 |
| Cell-cell adhesion | 87 | 53 | -3.41 | 0.0000 | GO:0016337 |
| G-protein signaling, coupled to IP3 second messenger (phospholipase C activating) | 35 | 23 | -2.32 | 0.0490 | GO:0007200 |
| Extracellular structure organization and biogenesis | 17 | 14 | -3.02 | 0.0029 | GO:0043062 |
| Extracellular matrix organization and biogenesis | 16 | 13 | -2.73 | 0.0225 | GO:0030198 |