| Literature DB >> 24090206 |
Miguel A Andrade-Navarro1, Femina Kanji, Carmen G Palii, Marjorie Brand, Harold Atkins, Carol Perez-Iratxeta.
Abstract
BACKGROUND: Gene transcripts specifically expressed in a particular cell type (cell-type specific gene markers) are useful for its detection and isolation from a tissue or other cell mixtures. However, finding informative marker genes can be problematic when working with a poorly characterized cell type, as markers can only be unequivocally determined once the cell type has been isolated. We propose a method that could identify marker genes of an uncharacterized cell type within a mixed cell population, provided that the proportion of the cell type of interest in the mixture can be estimated by some indirect method, such as a functional assay.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24090206 PMCID: PMC3853712 DOI: 10.1186/1472-6750-13-80
Source DB: PubMed Journal: BMC Biotechnol ISSN: 1472-6750 Impact factor: 2.563
Figure 1Scheme of the proposed approach to obtain candidate markers starting from several cell mixtures that contain the cell type of interest and an estimate of its proportion in each mixture. After gene expression profiling by a high-throughput technology, the expression of genes uniquely expressed in the cell type of interest should best correlate with the cell proportion estimates.
Composition percentages of the four cell lines in the five mixtures
| 0.1% | 11.0% | 15.0% | 73.9% | |
| 1.0% | 7.0% | 24.0% | 68.0% | |
| 3.0% | 5.0% | 30.0% | 62.0% | |
| 2.0% | 9.0% | 35.0% | 54.0% | |
| 0.3% | 13.0% | 40.0% | 46.7% |
Figure 2Most correlated probesets to Ly18 concentration. (A) Genes corresponding to the three most correlated probe sets to Ly18 cell line concentration. (B) Hybridization values in the four pure samples show that these probesets are only appreciably expressed in Ly18 cells. (C) Hybridization values in the mixtures versus the fraction of Ly18 show high correlation.
Performance of our method in detecting markers
| 0.1% to 3% | 486 | 4 | 7 | 13 | 0 to 1 (0.523) | 0.81 | |
| 5% to 13% | 1237 | 3 | 5 | 5 | 1 to 2 (1.30) | 0.93 | |
| 15% to 40% | 901 | 5 | 9 | 14 | 1 (0.95) | 0.95 | |
| 46.7% to 73.9% | 792 | 2 | 3 | 4 | 1 (0.83) | 0.98 |
range: range of fractions for the cell type in the mixtures. markers: number of markers defined as positives from the analysis of the pure samples and present at least in one of the mixtures; top5 (10, 20): number of positive markers (see Methods) among the 5 (10, 20) most correlated probesets; expected top20: expected numbers of markers if we randomly selected 20 probesets; correlation of 20th: value of the correlation coefficient for the 20th most correlated probe set.
Figure 3Marker scores versus correlation coefficients for probe sets in each cell line. High correlation coefficients correspond to high marker scores and often identify markers. Blue dots indicate markers, defined as present (P) in only that cell type, and orange dots are non-markers. The dashed lines indicate the value of correlation for the 20th most correlated probeset (see Table 2).
Figure 4Relative performance (K562 in red, Ly18 in blue) decreases as error estimates of the target cell type concentration increase.