| Literature DB >> 17407579 |
Manuel Nieto-Díaz1, Wolfgang Pita-Thomas, Manuel Nieto-Sampedro.
Abstract
BACKGROUND: Gene expression profiles of non-model mammals may provide valuable data for biomedical and evolutionary studies. However, due to lack of sequence information of other species, DNA microarrays are currently restricted to humans and a few model species. This limitation may be overcome by using arrays developed for a given species to analyse gene expression in a related one, an approach known as "cross-species analysis". In spite of its potential usefulness, the accuracy and reproducibility of the gene expression measures obtained in this way are still open to doubt. The present study examines whether or not hybridization values from cross-species analyses are as reproducible as those from same-species analyses when using Affymetrix oligonucleotide microarrays.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17407579 PMCID: PMC1853087 DOI: 10.1186/1471-2164-8-89
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Number of repeated probe pairs and sequences in the Affymetrix U133 Plus 2.0 GeneChip®.
| Number of Repetitions | Number of Sequences | Number of Probe Pairs |
| 2 | 12794 | 12794 |
| 3 | 1924 | 2886 |
| 4 | 462 | 924 |
| 5 | 174 | 435 |
| 6 | 74 | 222 |
| 7 | 22 | 77 |
| 8 | 8 | 32 |
| 9 | 4 | 18 |
| 11 | 2 | 11 |
| 12 | 2 | 12 |
| 15 | 2 | 15 |
| 16 | 2 | 16 |
| 20 | 2 | 20 |
Both PM and MM sequences are considered to calculate the number of repeated sequences.
Description of the data used in the present study.
| Present study | GSM93225 | Soft tissues of antler tip | ||
| Present study | GSM93226 | Soft tissues of antler base | ||
| Present study | GSM93227 | Soft tissues from skull frontal bone | ||
| Present study | GSM93228 | Soft tissues of antler tip | ||
| Dillman and Phillips, 2005 | GSM50690 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50691 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50692 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50693 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50694 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50695 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50696 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50697 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50698 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50699 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50700 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50701 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50702 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50703 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50704 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50705 | Whole blood (see Dillman and Phillips, 2005) | ||
| Dillman and Phillips, 2005 | GSM50706 | Whole blood (see Dillman and Phillips, 2005) | ||
Code corresponds to the code names for the samples. GEO database accession number detail the codes corresponding to the expression data in the GEO database [33]
Figure 1Comparison of the number of sequences . Each bar represents the mean number of PM sequences per class and species (error bar represents the Standard Error of the Mean). Below the graph there is a summary of the two-way ANOVA results, using the number of sequences as a dependent variable and CV classes and species as factors. The differences in the number of sequences per class between species are analyzed by the interaction term (CV Class*Species). Significant differences (p < 0.05) between species are marked by *.
Figure 2Comparison of the number of MM and PM sequences . All details and features of the figure are the same as in figure 1. The differences in the number of sequences per class between PM and MM sequences are analyzed by the interaction term (CV Class*PM/MM).
Figure 3PM un-reproducible probe sequences (UPS) for the different species or groups of species. Results are detailed for each species or group of species and for CV over 0.5, 0.75 and 1. Freq details the probability of a sequence to present a CV over each boundary. NRR corresponds to non-random range and specifies the number of samples with a CV over a defined value (0.5, 0.75 and 1) in a given sequence that cannot result from a random distribution of the hybridization values in each species of group of species. Seqs specifies the number of sequences yielding poorly reproducible hybridization in each species or group of species. On the right side there is a graphic representation of the distribution of the UPS in the different species and groups of species. Detailed description of the calculi and the intermediate data can be obtained from the Additional file 6.
Figure 4PM and MM un-reproducible probe sequences (UPS) for human samples. Results are detailed for MM and PM sequences, and for CV over 0.5, 0.75 and 1. All details as in figure 3.
Figure 5Comparison of the hybridization variability in repeated sequences. (a) Comparison between human and non-human PM data and (b) between PM and MM human data. Graphs detail the mean, interquartile and 50 to 95 percentile ranges and illustrate the magnitude and direction of each sample or species CV changes respect to the mean human PM CV. White and grey ranges in graphs "species mean data" and "human mismatch data" are defined after the human samples mean range (white area) and the interquartile range (grey area) and used to explore the magnitude of the change. Tables below each graph details the results of the one-tailed t test used to determine if the differences were significantly larger than 0. Sample codes as detailed in table 2. (*) denotes significantly different mean hybridization CV between a given non-human species and human.
Figure 6Correlation between mean hybridization value and CV value in the repeated sequences. Correlation coefficients were computed for each individual species and for all species together. The upper part of the table details the mean and standard deviation of the correlation coefficients of 7736 repeated sequences in each case. It also specifies the number of sequences with a negative correlation between mean and CV. Lower part of the table details the t-test parameters. The scatter plot below illustrates the correlation between mean hybridization value and CV value in a sequence repeated in 5 different probesets. It shows that correlation exists when all species are considered together but that it does not hold for the individual species.