| Literature DB >> 16398936 |
Jan M Ruijter1, Helene H Thygesen, Onard J L M Schoneveld, Atze T Das, Ben Berkhout, Wouter H Lamers.
Abstract
BACKGROUND: In experimental biology, including retrovirology and molecular biology, replicate measurement sessions very often show similar proportional differences between experimental conditions, but different absolute values, even though the measurements were presumably carried out under identical circumstances. Although statistical programs enable the analysis of condition effects despite this replication error, this approach is hardly ever used for this purpose. On the contrary, most researchers deal with such between-session variation by normalisation or standardisation of the data. In normalisation all values in a session are divided by the observed value of the 'control' condition, whereas in standardisation, the sessions' means and standard deviations are used to correct the data. Normalisation, however, adds variation because the control value is not without error, while standardisation is biased if the data set is incomplete.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16398936 PMCID: PMC1368993 DOI: 10.1186/1742-4690-3-2
Source DB: PubMed Journal: Retrovirology ISSN: 1742-4690 Impact factor: 4.602
Figure 1Comparison of normalisation, standardisation and factor correction. DNA constructs containing different enhancer, promoter, and intron sequences from the rat glutamine synthetase gene coupled to the firefly luciferase reporter gene were transfected into FTO-2B cells. Luciferase activity was measured 64 hours after transfection [1]. This plot shows the activity of 8 different DNA constructs (= conditions) measured in 6 independent measurement sessions (◆ □ ▲ ◇ ●). A: Original measurements, plotted on a logarithmic Y-axis. The approximately parallel lines connecting the results from each session indicate that most of the variation between the sessions is multiplicative. B: Data after normalisation, using condition 1 as 'control' (one session [◆] did not include condition 1 and had to be dropped). Note that the variation in the control condition ('c') is lost. C: Data after standardisation. Note that a linear transformation of the standardised values (standardised* = 410 + 305 × standardised) was required to enable this logarithmic plot. D: Data after applying factor correction. The minimal remaining distance between the lines indicates that factor correction is most effective in removing the multiplicative between-session variation.
Figure 2Comparison of normalisation, standardisation and factor correction. Mean (and SEM) of the data of the molecular-biology data set from Figure 1 A: original data. B: normalised data. C: standardised data. D: data after factor correction. Note that normalisation, standardisation, and factor correction reduce the variation within each condition. However, normalisation (B) leads to loss of variance in the control condition ('c') and to added variation in the other conditions. Standardisation (C) of this incomplete data set leads to increased variation, compared to factor correction, in some conditions. With factor correction (D) all conditions retain their statistical variance, which is generally smaller than after normalisation and standardisation. An asterisk indicates a statistically significant difference between the DNA construct and construct 1 (t-test; P < 0.05). Note that the number of observations per construct in these comparisons ranges from 2 to 5.
Results of the application of both methods for estimation of session factors on a simulated data set. A multi-session experiment with 5 sessions and 5 conditions was simulated with 5 observations per combination of session and condition. Each condition was measured in 4 different sessions. In simulating data, the overall mean was set to 100 and the standard deviation was set to 10. Factors and condition effects are given in the table. The estimated session factors are all close to the factors used in the simulation for both methods and the factors estimated with the ratio method are well within the variance of those estimated with the maximum likelihood approach. The condition means estimated with the maximum likelihood method are close to the values used in the simulation.
| Ymean | sd | n | se | |
| 100 | 10 | 20 | 2.24 | |
| ratio | max. likelih. | |||
| session | factor | factor | factor | se |
| 1 | 0.1 | 0.101 | 0.101 | 0.002 |
| 2 | 0.2 | 0.188 | 0.188 | 0.004 |
| 3 | 1 | 1.065 | 1.054 | 0.021 |
| 4 | 5 | 4.913 | 4.979 | 0.093 |
| 5 | 10 | 10.05 | 10.02 | 0.185 |
| simulated | observed | |||
| condition | effect | mean | se | |
| A | -50 | 51.7 | 2.14 | |
| B | -20 | 78.6 | 2.14 | |
| C | 0 | 101.7 | 2.15 | |
| D | 20 | 119.4 | 2.15 | |
| E | 50 | 151.4 | 2.16 | |
Figure 3Virus production of HIV-1 variants. The HIV-1 molecular clone LAI and derivatives with a modified mechanism of transcription regulation [13] and variation in the viral Tat gene were transfected into C33A cells. Virus production was measured at two days after transfection. The experiment was repeated seven times. A: mean values with standard deviation of observed data. B: normalisation of the data with the WT construct set at 100% in each session. C: corrected data after standardisation. D: data after removal of between-session variation with factor correction. WT: HIV-rtTA construct with wild-type Tat gene; A-D: HIV-rtTA variants with mutated Tat genes (to be described elsewhere); LAI: HIV-LAI proviral clone with unmodified mechanism of transcription regulation. An asterisk indicates a statistically significant difference between the virus variant and WT (t-test; P < 0.05). The number of observations per variant is 8.