| Literature DB >> 540603 |
I F Goldstein, J L Fleiss, M Goldstein, L Landovitz.
Abstract
In epidemiological studies using linear regression, it is often necessary for reasons of economy or unavailability of data to use as the independent variable not the variable ideally demanded by the hypothesis under study but some convenient practical approximation to it. We show that if the correlation coefficient between the "practical" and "ideal" variables can be obtained, then a range of uncertainty can be obtained within which the desired regression coefficient of dependent on "ideal" variable may lie. This range can be quite wide, even if the practical and ideal variables are fairly well correlated. These points are illustrated with data on observed regression coefficients from an air pollution epidemiological study, in which pollution measured at one station in a large metropolitan area (containing 40 aerometric stations) was used as the practical approximation to the city-wide average pollution. The uncertainties in the regression coefficients were found to exceed the regression coefficients themselves by large factors. The problem is one that may afflict application of linear regression in general, and suggests caution when selecting independent variables for regression analysis on the basis of convenience, rather than relevance to the hypotheses tested.Mesh:
Substances:
Year: 1979 PMID: 540603 PMCID: PMC1637915 DOI: 10.1289/ehp.7932311
Source DB: PubMed Journal: Environ Health Perspect ISSN: 0091-6765 Impact factor: 9.031