Literature DB >> 32096769

Application of epidemiological findings to individuals.

Paolo Boffetta¹, Andrea Farioli², Emanuele Rizzello³.

Abstract

Three types of issues need to be considered in the application of epidemiology results to individuals. First, epidemiology results are subject to random error, and can be applied only to an ideal subject with average values of all variables under study, including potential confounders included in the regression models. Second, the observational nature of epidemiology makes it susceptible to systematic error, and any extrapolation to individuals would mirror the validity of the original results. Quantitative bias analysis has been proposed to assess the likelihood, direction and magnitude of bias, but this has not yet become part of the normal practice of epidemiology. Finally, external validity of the results (i.e., their application to individuals and populations other than those included in the underlying studies) needs to be addressed, including population-based factors, such as heterogeneity in exposure or disease circumstances, and individual-based factors, such as interaction of the risk factors of interest with other determinants of the disease. Similar considerations apply to the application of results of clinical trials to individual patients, although in these studies sources of systematic error are better controlled.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2020 PMID： 32096769 PMCID： PMC7809964 DOI： 10.23749/mdl.v111i1.9055

Source DB: PubMed Journal: Med Lav ISSN： 0025-7818 Impact factor: 1.275

Introduction

It is customary to consider a separation between clinical medicine, aimed at improving the health of individuals through prevention or treatment, and public health, aimed at improving health at the population level. Epidemiology provides research tools to both public health and clinical medicine: the implications of epidemiology results oftentimes clearly address one of these two domains, but in numerous instances they are applicable to both. Moving from these considerations, Rogawski et al. (23) distinguish between public health epidemiology, which “informs interventions that are applied to populations or that confer benefits beyond the individual” and medical epidemiology, which “informs interventions that improve the health of treated individual”. Based on this distinction, they argue in favor of public health epidemiology, which, in their opinion, has been neglected in favor of individual-oriented approaches. We would like to opine that such dichotomy is epistemologically incorrect, and to provide a framework to apply epidemiology results to both populations and individuals. It has been a long time since Rose highlighted the link between “sick individuals and sick populations” (24).

The nature of epidemiologic results

Epidemiology measures health-related conditions and events in groups of individuals, and compares them to derive inferences on possible determinants. They therefore represent averages of the likelihood that the condition or event occur (or, in case of continuous variables, that they take a particular value or range) in the different groups under study: at the individual level, the corresponding likelihood is just zero or one. For example, a measure of incidence indicates the number of individuals in which the event of interest (e.g., diagnosis of a disease) occurs over the person-time of observation: while the measure can be interpreted as a hypothetical average likelihood that the event occurred in each individual under study, the actual individual likelihood of occurrence was one for cases and zero for non-cases. Analogously, a comparative measure such a ratio of incidences between two groups would indicate the ratio of the average likelihoods of the individuals in the two groups. The application of group-based likelihoods, and their comparative measures, to individuals is particularly helpful to make prediction regarding individuals outside the population-time under study: in practice, we predict the risk of an individual to die over a given period of time based on the most recent mortality rates of their population, or, if these are not available, of a similar population, and we apply comparative measures to predict the risk of an individual with a given characteristics relative to their counterfactual without that characteristics. It is important to note that these considerations apply to results of studies based on both observational (epidemiology studies) and experimental design (so called clinical trials), although there are differences in their interpretation as discussed below.

Issues in the application of group-derived measures to individuals

Precision

All biological variables are subject to random error, which is operationalized using probability distribution models derived from frequentist statistics: measures aimed at quantifying the variability, such as the standard error and the confidence interval, are customarily reported in clinical and epidemiological studies. The notion of random error and its quantification are familiar to most medical researchers: a simple interpretation is that the central measure of the parameter represents the value in the “average” individual or patient, and that the distribution of all individuals and patients in real life is described by the measures of variability. If the measures of interest are conditional to the distribution of other variables, as in the case of adjustment for potential confounders in stratified or regression analysis, the latter will also be considered in determining the evasive “average” subject. In this context, a multivariable relative risk of lung cancer among smokers equal to 10 can be interpreted as the ratio of the likelihood to develop lung cancer of the “average” smoker in the study population to that of their non-smoking counterfactual, where “average” refers not only to the variables capturing the carcinogenic effect of tobacco smoking (amount, duration, time since quitting, age at start, etc.), but also to other variables included in the regression models. Such ideal average individual, and their counterfactual, are useful simplifications to explain the implications of group-based results (in the example above, “Our study shows that the risk of lung cancer of a smoker is 10-times higher than that of a non-smoker”); these results, however, cannot be applied with certainty to any real individual.

Internal validity

Random error is not the only factor complicating the application of group-derived results of epidemiology studies to individuals. Well-designed and conducted epidemiologic studies provide the best risk estimates when experimental approaches are not applicable. The observational nature of epidemiologic research, however, makes it susceptible to systematic error. Complete control of bias and confounding can seldom be achieved due to: residual and unmeasured confounding; selection and information bias; publication bias. Although the effect of known and measurable confounders can be controlled – at least in part – by including appropriate terms in regression models, control of bias requires appropriate provisions in the design, conduct and analysis of the study (25). In addition, quantitative bias analysis has been increasingly used to assess the possible effect of selected sources of bias (14). This represents a formal approach to provide a quantitative estimate of the likelihood, direction, and magnitude of the error introduced by one or multiple sources of bias. Several types of quantitative bias analysis have been described, depending on whether one or multiple types of bias are addressed, and whether a fixed value or a range of values are assigned to the bias parameter (15). Steps in the bias analysis include (i) to identify potential source of bias, (ii) to identify sources of information on bias parameters, (iii) to derive alternative values to the original study variables, and (iv) to quantify their effect on the original results. Recommendations have been developed, that quantitative bias analysis should accompany any presentation of results of observational studies (7), however, most investigators simply ignore them, resulting in an unknown amount of bias affecting results of such studies.

External validity

External validity concerns the applicability of the results of a study to a population other than that under study. It is also referred to as ‘generalizability’ of the results. Lack of external validity does not reduce the ability of a study to contribute to causal inference, and failure to recognize this fact is one of the most common mistakes in the interpretation of clinical and epidemiological studies. However, external validity becomes an important issue in the context of use of group-based results to individuals. The considerations made above on the need to identify the “average” study subject to account for random error, and to control sources of bias to generate valid results apply only to the populations from which the results were generated. Any application of results of epidemiological or clinical research to individuals outside the populations under study should address factors that may differ between the two. These factors can be operationally divided in two groups. The first group comprises factors external to the individuals to whom results are to be applied. Differences in exposure circumstances is one such factor. Table 1 illustrates this phenomenon in the case of the lower risk of lung cancer from tobacco smoking in Chinese smokers compared to European and American smokers (13). In a series of elegant studies conducted in two populations of Chinese smokers from Shanghai and Singapore, Yuan et al. (28) showed that the lower risk was likely due to the characteristics of the cigarettes consumed in China vs. Singapore: although the level of urinary cotinine (a marker of amount of tobacco smoking) was comparable in the two groups, the levels of 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL), a tobacco-specific nitrosamine and a markers of carcinogenicity of tobacco smoke, was significantly lower in smokers from Shanghai, likely due to differences in curing and manufacturing processes of traditional local Chinese cigarettes smoked in Shanghai compared to standard industrial cigarettes smoked in Singapore.

Table 1

Levels of cotinine and 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL) in urine samples of smoking lung cancer cases from Shanghai and Singapore (28)

Tabella 1 - Livelli di cotinina e di 4-(metilnitrosamino)-1-(3-piridil)-1-butanolo (NNAL) in campioni di urine di soggetti fumatori con tumore del polmone urine nelle città di Shanghai e Singapore (

	Shanghai (N=155)	Singapore (N=91)
Cotinine (ng/mg creat.)	3,033	2,873
NNAL (pmol/mg creat.)	0.23	0.89

Levels of cotinine and 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL) in urine samples of smoking lung cancer cases from Shanghai and Singapore (28) Tabella 1 - Livelli di cotinina e di 4-(metilnitrosamino)-1-(3-piridil)-1-butanolo (NNAL) in campioni di urine di soggetti fumatori con tumore del polmone urine nelle città di Shanghai e Singapore ( Most occupational epidemiological studies are of retrospective design, and address the health effects of exposure that occurred in the past. In many industries there have been changes in technology and industrial hygiene, which have resulted in important changes in exposure circumstances and levels. Although in some cases detailed dose-response and other data are available, that help transferring results of these studies to other populations of workers in the same industries or jobs, in many instances these data are not available. Use of results of studies conducted in other countries, where technological and industrial hygiene conditions might differ, is another potential source of lack of generalizability. Another factor is the need to consider absolute, rather than relative measures of occurrence. Exposure to a risk factor increases the number of cases of the disease of interest in the population, i.e., its incidence; however, the relative measure of incidence depends also on the incidence in the unexposed group. In the absence of effect modification, the relative measure of incidence will therefore be lower in a population with higher incidence in the unexposed compared to a population with lower incidence, as shown in table 2. A well-known example of this phenomenon is the apparent stronger association between tobacco smoking and lung cancer in women compared to men: although various explanations have been proposed, such as a role of hormonal factors (21), the most likely explanation is the higher rate of lung cancer among men for causes other than tobacco smoking (e.g., occupational exposures): a reduction of the role of these other factors in men explain why the gender gap in relative risks observed in the past has disappeared in recent studies (22). The presence of interaction between the risk factor of interest and the determinants of the incidence among the unexposed would further affect the relative risk.

Table 2

Tabella 2 - Effetto di incidenza tra i non esposti sul rischio relativo – esempio ipotetico di due popolazioni di soggetti con 1000 esposti e 1000 non esposti con una maggiore incidenza nei non esposti in una delle popolazioni. L’incidenza dovuta all’esposizione è fissata a 40/1000 in entrambe le popolazioni

	Population 1	Population 2
Incidence rate in unexposed	10/1000	20/1000
Incidence rate in exposed	50/1000	60/1000
Rate ratio	5	3

Effect of incidence in unexposed on the relative risk – Hypothetical example of two populations with 1000 exposed and 1000 unexposed subjects each, and higher incidence in the unexposed in one population. The incidence due to the exposure is set to 40/1000 in both populations Tabella 2 - Effetto di incidenza tra i non esposti sul rischio relativo – esempio ipotetico di due popolazioni di soggetti con 1000 esposti e 1000 non esposti con una maggiore incidenza nei non esposti in una delle popolazioni. L’incidenza dovuta all’esposizione è fissata a 40/1000 in entrambe le popolazioni Characteristics of the individuals represent the second group of factors that may affect external validity. The simplest form they can take is that of modifiers of the effect of the exposure of interest (interaction), which is presented (in the form of a positive interaction) in table 3. The incidence among those unexposed to either factor is 10/1000. In the absence of factor B, exposure to factor A increases the incidence by 10/1000; in the absence of factor A, exposure to factor B increases the incidence by 20/1000. In the absence of interaction, the incidence among those exposed to both factors should be (10+10+20) = 40/1000; in the example in the table, the incidence among those exposed to both is 50/1000, suggesting a positive interaction between the two exposures (for sake of simplicity no consideration is given to the statistical significance of the interaction term). When rate ratios are used instead of incidence rates, and the group unexposed to both factors is taken as reference, the interaction is described by the formula:

Table 3

Hypothetical example of positive interaction between two risk factors on the incidence of a disease

Tabella 3 - Esempio ipotetico di interazione positiva tra due fattori di rischio sull’incidenza della malattia

	Incidence of the disease
	Unexposed to A	Exposed to A
Unexposed to B	10/1000	20/1000
Exposed to B	30/1000	50/1000

Hypothetical example of positive interaction between two risk factors on the incidence of a disease Tabella 3 - Esempio ipotetico di interazione positiva tra due fattori di rischio sull’incidenza della malattia RRab ≠ RRa + RRb – 1. Interaction is conceptually similar to the problem of difference in background incidence across populations described above; however, it applies to the characteristics of the individuals, irrespective of the distribution of the two risk factors in the population. Several examples of interaction have been identified among causes of chronic diseases, both genetic and environmental (in broad sense). Although their effect on the risk of disease at the individual level can in principle be accounted for, a precise estimate of their magnitude is available only for a fraction of them, such tobacco smoking and asbestos for lung cancer (20) and tobacco smoking and alcohol drinking for head and neck cancer (9).

Adequacy of statistical models aimed at measuring associations

In current epidemiological practice, observational studies are designed to identify or confirm an association between an exposure and the occurrence of a certain disease. The focus is posed on the measures of association (such as relative risk, when measuring on the multiplicative scale, or risk difference, when quantifying the absolute risk); little attention is devoted to the overall performance of the regression model. Sometimes researchers present direct comparison between two or more regression models by applying statistical testing (e.g. maximum likelihood ratio test (11)) or information criteria which are largely based on the likelihood function of the model (e.g. Bayesian Information Criterion or Akaike Information Criterion) (2, 3, 26). Of note, such comparisons inform whether a specific model adapts to sample data better than a few others; however, they do not convey information on the absolute goodness-of-fit of the models. A perhaps even worse practice is testing the goodness-of-fit (for instance, through the Hosmer-Lemeshow goodness-of-fit test (10)) and interpreting a low p-value as an indication that the model is performing well; in fact, these tests only inform that introducing a specific variable in the regression model contributes to improving the goodness-of-fit, but do not provide a meaningful measure of the overall performance of the regression model. People with a quantitative background (e.g., industrial hygienists) should be familiar with measuring and reporting the proportion of the variance in the dependent variable (disease status) that is predictable from the independent variable (exposure); in the context of simple linear regression, this can be achieved by the coefficient of determination (usually reported as R2). Outside linear regression, similar measures have bene proposed, like the McFadden pseudo-R2 for logistic regression (17, 18). This index assumes a value of 0 in the empty model (no predictive value) and a value of 1 in case of perfect prediction. A conceptually similar index is the Harrel’s C index of concordance estimated after fitting a Cox proportional hazards regression models (8). Describing the properties and the (several) limitations of these indices goes beyond the intents of this paper. However, a consideration is worthwhile: how often is the reader of an epidemiological paper informed about the absolute goodness of fit of a regression model whose results are reported in the classic form of one or several relative risks (and corresponding confidence intervals and p-values)? It has been shown that in most observational studies the absolute goodness of fit of regression models is usually rather low (e.g. odds ratios from case-control studies with a McFadden pseudo-R2 not higher than 0.3) (19). This circumstance might not a limitation if the purpose of the analysis is to demonstrate the effect of a certain exposure in increasing (or decreasing) the risk relative or absolute risk of a specific condition; indeed, estimates of association will be valid as far as bias, including confounding, can be excluded (see above), independent from the overall goodness of fit of the regression model. Conversely, knowledge of the overall model performance measured through an absolute goodness-of-fit index is fundamental if the goal is to answer to the following questions: Did a specific subject in the study population develop the condition under investigation due to a specific exposure (in-sample prediction)? Will a specific subject develop the condition under investigation, and when (out-of-sample prediction)? To be answered, these questions need an extremely high predictive value from the underlying regression models. Outside the clinical context (e.g. prediction of tumor response based on treatment protocols), this condition is seldom achieved. A worthwhile example is the calculation of the risk of cardiovascular events based on the few strong determinants highly prevalent in the general population. The most known example is the so called “Framingham score”, which consists in a series of formulas derived from Cox proportional hazards regression models applied to a prospective population-based cohort study (4). The authors were able to adapt models with an Harrel’s C >0.7 – a conventional, somehow questionable, threshold that identifies models with good predictive value – based on a few variables: gender (the models were actually gender-specific), age, diabetes status, tobacco smoking, treated and untreated systolic blood pressure, total and high-density lipoprotein cholesterol (or body mass index, as a surrogate measure). Knowledge of these few data is used in current clinical practice to predict the 10-year risk of cardiovascular disease. However, a large body of literature suggested that the external validity of the formula might be limited (e.g. (27)); in particular, an overestimation of the risk has been observed in certain populations (5). This could occur because of improvements in the treatment and control of predisposing conditions (such as hypertension and diabetes) or due to a different baseline risk determined by lifestyle (including diet) and genetic factors. In synthesis, the use of estimates from observational studies to predict individual events is a complex process often hampered by the lack of fundamental knowledge on the disease process and, hence, a limited predictive value of the multivariable regression models used to generate the results.

Considerations about clinical trials

The above discussion was formulated with respect to observational research. One can argue that these considerations do not apply to experimental studies, in which the determinant under investigation (exposure) is assigned to study subjects. In this respect, the results of trials, and in particular clinical trials, are directly applicable to individual patients with the same conditions as those included in the trials. After all, when clinicians prescribe a new drug to their patients based on the results of a trial, they do so because they expect in the patients the same effect shown in the trial. If clinical and other medical trials are well designed and executed, they can prevent bias from affecting their results. However, the other two sources of error in applying results from populations to individuals, that were described above for observational studies, also apply to trials. Results of trials are affected by random error, and their results would precisely apply only to a hypothetical “average” patient. In practice, the clinicians mentioned in the previous paragraph would not be so naïve to expect in each of their patients exactly the result reported in the trial: they prescribe the new drug with the expectation to see in their patients, on average, the effect observed in the trial, but they recognize that there might be plenty of individual variation in the response. More important, however, is the issue of external validity of results of experimental studies. The problem that trials, in particular treatment trials, include selected samples of patients who might in principle benefit from the treatments under has been increasingly recognized in the medical literature, in particular with respect to sociodemographic characteristics such as age (e.g., underrepresentation of elderly patients in clinical trials (16)) and race/ethnicity (e.g., overrepresentation of non-Hispanic Whites (6)).

Conclusions

Considerations about the applicability of results of epidemiology studies to individuals are analogous to those developed within the framework of personalized medicine. The goal of personalize medicine is to describe all individual characteristics that determine the response of the individual patient to a given treatment, and select the most effective one (1). An analogous approach can be invoked for epidemiology, although issues of internal validity would complicate the process, as discussed above. Although an exhaustive description of all relevant individual factors remains elusive, steps can be taken in this direction. Systematic reviews, meta-analyses and umbrella reviews (12) help improving the precision of risk estimates and offer opportunity for stratified analysis to address sources of heterogeneity of results across populations. Routine application of quantitative bias analysis (15), as discussed above, would improve the validity of inferences at the individual level. Integration of biology and epidemiology would contribute to reducing uncertainties on the external validity of the results. In conclusion, epidemiology results can be applied to individuals under the stringent framework we outlined here. As in most instances sources of random error, internal validity, and external validity are only partially controlled, extrapolation to individuals remains tentative at best. One case in which extrapolation to individuals may be justified is that of high-penetrance susceptibility genes: in which results of clinical or epidemiological have shown such a high risk in carriers that consideration about random and systematic have less relevance, and it may be justified to assume external validity even in the absence of direct evidence supporting it. No potential conflict of interest relevant to this article was reported by the authors PB is an associate editor of the journal, but this article was reviewed by an anonymous reviewer, who provided useful suggestions to improve it

18 in total

Review 1. Comparisons of established risk prediction models for cardiovascular disease: systematic review.

Authors: George C M Siontis; Ioanna Tzoulaki; Konstantinos C Siontis; John P A Ioannidis
Journal: BMJ Date: 2012-05-24

Review 2. Barriers to recruiting underrepresented populations to cancer clinical trials: a systematic review.

Authors: Jean G Ford; Mollie W Howerton; Gabriel Y Lai; Tiffany L Gary; Shari Bolen; M Chris Gibbons; Jon Tilburt; Charles Baffi; Teerath Peter Tanpitukpongse; Renee F Wilson; Neil R Powe; Eric B Bass
Journal: Cancer Date: 2008-01-15 Impact factor: 6.860

3. On the Need for Quantitative Bias Analysis in the Peer-Review Process.

Authors: Matthew P Fox; Timothy L Lash
Journal: Am J Epidemiol Date: 2017-05-15 Impact factor: 4.897

4. Good practices for quantitative bias analysis.

Authors: Timothy L Lash; Matthew P Fox; Richard F MacLehose; George Maldonado; Lawrence C McCandless; Sander Greenland
Journal: Int J Epidemiol Date: 2014-07-30 Impact factor: 7.196

5. Sick individuals and sick populations.

Authors: G Rose
Journal: Int J Epidemiol Date: 1985-03 Impact factor: 7.196

6. Evaluating the yield of medical tests.

Authors: F E Harrell; R M Califf; D B Pryor; K L Lee; R A Rosati
Journal: JAMA Date: 1982-05-14 Impact factor: 56.272

7. General cardiovascular risk profile for use in primary care: the Framingham Heart Study.

Authors: Ralph B D'Agostino; Ramachandran S Vasan; Michael J Pencina; Philip A Wolf; Mark Cobain; Joseph M Massaro; William B Kannel
Journal: Circulation Date: 2008-01-22 Impact factor: 29.690

8. Urinary levels of tobacco-specific nitrosamine metabolites in relation to lung cancer development in two prospective cohorts of cigarette smokers.

Authors: Jian-Min Yuan; Woon-Puay Koh; Sharon E Murphy; Yunhua Fan; Renwei Wang; Steven G Carmella; Shaomei Han; Katie Wickham; Yu-Tang Gao; Mimi C Yu; Stephen S Hecht
Journal: Cancer Res Date: 2009-03-24 Impact factor: 12.701