Literature DB >> 26668173

Debunking the curse of the rainbow jersey: historical cohort study.

Thomas Perneger1.   

Abstract

OBJECTIVE: To understand the underlying mechanism of the "curse of the rainbow jersey," the lack of wins that purportedly affects the current cycling world champion.
DESIGN: Historical cohort study.
SETTING: On the road. PARTICIPANTS: Professional cyclists who won the World Championship Road Race or the Tour of Lombardy, 1965-2013. MAIN OUTCOME MEASURES: Number of professional wins per season in the year when the target race was won (year 0) and in the two following years (years 1 and 2; the world champion wears the rainbow jersey in year 1). The following hypotheses were tested: the "spotlight effect" (that is, people notice when a champion loses), the "marked man hypothesis" (the champion, who must wear a visible jersey, is marked closely by competitors), and "regression to the mean" (a successful season will be generally followed by a less successful one).
RESULTS: On average, world champions registered 5.04 wins in year 0, 3.96 in year 1, and 3.47 in year 2; meanwhile, winners of the Tour of Lombardy registered 5.08, 4.22, and 3.83 wins. In a regression model that accounted for the propensity to win of each rider, the baseline year accrued more wins than did the other years (win ratio 1.49, 95% confidence interval 1.24 to 1.80), but the year in the rainbow jersey did not differ significantly from other cycling seasons.
CONCLUSIONS: The cycling world champion is significantly less successful during the year when he wears the rainbow jersey than in the previous year, but this is best explained by regression to the mean, not by a curse. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

Entities:  

Mesh:

Year:  2015        PMID: 26668173      PMCID: PMC4986283          DOI: 10.1136/bmj.h6304

Source DB:  PubMed          Journal:  BMJ        ISSN: 0959-8138


Mark Cavendish, 2011 road world champion

Introduction

Samuel Johnson chided doctors for believing that if a patient got better it was because they sent him to the waters, for mistaking “subsequence for consequence.”1 The alternative explanation—that patients consult when they feel poorly, and most get better regardless of treatment—requires a grasp of random variation. Mostly, we struggle with randomness.2 Doctors are not the only culprits. Consider professional cycling and the “curse of the rainbow jersey.”3 The “rainbow” jersey is worn by the current cycling world champion (it is an odd rainbow: the jersey is white, with bands of blue, red, black, yellow, and green across the chest). In 1965 British cyclist Tom Simpson won the World Championship Road Race, then broke his leg while skiing during the following winter and lost his 1966 season to this and other injuries. In the ensuing years, champion after champion encountered all manner of misery while wearing the jersey: injury, disease, family tragedy, doping investigations, even death, but especially a lack of wins.3 It soon became obvious that the rainbow jersey was cursed. Several explanations can be entertained. One is that the world champion is as likely to encounter difficulties as anyone, but, as he is the champion, people notice more. This is the “spotlight effect.” Another explanation is that the world champion, very noticeable in the rainbow jersey, is marked more closely by rivals, which lowers his chances of winning. This is the “marked man hypothesis.” Finally, random variation in success rates ensures that a very successful season, such as one during which the rider has won a major race, is likely to be followed by a less successful season. This is the “regression to the mean” phenomenon.4 In this study, I explored to what extent these hypotheses are supported by racing results of cycling champions.

Methods

The study population included winners of the Union Cycliste Internationale men’s World Championship Road Race from 1965 to 2013 and, for comparison, the winners of the Tour of Lombardy of the same years. The latter race is of comparable importance—it is one of five “monuments” among classic one day races—and takes place at the end of the racing season, just like the World Championship. The outcome variable was the number of individual wins in professional races during a given year, obtained from a publicly accessible database (www.procyclingstats.com). Win counts were obtained for three calendar years: year 0, at the end of which the rider won the target race (World Championship or Tour of Lombardy); year 1, during which the world champion wore the allegedly cursed jersey; and year 2, when all riders returned to curse-free status.

Study hypotheses

The hypothesised patterns for the average numbers of wins are (fig 1):

Fig 1  Three hypotheses under consideration: expected average number of wins in year when race took place (year 0), following year (year 1), and year after that (year 2), for winner of World Championship Road Race (empty circles) and winner of Tour of Lombardy (full circles)

Fig 1  Three hypotheses under consideration: expected average number of wins in year when race took place (year 0), following year (year 1), and year after that (year 2), for winner of World Championship Road Race (empty circles) and winner of Tour of Lombardy (full circles) “Spotlight effect”—the problems of the world champion are apparent only because of increased media attention, so the numbers of wins remain at the same level for the three years. “Marked man” hypothesis (indistinguishable from the rainbow curse)—a decrease in wins affects the current world champion, but this effect disappears in year 2 and does not affect the Lombardy winner. “Regression to the mean”—year 0 is a high outlier, and the number of wins returns to a lower level in years 1 and 2. The pattern is identical for the Lombardy winner. Combination of “marked man” and “regression to the mean.”

Statistical analysis

I tabulated the mean numbers of professional victories per rider and per year separately for winners of the World Championship and of the Tour of Lombardy. I used the Wilcoxon paired test for year to year comparisons. I used mixed negative binomial regression to evaluate the hypotheses.5 The dependent variable was the annual number of wins. Each rider was afforded an individual tendency to win, represented below by the random intercept αi. The index “i” identified the rider and remained identical if a rider won more than one target race (for example, Eddy Merckx won five target races and contributed 15 data points). An annual win count appeared more than once if it counted towards more than one target win; for example, for a repeat champion, the win total for year 1 of the first title was also the win total for year 0 of the second title. I built four models. The first (model 1) represented the “spotlight effect” and added to the random intercept a fixed effect for the race (World Championship=0, Tour of Lombardy=1): log(wins)=αi rideri+β1 Lombardy. The model of the “marked man” hypothesis (model 2) added a fixed effect for the year in the rainbow jersey (rainbow=1 for year 1 of the world champion, and=0 otherwise): log(wins)= αi rideri+β1 Lombardy+β2 rainbow. The model representing “regression to the mean” (model 3) included a fixed effect for the baseline year of both races (baseline=1 for year 0, and 0 for years 1 and 2): log(wins)= αi rideri+β1 Lombardy+β3 baseline. The fourth model (model 4) represented both the “marked man” and the “regression to the mean” hypotheses together: log(wins)= αi rideri+β1 Lombardy+β2 rainbow+β3 baseline. Regression coefficients β correspond to expected differences in logarithms of wins, and eβ express the ratio of wins. The a priori hypotheses put no constraint on β1 but required a negative β2 and a positive β3. I used the Akaike information criterion to identify the best fitting model. The criterion equals 2k–2LL, where k is the number of parameters of each model and LL its log-likelihood.6 Each model included three parameters (two for the negative binomial distribution and one for the variance of the random intercept) in addition to parameters of the fixed effects. The analyses were run on Stata version 13.

Results

The dataset included annual win totals for 289 rider years: for each race, 49 results in year 0, 49 in year 1, and 46 (World Championship) or 47 (Tour of Lombardy) in year 2. Totals were lower in year 2 because winners in 2013 contributed only years 0 and 1 (the 2015 season was incomplete at the time of analysis), and three win totals were missing due to retirement of riders. Several riders won more than one target race, and 63 different riders contributed data: 40 riders had one target win, 14 had two wins, seven had three wins, one had four wins, and one had five (Merckx, triple world champion and double Lombardy winner). Six riders won both races in the same season. Winners of both target races had similar annual numbers of wins: on average 4.18 (quartiles 1, 2.5, and 5) for world champions, and 4.37 (quartiles 1, 3, and 6) for Lombardy winners. Similarly, for winners of both races, the annual win total was higher in year 0 than in years 1 and 2 (table 1); the difference between year 0 and the following years was statistically significant, but the difference between years 1 and 2 was not.
Table 1

Mean number of professional racing wins for world champions and for Tour of Lombardy winners of preceding year

No of rider yearsMean (SD) No of winsQuartilesP value v year 0*P value v year 1*
World champions
Year 0495.04 (4.32)2, 3, 60.021
Year 1†493.96 (5.61)1, 2, 50.021
Year 2463.47 (5.18)1, 1, 50.0110.53
Lombardy winners
Year 0495.08 (4.04)2, 4, 70.030
Year 1494.22 (4.94)1, 2, 50.030
Year 2473.83 (4.55)1, 3, 50.0040.34

*Wilcoxon paired test.

†Corresponds to the year during which the “curse of the rainbow jersey” applies.

Mean number of professional racing wins for world champions and for Tour of Lombardy winners of preceding year *Wilcoxon paired test. †Corresponds to the year during which the “curse of the rainbow jersey” applies. The first regression model confirmed that the average number of annual wins did not differ significantly between world champions and Lombardy winners (table 2). Model 2 tested whether the year in the rainbow jersey was a special case; although the win ratio was less than 1, the reduction was small and statistically non-significant. Model 3 confirmed that the baseline year of both races was significantly more successful than the ensuing years. Model 4 confirmed that the rainbow year did not differ significantly from other years (this time the win ratio was above 1) but that the baseline year of either race was significantly more successful.
Table 2

Mixed negative binomial regression models with random rider specific intercept, and their goodness of fit statistics

Hypothesis representedModel 1: Spotlight effectModel 2: Marked man or rainbow curseModel 3: Regression to meanModel 4: Marked man and regression to mean
Ratio (95% CI) of winsP valueRatio (95% CI) of winsP valueRatio (95% CI) of winsP valueRatio (95% CI) of winsP value
Lombardy (v worlds)1.10 (0.85 to 1.42)0.461.05 (0.80 to 1.37)0.741.10 (0.86 to 1.42)0.791.14 (0.87 to 1.49)0.34
Rainbow year (v all others)0.86 (0.65 to 1.13)0.281.10 (0.82 to 1.47)0.53
Baseline year (v year 1 or 2)1.49 (1.24 to 1.80)<0.0011.53 (1.25 to 1.87)<0.001
Akaike information criterion*1389.361390.181373.101374.70

*Lower value is better.

Mixed negative binomial regression models with random rider specific intercept, and their goodness of fit statistics *Lower value is better. The comparison of goodness of fit statistics confirmed that models 3 and 4, which incorporated regression to the mean, were substantially better than models 1 or 2. The best fitting model was model 3, as it had the lowest value of the Akaike information criterion.

Discussion

The curse of the rainbow jersey probably does not exist. The current road racing world champion wins less on average than he did in the previous season, but this phenomenon is best explained by regression to the mean. The relative lack of success was not restricted to the season in the rainbow jersey but persisted in the following season and affected equally the winners of the Tour of Lombardy. There was nothing remarkable about the year spent wearing the rainbow jersey. Nevertheless, this study may not rule out a curse entirely, as it tested only one facet of the curse—the decrease in wins. I found no good data about the personal problems of professional cyclists. Also, all wins were given even weight: if the world champion is cursed to winning only minor races, this analysis would have missed that. Finally, this analysis did not account for any changes in doping practices, for lack of reliable data. The possibility remains that cyclists dope until they win an important race and stop afterwards. Regression towards the mean is unavoidable whenever the variable under study (here, sporting success) fluctuates over time, the correlation between consecutive observations is less than 1, and the baseline observation is defined by an arbitrarily high or low value (here, a season marked by an important win). Regression to the mean may explain, for instance, why patients who lose bone density in the first year are likely to reverse this trend at follow-up or why HIV related risk behaviours improve after enrolment into a prevention trial.7 8 This phenomenon occurs regularly in clinical medicine, research, and programme evaluation, as well as in other walks of life. For instance, some flight instructors believe that praising a pilot after a smooth landing is counterproductive but reprimanding a pilot after a rough landing leads to improvement.2 Their observation is correct—an extreme performance will be followed by a more average one—but the causal inference is not. Neither is this reaction particularly new. Quite possibly the proverb “Pride goeth before destruction” (King James Bible, Proverbs 16:18) should be credited with the first description of regression towards the mean, and not Francis Galton,9 who merely showed that chance and correlation, not the Lord or a large ego, were to blame. Professional cyclists, just like doctors, are prone to mistaking temporal sequence for causality Cycling world champions seem to have a horrible year wearing the champion’s stripes (“the curse of the rainbow jersey”) World champions win significantly less when they wear the rainbow jersey than during the previous year However, this is no different from the following year and is similar to the experience of winners of the Tour of Lombardy Regression towards the mean explains this pattern best Yesterday’s winner is not cursed if he does not win again today (and, by analogy, the patient did not necessarily get better because the doctor prescribed mud baths)
  4 in total

Review 1.  Regression to the mean: what it is and how to deal with it.

Authors:  Adrian G Barnett; Jolieke C van der Pols; Annette J Dobson
Journal:  Int J Epidemiol       Date:  2004-08-27       Impact factor: 7.196

2.  Judgment under Uncertainty: Heuristics and Biases.

Authors:  A Tversky; D Kahneman
Journal:  Science       Date:  1974-09-27       Impact factor: 47.728

3.  Monitoring osteoporosis therapy with bone densitometry: misleading changes and regression to the mean. Fracture Intervention Trial Research Group.

Authors:  S R Cummings; L Palermo; W Browner; R Marcus; R Wallace; J Pearson; T Blackwell; S Eckert; D Black
Journal:  JAMA       Date:  2000-03-08       Impact factor: 56.272

4.  Regression to the mean and changes in risk behavior following study enrollment in a cohort of U.S. women at risk for HIV.

Authors:  James P Hughes; Danielle F Haley; Paula M Frew; Carol E Golin; Adaora A Adimora; Irene Kuo; Jessica Justman; Lydia Soto-Torres; Jing Wang; Sally Hodder
Journal:  Ann Epidemiol       Date:  2015-03-21       Impact factor: 3.797

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.