| Literature DB >> 36186715 |
Tzu-Kun Hsiao1, Jodi Schneider1.
Abstract
We present the first database-wide study on the citation contexts of retracted papers, which covers 7,813 retracted papers indexed in PubMed, 169,434 citations collected from iCite, and 48,134 citation contexts identified from the XML version of the PubMed Central Open Access Subset. Compared with previous citation studies that focused on comparing citation counts using two time frames (i.e., preretraction and postretraction), our analyses show the longitudinal trends of citations to retracted papers in the past 60 years (1960-2020). Our temporal analyses show that retracted papers continued to be cited, but that old retracted papers stopped being cited as time progressed. Analysis of the text progression of pre- and postretraction citation contexts shows that retraction did not change the way the retracted papers were cited. Furthermore, among the 13,252 postretraction citation contexts, only 722 (5.4%) citation contexts acknowledged the retraction. In these 722 citation contexts, the retracted papers were most commonly cited as related work or as an example of problematic science. Our findings deepen the understanding of why retraction does not stop citation and demonstrate that the vast majority of postretraction citations in biomedicine do not document the retraction.Entities:
Keywords: citation analysis; citation context analysis; intentional postretraction citation; postretraction citation PubMed Central Open Access Subset; retraction
Year: 2022 PMID: 36186715 PMCID: PMC9520488 DOI: 10.1162/qss_a_00155
Source DB: PubMed Journal: Quant Sci Stud ISSN: 2641-3337
Citations acknowledging retraction reported in previous research
| Reference | % citations acknowledging retraction | # citations acknowledging retraction in the citations included for analysis | # retracted papers included for citation analysis |
|---|---|---|---|
|
| 37.85 | 81/214 citations | One retracted Covid-19 paper |
|
| 47.5 | 95/200 citations | Two retracted Covid-19 papers |
|
| 21.05 | 4/19 sampled citations | 33 retracted Covid-19 papers |
|
| 4.5 | 5/112 citations | One retracted paper |
|
| 5.4 | 37/685 citations | 81 retracted dentistry papers |
|
| 16.03 | 21/131 citations | 46 retracted psychological papers |
|
| 1.07 | 6/559 citations | 54 retracted papers reporting a radiology-imaging diagnostic method |
|
| 6.6 | 27/407 citations | 47 retracted radiation oncology papers |
|
| 38.2 (2005–2010); 71.7 (2011–2018) | 123/322 (2005–2010); 360/502 (2011–2018) | Wakefield’s |
|
| 25.8 | –/267 citations | 20 retracted papers by Scott S. Reuben |
|
| 4.15 | 204/4,917 citations | 265 retracted papers in MEDLINE |
|
| 3.5 | 0/37 citations in 2014; 2/57 citations in 2015 | A paper published in |
|
| 6 | 14/247 citations in the 2000 sample; 8/144 citations in the 2005 sample | 1,112 retracted papers in PubMed |
|
| 2.8 | 17/603 citations stratified random sampled from 5,393 citations | 102 papers affected by scientific misconduct |
|
| < 3 in 9/10 papers; 29 in the paper having 96 citations | –/225 citations | 315 retracted papers in PubMed |
|
| 6.4 (AIM); 7.7 (non-AIM) | 19/299 citations from AIM journals; 123/1,594 citations from non-AIM journals | 235 retracted papers in MEDLINE |
|
| 5.7 | 17/298 citations | John Darsee’s papers |
|
| 2.9 | 5/178 citations | 82 retracted papers identified from journals in |
Number not reported.
Inferred from data: 2/57 citations in 2015.
Descriptive statistics of the number of citations
| Preretraction | In retraction year | Postretraction | Total | |
|---|---|---|---|---|
| #Retracted papers | 7,766 | 7,766 | 7,766 | 7,813 |
| #Citations | ||||
| Mean | 12.86 | 2.74 | 6.20 | 21.69 |
| SD | 40.28 | 7.23 | 17.87 | 55.25 |
| Q1 | 0.00 | 0.00 | 0.00 | 2.00 |
| Q2 | 1.00 | 1.00 | 2.00 | 7.00 |
| Q3 | 10.00 | 3.00 | 6.00 | 21.00 |
| Max | 1440.00 | 299.00 | 844.00 | 2011.00 |
47 papers lacked retraction year information.
2,088 citations from retraction notices and 15 problematic citations have been excluded.
Citation contexts of citations to retracted papers
| Preretraction (%) | In retraction year (%) | Postretraction (%) | Missing retraction year (%) | |||||
|---|---|---|---|---|---|---|---|---|
| Total | 28,439 | (100.00) | 6,412 | (100.00) | 13,252 | (100.00) | 31 | (100.00) |
| In main text | ||||||||
| In introduction/background | 6,952 | (24.45) | 1,698 | (26.48) | 3,947 | (29.78) | 17 | (54.84) |
| In methods | 2,071 | (7.28) | 388 | (6.05) | 679 | (5.12) | 2 | (6.45) |
| In results | 3,249 | (11.42) | 708 | (11.04) | 1,190 | (8.98) | 1 | (3.23) |
| In conclusion/discussion | 8,089 | (28.44) | 1,881 | (29.34) | 4,156 | (31.36) | 4 | (12.90) |
| IMRaD not identified | 6,883 | (24.20) | 1,414 | (22.05) | 2,764 | (20.86) | 6 | (19.35) |
| In abstract | 14 | (0.05) | 4 | (0.06) | 1 | (0.01) | 0 | (0.00) |
| In supporting material | 4 | (0.01) | 6 | (0.09) | 5 | (0.04) | 0 | (0.00) |
| In tables and table/figure captions | 1,177 | (4.14) | 313 | (4.88) | 510 | (3.85) | 1 | (3.23) |
Sections for which the IMRaD section types were unidentifiable through the method described in Hsiao and Torvik (2020).
Postretraction citation contexts acknowledging the retraction
| Priority | Rule | # citation contexts identified | # citation contexts acknowledging the retraction | # false positives |
|---|---|---|---|---|
| 1 | At least one of the cue words (retract | 243 | 169 | 74 |
| 2 | At least one of the cue words (retract | 309 | 283 | 26 |
| 3 | Retraction notice is cited together with the retracted paper in the citing paper’s full text | 159 | 159 | 0 |
| Total | 711 | 611 | 100 |
Cue words do not always refer to retraction. We manually inspected the identified citation contexts to exclude false positives. Some examples: retract* in neurite retraction; withdr* in withdrawal symptoms; error in error rate.
Locations of postretraction citation contexts
| # postretraction citation contexts acknowledging retraction | (%) | |
|---|---|---|
| In main text | ||
| In introduction/background | 154 | (21.33) |
| In methods | 32 | (4.43) |
| In results | 105 | (14.54) |
| In discussion/conclusion | 110 | (15.24) |
| IMRaD not identified | 284 | (39.34) |
| In supporting material | 3 | (0.42) |
| In tables and table/figure captions | 34 | (4.71) |
| Total | 722 | (100) |
The citation purposes
| Purpose | Description |
|---|---|
| Comparison | Authors of the citing paper compared “their” results or methods with the retracted paper. According to the tone, this category is further divided into negative (−), positive (+), and neutral (±). Negative tone refers to the cases that inconsistency, contradiction, or discrepancy is reported in the comparison. Positive tone refers to the cases that consistency is reported in the comparison. Neutral tone refers to the cases where consistency between the compared results was unclear. |
| Correction | The retracted paper was cited to make a correction. |
| Example of problematic science | The retracted paper was cited to provide an example of problematic science. This purpose satisfies one of the following conditions: (a) the retracted paper was cited to provide an example of problematic research (e.g., irreproducible research, unreliable research, research involving scientific misconduct, a flawed study); (b) the retracted paper was cited to provide an example of where peer review failed and problematic science was published; (c) the retracted paper was cited to provide an example showing a problem in scientific research or scholarly communication; or (d) the retracted paper was cited to provide an example of the societal impact of problematic research. |
| Exclusion rationale | The retracted paper was cited to explain why it is excluded from use/consideration. Especially found in the context of research synthesis (e.g., review articles and meta-analyses that provide a formal exclusion rationale for papers that are not included.) This purpose can also be found in the literature review section of a research article. |
| Notify retraction included | Notify readers that one or more retracted papers were included in a different, previous published review article, guideline, or paper. |
| Related work | The retracted paper was cited to show what has been done or found in the past or was cited for one of the following reasons: (a) The retracted paper was once a landmark in the field; (b) the retracted paper was the origin/pioneer of something (e.g., “X first identified/describe Y”, “X was identified as a novel …”, “X was initially proposed…”, or “X was originally…”); and (c) the retracted paper led to an important event in the field, such as Wakefield’s paper’s influence on the supposed autism–vaccine link and the antivaccine movement. |
| Republication of retraction | In the republication of the retracted paper, the authors cited the retracted paper to announce the republication. |
| Reproduce | A citation to the retracted paper was made because the citing paper tried to reproduce/repeat the finding or experiment mentioned in the retracted paper. |
| Subject of study | Cited retraction is the object of study of a case study about retraction, or is the data used in a study about retraction, scientific misconduct, or peer review. Note that in these studies, retracted papers can be cited in the results. |
| Use | Citing paper uses something from the cited retracted paper. This type of citation is often found in the Methods section. |
| Other | Those that do not belong to the above categories. |
Boxplot of time lag between publication and retraction arranged by retraction year and publication year. The box areas show interquartile range (IQR, from 25% to 75%) of the time lag between publication and retraction. The upper whisker of each box is the longest time lag smaller than 1.5 IQR above the 75th percentile; the lower whisker is the shortest time lag greater than 1.5 IQR below the 25th percentile. The grey dotted line in the lower panel shows the maximum possible time lag (i.e., up to 2020, the year of data collection) for papers to be retracted in each publication year.
Distribution of active, inactive, and uncited papers.
The percentage of active retracted papers with a given retraction year over time. Note that the citation data from 2020 are incomplete because the data were collected in August 2020. The shades denote the percentage of active retracted papers in a given citation year. Red shades denote that more than 50% of the papers retracted in a given year were active in a given citation year. Blue shades denote that less than 50% of the papers retracted in a given year were active in a given citation year. The grey area is the preretraction phase. The yellow box shows the percentages of papers retracted before 1995 that are active in citation year 2013. The green box shows the percentages of papers retracted before 2014 that are active in citation year 2018.
Locations of citations to retracted papers. Text progression indicates the location of a citation context in the main text on a percentage scale. The IMRaD sections were identified from the section titles (Hsiao & Torvik, 2020). IMRaD not identified refers to the sections where the IMRaD section types could not be identified from the section titles. The y-axis scales do not range from 0–100% because each part of the text only has a few citation contexts. For clarity, to show the trends of citation contexts’ locations the scales of the y-axes were set from 0–3% for all citation contexts as well as for citation contexts in introduction/background sections and discussion/conclusion sections. For citation contexts in method sections, in result sections, and in sections for which the IMRaD section types were unidentifiable, the scales of the y-axes were set from 0–0.5%.
Distribution of text progression by the age of citation relative to retraction year. Age represents the number of years preceding (negative) or following (positive) retraction. Markers in the cells denote the percentage of citation context in each group of citation contexts having the same age relative to retraction year.
Locations of the retracted papers being mentioned only once in the full text
| IMRaD | # Preretraction pairs | (%) | # Postretraction pairs (retraction not acknowledged) | (%) | # Postretraction pairs (retraction acknowledged) | (%) |
|---|---|---|---|---|---|---|
| Introduction/background | 3,272 | (27.52) | 2,359 | (33.04) | 47 | (16.43) |
| Methods | 690 | (5.8) | 397 | (5.56) | 13 | (4.55) |
| Results | 971 | (8.17) | 490 | (6.86) | 29 | (10.14) |
| Discussion/conclusion | 3,656 | (30.75) | 2,349 | (32.90) | 46 | (16.08) |
| IMRaD not identified | 3,300 | (27.76) | 1,545 | (22.64) | 151 | (52.80) |
| Total | 11,889 | (100) | 7,140 | (100) | 286 | (100) |
Number of different IMRaD sections where multiple mentions appeared
| #Different IMRaD sections | # Preretraction pairs | (%) | # Postretraction pairs (retraction not acknowledged) | (%) | # Postretraction pairs (retraction acknowledged) | (%) |
|---|---|---|---|---|---|---|
| 1 | 2,225 | (44.45) | 860 | (46.66) | 68 | (50.00) |
| 2 | 2,188 | (43.71) | 890 | (48.29) | 57 | (41.91) |
| 3 | 483 | (9.65) | 82 | (4.45) | 10 | (7.35) |
| 4 | 109 | (2.18) | 10 | (0.54) | 1 | (0.74) |
| 5 | 1 | (0.02) | 1 | (0.05) | 0 | (0) |
| Total | 5,006 | (100) | 1,843 | (100) | 136 | (100) |
Location of the retracted papers that were mentioned multiple times.
Distribution of the citation contexts belonging to the purposes
| Purpose | # citation contexts | (%) |
|---|---|---|
| Related work | 453 | (62.74) |
| Example of problematic science | 62 | (8.59) |
| Reproduce | 40 | (5.54) |
| Exclusion rationale | 35 | (4.85) |
| Subject of study | 33 | (4.57) |
| Comparison | 26 | (3.60) |
| Notify retraction included | 24 | (3.32) |
| Use | 20 | (2.77) |
| Other | 14 | (1.94) |
| Correction | 10 | (1.39) |
| Republication of retraction | 5 | (0.69) |
| Total | 722 | (100) |
Comparison: 11 (1.52%) were negative (−), 5 (0.69%) were neutral (±), and 10 (1.39%) were positive (+).
Citation purposes observed in different types of articles. The signs (−, +, ±) represent the tone identified in comparison citations. In research articles and review articles, most of the tones of comparison citations were negative (−), followed by positive (+) and neutral (±). In systematic reviews and meta-analyses, only positive comparison citations were observed. For articles about retraction, scientific misconduct, or peer review, the citation purposes observed in each article subtype are presented with different color keys.