| Literature DB >> 34376878 |
Ivan Heibi1,2, Silvio Peroni1,2.
Abstract
In this article, we show the results of a quantitative and qualitative analysis of open citations on a popular and highly cited retracted paper: "Ileal-lymphoid-nodular hyperplasia, non-specific colitis and pervasive developmental disorder in children" by Wakefield et al., published in 1998. The main purpose of our study is to understand the behavior of the publications citing one retracted article and the characteristics of the citations the retracted article accumulated over time. Our analysis is based on a methodology which illustrates how we gathered the data, extracted the topics of the citing articles and visualized the results. The data and services used are all open and free to foster the reproducibility of the analysis. The outcomes concerned the analysis of the entities citing Wakefield et al.'s article and their related in-text citations. We observed a constant increasing number of citations in the last 20 years, accompanied with a constant increment in the percentage of those acknowledging its retraction. Citing articles have started either discussing or dealing with the retraction of Wakefield et al.'s article even before its full retraction happened in 2010. Articles in the social sciences domain citing the Wakefield et al.'s one were among those that have mostly discussed its retraction. In addition, when observing the in-text citations, we noticed that a large number of the citations received by Wakefield et al.'s article has focused on general discussions without recalling strictly medical details, especially after the full retraction. Medical studies did not hesitate in acknowledging the retraction of the Wakefield et al.'s article and often provided strong negative statements on it.Entities:
Keywords: Citation analysis; Retraction; Science of Science; Topic modeling
Year: 2021 PMID: 34376878 PMCID: PMC8338205 DOI: 10.1007/s11192-021-04097-5
Source DB: PubMed Journal: Scientometrics ISSN: 0138-9130 Impact factor: 3.238
An overview of all the steps needed for generating an annotated dataset of WF-PUB-1998 citing entities, to be further used during this work. For each step, we provide a brief description, the inputs needed and the output produced. The output is represented as the list of features that will be included in the final dataset used for our analysis
| Step | Description | Input | Output |
|---|---|---|---|
| (1) Identifying and retrieving the citing entities | Identifying the list of entities citing WF-PUB-1998 and storing their main metadata | DOI of the retracted article | For each citing entity: (1.1) DOI (1.2) year of publication (1.3) title (1.4) venue id (ISSN/ISBN) (1.5) venue title |
| (2) Retrieving the citing entities characteristics | Annotating whether the citing entities have been or have not been retracted as well | DOIs of the citing entities | For each citing entity: (2.1) is / is not retracted |
| (3) Classifying the citing entities according to subject areas and subject categories | Classifying the citing entities into areas of study and specific subject categories, following the SCImago classification | ISSN/ISBN of publication venues of citing entities | For each citing entity: (3.1) subject area (3.2) subject category |
| (4) Extracting textual values from the citing entities | Extracting the citing entities' abstracts, the in-text reference pointers, citation contexts, title of the section where the in-text citations happen | DOIs of the citing entities | For each citing entity: (4.1) abstract (4.2) in-text citation section (4.3) in-text citation context (4.4) in-text reference pointer |
| (5) Annotating the in-text citations characteristics | Manually annotating the intent (based on the citation functions of CiTO) and sentiment of each in-text citation and specifying whether the text in citation contexts mentions the retraction of the cited article | In-text citation contexts | For each in-text citation: (5.1) citation intent (5.2) citation sentiment (5.3) retraction is / is not mentioned |
Fig. 1The decision model for the selection of a CiTO citation function to use for the annotation of the citation intent of a an examined in-text citation based on its context. The first large row contains the three macro-categories: (1) “Reviewing …”, (2) “Affecting …” and (3) “Referring …”. Each macro-category has at least two subcategories and each subcategory refers to a set of citation functions. The first row defines what are the citation functions suitable for it through the help of a guiding sentence which needs to be completed according to the chosen sub-category and citation function
Fig. 2The coherence score of different LDA topic models built using a variable number of topics, from 1 to 40. The topic model is based on the corpus and dictionary of the in-text citation contexts. The orange line is the average value and it plateaus around 22–23 topics
Fig. 3The workflow, created via MITAO, we used for computing the LDA topic modeling and generating the LDAvis (LDA visualization) and MTMvis (Metadata-based Topic Modeling visualization) visualizations (the tools “LDAvis”, “MTMvis < period > ” and “MTMvis < area > ”). The green squares are used to specify input material which is considered by the various tools composing the workflow (i.e., the red rhombi). In particular, the workflow takes three inputs: (a) the vectorized corpus (“Corpus”), (b) a dictionary of words based on the tokenization results (“Dictionary”) and (c) the metadata of the original documents forming the corpus (“Meta”). The arrows between the tools indicate the direction of the data flow and the output-input relation among them. For instance, the execution of the workflow starts with the tool “LDA Topic Modeling”, that takes in input the “Corpus” and the “Dictionary” and produces an output that is used as part of the input for other three tools, i.e. “LDAvis”, “Terms X Topics” and “Docs X Topics”
The features that directly characterize the citing entities. The first column lists the features with a brief description, while the second column summarizes the related values we gathered
| WF-PUB-1998 citing entities features | Values |
|---|---|
doi The DOI of the citing article | Total: All the citing entities had a value specified |
year The year of publication of the citing article | Total: All the citing entities had a value specified Values: From 1998 (year of publication of WF-PUB-1998) to 2017 |
title The title of the citing article | Total: All the citing entities had a value specified |
source_id The ID (ISSN/ISBN) of the venue of publication of the citing article | Total: 599 (97%) citing entities had a value specified Values: ISSNs (548), ISBNs (51) |
source_title The title of the venue of publication of the citing article | Total: 603 (98%) citing entities had a value specified |
retracted A yes/no value depending on whether the citing article has or has not received at least one retraction notification | Total: 1 citing entity |
area The subject areas of the venue of publication of the citing article, based on the the SCImago Journal Classification ( | Total: 576 (93%) citing entities had at least a value specified Values: 24 different values: "medicine" (380), "social sciences" (90), "nursing" (81), "biochemistry, genetics and molecular biology" (59), "psychology" (58), "pharmacology, toxicology and pharmaceutics" (54), "immunology and microbiology" (52), "arts and humanities" (28), "neuroscience" (24), "environmental science" (17), "agricultural and biological sciences" (16), "health professions" (15), "computer science" (13), "mathematics" (10), "business, management and accounting" (8), "engineering" (7), "dentistry" (7), "multidisciplinary" (7), "decision sciences" (7), "economics, econometrics and finance" (5), "earth and planetary sciences" (1), "chemical engineering" (1), "materials science" (1), "physics and astronomy" (1) |
category The subject categories of the venue of publication of the citing article, based on the the SCImago Journal Classification ( | Total: 576 (93%) citing entities had a value specified Values: 170 different values |
abstract The abstract of the citing article | Total: All the citing entities had a value specified |
mention_retraction A yes/no value that indicates if at least one of the citation contexts of the citing article explicitly mentions the fact that the cited entity is retracted | Total: All the citing entities had a value specified Values: |
Fig. 4A summary of the citing entities. The first column contains the periods P1–P3 we considered, the second column shows the distribution per year of the citing entities that do mention (in green) or do not mention (in red) the retraction of WF-PUB-1998, while the third column shows the distribution of the subject areas of the citing entities. (Color figure online)
Fig. 5The LDAvis visualization built over the topic model obtained from the abstracts of the citing entities
The 13 topics available in the topic model obtained from the abstracts of the articles citing WF-PUB-1998. For each topic (row) we mention its proportion percentage in the corpus (column 1) and the 30 most relevant terms (column 2) and we give our interpretation of it (column 3)
| Topic (proportion) | Terms (the 30 most probable terms) | Interpretation |
|---|---|---|
| 1 (3.5%) | bias, epidem, philosoph, cell, experi, consum, behavior, protect, even, illich, acetaminophen, scientist, scienc, oxid, call, conserv, long, public, scientif, phone, campaign, occur, experiment, ethic, feed, among, politician, reject, dissent, insignific | Close to the social studies domain, might take in consideration ethical thematic |
| 2 (4.2%) | retract, symptom, disabl, student, scienc, perspect, expertis, misconduct, forens, paper, probabl, studi, mandat, cite, diseas, treat, provid, research, replic, vaccin, tripl, librari, qualif, adjuv, might, take, younger, case, media, wakefield | Includes terms related to the retraction phenomena (study-domain independent), it includes terms used in scientometric analysis |
| 3 (46%) | vaccin, parent, children, health, inform, immun, public, decis, review, safeti, studi, measl, evid, articl, research, risk, disord, practic, autist, issu, factor, concern, diseas, report, increas, import, relat, child, effect, base | This is by far the largest topic out of the 13, it contains common terms, highly frequent in the corpus and close to the WF-PUB-1998 thematics |
| 4 (3.6%) | access, nurs, bowel, knowledg, hepat, polici, global, mobil, portfolio, immunis, ask, biblic, newspap, pharmaceut, huge, visitor, time, citizen, percept, symptom, organ, internet, take, model, statist, carer, epoch, golden, scientif, cognit | Includes terms from the Medical and pharmaceutical field |
| 5 (5.1%) | vaccin, regress, anti, myth, movement, countri, syndrom, social, incom, diagnost, affect, determin, right, iron, hcws, children, overview, diseas, court, peopl, parent, million, routin, acupunctur, danger, mortal, immun, claim, degrad, intervent | Some terms are out of the medical field of study and might indicate a possible discussion |
| 6 (3.8%) | statist, cultur, describ, infant, citat, exampl, articl, symptom, case, american, journal, metabol, combin, parent, disagr, doctor, abl, associ, construction, bordetella, basi, advers, illustr, gliadin, illusori, literatur, indic, unit, eat, rather | A high number of terms are related to the scientometric field of study. All the terms are objective and do not indicate an opinion or a discussion |
| 7 (4.1%) | biomark, altmetr, nan, behavior, disord, occur, well, postpon, answer, herd, context, fear, genet, graphic, appeal, interquartil, order, vaccin, gene, sensori, evalu, modul, geneticist, chapter, lymphocyt, abstain, putat, approach, protect, homeopathi | Includes terms from the biology, pharmacology and genetics field of study. Might also indicate a statistical analysis along with an open discussion |
| 8 (4.9%) | vaccin, balanc, symptom, risk, reaction, link, subgroup, mump, regress, sphere, aefi, opioid, public, prevent, record, case, food, diseas, chronic, media, claim, allerg, week, resid, advers, children, estim, strain, cobalamin, associ | A large part of the terms are close to WF-PUB-1998 treated thematics. Mostly from the medical field of study |
| 9 (5.1%) | fraud, diseas, narrat, vaccin, complic, health, controversi, comorbid, measl, polici, coliti, bowel, neurolog, travel, inflammatori, movement, trust, ocean, research, retract, attribut, percept, public, futur, preserv, caus, ulcer, case, medic, problem | Concern the retraction phenomena, followed with some medical expressions. It also includes strong terms such as “fraud” |
| 10 (4.9%) | vaccin, immun, misconduct, polici, patholog, retract, aefi, result, children, research, disord, qualit, report, caus, development, advanc, case, chang, internet, record, expos, mold, infecti, program, vaer, actor, live, sinc, mani, appli | General terms related to WF-PUB-1998 treated thematics, some are correlated with a discussion around the retraction phenomena |
| 11 (5.6%) | vaccin, health, immunis, engag, communic, reason, disord, virus, diagnost, messag, coverag, examin, make, chang, accept, client, diseas, measl, development, research, consid, resist, peopl, public, evid, observ, recent, imag, effect, pervas | Terms from the medical study field related to WF-PUB-1998 treated thematics |
| 12 (3.3%) | uncertainti, boy, scientif, gynecologist, debat, twitter, semant, ongo, intent, vaccin, disturb, variabl, messag, rhetor, liabil, frame, reddit, percept, content, sourc, gfcf, produc, paediatr, rais, pyridox, guilt, fact, advic, link, chang | Includes terms close to the computer science lexicon and from the pediatric field of study |
| 13 (5.9%) | vaccin, incid, scientif, erad, measl, frame, viral, literatur, enceph, diseas, controversi, mother, workshop, differ, propos, expert, infect, increas, evalu, genet, dramat, recent, coalit, frequent, communic, current, link, programm, polio, scienc | Includes a large number of general terms from different fields of study, part of them are correlated with WF-PUB-1998 thematics |
Fig. 6MTMvis built on the topic model obtained from the abstracts of the citing entities, shown against the three period P1-P3. For each period the visualization plots the topics distribution (e.g. topic 3 is the dominant topic in all the periods: P1, P2 and P3
Fig. 7MTMvis built on the topic model obtained from the abstracts of the citing entities, shown against their subject areas. For each subject area the visualization plots the topics distribution (e.g. topic 3 is the dominant topic in”arts and humanities”)
The features that directly characterize the in-text citations. The first column lists the features with a brief description, while the second column summarizes the related values we gathered, i.e. the total number and, if applicable, a classification of the different values
| WF-PUB-1998 in-text citations features | Values |
|---|---|
intext_citation.section The kind of section in the citing entity which includes the in-text citation, taken from the list in (Suppe | Total: 757 (87%) in-text citations had a value specified Values: 10 different values: |
intext_citation.context The textual context in the citing entity which includes the in-text citation | Total: all the in-text citations had a value specified |
intext_citation.pointer The string representing the in-text reference pointer (e.g., “[3]”) in the citing entity to the bibliographic reference of WF-PUB-1998 | Total: all the in-text citations had a value specified |
intext_citation.intent The citation intent related to the in-text citation in the citing entity, i.e., the author’s reason for citing WF-PUB-1998, taken among the citation functions defined in CiTO | Total: all the in-text citations had a value specified Values: 17 different values: |
intext_citation.sentiment The sentiment, classified as positive/negative/neutral, conveyed by the citation context of an in-text citation | Total: All the in-text citations had a value specified Values: |
Fig. 8A summary of the in-text citations. All the data are classified under the three sentiments: negative (red), neutral (yellow) and positive (green). The first column contains the periods P1-P3 we considered, the second column shows the distribution per year of the in-text citations, the third column shows the citation intents distribution and the last column shows the in-text citation sections distribution
Fig. 9The LDAvis visualization of the topic model created using the citation contexts of the in-text citations contained in the entities citing WF-PUB-1998
The 22 topics available in the topic model generated using the in-text citation contexts contained in the articles citing WF-PUB-1998. For each topic (row) we mention its proportion percentage in the corpus (column 1) and the 30 most relevant terms (column 2) and we give our interpretation of it (column 3)
| Topic (proportion) | Terms (the 30 most probable terms) | Interpretation |
|---|---|---|
| 1 (2.7%) | valu, reuter, thomson, worth, retract, would, impact, time, mean, paper, immun, comparison, figur, lancet, base, mening, cerebr, associ, identifi, greater, roach, regress, senior, later, europ, vitamin, viral, assign, consist, campaign | Includes few terms from the medical domain. Might talk about WF-RET-CASE and from a statistical/mathematical perspective. Some terms include general info about the paper (metadata) |
| 2 (5.2%) | articl, control, associ, public, signific, affect, research, biopsi, scientif, follow, controversi, natur, patient, case, decad, preserv, evid, report, vaccin, use, multipl, clinic, subsequ, differ, first, cell, follicl, symptom, concern, studi | General terms which summarize what WF-PUB-1998 talks about |
| 3 (3.9%) | articl, case, colon, development, lancet, enterocol, report, assert, associ, public, vaccin, disord, follow, signific, diagnosi, base, group, mumpsrubella, treatment, controversi, parent, scientif, evid, result, symptom, studi, publish, link, reaction, unknown | General terms which summarize what WF-PUB-1998 talks about |
| 4 (3.9%) | articl, immun, dose, claim, lancet, requir, discredit, declin, find, infect, diseas, februari, colleagu, herd, coverag, respons, measl, report, first, sever, vaccin, regress, month andrew, second, mump, research, countri, detail, parent | General terms which summarize what WF-PUB-1998 talks about. This topic might also outline other information related to the paper |
| 5 (3.6%) | vaccin, link, research, suggest, report, appear, measl, focus, scientif, develop, univers, reject, possibl, studi, associ, mump, investig, though, fund, even, continu, around, wakefieldet, newspap, result, doubt, relat, signific, high, evid | Discusses the controversy around WF-RET-CASE |
| 6 (2.8%) | design, studi, expert, consider, parent, bias, receiv, requir, risk, messag, control, occur, point, factor, time, start, howev, report, third, call, know, vaccin, connect, major, qualiti, media, research, school, best, mump | Does not include any medical term, it rather focuses on other related aspects concerned with WF-PUB-1998 |
| 7 (2.2%) | diseas, caus, characterist, process, read, inflamm, alarm, report, assert, without, show, ileocolon, bowel, declin, safeti, student, first, exampl, intestin, media, young, detail, autist, adult, possibl, scientif, studi, even, control, wide | General terms which summarize what WF-PUB-1998 talks about. Many terms are related to its medical background |
| 8 (13.2%) | regress, development, increas, hypothes, causal, vaccin, link, bowel, disord, case, report, measl, problem, symptom, author, associ, system, immun, mump, hypothesi, relationship, peptid, suggest, opioid, diseas, popul, onset, studi, risk, autist | General terms which summarize what WF-PUB-1998 talks about. Almost all the terms are related to its medical background |
| 9 (2.6%) | seri, development, parent, report, abnorm, loss, child, autist, associ, acquir, consecut, articl, spectrum, coliti, pervas, symptom, specif, attent, author, count, skill, normal, public, concurr, characterist, claim, abdomin, reduc, enabl, eight | A large number of terms from the medical domain. The connection with WF-PUB-1998 is less evident |
| 10 (3.9%) | mucos, regress, trigger, subtl, food, medic, symptom, development, pattern, extens, lead, clear, result, retract, disord, specif, condit, enterocol, intoler, possibl, research, also, develop, affect, suggest, case, diarrhea, includ, year, caus | General terms which summarize WF-PUB-1998 medical background. Might also mention its retract |
| 11 (3.7%) | understand, paper, period, peer, cohort, development, singl, review, widespread, technic, major, help, studi, rat, scienc, public, littl, interact, relationship, origin, week, outbreak, neurochem, normal, media, articl, sometim, regress, debat, report | Focuses on technical aspects and does not include any medical terminology |
| 12 (10.8%) | find, retract, uptak, studi, subsequ, media, vaccin, paper, fear, link, controversi, increas, evid, measl, scientif, articl, publish, number, safeti, mump, journal, belief, parent, public, mani, health, mother, concern, claim, lancet | Talks and discusses WF-PUB-1998 retraction phenomena from different perspectives, e.g. social impact. It can also talk about the negative impacts |
| 13 (3.6%) | lancet, cost, articl, rhetor, health, paper, public, text, scienc, begin, outbreak, emerg, autist, immedi, also, featur, interpret, acquir, origin, caus, controversi, distress, depart, note, might, languag, debat, measl, behavior, vaccin | Talks and discusses WF-RET-CASE from different perspectives, yet far from the medical domain. It might also not take in consideration the paper retraction in the discussion |
| 14 (5.6%) | unit, publish, studi, state, bowel, possibl, general, paper, measl, immunis, royal, case, press, diseas, report, free, group, three, kingdom, research, link, controversi, hospit, risk, would, receiv, vaccin, women, journal, earlier | Discusses the medical conclusions arised from WF-PUB-1998 |
| 15 (3.3%) | intestin, report, associ, studi, regress, retract, team, hospit, bowel, sever, find, ileum, royal, behavior, hypothesi, free, subsequ, caus, ethic, altmetr, problem, colleagu, vaccin, consider, research, connect, abnorm, paper, number, british | Discusses the medical conclusions arised from WF-PUB-1998. It might point out the emerging controversies of the paper |
| 16 (4.8%) | associ, development, regress, whether, initi, spectrum, vaccin, bowel, trigger, possibl, propos, widespread, autoimmun, autist, disord, specif, environment, diseas, increas, coverag, concern, public, articl, hypothesi, virus, aris, question, base andrew, sinc | Talks about the medical thematics and conclusions of WF-PUB-1998 |
| 17 (5.4%) | refer, articl, link, vaccin, skill, normal, histori, mump, acquir, side, bowel, support, concern, studi, public, research, suggest, associ, measl, demonstr, paper, lancet, evid, pediatr, symptom, exist, follow, prove, describ, diseas | Discusses the conclusions and impact of WF-PUB-1998 |
| 18 (4.6%) | receiv, countri, author, health, measl, articl, year, effect andrew, eight, recent, parent, behavior, case, suggest, vaccin, paper, begin, safeti, british, sinc, studi, develop, short, mump, patient, campaign, publish, report, investig | Gives an overview of WF-RET-CASE, without necessarily analysing the contents and conclusions |
| 19 (3.2%) | articl, factor, constip, nonspecif, britain, caus, first, autist, measl, compon, year, upon, call, intestin, publish, dquo, retract, make, immun, event, disord, neurolog, unnecessari, permeabl, find, apoptosi, vaccin, describ, scientif, syndrom | Talks about WF-RET-CASE and the medical conclusions of the paper |
| 20 (1.6%) | theori, council, nationwid, queri, consequ, profession, medic, continu, ultim, despit, accept, long, case, optim, rat, vitamin, pertussi, myelogenesi, impair, persist, altern, indic, https, prsa, cross, field, coverag, notif, dismiss, collaps | Discusses WF-PUB-1998 as a case study that might be of interest to a better understanding of the research or more specifically the medical research |
| 21 (5%) | vaccin, coverag, public, associ, link, articl, suggest, publish, research, media, time, citat, claim, thimeros, across, prove, particular, measl, lancet, lead, whether, parent, paper, evid, subsequ, extens, sinc, mump, around, increas | Talks about WF-RET-CASE and consequences. Might also discuss the paper links and citations |
| 22 (4.4%) | connect, retract, paper, potenti, topic, disord, research, deer, receiv, exampl, studi, lancet, report, claim, elliman, causal, use, development, inflammatori, although, concern, data, type, exist, inform, bowel, publish, base, measl, attent | Discusses WF-PUB-1998 and retraction from non-medical aspects. Might refer to it as a retraction example |
Fig. 10MTMvis created considering the topics extracted from the citation contexts of the in-text citations citing WF-PUB-1998 according to the periods P1-P3. For each period the visualization plots the topics distribution – e.g., topic 8 (in purple) is the dominant topic in P1
Fig. 11MTMvis created considering the topics extracted from the citation contexts of the in-text citations citing WF-PUB-1998 according to the subject areas of the citing entities. For each period the visualization plots the topics distribution – e.g., topic 3 (in dark yellow) is the dominant topic of the “arts and humanities” subject area
Fig. 12The evolution of topics 1, 2 and 5 during P2–P3 on all the subject areas plotted using MTMvis. MTMvis has been generated from the topic model created using the abstracts of the citing entities. The themes covered by these topics are close to the retraction phenomena and used a limited number of terms from medical jargon
Fig. 13The distribution of topic 1 over all the subject areas during P2–P3 plotted using MTMvis. MTMvis has been generated from the topic model created using the abstracts of the citing entities. Topic 1 include terms from the social science domain and relates to ethical themes
Fig. 14The subject areas of citing entities published in P2–P3 which includes either topic 2, or 5 in their top 5 topics. The themes covered by these topics relate to the retraction phenomena and use a limited number of terms from medical jargon
Fig. 15The four graphs illustrate the way the use of citation intents changed over time (i.e., the three periods P1, P2 and P3) and according to their perceived sentiment. The citation intents cites as evidence, critiques and credits are illustrated in separated charts, that show an increment in the negative sentiment along the three periods
Fig. 16The cites as evidence and credits citation intents distributions among the sections (the recognizable ones) and during the three periods (i.e. P1–P3)
Fig. 17The evolution over time of three groups of topics defined from the citation contexts of the in-text citations to WF-PUB-1998
Fig. 18The increasing (left) and decreasing (right) topics of the in-text citation topic model, considering only the medicine area of study
A summary of the differences and similarities between our study and (Suelzer et al. 2019)
| Method and results in (Suelzer et al. | ||
|---|---|---|
| Feature | Method | Results |
| Analyzed entities and the citation context | Differences: 1) The analysis does not address and collect the in-text citations as separated entities 2) The reviewed in-text citation contexts are not defined following a precise structure/rule – in our work, indeed, the context is defined by the anchor sentence, plus the prior and subsequent sentence Similarities: 1) the citing works are collected and analyzed as individual entities 2) the definition of three periods following the years of the partial and full retraction (i.e. P1, P2 and P3) | Differences: 1) The total number of citing entities is higher – 1,153 (compared to the 615 of our work) 2) The analysis does not mention any statistics regarding the number of in-text citations Similarities: 1) The percentage of citing entities gathered starting from the partial retraction to the last date considered (i.e. March 11, 2019) with respect to the total number is 76% (close to the 78% of our work). Generally, considering the periods P1, P2 and P3 the distribution is respectively 23%, 28% and 49% (close to our distribution: 22%, 24% and 54%) |
| Mentions of the retraction | Differences: 1) There is no definition of the textual contexts gathered in order to establish whether the citing work had mentioned the retraction or not 2) It also checks whether the retraction is mentioned in the reference list Similarities: 1) A citing entity mentions the retraction, only if the word “retraction” (and its derivatives) is used | Differences: 1) A higher percentage of entities mention the retraction after the partial retraction – i.e. P2-P3 (56% vs 25%). This gap is reduced when considering only the retraction mentioned in the citation context – 38.3% vs 25% Similarities: 1) The overall trend in P2-P3 is similar and shows a continuous increment in the number of entities mentioning the retraction over time 2) 2009 was the year with the lowest number of entities mentioning the retraction |
| Citation Intent | Differences: 1) The citations are characterized into 8 different categories, following the definitions in (Bornmann et al. 2) There is no distinction between the intent and the sentiment of the citation (e.g. a citation could be characterized as 3) The citation characteristics address the citing works entities (on our work the annotation refer to the specific in-text citations) Similarities: 1) The annotation was performed following a set of rules which guided and helped the annotator | Differences: 1) The plotted results combine in the same dimension both the sentiment and the citation function, following the established annotation rules, thus this fact makes the comparison with our results difficult |
| Citation sentiment | Differences: 1) It is not part of the analyzed features, although this subject is addressed in the discussion section 2) This information is embedded in the citation intent value (e.g. using the characteristics | |
| Citation section | Differences: 1) It is not part of the analyzed features, although this subject is part of the discussion since some articles are provided as examples of citations appearing in the | |
| Text analysis (i.e. topic modeling) | Differences: 1) It is not part of the analyzed features. They give a brief summary regarding their observations of some examples that have cited WF-PUB-1998 | |