Literature DB >> 34813595

A survey of biomedical journals to detect editorial bias and nepotistic behavior.

Alexandre Scanff¹, Florian Naudet¹, Ioana A Cristea², David Moher^3,4, Dorothy V M Bishop⁵, Clara Locher¹.

Abstract

Alongside the growing concerns regarding predatory journal growth, other questionable editorial practices have gained visibility recently. Among them, we explored the usefulness of the Percentage of Papers by the Most Prolific author (PPMP) and the Gini index (level of inequality in the distribution of authorship among authors) as tools to identify journals that may show favoritism in accepting articles by specific authors. We examined whether the PPMP, complemented by the Gini index, could be useful for identifying cases of potential editorial bias, using all articles in a sample of 5,468 biomedical journals indexed in the National Library of Medicine. For articles published between 2015 and 2019, the median PPMP was 2.9%, and 5% of journal exhibited a PPMP of 10.6% or more. Among the journals with the highest PPMP or Gini index values, where a few authors were responsible for a disproportionate number of publications, a random sample was manually examined, revealing that the most prolific author was part of the editorial board in 60 cases (61%). The papers by the most prolific authors were more likely to be accepted for publication within 3 weeks of their submission. Results of analysis on a subset of articles, excluding nonresearch articles, were consistent with those of the principal analysis. In most journals, publications are distributed across a large number of authors. Our results reveal a subset of journals where a few authors, often members of the editorial board, were responsible for a disproportionate number of publications. To enhance trust in their practices, journals need to be transparent about their editorial and peer review practices.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34813595 PMCID： PMC8610247 DOI： 10.1371/journal.pbio.3001133

Source DB: PubMed Journal: PLoS Biol ISSN： 1544-9173 Impact factor: 8.029

Introduction

Research integrity matters across the research ecosystem. In this process, scientific journal editors are key actors that ensure the trustworthiness of the scientific publication process. But, paraphrasing Dr. Drummond Rennie’s famous quote, who is guarding those guardians? [1] Some of our team (CL, IC, DM, and FN) had doubts that anyone does such safe guarding in the case of New Microbes and New Infections (NMNI), an Elsevier journal, whose most prolific author, Didier Raoult, coauthored 32% of its 728 published papers [2]. NMNI’s editor-in-chief and 6 additional associate editors of the journal work directly for, and report to, Raoult. Together, these editors authored 44% of the 728 papers published in the journal as of June 25, 2020. We suggested that such “self-promotion journals” were “a new type of illegitimate publishing entity, which could have certain key characteristics such as (i) a constantly high proportion of papers published by the same group of authors, (ii) relationships between the editors and these authors, and (iii) publication of low-quality research” [2]. We applied a preliminary approach to detect these “self-promotion journals” in the field of infectious disease using a measure easy to compute: the proportion of contributions published in a journal by the most prolific author, i.e., the one who published the most articles in a given time period [2]. In journals publishing more than 50 papers over 5 years, it was rare to see journals where a specific author published more than 10% of the papers, and, indeed, NMNI was a clear outlier. Note, however, this is a crude measure as it is based on all published articles, whatever their type (research, letter, editorial, etc.) and therefore may give high scores for legitimate contributions by active editors. Coincidently, one of our colleagues (DB) reported a similar analysis for the addiction subfield of psychology in a blog post [3]. This analysis focused on research articles only. She found a bimodal distribution of “the percentage by the most prolific” measure, identical to that observed for NMNI, with only 3 out of 99 journals having a score over 8%. In 2 of these journals, the high score was attributable to the same individual, who was on the editorial board of the journal, and who had published together with the editor-in-chief. Bishop also noted that the same method identified a journal editor, Johnny Matson, who had been found previously to be publishing copiously in journals edited by himself or other editors [3]. Furthermore, for many of these papers, indirect evidence of superficial or absent peer reviews was suspected because of the remarkably rapid turnaround, often within a week or less, between the dates recorded for submission and manuscript acceptance. This circumstantial evidence of unethical editorial practice can only be obtained, however, in journals that report these dates for published manuscripts. These convergent analyses from different fields suggest that the Percentage of Papers by the Most Prolific author (PPMP) deserves consideration as a potential red flag to identify journals that are suspected of biased editorial decision-making—what we now term “nepotistic journals.” We may draw parallels between the PPMP and studies on resource distribution in economics. A highly prolific author who is an outlier on the PPMP measure in effect monopolizes a large proportion of a journal’s publications. This analogy supports another, more complex measure, the Gini index [4], used in econometrics to describe resource distribution inequalities. This measure was recently applied in bibliometrics to explore inequality in authorship across authors publishing in high-impact academic medical journals [5]. Applied to our context, it could be used to quantify imbalances in the patterns of authorship within a journal: the values of the Gini index range from 0 (perfect equality in numbers of articles among authors) to 1 (major inequality). We set out to apply these 2 indices to a very large dataset of biomedical journals over a 5-year time frame, to describe outliers using these indices, and to describe time intervals between submission and acceptance dates as a potential indicator of unfair or partisan editorial practices.

Results

Journal selection and description

Using the search query on the United States National Library of Medicine (NLM) catalog, 11,665 journals labeled with at least one of 152 “Broad Subject Terms” were retrieved. details the reasons for noninclusion of some journals. After exclusions, 5,468 journals were analyzed in the principal analysis.

Flow chart of included journals.

Selection flow chart for journals labeled with at least one “Broad Subject Term” in the NLM. NLM, National Library of Medicine. These journals published a total of 4,986,335 articles of which 4,183,917 were considered research articles (i.e., original articles, case reports, and reviews; ). The main characteristics of the journals analyzed are described in . Briefly, they published a median of 500 articles (IQR 262 to 964) for the 2015 to 2019 period, of which 426 (IQR 232 to 798) were considered research articles. Two “mega-journals” published more than 25,000 articles over the 5-year period (Scientific Reports with 95,900 articles, and PLOS ONE with 108,990 articles). For 3,668 journals (67%), there was at least one article without any author, and a median percentage of 0.9% of articles (IQR 0.4% to 2.1%) with no named author in these journals. The author with the largest number of articles in a given journal is a journalist (N = 767), and the author with the largest number of research articles is an academic (N = 252). Both authors publish in established journals (The BMJ and Physical Review Letters, respectively) and are members of their respective editorial boards. NLM, National Library of Medicine; MPA, Most Prolific Author; PPMP, Percentage of Papers by the Most Prolific author.

Description of the indices

Percentage of papers by the most prolific author and the Gini index

For the principal analyses based on all articles published during the 2015 to 2019 period, the PPMP median was 2.88% (IQR 1.71% to 4.91%), and the 95th percentile of the PPMP value was 10.6%. details the PPMP and numbers of published outputs for all journals. It also shows that the number of publications by a prolific author in a given journal can result in different PPMP depending on the number of papers published in this journal. The most prolific author(s) in each journal published a median of 14 articles (IQR 8 to 25). For 1,022 journals (19%), there was more than one author with the same largest number of published articles. To assess authorship disparity from not only a single highly prolific author but from a group of authors, we also computed the Gini index, a measure of the degree of unequal distribution of authorship within a journal (). Gini index range from 0 to 1, with smaller values indicating a more equal distribution of articles across authors and higher values representing greater inequality. Over the 2015 to 2019 period, the Gini index median was 0.183 (IQR 0.131 to 0.246), and the 95th percentile was 0.355. details the Gini indices and numbers of published outputs for all journals. presents the correlation between the PPMP and Gini indices. This correlation was 0.35 (95% CI 0.33 to 0.37).

PPMP and Gini index among all articles.

Distribution of PPMP author(s) (A) and Gini index (B) in relation to journal size, and comparison between the PPMP and the Gini index (C), across all articles published by all journals in the United States NLM catalog having at least one Broad Subject Term and having published at least 50 authored articles between 2015 and 2019. The data underlying this figure may be found in https://osf.io/6e3uf/. NLM, National Library of Medicine; PPMP, Percentage of Papers by the Most Prolific author. For both indices, there were no meaningful differences between index values across years, with the 95th percentile ranging from 10.1% to 11.4% for the PPMP and 0.212 to 0.224 for the Gini index (). Results of the sensitivity analyses based on “research articles” alone were consistent with those for all articles (). Correlations between indices computed for all articles and for “research articles” alone were 0.84 (0.84 to 0.85) for the PPMP and 0.93 (0.93 to 0.94) for the Gini index. In 540 journals (9.9%), for at least a quarter of the authors, only the initials of their first name(s) were presented.

Field-specific variations

The distribution of the PPMP and Gini index for each NLM broad term is presented in . The median PPMP per field ranged from 1.1% to 9.5%, and the median Gini index per field ranged from 0.113 to 0.297.

Publication lag

Because of failures to report submission or acceptance dates, publication lag was not calculable for 2,743 journals (50.2%). Compared to journals that did report submission and acceptance dates, these journals had fewer authored articles (369 (IQR 200 to 712) versus 637 (IQR 355 to 1,186)). There were no differences for the Gini index but a higher PPMP (3.4% (IQR 2.0% to 5.9%) versus 2.4% (IQR 1.5% to 4.0%)). For the 2,725 journals with data on submission and publication (49.8%), the median of publication lag for all authored articles over the 5 years was 85 days (IQR 53 to 123) for articles published by the most prolific author(s) versus 107 days (IQR 80 to 141) for articles not published by the most prolific author(s). shows the scatter plot for all articles with the marginal density curve for the median of publication lag for the most prolific author(s) versus nonprolific authors, for each journal: For articles authored by the most prolific author(s), the distribution of the publication lag was skewed toward a shorter time lag. Using a cutoff of 3 weeks for the median of publication lag, 277 (10.2%) of the journals had a median below this for articles by the most prolific author(s), 51 (1.9%) journals had a median below this for articles not by prolific author(s), and 38 (1.4%) journals had a median below this for both types of articles (i.e., authored by the most prolific author(s) or not). For the most prolific authors, publication lag decreased with the number of articles published as shown in , not solely in outlier journals. The results of the sensitivity analyses based on “research articles” alone were consistent with those for all articles ().

Publication lag.

Distribution of the publication lag median for the subgroup of 2,725 (49.8%) journals reporting submission and publication dates. Publication lag median (in days) are presented for articles signed by the most prolific authors compared to the articles without any of the most prolific authors (with marginal density plot of distributions). The data underlying this figure may be found in https://osf.io/6e3uf/.

Publication lag among articles of the most prolific author.

Distribution of publication lag (in days) and number of articles authored for each of the most prolific authors, across all articles (with marginal density plot of distributions) for the subgroup of 2,725 (49.8%) journals reporting submission and publication dates. The data underlying this figure may be found in https://osf.io/6e3uf/.

Description of outliers and identification of nepotistic journals

Using the 95th percentile value, we identified through the principal analysis 480 outlier journals: 206 based on the PPMP and the Gini index considered separately, and 68 based on both indices. The yearly and global distributions of these indices are presented for all journal in . The main characteristics of the 100 randomly selected outlier journals are presented in . Of the 100 journals identified through the principal analyses, 98 were reported in English, among which 31 were also in another language (either fully multilingual journals or translation of abstracts). The most common non-English languages were German (6 journals), Chinese (5 journals), Japanese (5 journals), and French and Italian (4 journals each). These outliers were well-established journals, with a median year of start of activity in 1990 (IQR 1976 to 2001). Only 56 of these 100 journals were indexed in Web of Science (WoS), which enables an assessment of the journal impact factor and other citation metrics. For these 56, the median journal impact factor was 2.9 (IQR 1.5 to 4.8) with a median self-citation ratio of 0.11 (IQR 0.047 to 0.21), corresponding to a median self-citing boost according to Ioannidis and Thombs of 13% (IQR 5% to 26%) [6]. The skewness and nonarticle inflation median was 86% (IQR 55% to 144%) [6]. Calculation of this metric was not possible for 11 journals (20%) that had a median of article citation of 0. Only 5 journals were indexed in the Directory of Open Access Journals as being full open access, and the median proportion of open access articles was 2.0% (IQR 0.47% to 8.0%). None of the outliers had an open peer review policy. For 2 of the 100 journals, the full composition of the editorial boards could not be found and only the editor-in-chief was known, but was not the most prolific author. In the remaining 98 journals, at least one of the most prolific authors was a member of the editorial board in 60 journals (61%), among whom 25 (26%) were the editors-in-chief. Journals where at least one of the most prolific authors was on the editorial board tended to have a higher impact factor than the others with a median of 3.4 (IQR 2.0 to 5.3) versus a median of 1.4 (IQR 1.0 to 3.1). Because there was sometimes more than one author with the same largest number of published articles, 108 “most prolific” authors were identified in the 100 outlier journals. We identified errors in author identification for one journal, MMW-Fortschritte der Medizin (an outlier on the Gini index), where the most prolific “author” was named “Red,” which seems to be a diminutive for “Redaktion,” possibly encompassing several physical individuals. When “Red” was ruled out as a valid author name (i.e., these papers were considered has having no author), the next most prolific author was, however, a member of the editorial board, the PPMP increased very slightly from 7.5% of 4,920 authored articles to 7.6% of 4,553 authored articles, and the Gini index decreased very slightly from 0.637 to 0.616. Out of 1,978 identified authors (excepting “Red”), 1,435 (72.5%) did not publish more than one article in this journal. Among the remaining 107 individual authors, 95 were formally identified from WoS. Among the 12 remaining, identification was considered unreliable for 8 because of possible homonyms, and 4 were not indexed in WoS. For these 12 authors, manual disambiguation using PubMed and Google identified 6 journalists, 5 physicians with a consistent affiliation to the journal considered, and one author where a high risk of homonym persisted (“Wang, Y” in Journal of Clinical Otorhinolaryngology, Head, and Neck Surgery). Among the 95 other authors, the median of the H-index was 28 (IQR 13 to 50). Overlap between indices (PPMP versus Gini index) and analyses (principal versus sensitivity analyses) is presented in . Among the 648 journals flagged as outliers either through the principal analyses, or through the sensitivity analyses, 299 (46.1%) are common to both analyses. Furthermore, qualitative analysis of the 100 randomly selected outlier journals identified through the sensitivity analyses is consistent with those for the principal analyses (see ).

Discussion

In this comprehensive survey of 5,468 biomedical journals, we describe several features of editor–author relationships among which were the following: (i) an article output was sometimes dominated by the prolific contribution of one author or a group of authors; (ii) with time lags to publication that were in some instances shorter for these prolific authors (when this information was available); and (iii) more than half of those prolific authors were typically members of the journal’s editorial board. In our study, the threshold to define the 95th percentile was 10.6% of articles published by the most prolific author. In absolute terms, we believe that it is reasonable to closely inspect these journals as it may question the judgment of an editor where more than 10% of the published papers are authored by the same person. To better characterize the lack of heterogeneity in authorship, we also computed the Gini index, for which the corresponding 95th percentile was 0.355 over 5 years and 0.20 when computed annually. One possible explanation is that a broader time frame allowed for more occasional authors to be recruited while maintaining the regular authors, revealing the latent heterogeneity. One other explanation is that some recent journals are relying on a group of authors on their first year before allowing these authors to move on to other publishing duties. The PPMP and the Gini index explore complementary patterns of asymmetry in publishing patterns. The PPMP reflects author practices, while the Gini index is more sensitive to groups of highly prolific authors. The Gini index has one advantage over the PPMP, in that it is less constrained by the total number of articles published. If a journal publishes a very large number of papers (i.e., 1,000 articles over 12 months), it becomes increasingly implausible that a single prolific author could account for 10% or more of them, as there is a natural upper limit to how many papers any one individual can author. Conversely, for a journal that publishes very few papers, an author could be identified as above the 95th percentile with a relatively modest number of publications. In other words, there might be a masking effect in journals with higher rates of publishing causing lower publishing rate journals to show a greater number of PPMPs. Further developments may consider breaking the journals into groups based on rates of publishing and/or using crude number of publications by the most prolific author. For the subgroup of 2,725 (49.8%) journals reporting submission and publication dates, the time lag for the most prolific author(s) was shorter, suggesting that for certain journals, which are outliers on PPMP and/or Gini index, peer reviews may have been absent or only superficial for prolific authors. However, for the most prolific authors, the publication lag decreases with the number of articles published across all journals and not solely in the subsample of outliers. This suggests that our description of these outliers based on PPMP could be only the tip of the iceberg, capturing solely the most extreme cases of hyperprolific publication in a given journal. Because not all journals report publication lag data, this finding is necessarily based on a subgroup. This could introduce bias if journals reporting publication dates differ from those that do not in terms of factors such as size or monitoring of the publication process [7]. Our findings persisted when all articles were considered as well as when only “research articles” were considered (i.e., excluding articles explicitly referenced as editorial, correspondence, or news articles), suggesting that editorials, correspondence, and news are not the only drivers of the indicators we explored. Conversely, it is possible that the definition we used to identify research articles is not perfect and carries a risk of misclassification bias. However, analyzing both the overlap of PPMP and Gini and the overlap between both analysis populations cautions against making simple binary distinctions. When based on all articles, these metrics carry a risk of false positives, represented by active editors and/or professional writers. When based on research articles, it may miss problematic behaviors by certain editors. In other words, there is surely a gray zone; the proposed metrics are useful to delineate the big picture of medical journals behaviors, but the specific behaviors of these journals will necessarily deserve more fine-grained scrutiny in future. For instance, it will be interesting to see whether hyperprolific authors publishing in their own journals adhere to COPE policy regarding conflicts of interest disclosures [8]. This was studied for authors of Cochrane reviews who are also editorial board members: The adherence to conflicts of interest policy was low [9]. We should beware of assuming that a hyperprolific author is necessarily engaged in questionable publishing practices: Some people are highly productive, and the speed with which good research can be completed is highly variable across research fields. Furthermore, authors may be represented in many papers because they play a key role in one aspect of the research, such as the statistical analysis, and senior researchers who oversee multiple projects may end up as authors on many papers. Similarly, shorter publication lags may occur simply because it is easier to find reviewers for eminent authors, or in a particular subject area, and/or because their expertise means that their papers require less revision. Nevertheless, there is no doubt that some highly prolific authors achieve an unusual level of productivity by exploiting the system or engaging in academic misconduct [10,11]. It is important to make a distinction between hyperprolific authors who publish a lot in a range of different journals and those who are exploiting a select pool of a few journals in which they appear as prominent authors (as we explored here). It is also very important to complement the PPMP and the Gini index with the absolute numbers of papers authored by the most prolific authors, because some problematic journal behaviors could pass unnoticed when only these 2 indices are used. On a random sample of outlier journals identified using the PPMP and/or the Gini index, we found that the prolific authors can be “established” scientists with a relatively high H-Index (for instance, a median of 28), and that 60% of these most prolific authors were editors-in-chief or members of the journal’s editorial board. About half of these journals had a median journal impact factor of 2.9 (IQR 1.5 to 4.8). These journals generally presented a large self-citation ratio, meaning that some of them may have questionable practices by manipulating their impact factor. The other half of these journals did not have an impact factor, possibly indicating they were new journals joining the WoS. Even though WoS uses an extensive list of eligibility criteria, it is also possible that some of the new journals are predatory, journals that are known to have “leaked into” trusted sources [7]. Importantly, while NLM is skewed toward English language journals, several outlier journals were in other languages. These journals are likely to represent smaller communities with the possibility of closer interactions between researchers. Our results underscore possible problematic relationships between authors who sit on editorial boards and decision-making editors. Typically, publishers promote independence between authors and journals. Hyperpublished authors may see such relationships as a way to more easily reach publication thresholds for hiring, promotion, and tenure. There may be defensible reasons for members of the editorial board of a journal to hyperpublish in a journal [3]. There are, for instance, certain research fields that are research niches, where the contributing authors are part of a very small community of specialists and are therefore the most likely authors. Although our findings are based solely on a subsample of journals, they provide crucial evidence that editorial decisions were not only unusually, but also selectively, fast for the favored subset of prolific authors. This pattern was also found by Sarigöl and colleagues when exploring favoritism toward collaborators and coeditors, which persists even after taking into account individual article quality, measured as citation and download numbers [7]. This phenomenon could have an impact on productivity-based metrics and suggests a risk of instrumentalization, if not corruption, of the scientific enterprise, by using journals as a “publication laundry” for “vanity publication” [12] by authors closely related to the editorial board. Exploiting productivity metrics has been widely described, in the form of self-citation, honorary authorship, and/or ghost writing. Manipulation of individual metrics by resorting to a dedicated “nepotistic” journal appears to be a little studied way of exploiting the system.

Limitations

Our descriptive and exploratory survey, based on a large available database, provides information about the broad scene of “nepotistic journals,” but it may miss some finer points, especially concerning the quality of articles published in these journals. The quality of a scientific article is a difficult concept to measure, and it cannot be easily summarized in quantitative metrics. We recommend a qualitative analysis of the papers published by the most prolific authors in journals flagged by these indicators, as well as an analysis of the relationship between this author and the editorial board. In addition, we restricted our analysis to journals indexed in the NLM under one or more of the existing broad terms. Some journals are registered without broad terms, requiring a manual pickup by the NLM. Consequently, our survey may have preferentially included the more established journals indexed in the NLM, with a durable presence in the database, and hence likely to have a better-quality global and editorial conduct than nonindexed journals. Similarly, because we restricted our search to journals publishing a minimum of 50 papers in the 2015 to 2019 period, we may have missed smaller journals with less professional editorial staff and/or from publishers with different standards. Importantly, our automated calculations rely on the articles identified through an NLM search, and it is possible that not all journals list all articles within PubMed. Although this is likely rare, we cannot exclude that certain authors use somewhat different name (for instance, change of name after marriage). We could not explore this bias that would have resulted in an underestimation of our indicators. More importantly, these calculations carry a risk of inaccuracy as a result of homonymy. Misidentification and/or merging of author names could bias the PPMP and the Gini index in both directions, and the risk of merging increases when only the initial or first name is known, and in the case of authors with similar names. The greater risk of homonyms could partially explain the increased Gini index values for larger journals, without reference to a tendency to editorial misconduct. Our analysis of the random sample of outliers enabled a disambiguation procedure consisting in inspecting qualitatively the most prolific authors. Only 1 out of 108 “most prolific” authors within a given journal was considered to be at being at high risk of homonymy. Among these 108 authors, this procedure also enabled identification of the 6 most prolific authors, who were professional journalists for whom high productivity is of course not an indicator of any academic misconduct, as they are professionals paid by the journal and not academics. The 2 proposed indicators, and their current calculation, should therefore not be used indiscriminately but could rather serve as a screening tool for potentially problematic journals that may then require careful exploration of their editorial practices. In addition, by analogy with citation-based metrics [13], we believe that no single metric can be sufficient but rather that different metrics can be complementary to inform about editorial behavior and that these metrics must not be used indiscriminately without considering all the identified limitations. While our results are exploratory and do not yet support a widespread use of these indicators, we hope that further research will help to establish these easily computed indexes as a resource for publishers, authors, and indeed scientific committees involved in promotion and tenure, to screen for potentially biased journals needing further investigation. DORA paved the way, of moving away from productivity-based metrics, and other efforts followed such as the Hong Kong Principles for assessing researchers. Integrity-based metrics are indeed needed to overcome the limitations of productivity-based metrics [14]. A transparent declaration of interests in communicating research is surely one important aspect of scientific integrity and trustworthy science. This principle of course applies to financial conflicts of interest, which are often underdeclared by journal editors [15], and also to nonfinancial conflicts of interest such as editor–author relationships. The proposed indices could add transparency in the editorial decision-making and peer review process of any journal. This transparency is currently lacking toward the public and any stakeholder involved in the research community, such as COPE, the Committee for Publication Ethics. Guidance for editors and publishers should be developed to delineate good practices and prevent obvious misconduct.

Methods

We developed and followed a research protocol, which was prospectively registered on July 21, 2020, on the Open Science Framework (https://osf.io/6e3uf/). The analytic code and summarized data are also available on the same URL.

Data extraction

The eligibility criteria for the selection of journals were the following: (i) a biomedical journal referenced in the NLM in the MEDLINE database; (ii) having at least one “Broad Subject Term”; and (iii) having published more than 50 papers between January 2015 and December 2019. “Broad Subject Terms” are Medical Subject Headings (MeSH), terms used to describe a journal’s overall scope, and they are defined by the NLM for journals in the MEDLINE database [16]. Each journal was analyzed only once regardless of the number of “Broad Subject Term” associated with the journal (except in subgroup analysis by “Broad Subject Terms”). The 2015 to 2019 period was chosen, as this 5-year window enables a smoothing of random variations and description of recent practices. One author (AS) searched for changes to journal names during the 2015 to 2019 period and, in cases of renaming, pooled the articles published under the different names. To identify eligible journals, we used the Entrez programming utilities (E-utilities), which enable queries to the National Center for Biotechnology Information (NCBI) databases. The search query—presented in —was used to identify all biomedical journals in the NLM catalog having at least one of the “Broad Subject Terms” listed. Then, for each journal, article metadata was automatically collected with E-utilities. On account of technical restrictions, querying for article metadata was run from 2015 up to the date of extraction and then restricted to the period January 2015 to December 2019. To manage articles without an author name, the third selection criterion was slightly modified to focus on journals with at least 50 “authored articles”—i.e., articles with at least one identified author—over the 2015 to 2019 period (see “Protocol changes”). Publications reprinted in several journals (for instance, PRISMA statements published in 6 different journals to promote dissemination) did not receive special treatment, and no correction was applied, as each article was only examined in relation to its publication journal.

Index calculation

Percentage of papers by the most prolific author and Gini index

For each journal, each author was identified by his or her full name (i.e., family name and complete first name) or barring that, by his or her family name and first name initial(s). The number of articles authored by this person was counted. When there was more than one author with the same largest number of published articles, they were all considered as the “most prolific” authors. The PPMP was defined as the number of articles by the most prolific author (nmax) divided by the total number of authored articles in the journal (Ntot): PPMP = [nmax / Ntot]. Complementary to the PPMP, the Gini index was used to explore inequality in the number of published articles related to more than one author. The Gini index for the number of publications by each author was calculated, with correction for the total number of authors (see formula and example in ) [4]. Gini index range from 0 to 1, with smaller values indicates a more equal distribution of articles across authors and higher values represent greater inequality. For the primary analysis, these 2 outcomes (PPMP and Gini index) were computed for all papers (including research articles, editorials, comments, etc.), and for a sensitivity analysis, they were computed only for research articles (using the NCBI publication type). In line with previous works [5,17], articles that were considered as research articles were included if (i) the Publication Type field was coded “Journal Article” and if (ii) the Abstract field was not empty. Furthermore, articles were not considered as research articles if Publication Type field was coded with the following label: “Comment,” “Letter,” “Editorial,” “Published Erratum,” “News,” “Introductory Journal Article,” “Biography,” “Portrait,” “Congress,” “Interview,” “Retraction of Publication,” “Personal Narrative,” “Retracted Publication,” “Patient Education Handout,” “Lecture,” “Autobiography,” “Clinical Conference,” “Classical Article,” “Address,” “Legal Case,” “Expression of Concern,” “Festschrift,” “Overall,” “Bibliography,” “Corrected and Republished Article,” “Interactive Tutorial,” “Duplicate Publication,” “Directory,” “Newspaper Article,” “Periodical Index,” “Dictionary.”

Publication time lag

For each article, the publication lag—defined as the time between submission and acceptance of an article—was computed whenever possible. After this, each journal was characterized by (i) median publication lag for articles authored by at least one of the most prolific authors and (ii) median publication lag for articles not authored by the most prolific author(s).

Description of outlier journals

Outliers were defined as journals with a PPMP value and/or the Gini index above their respective 95th percentiles in the principal analysis (i.e., on all articles) and in the sensitivity analysis (i.e., on research articles). For pragmatic reasons, 2 samples of 100 outlier journals were randomly selected (first sorted by full name, in alphabetic order, and randomly sampled using a random number generator with a seed arbitrarily set at 42; R function sample_n in dplyr package). One reviewer (AS or FN or CL) manually extracted characteristics related to the journal impact factor (WoS), open access policies (WoS and Directory of Open Access Journals), open peer review policies (Publons–Clarivate), the most prolific authors’ H-index (WoS), and presence and role (i.e., editor-in-chief or board member) on the editorial board of the journal (journal or publisher website). Where this information was available, we made a distinction between advisory boards (that were not considered in the analysis) and editorial boards. For the year 2019, the metrics “self-citation boost” (i.e., number of self-citing articles over number of non-self-citing articles) and “skewness and nonarticle inflation” (i.e., impact factor minus median of citations for an article, over median of citations for an article) was computed according to Ioannidis and Thombs [6]. Importantly, this extraction allowed for qualitatively exploring the possibility of homonyms between authors by comparing names, affiliations, research field, and all available qualitative information on NLM and WoS (using WoS Author search tool). When a doubt persisted, we used Google to identify the author at risk of homonymy and their credentials.

Data analysis

A descriptive analysis was performed using median, range, and quartiles for continuous variables, and counts and percentages for categorical variables. For both analyses, descriptions for the 100 outlier journals were computed overall and with respect to membership of any of the most prolific author(s) on the editorial board. Correlations were computed using Pearson’s coefficient, with 95% confidence interval (CI). To explore field-specific variations, the distribution of the 2 indices within each “Broad Subject Term” was graphically displayed. The yearly and overall distribution of the percentage of papers by each author and the Lorenz curve—a graphic representation of the cumulative distribution of appearances as an author—were presented for each of the potential outliers identified above. All analyses were conducted using R version 3.6 [18], and main packages RISmed 2.1 for queries on journal characteristics [19], easyPubMed 2.13 for queries on article characteristics [20], DescTools 0.99 for Gini index calculation [21], and tidyverse 1.3 for miscellaneous [22].

Protocol changes

Some practical unforeseen challenges arose in our research because a few articles unexpectedly lacked author names (i.e., articles without authors), which precluded them from contributing to the numerator of PPMP or to the Gini index. We therefore amended our definitions to make it explicit that the PPMP denominator was defined as the number of articles with at least one identified author rather than all published articles, and journals were included only if they had published 50 articles with author names rather than all published articles. The 3-week threshold used to describe publication lag as being suggestive of unduly rapid or absent peer review was not initially specified in our protocol and was arbitrarily added for descriptive purposes. Three weeks seems plausible for a thorough review process while expedited reviews in a few days can be suspicious if there is a pattern of fast reviews—and especially if those fast reviews applied selectively to certain favored authors). We also explored the relationship between publication lag for articles authored by any of the most prolific author(s) and the number of papers authored by these authors. The description of the outlier journals with respect to the membership of any of the most prolific author(s) on the editorial board was added a posteriori for exploratory purposes. We initially planned to focus our sensitivity analysis on “journal articles” only. During the peer review process, following a Science’s new [23], it appeared clear that this category was not specific enough. The protocol was therefore edited (https://osf.io/6evmz/) with a better definition of “research articles.” We have also described the overlap between both analysis and added more emphasis on the sensitivity analysis by describing a random sample of outliers in this analysis (it was not part of our initial protocol).

Cross classification of Publication Types.

Publication types extracted from MEDLINE Metadata to each article and their co-occurrences, among all journals in the United States NLM catalog having at least one Broad Subject term and having published at least 50 signed articles between 2015 and 2019. The data underlying this figure may be found in https://osf.io/6e3uf/. NLM, National Library of Medicine. (TIF) Click here for additional data file.

Formula and example to understand the Gini index.

We investigated the level of inequality in the distribution of authorship among authors using the Gini index. This statistical measure is derived from the Lorenz curve and is widely used in econometrics to describe income or wealth inequalities in a given population. In our study, “income” corresponds to the number of articles signed by authors in a given journal, and the “population” is all authors who have published at least one article between 2015 and 2019 in the journal. The Lorenz curve is a graphical representation of the ranked distribution of the cumulative percentage of authors on the abscissa versus the cumulative percentage of authorship distributed along the ordinate axis. In case of complete equality across author (i.e., each author within a journal has published the exact same number of articles), the Lorenz curve would follow the 45 degree diagonal. The further inequality increases (i.e., one author or a group of authors published more articles than others authors), the further the Lorenz curve moves away from this diagonal of equal distribution. The Gini index is a measure of the area between the diagonal of equal distribution and the Lorenz curve, corrected for the number of authors . In practice, the calculation formula is , where n is the total number of authors, and yi is the number of articles published by author i, with authors sorted in nondecreasing order of article numbers. The Gini index ranges from 0 to 1, with smaller values indicating a more equal distribution of articles across authors and higher values representing greater inequality. As expressed in its formula, the Gini index calculation gives a higher weight to extreme positive values and may be comparable in interpretation to a normalized root-mean-square error against an expected distribution of “all authors appear exactly the same number of times.” As a toy example, hypothetical cases of distribution of authorship between 3 authors totaling 24 authorships (left), with corresponding Lorenz curve and Gini index (right) are shown in the figure below. The dark blue line represents the diagonal of equal distribution (scenario 1), and other lines represents Lorenz curve of different scenarios of inequality (scenarios 2–5). The Gini formula counts the times an author’s name occurs but does not distinguish between papers contributed by different authors. Thus, the scenario 1 shown here could occur with 8 articles published in common by the 3 authors, or 4 articles published in common and each of the 3 separately publishing 4 other articles. (TIF) Click here for additional data file.

PPMP and Gini index among articles considered as research articles.

Distribution of the (A) PPMP author(s) and (B) the Gini index in relation to journal size, and (C) comparison between the PPMP and the Gini index, among articles considered as research articles (i.e., original article, case reports, and reviews), published by all journals in the US NLM catalog having at least one Broad Subject term and having published at least 50 signed articles between 2015 and 2019. The data underlying this figure may be found in https://osf.io/6e3uf/. NLM, National Library of Medicine; PPMP, Percentage of Papers by the Most Prolific author. (TIF) Click here for additional data file.

PPMP according to the subject area.

Distribution of the PPMP author for each US NLM broad term represented by at least 10 journals having published at least 50 signed articles between 2015 and 2019. The number of journals covered by a Broad Subject term is shown next to the name of the field of study. Width of the box-and-whisker relative to the number of journals. Vertical line at the 95th percentile of PPMP among journals. The data underlying this figure may be found in https://osf.io/6e3uf/. NLM, National Library of Medicine; PPMP, Percentage of Papers by the Most Prolific author. (TIF) Click here for additional data file.

Gini index according to the subject area.

Distribution of the Gini index for each US NLM broad term represented by at least 10 journals having published at least 50 signed articles between 2015 and 2019. The number of journals covered by a Broad Subject term is shown next to the name of the field of study. Width of the box-and-whisker relative to the number of journals. Vertical line at the 95th percentile of PPMP among journals. The data underlying this figure may be found in https://osf.io/6e3uf/. NLM, National Library of Medicine; PPMP, Percentage of Papers by the Most Prolific author. (TIF) Click here for additional data file.

Publication lag for articles considered as research articles.

Distribution of the publication lag median (with marginal density plot of distributions) for the subgroup of 2,790 (52.2%) journals reporting submission and publication dates. Publication lag median (in days) are presented for articles signed by the most prolific authors compared to the articles without any of the most prolific authors, among articles considered as research articles (i.e., original article, case reports, and reviews). The data underlying this figure may be found in https://osf.io/6e3uf/. (TIF) Click here for additional data file.

Publication lag for articles considered as research articles with at least one most prolific author.

Distribution of publication lag median (with marginal density plot of distributions) and number of articles authored for each of the most prolific authors, across articles considered as research articles (i.e., original article, case reports, and reviews) for the subgroup of 2,790 (52.2%) journals reporting submission and publication dates. The data underlying this figure may be found in https://osf.io/6e3uf/. (TIF) Click here for additional data file.

Venn diagram illustrating overlap between indices (PPMP versus Gini index) and analyses (principal versus sensitivity analyses).

The data underlying this figure may be found in https://osf.io/6e3uf/. PPMP, Percentage of Papers by the Most Prolific author. (TIF) Click here for additional data file.

Description of included journals.

Main characteristics of all journals in the US NLM catalog having at least one Broad Subject term and having published at least 50 signed articles between 2015 and 2019. NLM, National Library of Medicine. (XLSX) Click here for additional data file.

Yearly and global individual data for the main characteristics of each selected journal (N = 5,468).

(XLSX) Click here for additional data file.

Main characteristics of 100 randomly selected outlier journals identified through the principal analyses and through the sensitivity analyses.

Journals are considered as outliers if they have a PPMP or a Gini index higher than the 95th percentile, among the journals in the US NLM catalog having at least one Broad Subject term and having published at least 50 signed articles between 2015 and 2019. NLM, National Library of Medicine. (XLSX) Click here for additional data file.

US NLM catalog journal identification query.

NLM, National Library of Medicine. (DOCX) Click here for additional data file. 27 Jan 2021 Dear Dr Locher, Thank you for submitting your manuscript entitled "‘Nepotistic journals’: a survey of biomedical journals." for consideration as a Research Article by PLOS Biology. Your manuscript has now been evaluated by the PLOS Biology editorial staff, as well as by an academic editor with relevant expertise, and I'm writing to let you know that we would like to send your submission out for external peer review. IMPORTANT: We will be reviewing your manuscript as a Meta-Research Article. Please could you change the article type to "Meta-Research Article" when you upload your additional metadata (see next paragraph)? No re-formatting is required. However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire. Please re-submit your manuscript within two working days, i.e. by Jan 29 2021 11:59PM. Login to Editorial Manager here: https://www.editorialmanager.com/pbiology During resubmission, you will be invited to opt-in to posting your pre-review manuscript as a bioRxiv preprint. Visit http://journals.plos.org/plosbiology/s/preprints for full details. If you consent to posting your current manuscript as a preprint, please upload a single Preprint PDF when you re-submit. Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review. Given the disruptions resulting from the ongoing COVID-19 pandemic, please expect delays in the editorial process. We apologise in advance for any inconvenience caused and will do our best to minimize impact as far as possible. Feel free to email us at plosbiology@plos.org if you have any queries relating to your submission. Kind regards, Roli Roberts Roland G Roberts, PhD, Senior Editor PLOS Biology 26 Mar 2021 Dear Dr Locher, Thank you very much for submitting your manuscript "‘Nepotistic journals’: a survey of biomedical journals." for consideration as a Meta-Research Article at PLOS Biology. Your manuscript has been evaluated by the PLOS Biology editors, an Academic Editor with relevant expertise, and by three independent reviewers; in addition the Academic Editor has kindly provided some extra guidance that I have included in the foot of this email. You'll see that the reviewers are broadly positive about you study, but each raises a number of concerns that must be addressed (especially rev #2, who has provided a very thorough set of comments). You should also attend to the additional comments provided by the Academic Editor (note that s/he considers one of the requests from reviewer #3 to be optional). In light of the reviews (below), we will not be able to accept the current version of the manuscript, but we would welcome re-submission of a much-revised version that takes into account the reviewers' comments and those of the Academic Editor. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent for further evaluation by the reviewers. We expect to receive your revised manuscript within 3 months. Please email us (plosbiology@plos.org) if you have any questions or concerns, or would like to request an extension. At this stage, your manuscript remains formally under active consideration at our journal; please notify us by email if you do not intend to submit a revision so that we may end consideration of the manuscript at PLOS Biology. **IMPORTANT - SUBMITTING YOUR REVISION** Your revisions should address the specific points made by each reviewer. Please submit the following files along with your revised manuscript: 1. A 'Response to Reviewers' file - this should detail your responses to the editorial requests, present a point-by-point response to all of the reviewers' comments, and indicate the changes made to the manuscript. *NOTE: In your point by point response to the reviewers, please provide the full context of each review. Do not selectively quote paragraphs or sentences to reply to. The entire set of reviewer comments should be present in full and each specific point should be responded to individually, point by point. You should also cite any additional relevant literature that has been published since the original submission and mention any additional citations in your response. 2. In addition to a clean copy of the manuscript, please also upload a 'track-changes' version of your manuscript that specifies the edits made. This should be uploaded as a "Related" file type. *Re-submission Checklist* When you are ready to resubmit your revised manuscript, please refer to this re-submission checklist: https://plos.io/Biology_Checklist To submit a revised version of your manuscript, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' where you will find your submission record. Please make sure to read the following important policies and guidelines while preparing your revision: *Published Peer Review* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details: https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/ *PLOS Data Policy* Please note that as a condition of publication PLOS' data policy (http://journals.plos.org/plosbiology/s/data-availability) requires that you make available all data used to draw the conclusions arrived at in your manuscript. If you have not already done so, you must include any data used in your manuscript either in appropriate repositories, within the body of the manuscript, or as supporting information (N.B. this includes any numerical values that were used to generate graphs, histograms etc.). For an example see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5 *Blot and Gel Data Policy* We require the original, uncropped and minimally adjusted images supporting all blot and gel results reported in an article's figures or Supporting Information files. We will require these files before a manuscript can be accepted so please prepare them now, if you have not already uploaded them. Please carefully read our guidelines for how to prepare and upload this data: https://journals.plos.org/plosbiology/s/figures#loc-blot-and-gel-reporting-requirements *Protocols deposition* To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Thank you again for your submission to our journal. We hope that our editorial process has been constructive thus far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Roli Roberts Roland G Roberts, PhD, Senior Editor, rroberts@plos.org, PLOS Biology ***************************************************** REVIEWERS' COMMENTS: Reviewer #1: The authors perform an analysis related to research integrity, in particular of journals and their editors. For this they use the percentage of papers by the most prolifc author (really a red flag if too high) and the Gini index of contributions over all authors of the journal ( a more refined indicator in this case). They further used the median lag time to publication, especially for the most prolific author (another red flag to signal journals with questionable behavior). As a bibliometrician I testify that the authors used the appropriate methods. Of course, the authors could have used another concentration index, but I see no reason why they should in this case. I have two minor observations: 1) I do not understand the meaning in Fig. 4 of "number of articles with any most prolific author" . Do they mean "at least one" ? 2) the box and whisker graphs pages 26/27 are extremely small. Maybe they can be included, in a larger version, as supplementary material. Reviewer #2: [identify themselves as Ivan Oransky and Alison Abritis] Thank you for the opportunity to review this manuscript, which is important as it sounds the alarm on what appears to be very problematic behavior at some journals. We would like to make a number of suggestions for revision. As detailed in our specific comments, we recommend that the authors make it more clear that many of the terms and concepts described in the manuscript have only appeared in their previous non-refereed work, including a preprint. We should stress that we encourage citing such sources, but we also think it is important to be transparent about their status and origin. We would also strongly recommend that the journal invite specialists in economics and statistics to review this paper, particularly the formulas used and subsequent findings. This material is outside of our expertise, but is critical to the paper's conclusions. Introduction: Paragraphs one and two of the Introduction appear to be a summary of, and borrow language from, the authors' preprint published in July 2020 (reference #1), with little else to support the subsequent hypothesis. It is also unclear from the preprint whether the authors limited their dataset to research articles, or included items such as editorials, which would skew figures for active editors. Several phrases were taken directly from the preprint without suitable citation. (e.g. "We suggest that (1) a constantly high proportion of papers published by a group of authors, (2) particularly in the presence of relationships between the editors and these authors, and (3) publication of low-quality research, are key characteristics of a new type of illegitimate publishing entity, i.e. "self-promotion journals", which deserve further investigation.") The authors' passage that "In the field of academic publishing, the term 'self-promotion journal' was coined" is a good example of one that should be more transparent about the fact that the term was coined by the authors elsewhere and has not yet, best we can tell, been taken up by others. The second paragraph of the introduction includes a second reference, to a blog post by a co-author from prior works, and should similarly have more transparent language attached. The authors state, referring to the post's findings: "Furthermore, many of these papers, with superficial or absent peer reviews, could be detected by the remarkably rapid turn-around, often within a week or less, between the dates recorded for submission and acceptance. This additional evidence of unethical editorial practice can only be obtained, however, in journals that report these dates for published manuscripts." We would recommend indicating how these reported dates were interpreted, because for some journals "acceptance date" means the date the paper was accepted *after* having gone through review. The manuscript may have been "accepted" but still required revisions which were done prior to publication. Similarly, a long time period between submission and acceptance does not indicate that a thorough, or even any, peer review was performed. And again, the nature of the publication (research, review, letter, editorial) was not clarified. Were comments related to a single study (e.g., "Authors Reply to Commenter, et al.") written by the same person considered separate articles? The only other reference in the Introduction, found in the third paragraph, is to a statistical method (Gini) used primarily in economics for studying inequities in resource/income distributions. As noted in our general remarks, we would recommend that the journal seek an reviewer with this kind of expertise. In the third paragraph, the authors state: "These convergent analyses from different fields suggest that the Percentage of Papers by the Most Prolific author (PPMP) is a simple measure that can be used as a red flag to identify journals that are suspected of biased editorial decision-making - what we now term 'nepotistic journals.'" This suggests that there is ample proof that a PPMP is a) "a simple measure" and b) "can be used as a red flag to identify journals that are suspected of biased editorial decision-making." We would recommend that the authors stress in their language that this is not yet a validated measure. (Consider too that journals starting up may have a disproportionately small author list, until well-known enough to attract a variety of new authors.) Methods: Data extraction: 1. "by the NLM for journals in the MEDLINE database" (emphasis ours): Did the authors differentiate between indexed in NLM, PubMed Central, and Medline? The confusion about the differences in terms NLM uses is of course not the fault of the authors, but some clarity would be helpful. 2. What did the authors do about journals having 2 or more Broad Subject Terms (BST). Is there unconsidered duplication within the single BST terms, or was the journal list just considered as a whole without separation per BST? 3. What are considered "published papers"? There are plenty of non-research papers indexed - including letters, editorials, and even tables of contents. What means (if any) of exclusion were applied for the bottom limit of 50 publications. This is especially important since the authors needed to revise their protocol to exclude "articles without an author name" - which would be quite an anomaly if concentrating on research and/or review papers. 4. How did the authors find all the articles per author? Was it through a NLM search via Pubmed or through the individual journal's separate table of contents? Not all journals list all articles within Pubmed - how was the comprehensiveness of the article check ensured? 4. "Publications reprinted in several journals (e.g., PRISMA statements published in 6 different journals to promote dissemination) did not receive special treatment, and no correction was applied, as each article was only examined in relation to its publication journal." How many of these cases occurred? Did the authors examine whether excluding these reprints affected any of the findings? It does not seem safe to assume that it would have no effect. Index Calculation: 1. What did the authors do to confirm that similar/same names were the same person, or that different names were really different people (i.e., referring primarily last name/first name swaps - those using somewhat different names are likely rare, although they do exist). 2. "When there was more than one author with the same largest number of published articles, they were all considered as the "most prolific" authors." If "most" is to be applied to more than one, perhaps a better term applies. 3. Use of the Gini index should be assessed by a reviewer familiar with its use and application. 4. "For the primary analysis, these two outcomes (PPMP and Gini) were computed for all papers, and for a sensitivity analysis they were computed only for papers labelled as 'journal articles' (using the NCBI publication type)." Please explain the difference between "all papers" and for "journal articles." Description of outlier journals 1.The term "outlier" could be misleading, as clearly this is referring to a specific group of journals created by specific measured behaviors rated along a particular scale, and "outliers" typically refer to statistical anomalies not generally expected to be part of a group. 2. We will defer to the statisticians in evaluating the true randomness of the sample. However, first alphabetizing journal names (with or without "The" in the title?), then selecting a seed set of 42 suggests "pseudo-randomization," not true randomization. Consider also that journals may avoid names/titles starting with letters towards the end of the alphabet and it becomes even less truly random. Data analysis 1. See notes elsewhere about obtaining a review from someone with appropriate expertise in statistics or economics. Protocol Changes 1. "because some articles unexpectedly lacked author names," again requires clarification as to the types of publications included as "article." 2. "The 3-week threshold used to describe publication lag as being suggestive of unduly rapid or absent peer review was not initially specified in our protocol, and was added for descriptive purposes." What is the basis of using 3 weeks as an arbitrary delineator? Is it just an assumption, or is there literature to show a relationship between 3 weeks and the quality of peer review? If so, include a reference. If not, say so and provide a justification for the time limit. Results: Journal Selection and Description 1. With such a broad range of numbers of articles published, might there not be a masking effect in journals with higher rates of publishing - causing lower-publishing-rate journals to show a greater number of PPMPs? Would it not have been more effective to break the journals into groups based on rates of publishing? Description of the Indices 1. Statistics- defer to a specialist for review. 2. Slightly more than half the sample did not provide submission/acceptance rates. This suggests it is problematic to draw conclusions about "publication lag" calculations and comparisons . Description of outliers and identification of nepotistic journals 1. "206 based on the PPMP and the Gini index considered separately, and 68 based on both indices." If the PPMP and the Gini index each identified 206 different journals, and only agreed on 68, can the authors comment on the significant difference? 2. "The main characteristics of the 100 randomly selected outlier journals are presented" A statistician should confirm random vs pseudo-random. Re: the discussion of journals "reported in English." This is somewhat misleading since by nature the NLM is skewed towards English language journals. 3. "We identified errors in author identification for one journal, MMW-Fortschritte der Medizin (an outlier on the Gini index), where the most prolific 'author' was named 'Red', which seems to be a diminutive for 'Redaktion', possibly encompassing several physical individuals." Not sure how this would be considered "an error"? "Redaktion" appears to be German for "Editorial Staff." So these would more appropriately belong to the "unnamed" articles? "4. When 'Red' was ruled out as a valid author name, the next most prolific author was however a member of the editorial board,…" For a journal with a highly active editorial board as demonstrated by the number of "Red" articles, would this be unusual or unseemly? Might it be possible that MMW-Fortschritte der Medizin has its own publishing emphasis? 5. How did the authors confirm that the same-named authors were indeed the same person? Did they cross-check affiliations, or just assume the same name in the same field would be the same person? Discussion: "In this comprehensive survey of 5 468 biomedical journals, we characterized several features of editor-author relationships among which were the following: (i) article output was sometimes dominated by the prolific contribution of one author or a group of authors, (ii) time lags to publication were in some instances shorter for these prolific authors and (iii) prolific authors were typically members of the journal's editorial board." Criteria for "sometimes" and "typically" are unclear. Additionally, as pointed out previously, the publication lag data was for less than half the journals considered in the sample, but was compared against figures for all the journals. "We concluded that defining the top 5% nepotistic journals required the threshold to be set at up to 10.6% of articles published by the most prolific author." "for the purposes of this study" should be added. This study is insufficient for a general standard. "In absolute terms, we believe it is reasonable to question the judgement of an editor where more than 10% of the published papers are authored by the same person." This may be reasonable, but without at least a validated spot-check of some of these papers, this statement appears to go well beyond what can be concluded from the findings. We note that a preprint of this manuscript has been the story of a news story in Science https://www.sciencemag.org/news/2021/02/journals-singled-out-favoritism that mentions the Didier Raoult oeuvre. Is there material there that could be cited? "This suggests that a broader time-frame allowed for more occasional authors to be recruited while maintaining the regular authors, revealing the latent heterogeneity." Or revealed a beginning journal struggling to keep afloat long enough to attract a variety of authors and allow the "regular authors" to fade out to other publishing duties. Association is of course not causation. "If a journal publishes a very large number of papers, it becomes increasingly implausible that a single prolific author could account for 10% or more of them, as there is a natural upper limit to how many papers any one individual can author." While this is not an unreasonable conclusion to draw, it should be noted that senior researchers who oversee multiple projects may end up as authors on many papers, which naturally ups their publication count. "Our findings persisted when all articles were considered as well as when only 'journal articles' were considered (i.e. excluding articles explicitly referenced as editorial, correspondence or news articles), suggesting that editorials, correspondence and news, are not the only drivers of the indicators we explored." Did the authors confirm that the NLM labeling of type was accurate? PubMed relies on metadata from publishers, which has been shown to be incomplete or error-laden in a not insignificant number of cases. "We should beware of assuming that a hyper-prolific author is necessarily engaged in questionable publishing practices: some people are naturally highly productive, and the speed with which good research can be completed is highly variable across research fields." Is a "hyper-prolific author" different to a PPMP? Here, those authors are given a pass, but a few paragraphs prior these authors cast aspersions on the editors for allowing such prolific behavior. "It is also very important to complement the PPMP and the Gini index with the absolute numbers of papers authored by the most prolific authors, because some problematic journal behaviours could pass unnoticed when only these two indices are used." This suggests that the PPMP is not a "simple measure" as described elsewhere in the manuscript. Limitations: 1. "but it may miss some finer points, especially concerning the quality of articles published in these journals." This is a significant limitation, as noted above, and deserves mention in the same breadth as other discussions of quality in the manuscript. 2. "Some journals are registered without broad terms, requiring a manual pick-up by the NLM." This is unclear. If the authors are referring to Medline and article indexings - the Publisher applies to Medline. No BSTs are required for the application. 3. "Similarly, because we restricted our search to journals publishing a minimum of 50 papers in the 2015-2019 period, we may have missed smaller journals with less professional editorial practices." Why is the assumption that smaller journals have less professional editorial practices? If this is referring to editorial staff, which is not the same as "editorial practices," that should be made clear. 4. "Importantly, our automated calculations carry a risk of inaccuracy as a result of homonymy…. Our analysis of the random sample of outliers enabled a disambiguation procedure consisting in inspecting qualitatively the most prolific authors. Only 1 out of 108 "most prolific" authors within a given journal was considered to be at being at high risk of homonymy." What was the disambiguation procedure? If only looking at the ones already considered to be prolific, how does this method ensure that others may miss such a determination due to slight differences in the names used? (For example, Erin Nicole Potts- Kant has published as Erin Potts, Erin Kant, Erin N Potts-Kant, etc. Others have swapped first, middle and last names, with minor spelling changes, depending upon the journal.) "Among these 108 authors, this procedure also enabled identification of the 6 most prolific authors, who were professional journalists for whom high productivity is of course not an indicator of any academic misconduct [emphasis ours], as they are professionals paid by the journal and not academics. The two proposed indicators, and their current calculation, should therefore not be used indiscriminately but could rather serve as a screening tool for potentially problematic journals that may then require careful exploration of their editorial practices." This also suggests the PPMP is not a "simple measure," and casts doubt on the earlier statement that "In absolute terms, we believe it is reasonable to question the judgement of an editor where more than 10% of the published papers are authored by the same person." "The proposed indices could add transparency in the editorial decision-making and peer review process of any journal." We agree, provided the nuance and limitations are clearly reflected. Reviewer #3: The paper proposes and analyzes two interesting metrics for detecting possibly nepotistic practices in journals. Both, the PPMP and the Gini index, are easy to compute and to understand and they allow to identify outlier journals and authors than can be then double checked for editorial misconduct. The experimental design, data and code are available in OSF. The analysis is rather descriptive and it would be great to see some statistical models in place (eg. when characterizing the profile of most prolific authors and of outlier journals). Although I understand the pragmatic decision of restricting the analysis of outlier journals to 100, I think the work should better cover all 480 journals. The authors should more explicitly describe the actual contribution of some co-authors of this study. COMMENTS FROM THE ACADEMIC EDITOR: The report from the reviewers cover a lot of terrain in much detail. On top of this, there a couple of more general nature that I would like to give back as feedback that you might choose to use in a next version: // take reader by the hand with a bit more explanation at the somewhat more complex or less widely known concepts. The figures are a good example - add perhaps 1-2 more sentences on what can be learned from the different figures, and how relate to each other. Another example is that some concepts are used, yet only explained in detail in the methods. That is relatively late due to the lay out format of PLOS Biol. This means that some aspects of the methods have to be introduced with a couple of extra words in the intro/results or discussion section. (e.g. gini index, "journal articles" // journal age is only touched upon lightly in the narrative - I would like to point out fro personal experience with several new journals that when a new journal starts, there is active solicitation of papers from EiC to EB members to have enough copy in the starting up phase - What can be said about this phenomenon in relation to these indices? // with regard to the suggestion of reviewer 3 to analyse a larger set of outlier journals I think it is important to note that while adding more observations will indeed increase precision, such a change will not affect the future decision of the PLOS Biol to publish or not publish. // The openness on the used method (registered analysis, explicit mention of deviations in planned analysis) is great to see. If you decide to add to or change the analysis, feel free to note that these changes came through peer review in order to keep your level of "open reporting on methods used" intact. NB the level of detail on the selection procedure for outlier selection is recommendable. The seed number chosen is as obvious as right. 27 Jul 2021 Submitted filename: renamed_09863.pdf Click here for additional data file. 5 Oct 2021 Dear Clara, Thank you for submitting your revised Meta-Research Article entitled "‘Nepotistic journals’: a survey of biomedical journals." for publication in PLOS Biology. I've now obtained advice from one of the original reviewers and have discussed their comments with the Academic Editor. In addition, as advised in one of my emails, both this reviewer (reviewer #2) and the Academic Editor felt that the manuscript would benefit from further assessment of the specialised statistical methods used (new reviewer#4); please accept my apologies for the additional delay caused by this and some additional communication problems over the summer months. Based on the reviews, we will probably accept this manuscript for publication, provided you satisfactorily address the remaining minor point raised by reviewer #4 and the following: a) Please could you choose a more explicit and appealing title? We suggest the following ideas (assuming these are supported by your findings): "Editorial bias and nepotistic behaviour detected in 10% of biomedical journals" or "Nepotistic behaviour detected in 10% of biomedical journals" or "A survey of biomedical journals to detect editorial bias and nepotistic behavior." I should also say that we like to avoid punctuation in titles. b) Please could you supply a blurb, as instructed in the submission form? c) Please attend to the request from reviewer #4. Keep the Materials and Methods section in its current place, but address the reviewer's concern by adding a description of the Gini index earlier in the manuscript in order to help the reader appreciate the implications of its use more fully. d) We note that your financial statement currently says “The author(s) received no specific funding for this work" - please can you confirm that this is correct? e) We note that your supplementary Figures currently have rather confusing names. Please re-number them Figs S1, S2, S3, etc., provide legends for them, and cite them in the text. f) We believe that the OSF deposition contains data and scripts sufficient to re-create all of the main and supplementary Figures. Can you confirm that this is the case? Please cite the location of the data clearly in each relevant main and supplementary Figure legend, e.g. "The data underlying this Figure may be found in https://osf.io/6e3uf/" g) Please re-write the Abstract in the more verbose format that is standard for our journal. I also notice that something has gone wrong with the second sentence of the Abstract. As you address these items, please take this last chance to review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the cover letter that accompanies your revised manuscript. We expect to receive your revised manuscript within two weeks. To submit your revision, please go to https://www.editorialmanager.com/pbiology/ and log in as an Author. Click the link labelled 'Submissions Needing Revision' to find your submission record. Your revised submission must include the following: - a cover letter that should detail your responses to any editorial requests, if applicable, and whether changes have been made to the reference list - a Response to Reviewers file that provides a detailed response to the reviewers' comments (if applicable) - a track-changes file indicating any changes that you have made to the manuscript. NOTE: If Supporting Information files are included with your article, note that these are not copyedited and will be published as they are submitted. Please ensure that these files are legible and of high quality (at least 300 dpi) in an easily accessible file format. For this reason, please be aware that any references listed in an SI file will not be indexed. For more information, see our Supporting Information guidelines: https://journals.plos.org/plosbiology/s/supporting-information *Published Peer Review History* Please note that you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. Please see here for more details: https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for-published-peer-review/ *Early Version* Please note that an uncorrected proof of your manuscript will be published online ahead of the final version, unless you opted out when submitting your manuscript. If, for any reason, you do not want an earlier version of your manuscript published online, uncheck the box. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us as soon as possible if you or your institution is planning to press release the article. *Protocols deposition* To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Please do not hesitate to contact me should you have any questions. Best wishes, Roli Roland G Roberts, PhD, Senior Editor, rroberts@plos.org, PLOS Biology ------------------------------------------------------------------------ DATA NOT SHOWN? - Please note that per journal policy, we do not allow the mention of "data not shown", "personal communication", "manuscript in preparation" or other references to data that is not publicly available or contained within this manuscript. Please either remove mention of these data or provide figures presenting the results and the data underlying the figure(s). ------------------------------------------------------------------------ REVIEWERS' COMMENTS: Reviewer #2: [identify themselves as Ivan Oransky and Alison Abritis] We appreciate how seriously the authors have taken the suggestions of all of the reviewers, which indicates a high level of integrity in the work. The paper is much improved, and we are happy to recommend acceptance. We would still recommend that the editors have a statistical expert review the methods, but we leave that to their discretion. Reviewer #4: Let me start by stating that in scientometrics/bibliometrics, the Gini-index is used more frequently, as a measure of concentration. I have personally used this as a measure of concentration with respect to field orientation, in other words, to what extent is a unit more mono-or interdisciplinary, and confront and correlate that with MRC research grant peer review assessments. With respect to this paper, a number of things strike me, in the first place the order of the sections in the manuscript. I would expect the Methods section before the Results section, which would mean that one would read about the used measures, among which the Gini-index, before reading results. This is unfortunate, as one now has to except or assume the meaning of the Gini-index, and how it work, and what it does. In the Methods section I think the presentation of the Gini-index is not sufficient, reading the section I did not get a clear view on how this index works, how it is calculated, how it is applied in the dataset that is collected, and also not how this affects the outcomes. I think the authors should add such a paragraph to the manuscript, as many readers of the paper will probably be not very familiar with the index, so a further explanation helps better understand the outcomes of the study. One would then also create a better understanding of how the Gini-index relates to the PPMP-indicator presented in the study, as well as the way these two measures are correlated, and how to interpret their correlation. I liked the study and the paper, as it deals with an important research integrity issue, namely that of authorship, and everything that can go wrong there, which often remains invisible among all questionable research practices. 19 Oct 2021 Submitted filename: renamed_0f319.pdf Click here for additional data file. 20 Oct 2021 Dear Clara, On behalf of my colleagues and the Academic Editor, Bob Siegerink, I'm pleased to say that we can in principle offer to publish your Meta-Research Article "A survey of biomedical journals to detect editorial bias and nepotistic behavior" in PLOS Biology, provided you address any remaining formatting and reporting issues. These will be detailed in an email that will follow this letter and that you will usually receive within 2-3 business days, during which time no action is required from you. Please note that we will not be able to formally accept your manuscript and schedule it for publication until you have made the required changes. Please take a minute to log into Editorial Manager at http://www.editorialmanager.com/pbiology/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process. PRESS: We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with biologypress@plos.org. If you have not yet opted out of the early version process, we ask that you notify us immediately of any press plans so that we may do so on your behalf. We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/. Thank you again for choosing PLOS Biology for publication and supporting Open Access publishing. We look forward to publishing your study. Best wishes, Roli Roland G Roberts, PhD Senior Editor PLOS Biology rroberts@plos.org

Table 1

Main characteristics of all journals in the United States NLM catalog having at least one Broad Subject Term and having published at least 50 authored articles between 2015 and 2019.

		All articles	Research Articles
Number of journals with ≥50 authored articles		5,468	5,341
Number of articles
	Median [IQR]	500 [262–964]	426 [232–798]
	Range	50–108,990	50–103,647
Number of articles with an author
	Median [IQR]	494 [257–952]	425 [232–795]
	Range	50–107,342	50–103,647
PPMP (%)
	Median [IQR]	2.88 [1.71–4.91]	2.56 [1.61–4.21]
	Range	0.1–39.9	0.126–44.8
PPMP 95th percentile (%)		10.6	8.77
Number of articles by MPA
	Median [IQR]	14 [8–25]	11 [7–18]
	Range	1–767	1–252
Tied as MPA		1,022 (19%)	1,271 (24%)
Gini index
	Median [IQR]	0.183 [0.131–0.246]	0.161 [0.113–0.219]
	Range	0.00–0.740	0.00–0.713
Gini index 95th percentile		0.355	0.324
Median of publication lag ratio (MPA/no MPA)
	Median [IQR]	0.829 [0.606–1.02]	0.879 [0.697–1.04]
	Range	0.00–26.8	0.0–754
	Not calculable	2,743 (50%)	2,551 (48%)

NLM, National Library of Medicine; MPA, Most Prolific Author; PPMP, Percentage of Papers by the Most Prolific author.

12 in total

1. Trends in Proportion of Women as Authors of Medical Journal Articles, 2008-2018.

Authors: Kamber L Hart; Roy H Perlis
Journal: JAMA Intern Med Date: 2019-09-01 Impact factor: 21.873

2. Thousands of scientists publish a paper every five days.

Authors: John P A Ioannidis; Richard Klavans; Kevin W Boyack
Journal: Nature Date: 2018-09 Impact factor: 49.962

3. How predatory journals leak into PubMed.

Authors: Andrea Manca; David Moher; Lucia Cugusi; Zeevi Dvir; Franca Deriu
Journal: CMAJ Date: 2018-09-04 Impact factor: 8.262

4. Guarding the guardians: a conference on editorial peer review.

Authors: D Rennie
Journal: JAMA Date: 1986-11-07 Impact factor: 56.272

5. A user's guide to inflated and manipulated impact factors.

Authors: John P A Ioannidis; Brett D Thombs
Journal: Eur J Clin Invest Date: 2019-07-01 Impact factor: 4.686

6. Publication by association: how the COVID-19 pandemic has shown relationships between authors and editorial board members in the field of infectious diseases.

Authors: Clara Locher; David Moher; Ioana Alina Cristea; Florian Naudet
Journal: BMJ Evid Based Med Date: 2021-03-30

7. Adherence to conflicts of interest policy in Cochrane reviews where authors are also editorial board members: A cross-sectional analysis.

Authors: Rafael Leite Pacheco; Carolina Oliveira Cruz Latorraca; Ana Luiza Cabrera Martimbianco; Enderson Miranda; Luis Eduardo Santos Fontes; David Nunan; Rachel Riera
Journal: Res Synth Methods Date: 2021-09-02 Impact factor: 5.273

8. Citation Metrics: A Primer on How (Not) to Normalize.

Authors: John P A Ioannidis; Kevin Boyack; Paul F Wouters
Journal: PLoS Biol Date: 2016-09-06 Impact factor: 8.029

9. Quantifying the effect of editor-author relations on manuscript handling times.

Authors: Emre Sarigöl; David Garcia; Ingo Scholtes; Frank Schweitzer
Journal: Scientometrics Date: 2017-03-03 Impact factor: 3.238

3 in total

1. Would Moving Forward Mean Going Back? Comment on Maselli et al. Direct Access to Physical Therapy: Should Italy Move Forward? Int. J. Environ. Res. Public Health 2022, 19, 555.

Authors: Antimo Moretti; Massimo Costa; Giovanna Beretta
Journal: Int J Environ Res Public Health Date: 2022-04-11 Impact factor: 4.614

2. Correction: A survey of biomedical journals to detect editorial bias and nepotistic behavior.

Authors: Alexandre Scanff; Florian Naudet; Ioana A Cristea; David Moher; Dorothy V M Bishop; Clara Locher
Journal: PLoS Biol Date: 2022-01-18 Impact factor: 8.029

Review 3. Deciphering the Retinal Epigenome during Development, Disease and Reprogramming: Advancements, Challenges and Perspectives.

Authors: Cristina Zibetti
Journal: Cells Date: 2022-02-25 Impact factor: 6.600

3 in total