Literature DB >> 29892634

Citation analytics: Data exploration and comparative analyses of CiteScores of Open Access and Subscription-Based publications indexed in Scopus (2014-2016).

Aderemi A Atayero1, Segun I Popoola1, Jesse Egeonu2, Olumuyiwa Oludayo3.   

Abstract

Citation is one of the important metrics that are used in measuring the relevance and the impact of research publications. The potentials of citation analytics may be exploited to understand the gains of publishing scholarly peer-reviewed research outputs in either Open Access (OA) sources or Subscription-Based (SB) sources in the bid to increase citation impact. However, relevant data required for such comparative analysis must be freely accessible for evidence-based findings and conclusions. In this data article, citation scores (CiteScores) of 2542 OA sources and 15,040 SB sources indexed in Scopus from 2014 to 2016 were presented and analyzed based on a set of five inclusion criteria. A robust dataset, which contains the CiteScores of OA and SB publication sources included, is attached as supplementary material to this data article to facilitate further reuse. Descriptive statistics and frequency distributions of OA CiteScores and SB CiteScores are presented in tables. Boxplot representations and scatter plots are provided to show the statistical distributions of OA CiteScores and SB CiteScores across the three sub-categories (Book Series, Journal, and Trade Journal). Correlation coefficient and p-value matrices are made available within the data article. In addition, Probability Density Functions (PDFs) and Cumulative Distribution Functions (CDFs) of OA CiteScores and SB CiteScores are computed and the results are presented using tables and graphs. Furthermore, Analysis of Variance (ANOVA) and multiple comparison post-hoc tests are conducted to understand the statistical difference (and its significance, if any) in the citation impact of OA publication sources and SB publication source based on CiteScore. In the long run, the data provided in this article will help policy makers and researchers in Higher Education Institutions (HEIs) to identify the appropriate publication source type and category for dissemination of scholarly research findings with maximum citation impact.

Entities:  

Keywords:  Analytics; Citation analytics; Citation impact; CiteScore; Data mining; Open Access; Smart campus

Year:  2018        PMID: 29892634      PMCID: PMC5993002          DOI: 10.1016/j.dib.2018.05.005

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data The dataset generated and made publicly available based on the stipulated criteria will help foster further investigation into the importance of Elsevier CiteScore and other source ranking methods [2], [3], [4]. Presenting this data in open access format will help researchers identify relevant sources as veritable outlets for dissemination of their research findings [5], [6]. Quite a lot of research findings often end up in subscription-only sources. This invariably limits access to such works and reduces their impact on future research significantly. This shortfall is mitigated by isolating and analyzing the OA sources of the largest global indexing body for scientific research [7], [8], [9]. Descriptive statistics, frequency distributions, one-way ANOVA and multiple comparison post-hoc tests that are presented in tables, plots, and graphs will make data interpretation much easier for useful insights, inferences, and logical conclusions [10], [11], [12], [13]. Detailed datasets that are made publicly available in a Microsoft Excel spreadsheet file attached to this article will encourage further explorative studies in this field of research.

Data

Analytics seeks to discover, interpret, and effectively communicate patterns in any given dataset. These attributes explain why analytics is becoming pervasive across various disciplines including ranking of Higher Education Institutions (HEIs). A very high premium is placed on scholarly research output as evidenced by publication in relevant sources as a proxy measure of excellence in ranking of HEIs. Scopus by Elsevier is currently the world's largest abstract and citation database of peer-reviewed literature. It currently boasts over 70 million records. CiteScore™– a measure of the average citations received per document published in a serial, is one of the three major indices used by Scopus to rank publication sources [14], [15], [16]. In this source ranking method, higher is better. This metric invention from Scopus is comprehensive and transparent. It is a free metrics of current sources indexed in Scopus. The potentials of citation analytics may be exploited to understand the gains of publishing scholarly peer-reviewed research outputs in either Open Access (OA) sources or Subscription-Based (SB) sources in the bid to increase citation impact. However, relevant data required for such comparative analysis must be freely accessible for evidence-based findings and conclusions. In this data article, citation scores (CiteScores) of 2542 OA sources and 15,040 SB sources indexed in Scopus from 2014 to 2016 were presented and analyzed based on a set of five inclusion criteria. Two publication source types (OA and SB) and they both covered three sub-categories namely: Book Series; Journal; and Trade Journal. Precise information about the distribution of the CiteScore data across the source types and sub-categories is presented in Table 1. Under the OA source type, 5 Book Series sources, 2536 Journal sources, and 1 Trade Journal source successfully met the inclusion criteria. On the other hand, 378 Book Series sources, 14,448 Journal sources, and 214 Trade Journal sources were included under the SB source type based on the inclusion criteria that were earlier set. It is becoming increasingly popular for subscription-based source providers to grant authors right to open their articles for a fee. This practice is sometimes referred to as the hybrid model. However, we noted that the hybrid model is a subset of the subscription-based model. Hence, in this data article, the hybrid model is totally captured under the SB category.
Table 1

Classification of scholarly research output publications.

Open Access (OA)Subscription (SB)Total
Book Series5378383
Journal253614,44816,984
Trade Journal1214215
Total254215,040
Classification of scholarly research output publications.

Experimental design, materials and methods

In this data article, CiteScores of 2542 OA sources and 15,040 SB sources indexed in Scopus from 2014 to 2016 were presented and analyzed. The methodology for calculating the CiteScore metrics is quite easy as represented by Eqs. (1), (2). The methodology is further explained and illustrated in Fig. 6. CiteScore for year N (CiteScore N) sums the citations received in year N to documents published in years N-1, N-2, and N-3, and divides this by the number of documents published in the three consecutive years N-1, N-2, and N-3.
Fig. 6

Boxplot representation of CiteScore data of Journal sources in 2016.

For instance, According to Scopus, the 3-year CiteScore time window was chosen as a best fit for all subject areas. Research shows that a 3-year publication window is long enough to capture the citation peak of the majority of disciplines. A set of five inclusion criteria was established namely: publication source must be indexed in the Scopus database; publication source must be active as at 28th December 2017; publication must be written in English language; publication source type must either be Book Series, Journal or Trade Journal; and publication source must have CiteScores in 2014, 2015, and 2016. The Source identification numbers were carefully anonymized using the format: OA##### for OA publication sources and; SB##### for SB publication sources, where # is an integer. Hence, the sequential Publication ID is OA00001 through OA2542 for OA publication sources, and SB00001 through SB15040 for SB publication sources. The descriptive statistics of the CiteScores of OA and SB scholarly research output sources for the three-year period are as presented in Table 2. In order to measure the tendency of centrality in the CiteScore data, boxplots are drawn for each publication source type. The boxplot representations of CiteScore data of Book Series, Journal, and Trade Journal sources for 2014, 2015, and 2016 are shown in Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9.
Table 2

Descriptive statistics of CiteScore data of scholarly research outputs (2014–2016).

2014
2015
2016
Open Access (OA)Subscription (SB)Open Access (OA)Subscription (SB)Open Access (OA)Subscription (SB)
Mean1.221.421.321.471.371.50
Median0.780.850.820.920.920.94
Mode0.000.000.000.000.120.00
Standard Deviation1.412.131.512.091.492.14
Variance1.984.552.294.382.234.58
Kurtosis31.72256.2639.57127.0323.11240.01
Skewness3.779.844.107.513.319.41
Range21.1189.9125.1966.4518.2989.23
Minimum0.000.000.000.000.000.00
Maximum21.1189.9125.1966.4518.2989.23
Total Samples254215,040254215,040254215,040
Fig. 1

Boxplot representation of CiteScore data of Book Series sources in 2014.

Fig. 2

Boxplot representation of CiteScore data of Book Series sources in 2015.

Fig. 3

Boxplot representation of CiteScore data of Book Series sources in 2016.

Fig. 4

Boxplot representation of CiteScore data of Journal sources in 2014.

Fig. 5

Boxplot representation of CiteScore data of Journal sources in 2015.

Fig. 7

Boxplot representation of CiteScore data of Trade Journal sources in 2014.

Fig. 8

Boxplot representation of CiteScore data of Trade Journal sources in 2015.

Fig. 9

Boxplot representation of CiteScore data of Trade Journal sources in 2016.

Descriptive statistics of CiteScore data of scholarly research outputs (2014–2016). Boxplot representation of CiteScore data of Book Series sources in 2014. Boxplot representation of CiteScore data of Book Series sources in 2015. Boxplot representation of CiteScore data of Book Series sources in 2016. Boxplot representation of CiteScore data of Journal sources in 2014. Boxplot representation of CiteScore data of Journal sources in 2015. Boxplot representation of CiteScore data of Journal sources in 2016. Boxplot representation of CiteScore data of Trade Journal sources in 2014. Boxplot representation of CiteScore data of Trade Journal sources in 2015. Boxplot representation of CiteScore data of Trade Journal sources in 2016. Fig. 10, Fig. 11, Fig. 12 show the trends in the CiteScores of OA and SB publication sources in the sub-categories of Book Series, Journal, and Trade Journal respectively between 2014 and 2016. Probability Density Functions (PDFs) and Cumulative Distribution Functions (CDFs) of the dataset are also computed. PDF and CDF models of Normal, Exponential, and Non-parametric distributions were used to fit the OA and SB CiteScore data and the results are shown in Fig. 13, Fig. 14, Fig. 15, Fig. 16 respectively. Distribution fitting parameters for OA CiteScore data, and their estimates and standard errors, are presented in Tables 3 and 4 respectively. In like manner, the distribution fitting parameters for SB CiteScore data, and their estimates and standard errors, are presented in Tables 5 and 6 respectively.
Fig. 10

Scatter plot of (a) OA (b) SB Book Series CiteScore data (2014–2016).

Fig. 11

Scatter plot of (a) OA (b) SB Journal CiteScore data (2014–2016).

Fig. 12

Scatter plot of (a) OA (b) SB Trade Journal CiteScore data (2014–2016).

Fig. 13

Probability density function plot of OA publications.

Fig. 14

Cumulative distribution function plot of OA publications.

Fig. 15

Probability density function plot of SB publications.

Fig. 16

Cumulative distribution function plot of SB publications.

Table 3

Distribution fitting parameters for OA CiteScore data (2014–2016).

NormalExponential
Log Likelihood−13770.7−9634.67
Domain−∞<y<∞0<y<∞
Mean1.30131.3013
Variance1.47241.6935
Table 4

Estimates and standard errors for OA CiteScore data distribution (2014–2016).

NormalExponential
ParameterApproxStd ErrApproxStd Err
µ1.30130.01691.30130.0149
σ1.47240.0119
Table 5

Distribution fitting parameters for SB CiteScore data (2014–2016).

NormalExponential
Log Likelihood–13770.7−9634.67
Domain−∞ <y<∞0<y<∞
Mean1.30131.3013
Variance1.47241.6935
Table 6

Estimates and standard errors for OA CiteScore data distribution (2014–2016).

NormalExponential
ParameterApproxStd ErrApproxStd Err

µ1.30130.01691.30130.0149
σ1.47240.0119
Scatter plot of (a) OA (b) SB Book Series CiteScore data (2014–2016). Scatter plot of (a) OA (b) SB Journal CiteScore data (2014–2016). Scatter plot of (a) OA (b) SB Trade Journal CiteScore data (2014–2016). Probability density function plot of OA publications. Cumulative distribution function plot of OA publications. Probability density function plot of SB publications. Cumulative distribution function plot of SB publications. Distribution fitting parameters for OA CiteScore data (2014–2016). Estimates and standard errors for OA CiteScore data distribution (2014–2016). Distribution fitting parameters for SB CiteScore data (2014–2016). Estimates and standard errors for OA CiteScore data distribution (2014–2016). Furthermore, correlation analyses are performed to establish a linear relationship between the OA CiteScores and the SB CiteScores. The correlation coefficient matrices and their corresponding p-values are presented in Table 7, Table 8, Table 9, Table 10, Table 11, Table 12. Analysis of Variance (ANOVA) and multiple comparison post-hoc tests are conducted to understand the statistical difference (and its significance, if any) in the citation impact of OA publication sources and SB publication source based on CiteScore. The results of the ANOVA test and the multiple comparison post-hoc test are presented in Tables 13 and 14. The mean CiteScores of the six groups (Open Access Book Series, Open Access Journal, Open Access Trade Journal, Subscription Book Series, Subscription Journal, and Subscription Trade Journal) are shown in Figs. 17 and 18 to aid comparative analyses.
Table 7

Correlation coefficient matrix of Book Series CiteScore data (2014–2016).

201420152016
Open Access Book Series20141
20150.95661
2016−0.02160.26241
Subscription Book Series20141
20150.98281
20160.96960.98201
Table 8

P-value matrix of Book Series CiteScore data (2014–2016).

201420152016
Open Access Book Series20141
20150.01081
20160.97250.66981
Subscription Book Series20141
20150.00001
20160.00000.00001
Table 9

Correlation coefficient matrix of Journal CiteScore data (2014–2016).

201420152016
Open Access Journal20141
20150.95491
20160.89860.94801
Subscription Journal20141
20150.97801
20160.96680.97831
Table 10

P-value matrix of Journal CiteScore data (2014–2016).

201420152016
Open Access Journal20141
20150.00001
20160.00000.00001
Subscription Journal20141
20150.00001
20160.00000.00001
Table 11

Correlation coefficient matrix of Trade Journal CiteScore data (2014–2016).

201420152016
Open Access Trade Journal20141
20151.00001
20161.00001.00001
Subscription Trade Journal201410.96140.9320
20150.961410.9405
20160.93200.94051
Table 12

P-value matrix of Trade Journal CiteScore data (2014–2016).

201420152016
Open Access Trade Journal20141
20151.00001
20161.00001.00001
Subscription Trade Journal201410.00000.0000
20150.000010.0000
20160.00000.00001
Table 13

ANOVA test results on CiteScore data (2014–2016).

Source of variationSum of squaresDegree of freedomMean squaresF statisticP-value
Group (Between)1401.35280.26867.669.79×10–71
Error (Within)218460.7527404.142
Total21986252745
Table 14

Multiple comparison post-hoc test results.

Source typeSource typeMean differenceLower Limit (95% confidence intervals)Upper Limit (95% confidence intervals)P-value
Open Access JournalOpen Access Book Series−0.51070.98832.48730.4152
Open Access JournalOpen Access Trade Journal−2.50560.84364.19280.9799
Open Access JournalSubscription Journal−0.2590−0.1869−0.11480.0000
Open Access JournalSubscription Trade Journal0.91581.15421.39250.0000
Open Access JournalSubscription Book Series−0.09420.09040.27500.7302
Open Access Book SeriesOpen Access Trade Journal−3.8128−0.14473.52351.0000
Open Access Book SeriesSubscription Journal−2.6729−1.17510.32260.2212
Open Access Book SeriesSubscription Trade Journal−1.34900.16591.68080.9996
Open Access Book SeriesSubscription Book Series−2.4053−0.89790.60950.5334
Open Access Trade JournalSubscription Journal−4.3791−1.03052.31820.9521
Open Access Trade JournalSubscription Trade Journal−3.04580.31053.66690.9998
Open Access Trade JournalSubscription Book Series–4.1062−0.75322.59970.9880
Subscription JournalSubscription Trade Journal1.11041.34101.57160.0000
Subscription JournalSubscription Book Series0.10280.27720.45170.0001
Subscription Trade JournalSubscription Book Series−1.3503−1.0638−0.77730.0000
Fig. 17

Boxplot showing the comparison of CiteScores of publication sources.

Fig. 18

Multiple comparison post-hoc plot of CiteScore data (2014–2016).

Correlation coefficient matrix of Book Series CiteScore data (2014–2016). P-value matrix of Book Series CiteScore data (2014–2016). Correlation coefficient matrix of Journal CiteScore data (2014–2016). P-value matrix of Journal CiteScore data (2014–2016). Correlation coefficient matrix of Trade Journal CiteScore data (2014–2016). P-value matrix of Trade Journal CiteScore data (2014–2016). ANOVA test results on CiteScore data (2014–2016). Multiple comparison post-hoc test results. Boxplot showing the comparison of CiteScores of publication sources. Multiple comparison post-hoc plot of CiteScore data (2014–2016).
Subject areaData Analytics
More specific subject areaCitation Analytics
Type of dataTables, graphs, figures, and spreadsheet file
How data was acquiredData was acquired from publication source list available in Scopus online database[1]. A set of five inclusion criteria was established namely: publication source must be indexed in the Scopus database; publication source must be active as at 28th December 2017; publication must be written in English language; publication source type must either be Book Series, Journal or Trade Journal; and publication source must have CiteScores in 2014, 2015, and 2016.
Data formatSecondary, analyzed
Experimental factorsPublication sources that did not meet any of the five criteria for inclusion in the period under consideration were excluded.
Experimental featuresDescriptive statistics, boxplot representations, scatter plots, frequency distributions, correlation and regression analyses, Probability Density Functions (PDFs), Cumulative Distribution Functions (CDFs), Analysis of Variance (ANOVA) test, and multiple post-hoc test are performed to explore the dataset provided in this data article. All statistical computations were done using the Machine Learning and Statistics toolbox in MATLAB 2016a software.
Data source locationData is available as supplementary material to this data article
Data accessibilityIn a bid to facilitate further works on citation analytics, detailed datasets are made publicly available in a Microsoft Excel spreadsheet file.
  6 in total

1.  Open Access Journal Policies: A Systematic Analysis of Radiology Journals.

Authors:  Anand Narayan; Katie Lobner; Jan Fritz
Journal:  J Am Coll Radiol       Date:  2017-12-12       Impact factor: 5.532

2.  Scopus database: a review.

Authors:  Judy F Burnham
Journal:  Biomed Digit Libr       Date:  2006-03-08

3.  Received signal strength and local terrain profile data for radio network planning and optimization at GSM frequency bands.

Authors:  Segun I Popoola; Aderemi A Atayero; Nasir Faruk
Journal:  Data Brief       Date:  2017-12-19

4.  Smart campus: Data on energy consumption in an ICT-driven university.

Authors:  Segun I Popoola; Aderemi A Atayero; Theresa T Okanlawon; Benson I Omopariola; Olusegun A Takpor
Journal:  Data Brief       Date:  2017-12-07

5.  Learning analytics for smart campus: Data on academic performances of engineering undergraduates in Nigerian private university.

Authors:  Segun I Popoola; Aderemi A Atayero; Joke A Badejo; Temitope M John; Jonathan A Odukoya; David O Omole
Journal:  Data Brief       Date:  2018-01-03

6.  Data on the key performance indicators for quality of service of GSM networks in Nigeria.

Authors:  Segun I Popoola; Aderemi A Atayero; Nasir Faruk; Joke A Badejo
Journal:  Data Brief       Date:  2017-12-14
  6 in total
  2 in total

1.  Impact factor correlations with Scimago Journal Rank, Source Normalized Impact per Paper, Eigenfactor Score, and the CiteScore in Radiology, Nuclear Medicine & Medical Imaging journals.

Authors:  Moises Villaseñor-Almaraz; Juan Islas-Serrano; Chiharu Murata; Ernesto Roldan-Valadez
Journal:  Radiol Med       Date:  2019-02-06       Impact factor: 3.469

Review 2.  Current concepts on bibliometrics: a brief review about impact factor, Eigenfactor score, CiteScore, SCImago Journal Rank, Source-Normalised Impact per Paper, H-index, and alternative metrics.

Authors:  Ernesto Roldan-Valadez; Shirley Yoselin Salazar-Ruiz; Rafael Ibarra-Contreras; Camilo Rios
Journal:  Ir J Med Sci       Date:  2018-12-03       Impact factor: 1.568

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.