| Literature DB >> 25536326 |
Jennifer L D'Souza1, Neil R Smalheiser2.
Abstract
In the present paper, we have created several novel journal similarity metrics. The MeSH odds ratio measures the topical similarity of any pair of journals, based on the major MeSH headings assigned to articles in MEDLINE. The second metric employed the 2009 Author-ity author name disambiguation dataset as a gold standard for estimating the author odds ratio. This gives a straightforward, intuitive answer to the question: Given two articles in PubMed that share the same author name (lastname, first initial), how does knowing only the identity of the journals (in which the articles were published) predict the relative likelihood that they are written by the same person vs. different persons? The article pair odds ratio detects the tendency of authors to publish repeatedly in the same journal, as well as in specific pairs of journals. The metrics can be applied not only to estimate the similarity of a pair of journals, but to provide novel profiles of individual journals as well. For example, for each journal, one can define the MeSH cloud as the number of other journals that are topically more similar to it than expected by chance, and the author cloud as the number of other journals that share more authors than expected by chance. These metrics for journal pairs and individual journals have been provided in the form of public datasets that can be readily studied and utilized by others.Entities:
Mesh:
Year: 2014 PMID: 25536326 PMCID: PMC4275247 DOI: 10.1371/journal.pone.0115681
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Journal pairs ranked by their normalized co-occurrence (Co) scores.
A random sample of 500,000 journal pairs was taken (from a total of 12,569,485 pairs in the study dataset) and plotted to show (A) MeSH Co score, (B) Author Co score, and (C) article pair Co score. All concordant journal pairs have MeSH and author Co scores = 1, and are not displayed in panels A and B. In panel C, article pair Co scores for concordant and discordant journal pairs are shown separately.
Figure 2Journal pairs ranked by their odds ratios.
A random sample of 500,000 discordant journal pairs was taken (from a total of 12,569,485 pairs in the study dataset) and plotted to show (A) MeSH odds ratio, (B) Author odds ratio, and (C) article pair odds ratio. In addition, concordant journal pairs are shown for the article pair odds ratio. Note that the odds ratios are displayed on a log scale. A dotted line indicates the point at which the odds ratio = 1; points above this line show more observed co-occurrences than expected by chance, whereas those below the line show fewer than expected by chance. See Methods for details.
The journals most related to the journal Bioinformatics according to various metrics.
| most related by MeSH | MeSH oddsratio | most related by shared authors | Author oddsratio | most related by article pairs | article pair oddsratio | weblogs |
| Bioinformatics | 9.4 | Bioinformatics | 118.7 | Genome Inform | 85.6 | BMC Bioinformatics |
| BMC Bioinformatics | 6.9 | Brief. Bioinformatics | 54.5 | J. Comput. Biol. | 78.6 |
|
| J. Comput. Biol. | 4.3 | J. Comput. Biol. | 53.9 | Proc Int Conf Intell Syst Mol Biol | 73.3 |
|
| PLoS Comput. Biol. | 3.7 | Proc Int Conf Intell Syst Mol Biol | 50.0 | Pac Symp Biocomput | 69.1 | Genome Res. |
| Pac Symp Biocomput | 3.6 | BMC Bioinformatics | 47.7 | BMC Bioinformatics | 66.9 |
|
| Comput. Appl. Biosci. | 3.6 | Pac Symp Biocomput | 47.5 | Brief. Bioinformatics | 58.8 |
|
| IEEE Trans Image Process | 3.5 | J Bioinform Comput Biol | 43.2 | Genome Res. | 53.6 |
|
| Genome Biol. | 3.4 | PLoS Comput. Biol. | 38.2 | J Bioinform Comput Biol | 52.0 | Proteins |
| J Bioinform Comput Biol | 3.4 | IEEE/ACM Trans Comput Biol Bioinform | 34.9 | Bioinformatics | 52.0 |
|
| IEEE/ACM Trans Comput Biol Bioinform | 3.3 | Genome Inform | 33.9 | PLoS Comput. Biol. | 45.7 | Genome Biol. |
| Brief. Bioinformatics | 3.2 | In Silico Biol. (Gedrukt) | 31.9 | Genome Biol. | 44.3 | BMC Genomics |
| Genome Res. | 3.1 | BMC Syst Biol | 30.6 | Mol. Syst. Biol. | 40.2 |
|
| IEEE Trans Pattern Anal Mach Intell | 3.1 | Genome Biol. | 30.4 | Comput. Appl. Biosci. | 39.2 |
|
| BMC Genomics | 3.1 | Stat Appl Genet Mol Biol | 28.6 | BMC Syst Biol | 35.6 | PLoS Comput. Biol. |
| BMC Syst Biol | 3.0 | Comput. Appl. Biosci. | 27.8 | In Silico Biol. (Gedrukt) | 35.4 |
|
| IEEE Trans Syst Man Cybern B Cybern | 2.9 | Appl. Bioinformatics | 26.9 | Curr Protoc Bioinformatics | 34.9 | J. Comput. Biol. |
| Med Image Comput Comput Assist Interv | 2.9 | Mol. Syst. Biol. | 26.6 | IEEE/ACM Trans Comput Biol Bioinform | 34.0 |
|
| IEEE Trans Med Imaging | 2.9 | Comput Biol Chem | 25.8 | OMICS | 32.9 |
|
| BioSystems | 2.8 | OMICS | 23.3 | Stat Appl Genet Mol Biol | 32.0 |
|
| Proc Int Conf Intell Syst Mol Biol | 2.8 | Curr Protoc Bioinformatics | 21.1 | Proteins | 31.1 | Pac Symp Biocomput |
The 20 journals most related to the journal “Bioinformatics” are displayed according to MeSH odds ratio, author odds ratio, and article pair odds ratio. For comparison, the 20 most related journals are displayed using the metric of Lu et al. [5] which is based on user clicking behavior during weblogs of PubMed retrieval sessions. Note that the first three metrics measures the similarity of “Bioinformatics” to itself, though Lu et al. do not. Note, also, that over half of the top journals listed by the weblog metric are biological and general journals (shown in italics) which are not included in any of the metrics reported in the present paper.
Top 25 journals sorted by journal size (as of 2009).
| Journal | Size | Discipline | |
| 1 |
| 147822 | Biochemistry |
| 2 |
| 134066 | Science |
| 3 |
| 102954 | Medicine |
| 4 |
| 96559 | Molecular Biology |
| 5 |
| 85035 | Biochemistry; Biophysics |
| 6 |
| 83552 | Science |
| 7 |
| 67414 | Biochemistry; Biophysics |
| 8 |
| 56342 | Biochemistry |
| 9 |
| 55918 | Medicine |
| 10 |
| 54202 | Physiology |
| 11 |
| 53305 | Allergy and Immunology |
| 12 |
| 53274 | Medicine |
| 13 |
| 51790 | Brain |
| 14 |
| 51371 | Biochemistry |
| 15 |
| 46506 | Bacteriology |
| 16 |
| 44459 | Medicine |
| 17 |
| 44057 | Neoplasms |
| 18 |
| 43766 | Science |
| 19 |
| 40385 | Biochemistry |
| 20 |
| 38813 | Urology |
| 21 |
| 36200 | Orthopedics |
| 22 |
| 34920 | Medicine |
| 23 |
| 34880 | Physiology |
| 24 |
| 34531 | Neoplasms |
| 25 |
| 34472 | Medicine |
Shown are the ISO-abbreviated journal names. Journal size is defined as number of articles included within the 2009 Author-ity dataset. Note that some distinct journal names (e.g. BMJ and Br Med J) are successors of each other in different time periods. Disciplines are taken from JDI annotations. See Methods for details.
Top 25 journals sorted by size of MeSH cloud.
| Journal | Discipline | Broadness index | MeSH cloud | |
| 1 |
| Medicine | 1.25 | 5399 |
| 2 |
| Medicine | 0.88 | 5332 |
| 3 |
| Medicine | 0.60 | 4885 |
| 4 |
| Rheumatology | 0.58 | 4830 |
| 5 |
| Medicine | 0.82 | 4784 |
| 6 |
| Medicine | 0.96 | 4689 |
| 7 |
| Medicine | 1.22 | 4555 |
| 8 |
| Biology; Medicine | 0.46 | 4548 |
| 9 |
| Rheumatology | 0.67 | 4543 |
| 10 |
| Medicine | 0.59 | 4512 |
| 11 |
| Neurology | 0.68 | 4501 |
| 12 |
| Biology; Medicine | 1.31 | 4445 |
| 13 |
| Internal Medicine | 0.58 | 4383 |
| 14 |
| Pathology | 0.77 | 4380 |
| 15 |
| Medicine | 0.92 | 4355 |
| 16 |
| Internal Medicine | 0.90 | 4307 |
| 17 |
| Science | 0.94 | 4249 |
| 18 |
| Medicine | 1.29 | 4239 |
| 19 |
| Medicine | 1.25 | 4232 |
| 20 |
| Dentistry | 1.05 | 4226 |
| 21 |
| Biology; Medicine | 0.99 | 4218 |
| 22 |
| Medicine | 0.60 | 4213 |
| 23 |
| Neurology | 0.79 | 4187 |
| 24 |
| Drug Therapy; Pharmacology | 0.76 | 4184 |
| 25 |
| Medicine | 0.97 | 4182 |
Top 25 journals sorted by author cloud/MeSH cloud ratio.
| Journal | Discipline | Broadness index | MeSH cloud | Size | # authors | author cloud | Author/MeSH cloud | |
| 1 |
| Drug Therapy | 1.67 | 60 | 321 | 916 | 535 | 8.92 |
| 2 |
| Complementary Therapies | 0.54 | 96 | 2408 | 8356 | 842 | 8.77 |
| 3 |
| Complementary Therapies | 0.56 | 50 | 1480 | 4723 | 432 | 8.64 |
| 4 |
| Complementary Therapies | 0.62 | 92 | 1227 | 4709 | 763 | 8.29 |
| 5 |
| Biochemistry | 0.10 | 151 | 147822 | 248546 | 1145 | 7.58 |
| 6 |
| Genetics | 1.52 | 82 | 143 | 365 | 612 | 7.46 |
| 7 |
| Pharmacology; Social Sciences | 0.44 | 105 | 5221 | 14841 | 760 | 7.24 |
| 8 |
| Neurology; Pharmacology | 1.90 | 73 | 153 | 486 | 404 | 5.53 |
| 9 |
| Dermatology; Pharmacology | 0.94 | 136 | 360 | 1000 | 713 | 5.24 |
| 10 |
| Allergy and Immunology | 0.97 | 146 | 411 | 907 | 758 | 5.19 |
| 11 |
| Genetics, Medical; Neoplasms | 0.13 | 324 | 6592 | 17288 | 1640 | 5.06 |
| 12 |
| Biology; Reproductive Medicine | 1.23 | 135 | 216 | 517 | 646 | 4.79 |
| 13 |
| Drug Therapy | 0.98 | 260 | 321 | 739 | 1236 | 4.75 |
| 14 |
| Anti-Bacterial Agents | 0.45 | 168 | 5107 | 10268 | 776 | 4.62 |
| 15 |
| Biochemistry; Chemistry Techniques, Analytical | 1.57 | 150 | 526 | 1465 | 680 | 4.53 |
| 16 |
| Biochemistry; Biophysics | 0.18 | 380 | 85035 | 132654 | 1716 | 4.52 |
| 17 |
| Biochemistry | 0.89 | 225 | 496 | 1173 | 986 | 4.38 |
| 18 |
| Botany | 0.46 | 175 | 6137 | 14978 | 766 | 4.38 |
| 19 |
| Pharmacology | 1.34 | 202 | 342 | 794 | 837 | 4.14 |
| 20 |
| Biology | 1.26 | 84 | 1028 | 3287 | 347 | 4.13 |
| 21 |
| Genetics, Medical | 0.19 | 546 | 9053 | 25992 | 2211 | 4.05 |
| 22 |
| Science | 0.18 | 448 | 83552 | 138529 | 1792 | 4.00 |
| 23 |
| Hematology | 0.60 | 200 | 420 | 1260 | 798 | 3.99 |
| 24 |
| Toxicology | 1.62 | 73 | 132 | 303 | 275 | 3.77 |
| 25 |
| Complementary Therapies | 0.33 | 51 | 730 | 2051 | 192 | 3.76 |
In this Table, journal size is defined as number of articles with listed authors in the Author-ity (2009) dataset.