Literature DB >> 25927201

Erratum to: Using text mining for study identification in systematic reviews: a systematic review of current approaches.

Alison O'Mara-Eves1, James Thomas2, John McNaught3, Makoto Miwa4, Sophia Ananiadou5.   

Abstract

Entities:  

Year:  2015        PMID: 25927201      PMCID: PMC4411935          DOI: 10.1186/s13643-015-0031-5

Source DB:  PubMed          Journal:  Syst Rev        ISSN: 2046-4053


× No keyword cloud information.

Erratum

Following publication of our article [1], it has come to our attention that two of the formulae in Table 1 were incorrect. The formulae for the measures of precision and burden have been corrected (Table 1). We are publishing this erratum to update these formulae to the following:
Table 1

Definitions of performance measures reported in the studies

Measure # Definition Formula
Recall (sensitivity) 22Proportion of correctly identified positives amongst all real positives \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{F}\mathrm{N}} $$\end{document}TPTP+FN
Precision 18Proportion of correctly identified positives amongst all positives. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{TP}{TP+FP} $$\end{document}TPTP+FP
F measure 10Combines precision and recall. Values of β < 1.0 indicate precision is more important than recall, whilst values of β > 1.0 indicate recall is more important than precision \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {F}_{\beta, k}\kern0.5em =\kern0.5em \frac{\left({\beta}^2+1\right){\mathrm{TP}}_k}{\left({\beta}^2+1\right){\mathrm{TP}}_k+{\mathrm{FP}}_k+{\beta}^2{\mathrm{FN}}_k} $$\end{document}Fβ,k=β2+1TPkβ2+1TPk+FPk+β2FNk Where β is a value that specifies the relative importance of recall and precision.
ROC (AUC) 10Area under the curve traced out by graphing the true positive rate against the false positive rate. 1.0 is a perfect score and 0.50 is equivalent to a random ordering
Accuracy 8Proportion of agreements to total number of documents. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{\mathrm{TP}+\mathrm{T}\mathrm{N}}{\mathrm{TP}+\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}+\mathrm{T}\mathrm{N}} $$\end{document}TP+TNTP+FP+FN+TN
Work saved over sampling 8The percentage of papers that the reviewers do not have to read because they have been screened out by the classifier \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathrm{W}\mathrm{S}\mathrm{S}\ \mathrm{at}\ 95\%\ \mathrm{recall} = \kern0.5em \frac{\mathrm{TN}+\mathrm{F}\mathrm{N}}{N-0.05} $$\end{document}WSSat95%recall=TN+FNN0.05
Time 7Time taken to screen (usually in minutes)
Burden 4The fraction of the total number of items that a human must screen (active learning) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ Burden=\frac{t{p}^T+t{n}^T+f{p}^T+t{p}^U+f{p}^U}{N} $$\end{document}Burden=tpT+tnT+fpT+tpU+fpUN
Yield 3The fraction of items that are identified by a given screening approach (active learning) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathrm{Yield}\kern0.5em =\kern0.5em \frac{{\mathrm{tp}}^T+{\mathrm{tp}}^U}{{\mathrm{tp}}^T+{\mathrm{tp}}^U+{\mathrm{fn}}^U} $$\end{document}Yield=tpT+tpUtpT+tpU+fnU
Utility 5Relative measure of burden and yield that takes into account reviewer preferences for weighting these two concepts (active learning) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{\beta \cdot \mathrm{yield}+\left(1\kern0.5em -\kern0.5em \mathrm{burden}\right)}{\beta +1} $$\end{document}βyield+1burdenβ+1 Where β is the user-defined weight
Baseline inclusion rate 2The proportion of includes in a random sample of items before prioritisation or classification takes place. The number to be screened is determined using a power calculation \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{n_i}{n_t} $$\end{document}nint Where n i = number of items included in the random sample; n t = total number of items in the random sample
Performance (efficiency) a 2Number of relevant items selected divided by the time spent screening, where relevant items were those marked as included by two or more people \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{\mathrm{Selected},\kern0.5em \mathrm{relevant}\kern0.5em \mathrm{items}}{\mathrm{Time}} $$\end{document}Selected,relevantitemsTime
Specificity 2The proportion of correctly identified negatives (excludes) out of the total number of negatives \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{\mathrm{TN}}{\mathrm{TN}+\mathrm{F}\mathrm{P}} $$\end{document}TNTN+FP
True positives 2The number of correctly identified positives (includes)TP
False negatives 1The number of incorrectly identified negatives (excludes)FN
Coverage 1The ratio of positives in the data pool that are annotated during active learning \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{{\mathrm{TP}}^L}{{\mathrm{TP}}^L+{\mathrm{FN}}^L+{\mathrm{TP}}^U+{\mathrm{FN}}^U} $$\end{document}TPLTPL+FNL+TPU+FNU Where L refers to labelled items and U refers to unlabelled items
Unit cost 1Expected time to label an item multiplied by the unit cost of the labeler (salary per unit of time), as calculated from their (known or estimated) salarytimeexpected × costunit
Classification error 1Proportion of disagreements to total number of documents100 % − accuracy %
Error 1Total number of falsely classified items divided by the total number of items \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{\sum \left(\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}\right)}{\sum \left(\mathrm{T}\mathrm{P}+\mathrm{F}\mathrm{P}+\mathrm{F}\mathrm{N}+\mathrm{T}\mathrm{N}\right)} $$\end{document}FP+FNTP+FP+FN+TN
Absolute screening reduction 1Number of items excluded by the classifier that do not need to be manually screenedTN + FN
Prioritised inclusion rate 1The proportion of includes out of the total number screened, after prioritisation or classification takes place \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \frac{n_{\mathrm{ip}}}{n_{\mathrm{tp}}} $$\end{document}nipntp Where nip = number of items included in prioritised sample; ntp = total number of items in the prioritised sample
Definitions of performance measures reported in the studies Precision = Burden =
  1 in total

Review 1.  Using text mining for study identification in systematic reviews: a systematic review of current approaches.

Authors:  Alison O'Mara-Eves; James Thomas; John McNaught; Makoto Miwa; Sophia Ananiadou
Journal:  Syst Rev       Date:  2015-01-14
  1 in total
  3 in total

1.  Automating document classification with distant supervision to increase the efficiency of systematic reviews: A case study on identifying studies with HIV impacts on female sex workers.

Authors:  Xiaoxiao Li; Amy Zhang; Rabah Al-Zaidy; Amrita Rao; Stefan Baral; Le Bao; C Lee Giles
Journal:  PLoS One       Date:  2022-06-30       Impact factor: 3.752

2.  Animal models of chemotherapy-induced peripheral neuropathy: A machine-assisted systematic review and meta-analysis.

Authors:  Gillian L Currie; Helena N Angel-Scott; Lesley Colvin; Fala Cramond; Kaitlyn Hair; Laila Khandoker; Jing Liao; Malcolm Macleod; Sarah K McCann; Rosie Morland; Nicki Sherratt; Robert Stewart; Ezgi Tanriver-Ayder; James Thomas; Qianying Wang; Rachel Wodarski; Ran Xiong; Andrew S C Rice; Emily S Sena
Journal:  PLoS Biol       Date:  2019-05-20       Impact factor: 8.029

3.  Drug Abuse Research Trend Investigation with Text Mining.

Authors:  Li-Wei Chou; Kang-Ming Chang; Ira Puspitasari
Journal:  Comput Math Methods Med       Date:  2020-02-01       Impact factor: 2.238

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.