| Literature DB >> 34842547 |
Aleksandra Nabożny1, Bartłomiej Balcerzak2, Adam Wierzbicki2, Mikołaj Morzy3, Małgorzata Chlabicz4.
Abstract
BACKGROUND: The spread of false medical information on the web is rapidly accelerating. Establishing the credibility of web-based medical information has become a pressing necessity. Machine learning offers a solution that, when properly deployed, can be an effective tool in fighting medical misinformation on the web.Entities:
Keywords: active annotation; credibility; fake news; web-based medical information
Year: 2021 PMID: 34842547 PMCID: PMC8665397 DOI: 10.2196/26065
Source DB: PubMed Journal: JMIR Med Inform
Number of surrounding sentences (m) needed to understand the context and evaluate the credibility of a sentence for all data, only credible subset, only noncredible subset, and only neutral subset (n=10,649).
|
| All data, n (%) | Credible subset, n (%) | Noncredible subset, n (%) | Neutral subset, n (%) |
| 0 | 8565 (80.43) | 4955 (80.07) | 1377 (71.27) | 2233 (88.3) |
| 1 | 1958 (18.39) | 1165 (18.83) | 514 (26.6) | 279 (11.03) |
| 2 | 107 (1) | 57 (0.92) | 34 (1.76) | 16 (0.63) |
| 3 | 12 (0.11) | 5 (0.08) | 6 (0.31) | 1 (0.04) |
| <3 | 8 (0.07) | 6 (0.1) | 2 (0.05) | 0 (0) |
Figure 1Annotation interface: single sentence view.
Figure 2Annotation interface: sentence in context view.
Figure 3Sentence reranking: general idea.
Figure 4Processing pipeline. PCA: principal component analysis; RoBERTa: Robustly Optimized Bidirectional Encoder Representations from Transformers Pretraining Approach.
Figure 5Distribution of credible, noncredible, and neutral sentence labels within topics. CS: cesarean section; CRED: credible; NB: natural birth; NEU: neutral; NONCRED: noncredible; SSRI: selective serotonin reuptake inhibitor.
Lift results for the full data set. m is the number of top sentences from each cluster to be manually reviewed.
| lift@ | Number of clusters | Batch percentile | |||
|
|
| 1% (approximately 100 sentences) | 10% (approximately 1000 sentences) | 20% | 40% |
| lift@5 | 200 | 1.36 |
|
|
|
| lift@10 | 130 | 1.23 | 1.31 | 1.3 |
|
| lift@15 | 100 |
| 1.27 | 1.22 | 1.16 |
aThe best performing set of parameters for a given batch percentile is italicized.
Lift results for the cholesterol and statins topic. m is the number of top sentences from each cluster to be manually reviewed.
| lift@ | Number of clusters | Batch percentile | |||
|
|
| 1% (approximately 20 sentences) | 10% (approximately 200 sentences) | 20% | 40% |
| lift@5 | 40 | 1.75 | 1.24 | 1.26 | 1.27 |
The number of occurrences of a particular claim category within the cholesterol and statins subset of sentences.
| Claim category | Number of occurrences | Is related claim factually incorrect? | Is category based on the content or on the form? |
| Miscellaneous | 95 | N/Aa | Form |
| (stat) Side effects | 43 | Yes | Content |
| (chol) Not an indicator of CVDb risk | 25 | Yes | Content |
| Diet as good as drugs | 22 | Yes | Form |
| (chol) Too low is harmful | 18 | Yes | Content |
| Lifestyle changes are enough | 15 | Yes | Content |
| Big pharma | 14 | Yes | Content |
| Inflammation theory | 14 | Yes | Content |
| (stat) Cause diabetes | 13 | Yes | Content |
| (stat) Not needed | 10 | Yes | Content |
| (chol) Makes cells and protects nerves | 8 | No | Content |
| (stat) Not effective | 7 | Yes | Content |
| (stat) Prescription based solely on (chol) level | 7 | Yes | Content |
| Detailed data | 7 | N/A | Form |
| (stat) Cause cognitive impairment | 6 | Yes | Content |
| (stat) Not studied enough | 6 | Yes | Content |
| High HDLc neutralizes high LDLd | 6 | No | Content |
| Harmful CoQ10e loss | 4 | Yes | Content |
| (chol) Consumption not an issue | 3 | Yes | Content |
| Lifestyle versus statins | 2 | Yes | Content |
| No liver function monitoring | 2 | Yes | Content |
aN/A: not applicable.
bCVD: cardiovascular disease.
cHDL: high-density lipoprotein.
dLDL: low-density lipoprotein.
eCoQ10: Coenzyme Q10.
Claim category and explanations of claim categories extracted manually from all noncredible sentences from the cholesterol and statins topic.
| Claim category | Claim explanation |
| (stat) Side effects | Statins’ side effects outweigh the benefits |
| (chol) Not an indicator of CVDa risk | Total cholesterol is not an indicator of CVD |
| Diet as good as drugs | Aggregation of different dietary interventions to lower cholesterol, triglycerides, or sugars |
| (chol) Too low is harmful | Too low cholesterol level is harmful |
| Lifestyle changes are enough | People can lower cholesterol level just by developing good habits and eating a proper diet |
| Big pharma | People (eg, physicians and pharmaceutical company workers) make considerable profit through prescribing statins |
| Inflammation theory | It is inflammation that causes CVD, not excessive cholesterol level; cholesterol is an effect, not a cause |
| (stat) Cause diabetes | Statins increase the risk of diabetes |
| (stat) Not needed | Statins are given to healthy people who do not need them |
| (chol) Makes cells and protects nerves | Cholesterol produces hormones that make body cells and protect nerves |
| (stat) Not effective | Statins do not fulfill their role in reducing the risk of CVD |
| (stat) Prescription based solely on (chol) level | Statin prescription is based solely on total cholesterol level |
| Detailed data | Sentences contain detailed data, for example, “LDLb cholesterol level should not exceed 200 md/dL” |
| (stat) Cause cognitive impairment | Statin consumption causes different forms of cognitive impairment (including memory loss and slow information processing) |
| (stat) Not studied enough | Statins’ effectiveness is not studied enough |
| High HDLc neutralizes high LDL | HDL is a so-called good cholesterol, whereas LDL is a so-called bad cholesterol; high levels of the former neutralize negative consequences of high levels of the latter |
| Harmful CoQ10d loss | Statin-related CoQ10 loss is harmful |
| (chol) Consumption not an issue | People should not worry about cholesterol consumption |
| Lifestyle versus statins | Lifestyle changes are more effective ways to prevent CVDs than statin consumption |
| No liver function monitoring | Monitoring of liver function tests is no longer recommended in patients on statin therapy |
| Miscellaneous | None of the above |
aCVD: cardiovascular disease.
bLDL: low-density lipoprotein.
cHDL: high-density lipoprotein.
dCoQ10: Coenzyme Q10.