| Literature DB >> 35200156 |
José Alberto Benítez-Andrades1, Maria-Esther Vidal2, Rafael Pastor-Vargas3, María Teresa García-Ordás4, José-Manuel Alija-Pérez4.
Abstract
BACKGROUND: Eating disorders affect an increasing number of people. Social networks provide information that can help.Entities:
Keywords: BERT; NLP; Twitter; bidirectional encoder representations from transformer; classification; data; deep learning; diet; disorder; eating disorder; machine learning; mental health; model; natural language processing; nutrition; performance; social media; weight
Year: 2022 PMID: 35200156 PMCID: PMC8914746 DOI: 10.2196/34492
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Study workflow: (A) data collection and preprocessing, (B) classification model training, and (C) evaluation. BERT: bidirectional encoder representations from transformer, ML: machine learning.
Categories of labeled tweets and examples.
| Category topics | Tweet | |
|
|
| |
|
| Written by someone who suffers from eating disorder | i was stressed and ate a whole bowl of pasta, where’s my badge for being the worst anorexic #edtwt |
|
| Written by someone who does not have an eating disorder | Is your #teenager not eating or eating a lot less than normal? She might be suffering from #anorexia. We can help; please come see us https://t.co/GfStM1IVGz #weightloss #losingweight https://t.co/z5NK0tjNIt |
|
|
| |
|
| Promotes eating disorders | Currently feeling like the best anorexic #eating disordertwt |
|
| Not promotes eating disorders | Higher-calorie diets could lead to a speedier recovery in patients with anorexia nervosa, study shows https://t.co/mipX3nrhHN |
|
|
| |
|
| Informative | #AnorexiaNervosa – A Father and Daughter Perspec- |
|
| Noninformative | Binge eating makes me sad :( #eatingdisorder |
|
|
| |
|
| Scientific | The problem extends to Food and Drug Administration and National Institutes of Health data sets used in a recent study appearing in Reproductive Toxicology. #ai #technology #BigData #ML https://t.co/DFvh6gNA38 |
|
| Nonscientific | Do not waste time thinking about what you could have done differently. Keep your eyes on the road ahead and do it differently now. #anorexia #eatingdis- order #recovery #nevergiveup #alwayskeepfighting |
Random forest hyperparameters.
| Category | criterion | max_depth | max_features | n_estimators |
| Category 1 | gini | 7 | log2 | 200 |
| Category 2 | gini | 8 | auto | 1000 |
| Category 3 | gini | 8 | sqrt | 800 |
| Category 4 | gini | 8 | auto | 1000 |
Figure 2Architecture of the recurrent neural network network. LSTM: long short-term memory.
Figure 3Architecture of the bidirectional long short-term memory (LSTM) network.
Table of terms and frequencies of the 10 most repeated terms in the initial data set and in the labeled subset of data.
| Term | Frequency, n | ||
|
|
| ||
|
| hey mp | 230,013 | |
|
| healthy | 210,430 | |
|
| pltpinkmonday | 209,330 | |
|
| eat | 183,436 | |
|
| covid19 | 156,541 | |
|
| edtwt | 123,175 | |
|
| anorexia | 112,864 | |
|
| disorders | 102,063 | |
|
| endsars | 99,844 | |
|
| bachelorette | 48,370 | |
|
| problem | 45,959 | |
|
|
| ||
|
| eat | 1132 | |
|
| disorder | 830 | |
|
| food | 410 | |
|
| recovery | 382 | |
|
| edtwt | 301 | |
|
| binge | 282 | |
|
| people | 245 | |
|
| anorexic | 244 | |
|
| research | 226 | |
|
| study | 202 | |
|
| problem | 199 | |
Classification performance.
| Model | Having eating disorders or not | Encouraging eating disorders or not | Informative or not | Scientific or not | ||||
|
| F1 score, % | Accuracy, % | F1 score, % | Accuracy, % | F1 score, % | Accuracy, % | F1 score, % | Accuracy, % |
| Random forest | 79.8 | 79.2 | 47 | 76.7 | 49.2 | 73.7 | 27.3 | 80.4 |
| Recurrent neural network | 83.2 | 82.6 | 61 | 82.1 | 67.3 | 70.7 | 67.3 | 70.7 |
| Bidirectional long short-term memory | 78.5 | 79.3 | 67.1 | 86.7 | 67.1 | 78.7 | 76.8 | 85.8 |
| Bidirectional encoder representations from transformer–baseda | 83.3 | 83 | 71.9 | 87.2 | 77.6 | 84.3 | 86 | 94.1 |
| RoBERTaa | 83.8 | 83.1 | 74.3 | 88.5 | 77.6 | 84.4 | 86.4 | 94.2 |
| DistilBERTa | 84 | 83.1 | 72.3 | 87.3 | 75 | 82.8 | 84.2 | 93.3 |
| CamemBERTa | 79.1 | 78.7 | 73.6 | 87.8 | 74.7 | 81.7 | 82.5 | 92.3 |
| ALBERTa | 81.2 | 80.4 | 74.3 | 88.2 | 73.8 | 81.5 | 83.3 | 93 |
| FlauBERTa | 82.6 | 81.7 | 72.9 | 87.5 | 72.2 | 80 | 83.4 | 92.7 |
| RobBERTa | 78.8 | 78.4 | 71.1 | 86.2 | 73.8 | 81.6 | 83 | 92.6 |
aA pretrained model was used: bert-based-multilingual-cased for BERT, roberta-base for RoBERTa, distilbert-base-cased for DistilBERT, camembert-base for CamemBERT, albert-base-v1 for ALBERT, flaubert-base-cased for FlauBERT, and robbert-v2-dutch-base for RobBERT.
Implementation time.
| Model | Time (seconds) | |||
|
| Having eating disorders or not | Encouraging eating disorders or not | Informative or not | Scientific or not |
| Random forest | 1.74 | 12.8 | 10.4 | 12.9 |
| Recurrent neural network | 152.1 | 163.1 | 151.5 | 153.7 |
| Bidirectional long short-term memory | 163.2 | 175.3 | 164.8 | 167.9 |
| Bidirectional encoder representations from transformer–based | 1257.4 | 1232.1 | 1292.7 | 1311.4 |
| RoBERTa | 1116.2 | 1158.8 | 1142.5 | 1192.8 |
| DistilBERT | 1343.3 | 1327.8 | 1332.0 | 1362.3 |
| CamemBERT | 1472.3 | 1457.5 | 1462.0 | 1493.4 |
| ALBERT | 1372.7 | 1352.3 | 1331.3 | 1392.5 |
| FlauBERT | 1203.9 | 1207.1 | 1202.1 | 1235.1 |
| RobBERT | 1234.4 | 1215.4 | 1319.7 | 1123.5 |