| Literature DB >> 33211021 |
Christopher Michael Homan1, J Nicolas Schrading1, Raymond W Ptucha1, Catherine Cerulli2, Cecilia Ovesdotter Alm1.
Abstract
BACKGROUND: Social media is a rich, virtually untapped source of data on the dynamics of intimate partner violence, one that is both global in scale and intimate in detail.Entities:
Keywords: intimate partner violence; natural language processing; social media
Year: 2020 PMID: 33211021 PMCID: PMC7714648 DOI: 10.2196/15347
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Counts per hour of #WhyIStayed (dotted) or #WhyILeft (solid) tweets from 9/8 to 9/12. Times in Eastern Standard Time, vertical lines mark 12-hour periods, with label corresponding to its left line. We removed spam from this set, but not meta tweets.
Summary of labels from all four annotators, A1 through A4, compared with the gold standard. Each cell indicates the number of tweets an annotator gave a label to.
| Annotator | Ads | Jokes | Leave | Meta | Other | Stay | Total | ||||||||
|
| |||||||||||||||
|
| La | 3 | 6 | 356 | 57 | 67 | 38 | 527 | |||||||
|
| Sb | 6 | 12 | 28 | 97 | 31 | 299 | 473 | |||||||
|
| |||||||||||||||
|
| L | 10 | 6 | 378 | 33 | 47 | 53 | 527 | |||||||
|
| S | 13 | 6 | 33 | 74 | 47 | 300 | 473 | |||||||
|
| |||||||||||||||
|
| L | 2 | 10 | 405 | 49 | 2 | 59 | 527 | |||||||
|
| S | 7 | 18 | 29 | 97 | 1 | 321 | 473 | |||||||
|
| |||||||||||||||
|
| L | 3 | 1 | 122 | 19 | 15 | 14 | 174 | |||||||
|
| S | 3 | 0 | 15 | 35 | 14 | 92 | 159 | |||||||
aL: #Why I Left.
bS: #Why I Stayed.
cA4 annotated only the first 333 tweets.
Basic lexical statistics on the tokens and types in the two balanced sets. Types are unique tokens whereas hapax legomena are those tokens that only occur once in the data set.
| Parameters | #WhyIStayed | #WhyILeft |
| Number of tokens | 130,545 | 118,768 |
| Number of types | 7094 | 6269 |
| Type:token ratio | 0.054 | 0.053 |
| Number of hapax legomena | 3871 | 3340 |
Top 9 most frequent unigrams (left) and bigrams (right) after preprocessing, with their respective frequencies in the Twitter devset.
| Unigrams | Bigrams | |||||||
| #WhyIStayed | Frequency, n | #WhyILeft | Frequency, n | #WhyIStayed | Frequency, n | #WhyILeft | Frequency, n | |
| think | 1061 | love | 930 | think love | 127 | deserve better | 298 | |
| love | 971 | realize | 888 | abusive relationship | 112 | finally realize | 103 | |
| leave | 872 | want | 702 | feel like | 95 | realize deserve | 80 | |
| abuse | 754 | leave | 613 | make feel | 89 | realize love | 67 | |
| believe | 578 | know | 594 | try leave | 78 | want live | 66 | |
| tell | 550 | better | 570 | emotional abuse | 72 | learn love | 61 | |
| want | 540 | deserve | 558 | think deserve | 67 | want daughter | 59 | |
| say | 529 | abuse | 507 | make believe | 64 | year old | 56 | |
| know | 518 | life | 497 | kill leave | 57 | know deserve | 55 | |
Top 9 most frequent trigrams after preprocessing, with their respective frequencies in the Twitter devset. The number in each cell indicates the number of times the trigram appeared in the dataset.
| #WhyIStayed | Frequency, n | #WhyILeft | Frequency, n |
| make feel like | 37 | realize deserve better | 56 |
| pregnant hit url | 25 | know deserve better | 40 |
| stay abusive relationship | 25 | finally realize deserve | 19 |
| change conversation url | 22 | son deserve better | 18 |
| leave man yell | 21 | true love hurt | 18 |
| abusive relationship url | 20 | daughter deserve better | 17 |
| man yell url | 20 | want daughter think | 15 |
| say kill leave | 20 | want daughter grow | 15 |
| church support spousal | 19 | daughter grow think | 15 |
The most indicative (in the direction of staying) verbs for abuser onto victim and victim as subject in the tweets having subject verb object structures. An exclamation point (!) before a verb indicates negation (eg, the phrase he did not love me would give the verb !love). Each cell indicates the weight of each subject verb object structure, as an support vector machine feature.
| Abuser onto victim | Weight of SVOa structure | Victim as subject | Weight of SVO structure |
| convince | 0.95 | realize | 0.98 |
| need | 0.94 | think | 0.91 |
| isolate | 0.94 | !think | 0.91 |
| promise | 0.92 | find | −0.88 |
| love | 0.90 | learn | −0.88 |
| !love | −0.89 | believe | 0.86 |
| !hit | 0.89 | !know | 0.84 |
| have | 0.87 | try | 0.80 |
| leave | −0.80 | felt | 0.73 |
| tell | 0.80 | know | −0.71 |
| be | 0.78 | tell | 0.71 |
| find | 0.76 | get | −0.70 |
| choke | −0.75 | N/Ab | N/A |
| kill | −0.74 | N/A | N/A |
aSVO: subject-verb-object.
bN/A: not applicable.
Top 10 features, with their linear support vector machine weights using ngrams and retweet counts as features, and informal register replacement during preprocessing. Except for try leave, the top features were all unigrams.
| #WhyIStayed | SVMa weight | #WhyILeft | SVM weight |
| think | 3.0 | realize | 3.3 |
| believe | 1.6 | finally | 2.4 |
| convince | 1.6 | tired | 1.7 |
| tell | 1.5 | realise | 1.4 |
| say | 1.3 | daughter | 1.4 |
| try leave | 1.1 | son | 1.4 |
| money | 1.0 | die | 1.3 |
| abuser | 0.9 | strong | 1.3 |
| feel | 0.9 | kill | 1.2 |
| young | 0.9 | anymore | 1.2 |
aSVM: support vector machine.
Top 10 subject-verb-object features for #WhyIStayed and #WhyILeft, with their support vector machine weights. An exclamation point (!) in front of a predicate verb indicates negation.
| #WhyIStayed | SVMa weights | #WhyILeft | SVM weights |
| he hurt me | 1.1 | he tell him | 1.3 |
| they !remember him | 1.1 | he !protect me | 1.2 |
| he need me | 1.1 | he !tell me | 1.0 |
| he convince me | 1.1 | he lie me | 1.0 |
| she convince me | 1.1 | he stab me | 1.0 |
| he give child | 1.0 | he do kid | 0.9 |
| he remind me | 1.0 | sister tell me | 0.89 |
| he wear me | 1.0 | she have baby | 0.89 |
| he !abuse kid | 1.0 | he strangle me | 0.78 |
| church tell me | 0.99 | he attack me | 0.77 |
aSVM: support vector machine.
Figure 2A pictorial summary of our results, grouped according to the forces that keep people in abusive relationships or cause them to leave and the dynamics involved in leaving. In the dynamics section, gray arrows denote pairs of textual features that represent opposing pressures and appeared on opposite lists in the same table (1-, 2-, or 3-gram, subject verb object, and support vector machine classification features). SVM: support vector machine; SVO: subject verb object.