| Literature DB >> 31671140 |
Sharad Jones1, Carly Fox2, Sandra Gillam3, Ronald B Gillam3.
Abstract
The accuracy of four machine learning methods in predicting narrative macrostructure scores was compared to scores obtained by human raters utilizing a criterion-referenced progress monitoring rubric. The machine learning methods that were explored covered methods that utilized hand-engineered features, as well as those that learn directly from the raw text. The predictive models were trained on a corpus of 414 narratives from a normative sample of school-aged children (5;0-9;11) who were given a standardized measure of narrative proficiency. Performance was measured using Quadratic Weighted Kappa, a metric of inter-rater reliability. The results indicated that one model, BERT, not only achieved significantly higher scoring accuracy than the other methods, but was consistent with scores obtained by human raters using a valid and reliable rubric. The findings from this study suggest that a machine learning method, specifically, BERT, shows promise as a way to automate the scoring of narrative macrostructure for potential use in clinical practice.Entities:
Mesh:
Year: 2019 PMID: 31671140 PMCID: PMC6822746 DOI: 10.1371/journal.pone.0224634
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Definition of MISL macrostructure elements and ENP.
| MISL Element | Definition |
|---|---|
| The who or what in the story acting as the agent | |
| The time and/or place the story or episode takes place | |
| An event or problem that causes the story to “take-off” | |
| The idea the character(s) has to fix the problem in the story | |
| The action taken by the character in response to the initiating event | |
| A causally linked event following the character’s action | |
| The number of modifiers following a given noun, that serve to describe the noun |
For full definitions of macrostructure elements reference [9].
QWK of machine learning models trained on undergraduate scored data.
| MISL Element | CMRF w/ CV | TIRF w/ CV | GVEL w/ CV | BERT w/ TS |
|---|---|---|---|---|
| 0.504 | 0.595 | 0.317 | 0.975 | |
| 0.239 | 0.348 | 0.459 | 0.911 | |
| 0.498 | 0.533 | 0.485 | 0.945 | |
| 0.423 | 0.536 | 0.335 | 0.953 | |
| 0.466 | 0.503 | 0.522 | 0.942 | |
| 0.494 | 0.500 | 0.493 | 0.790 | |
| 0.480 | 0.437 | 0.454 | 0.908 |
QWK of the various ML models to the undergraduate scorers either through 10-fold cross-validation (CV) or a holdout test set (TS) as compute time permitted.
QWK of BERT to US, BERT to ES, and US to ES.
| MISL Element | BERT to US | BERT to ES | US to ES |
|---|---|---|---|
| 0.975 | 0.938 | 0.956 | |
| 0.911 | 0.591 | 0.601 | |
| 0.945 | 0.593 | 0.547 | |
| 0.953 | 0.427 | 0.400 | |
| 0.942 | 0.396 | 0.417 | |
| 0.790 | 0.651 | 0.410 | |
| 0.908 | 0.724 | 0.780 |
Results of comparing the QWK of BERT to undergraduate scores (US) to that of BERT to expert scores (ES) and US to ES.