Literature DB >> 30298264

The Tool for the Automatic Analysis of Cohesion 2.0: Integrating semantic similarity and text overlap.

Scott A Crossley1, Kristopher Kyle2, Mihai Dascalu3.   

Abstract

This article introduces the second version of the Tool for the Automatic Analysis of Cohesion (TAACO 2.0). Like its predecessor, TAACO 2.0 is a freely available text analysis tool that works on the Windows, Mac, and Linux operating systems; is housed on a user's hard drive; is easy to use; and allows for batch processing of text files. TAACO 2.0 includes all the original indices reported for TAACO 1.0, but it adds a number of new indices related to local and global cohesion at the semantic level, reported by latent semantic analysis, latent Dirichlet allocation, and word2vec. The tool also includes a source overlap feature, which calculates lexical and semantic overlap between a source and a response text (i.e., cohesion between the two texts based measures of text relatedness). In the first study in this article, we examined the effects that cohesion features, prompt, essay elaboration, and enhanced cohesion had on expert ratings of text coherence, finding that global semantic similarity as reported by word2vec was an important predictor of coherence ratings. A second study was conducted to examine the source and response indices. In this study we examined whether source overlap between the speaking samples found in the TOEFL-iBT integrated speaking tasks and the responses produced by test-takers was predictive of human ratings of speaking proficiency. The results indicated that the percentage of keywords found in both the source and response and the similarity between the source document and the response, as reported by word2vec, were significant predictors of speaking quality. Combined, these findings help validate the new indices reported for TAACO 2.0.

Entities:  

Keywords:  Coherence; Cohesion; Essay quality; Natural language processing; Speaking proficiency

Mesh:

Year:  2019        PMID: 30298264     DOI: 10.3758/s13428-018-1142-4

Source DB:  PubMed          Journal:  Behav Res Methods        ISSN: 1554-351X


  6 in total

1.  Assisting students' writing with computer-based concept map feedback: A validation study of the CohViz feedback system.

Authors:  Christian Burkhart; Andreas Lachner; Matthias Nückles
Journal:  PLoS One       Date:  2020-06-29       Impact factor: 3.240

2.  A large-scaled corpus for assessing text readability.

Authors:  Scott Crossley; Aron Heintz; Joon Suh Choi; Jordan Batchelor; Mehrnoush Karimi; Agnes Malatinszky
Journal:  Behav Res Methods       Date:  2022-03-16

3.  Validity of a Computational Linguistics-Derived Automated Health Literacy Measure Across Race/Ethnicity: Findings from The ECLIPPSE Project.

Authors:  Dean Schillinger; Renu Balyan; Scott Crossley; Danielle McNamara; Andrew Karter
Journal:  J Health Care Poor Underserved       Date:  2021-05

4.  LOCO: The 88-million-word language of conspiracy corpus.

Authors:  Alessandro Miani; Thomas Hills; Adrian Bangerter
Journal:  Behav Res Methods       Date:  2021-10-25

5.  Widespread cortical thinning, excessive glutamate and impaired linguistic functioning in schizophrenia: A cluster analytic approach.

Authors:  Liangbing Liang; Angélica M Silva; Peter Jeon; Sabrina D Ford; Michael MacKinley; Jean Théberge; Lena Palaniyappan
Journal:  Front Hum Neurosci       Date:  2022-08-05       Impact factor: 3.473

6.  Developing and Testing Automatic Models of Patient Communicative Health Literacy Using Linguistic Features: Findings from the ECLIPPSE study.

Authors:  Scott A Crossley; Renu Balyan; Jennifer Liu; Andrew J Karter; Danielle McNamara; Dean Schillinger
Journal:  Health Commun       Date:  2020-03-02
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.