Literature DB >> 29057293

Measuring Text Difficulty Using Parse-Tree Frequency.

David Kauchak1, Gondy Leroy2, Alan Hogue3,4.   

Abstract

Text simplification often relies on dated, unproven readability formulas. As an alternative and motivated by the success of term familiarity, we test a complementary measure: grammar familiarity. Grammar familiarity is measured as the frequency of the 3rd level sentence parse tree and is useful for evaluating individual sentences. We created a database of 140K unique 3rd level parse structures by parsing and binning all 5.4M sentences in English Wikipedia. We then calculated the grammar frequencies across the corpus and created 11 frequency bins. We evaluate the measure with a user study and corpus analysis. For the user study, we selected 20 sentences randomly from each bin, controlling for sentence length and term frequency, and recruited 30 readers per sentence (N=6,600) on Amazon Mechanical Turk. We measured actual difficulty (comprehension) using a Cloze test, perceived difficulty using a 5-point Likert scale, and time taken. Sentences with more frequent grammatical structures, even with very different surface presentations, were easier to understand, perceived as easier and took less time to read. Outcomes from readability formulas correlated with perceived but not with actual difficulty. Our corpus analysis shows how the metric can be used to understand grammar regularity in a broad range of corpora.

Entities:  

Keywords:  Comprehension; Health Literacy; Patient Education; Readability; Text Difficulty; Text Simplification

Year:  2017        PMID: 29057293      PMCID: PMC5644354          DOI: 10.1002/asi.23855

Source DB:  PubMed          Journal:  J Assoc Inf Sci Technol        ISSN: 2330-1635            Impact factor:   2.687


  20 in total

1.  Study finds research consent forms difficult to comprehend.

Authors:  Jeffrey Brainard
Journal:  Chron High Educ       Date:  2003-01-17

2.  Beyond surface characteristics: a new health text-specific readability measurement.

Authors:  Hyeoneui Kim; Sergey Goryachev; Graciela Rosemblat; Allen Browne; Alla Keselman; Qing Zeng-Treitler
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

3.  The effect of word familiarity on actual and perceived text difficulty.

Authors:  Gondy Leroy; David Kauchak
Journal:  J Am Med Inform Assoc       Date:  2013-10-07       Impact factor: 4.497

4.  Are our investments paying off?: a study of reading level and bereavement materials.

Authors:  Ann Rathbun; Leslie A Thornton; Jamie E Fox
Journal:  Am J Hosp Palliat Care       Date:  2008-05-07       Impact factor: 2.500

5.  Improving perceived and actual text difficulty for health information consumers using semi-automated methods.

Authors:  Gondy Leroy; James E Endicott; Obay Mouradi; David Kauchak; Melissa L Just
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

6.  Clustering clinical trials with similar eligibility criteria features.

Authors:  Tianyong Hao; Alexander Rusanov; Mary Regina Boland; Chunhua Weng
Journal:  J Biomed Inform       Date:  2014-02-01       Impact factor: 6.317

7.  The influence of text characteristics on perceived and actual difficulty of health information.

Authors:  Gondy Leroy; Stephen Helmreich; James R Cowie
Journal:  Int J Med Inform       Date:  2010-03-04       Impact factor: 4.046

Review 8.  Health literacy screening instruments for eHealth applications: a systematic review.

Authors:  Sarah A Collins; Leanne M Currie; Suzanne Bakken; David K Vawdrey; Patricia W Stone
Journal:  J Biomed Inform       Date:  2012-04-12       Impact factor: 6.317

9.  Development and validation of a low-literacy Chronic Obstructive Pulmonary Disease knowledge Questionnaire (COPD-Q).

Authors:  Paula Maples; Andrea Franks; Shaunta' Ray; Amy Barger Stevens; Lorraine S Wallace
Journal:  Patient Educ Couns       Date:  2009-12-30

10.  A user-study measuring the effects of lexical simplification and coherence enhancement on perceived and actual text difficulty.

Authors:  Gondy Leroy; David Kauchak; Obay Mouradi
Journal:  Int J Med Inform       Date:  2013-04-29       Impact factor: 4.046

View more
  3 in total

1.  The Role of Surface, Semantic and Grammatical Features on Simplification of Spanish Medical Texts: A User Study.

Authors:  Partha Mukherjee; Gondy Leroy; David Kauchak; Brianda Armenta Navarrete; Damian Y Diaz; Sonia Colina
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

2.  NegAIT: A new parser for medical text simplification using morphological, sentential and double negation.

Authors:  Partha Mukherjee; Gondy Leroy; David Kauchak; Srinidhi Rajanarayanan; Damian Y Romero Diaz; Nicole P Yuan; T Gail Pritchard; Sonia Colina
Journal:  J Biomed Inform       Date:  2017-03-22       Impact factor: 6.317

3.  Readability of English, German, and Russian Disease-Related Wikipedia Pages: Automated Computational Analysis.

Authors:  Jelizaveta Gordejeva; Richard Zowalla; Monika Pobiruchin; Martin Wiesner
Journal:  J Med Internet Res       Date:  2022-05-16       Impact factor: 7.076

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.