| Literature DB >> 26618205 |
Gondy Leroy, James E Endicott.
Abstract
With increasing text digitization, digital libraries can personalize materials for individuals with different education levels and language skills. To this end, documents need meta-information describing their difficulty level. Previous attempts at such labeling used readability formulas but the formulas have not been validated with modern texts and their outcome is seldom associated with actual difficulty. We focus on medical texts and are developing new, evidence-based meta-tags that are associated with perceived and actual text difficulty. This work describes a first tag, term familiarity , which is based on term frequency in the Google corpus. We evaluated its feasibility to serve as a tag by looking at a document corpus (N=1,073) and found that terms in blogs or journal articles displayed unexpected but significantly different scores. Term familiarity was then applied to texts and results from a previous user study (N=86) and could better explain differences for perceived and actual difficulty.Entities:
Keywords: Actual Difficulty; Health Informatics; Lexical Tags; Meta Information; Natural Language Processing; Perceived Difficulty
Year: 2011 PMID: 26618205 PMCID: PMC4662562 DOI: 10.1007/978-3-642-24826-9_38
Source DB: PubMed Journal: Digit Libraries Cult Herit Knowl Dissem Future Creat (2011)