| Literature DB >> 28936189 |
Deborah Denman1, Renée Speyer1,2,3, Natalie Munro4, Wendy M Pearce5, Yu-Wei Chen4, Reinie Cordier2.
Abstract
Introduction: Standardized assessments are widely used by speech pathologists in clinical and research settings to evaluate the language abilities of school-aged children and inform decisions about diagnosis, eligibility for services and intervention. Given the significance of these decisions, it is important that assessments have sound psychometric properties. Objective: The aim of this systematic review was to examine the psychometric quality of currently available comprehensive language assessments for school-aged children and identify assessments with the best evidence for use.Entities:
Keywords: Language Disorder; language assessment; language impairment; psychometric properties; reliability; validity
Year: 2017 PMID: 28936189 PMCID: PMC5594094 DOI: 10.3389/fpsyg.2017.01515
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
COSMIN domains, psychometric properties, aspects of psychometric properties and similar terms based on Mokkink et al. (2010c).
| Reliability | Internal consistency (The degree of the interrelatedness between items) | Internal reliability |
| Reliability (Variance in measurements which is because of “true” differences among clients) | Inter-rater reliability | |
| Measurement error (Systematic and random error of a client's score that is not due to true changes in the construct to be measured) | Standard Error of Measurement | |
| Validity | Content Validity (The degree to which the content of an instrument is an adequate reflection of the construct to be measured) | n/a |
| Construct validity (The degree to which scores are consistent with hypotheses based on the assumption that the instrument validly measures the construct to be measured) | n/a | |
| Aspect of construct validity—structural validity (The degree to which scores reflect the dimensionality of the measured construct) | Internal structure | |
| Aspect of Construct validity—hypothesis testing (Item construct validity) | Concurrent validity | |
| Aspect of Construct validity-Cross cultural validity (The degree to which the performance of the items on a translated or culturally adapted instrument are an adequate reflection of the performance of the items of the original version of the instrument) | n/a | |
| Criterion validity (The degree to which scores reflect measurement from a “gold standard”) | Sensitivity/specificity (when comparing assessment with gold-standard) | |
| Responsiveness | Responsiveness (The ability to detect change over time in the construct to be measured) | Sensitivity/specificity (when comparing two administrations of an assessment) Changes over time Stability of diagnosis |
| Interpretability (The degree to which qualitative meaning can be assigned to quantitative scores obtained from the assessment) | n/a |
Interpretability is not considered a psychometric property.
Figure 1Flowchart of selection process according to PRISMA.
Search Terms used in database searches.
| Subject Headings | Child, preschool: 2–5 years; Child: 6–12 years | |
| English language; Preschool child < 1 to 6 years>; School child < 7 to 12 years> | ||
| No limitations | ||
| (English[lang]) AND (“child”[MeSH Terms:noexp] OR “child, preschool”[MeSH Terms]) | ||
| Free Text Words | English language; Child, preschool: 2–5 years; Child: 6–12 years; Publication date: 20130101-20141231 | |
| English language; Preschool child < 1 to 6 years>; School child < 7 to 12 years>; yr = “2013-Current” | ||
| English; Preschool age (2–5 years); School Age (6–12 years); Adolescence (13–17 years); Publication year: 2013–2014 | ||
| English; Preschool Child: 2–5 years; Child: 6–12 years; Publication date from 2013/01/01 to 2014/02/31 | ||
| Gray Literature | No limitations | |
| No limitations | ||
| Free Text Words | English Language | |
| English language | ||
| English | ||
| English | ||
| English | ||
| Gray literature | English | |
| Publication year of assessment to current | ||
| No limitations | ||
| No limitations | ||
The title of the assessment and its acronym were used as search strategy.
Criteria for measuring quality of findings for studies examining measurement properties based on Terwee et al. (2007) and Schellingerhout et al. (2011).
| Internal consistency | + | Subtests one-dimensional (determined through factor analysis with adequate sample size) and Cronbach alpha between 0.70 and 0.95 |
| ? | Dimensionality of subtests unknown (no factor analysis) or Cronbach's alpha not calculated | |
| − | Subtests uni-dimensional (determined through factor analysis with adequate sample size) and Cronbach's alpha < 0.7 or > 0.95 | |
| ± | Conflicting results | |
| NR | No information found on internal consistency | |
| NE | Not evaluated due to “poor” methodology rating on COSMIN | |
| Reliability | + | ICC/weighted Kappa equal to or > than 0.70 |
| ? | Neither ICC/weighted Kappa calculated or doubtful design or method (e.g., time interval not appropriate) | |
| − | ICC/weighted Kappa < 0.70 with adequate methodology | |
| ± | Conflicting results | |
| NR | No information found on reliability | |
| NE | Not evaluated due to “poor” methodology on COSMIN | |
| Measurement error | + | MIC > SDC or MIC equals or inside LOA |
| ? | MIC not defined or doubtful design or method | |
| − | MIC < SDC or MIC equals or inside LOA with adequate methodology | |
| + | Conflicting results | |
| NR | No information found on measurement error | |
| NE | Not evaluated due to “poor” methodology on COSMIN | |
| Content validity | + | Good methodology (i.e., an overall rating of “Good” or above on COSMIN criteria for content validity) and experts examined all items for content and cultural bias during development of assessment |
| ? | Questionable methodology or experts only employed to examine one aspect (e.g., cultural bias) | |
| − | No expert reviewer involvement | |
| ± | Conflicting results | |
| NR | No information found on content validity | |
| NE | Not evaluated due to “poor” methodology | |
| Structural validity | + | Factor analysis performed with adequate sample size. Factors explain at least 50% of variance |
| ? | No factor analysis or inadequate sample size. Explained variance not mentioned | |
| − | Factors explain < 50% of variance despite adequate methodology | |
| ± | Conflicting results | |
| NR | No information found on structural validity | |
| NE | Not evaluated due to “poor” methodology | |
| Hypothesis testing | + | Convergent validity: Correlation with assessments measuring similar constructs equal to or >0.5 and correlation is consistent with hypothesis |
| ? | Questionable methodology e.g., only correlated with assessments that are not deemed similar | |
| − | Discriminant validity: findings inconsistent with hypotheses (e.g., no significant difference identified from appropriate statistical analysis) | |
| ± | Conflicting results | |
| NR | No information found on hypothesis testing | |
| NE | Not evaluated due to “poor” methodology |
+, Positive result; −, Negative result; ?, Indeterminate result due to methodological shortcomings; ±, Conflicting results within the same study (e.g., high correlations for some results but not on others); NR, Not reported; NE, Not evaluated; MIC, minimal important change; SDC, smallest detectable change; LOA, limits of agreement; ICC, Intra-class correlation; SD, standard deviation.
Level of evidence for psychometric quality for each measurement property based on Schellingerhout et al. (2011).
| Strong evidence | +++ or −−− | Consistent findings across 2 or more studies of “good” methodological quality OR one study of “excellent” methodological quality |
| Moderate evidence | ++ or −− | Consistent findings across 2 or more studies of “fair” methodological quality OR one study of “good” methodological quality |
| Weak evidence | + or − | One study of “fair” methodological quality (examining convergent or discriminant validity if rating hypothesis testing) |
| Conflicting evidence | ± | Conflicting findings across different studies (i.e., different studies with positive and negative findings) |
| Unknown | ? | Only available studies are of “poor” methodological quality |
| Not Evaluated | NE | Only available studies are of “poor” methodological quality as rated on COSMIN |
+, Positive result; –, Negative result.
Summary of assessments included in the review.
| 6–11 years | Spoken language including pragmatics. | |
| Spoken and written language skills including phonemic awareness | ||
| 3–21 years | Spoken language including pragmatics | |
| 5;0–21;11 years | Spoken language; supplemental tests for reading, writing and pragmatics | |
| 3;0–6;11 years | Spoken language | |
| 4–9 years | Spoken language: | |
| 5;0–12;11 years | Spoken and written language: | |
| 6–11 years | Spoken language | |
| 3;0–7;5 years | Spoken language | |
| 3–21 years | Spoken language | |
| Birth-7;11 years | Spoken language | |
| 3;0–7;11 | Spoken language | |
| 8;0–17 years | Spoken language | |
| 4;0–8;11 years | Spoken language | |
| 2–90 years | Spoken language |
Normative data is based on U.S. school grade level. No normative data is provided for age level in this assessment.
Summary of assessments excluded from the review.
| 1 | Adolescent Language Screening Test (ALST) | Morgan and Gillford (1984) | 11–17 | Pragmatics, receptive vocabulary, expressive vocabulary, sentence formulation, morphology and phonology | Not published within last 20 years |
| 2 | Aston Index Revised (Aston) | Newton and Thomson (1982) | 5–14 | Receptive language, written language, reading, visual perception, auditory discrimination | Not published within last 20 years |
| 3 | Bracken Basic Concept Test-Expressive (BBCS:E) | Bracken (2006) | 3–6;11 | Expressive: basic concepts | Not comprehensive language assessment |
| 4 | Bracken Basic Concept Test-3rd Edition Receptive (BBCS:3-R) | Bracken (2006) | 3–6;11 | Receptive: basic concepts | Not comprehensive language assessment |
| 5 | Bankson Language Test-Second Edition (BLT-2) | Bankson (1990) | 3;0–6;11 | Semantics, syntax/morphology and pragmatics | Not published within last 20 years |
| 6 | Boehm Test of Basic concepts-3rd Edition (Boehm-3) | Boehm (2000) | Grades K-2 (US) | Basic concepts | Not comprehensive language assessment |
| 7 | Boehm Test of Basic Concepts Preschool-3rd Edition (Boehm-3 Preschool) | Boehm (2001) | 3;0–5;11 | Relational concepts | Not comprehensive language assessment |
| 8 | British Vocabulary Scale-3rd Edition (BPVS-3) | Dunn et al. (2009) | 3–16 | Receptive vocabulary | Not comprehensive language assessment |
| 9 | Clinical Evaluation of Language Fundamentals–5th Edition Metalinguistics (CELF-5 Metalinguistic) | Wiig and Secord (2013) | 9;0–21;0 | Higher level language: making inferences, conversation skills, multiple meanings and figurative language | Not comprehensive language assessment |
| 10 | Clinical Evaluations of Language Fundamentals-5th Edition Screening (CELF-5 Screening) | Semel et al. (2013) | 5;0–21;11 | Receptive and expressive semantics and syntax | Screening assessment |
| 11 | Comprehensive Receptive and Expressive Vocabulary Test-Second Edition (CREVT-3) | Wallace and Hammill (2013) | 5–89 | Receptive and expressive vocabulary | Not comprehensive language assessment |
| 12 | Compton Speech and Language Screening Evaluation-Revised Edition | Compton (1999) | 3–6 | Expressive and receptive language, articulation, auditory memory and oral-motor co-ordination | Screening Assessment |
| 13 | Executive Functions Test Elementary | Bowers and Huisingh (2014) | 7;0–12;11 | Higher level language: working memory, problem solving, inferring and making predictions | Not comprehensive language assessment |
| 14 | Expressive Language Test-2nd Edition (ELT-2) | Bowers Huisingh et al. (2010) | 5;0–11;0 | Expressive language: sequencing, metalinguistics, grammar and syntax | Not comprehensive language assessment |
| 15 | Expressive One-Word Vocabulary Test-4th Edition (EOWPVT-4) | Martin and Brownell (2011) | 2–80 | Expressive vocabulary (picture naming) | Not comprehensive language assessment |
| 16 | Expression, Reception and Recall of Narrative Instrument (ERRNI) | Bishop (2004) | 4–15 | Narrative skills: story comprehension and retell | Not comprehensive language assessment |
| 17 | Expressive Vocabulary Test-Second Edition (EVT-2) | Williams (2007) | 2;6–90+ | Expressive vocabulary and word retrieval | Not comprehensive language assessment |
| 18 | Fluharty Preschool Screening Test-Second Edition (FPSLST-2) | Fluharty (2000) | 3;0–6;11 | Receptive and expressive language: sentence repetition, answering questions, describing actions, sequencing events and articulation. | Screening Assessment |
| 19 | Fullerton Language Test for Adolescent-Second Edition (FLTA-2) | Thorum (1986) | 11-Adult | Receptive and expressive language | Not published within last 20 years |
| 20 | Grammar and Phonology Screening Test (GAPS) | Van der Lely (2007) | 3;5–6;5 | Grammar and pre reading skills | Not Comprehensive language assessment |
| 21 | Kaufman Survey of Early Academic and Language Skills (K-SEALS) | Kaufman and Kaufman (1993) | 3;0–6;11 | Expressive and receptive vocabulary, numerical skills and articulation | Not published in last 20 years |
| 22 | Kindergarten Language Screening Test-Second Edition (KLST-2) | Gauthier and Madison (1998) | 3;6–6;11 | General language: question comprehension, following commands, sentence repetition, comparing and contrasting objects and spontaneous speech | Screening Assessment |
| 23 | Language Processing Test 3 Elementary (LPT-3:P) | Richard and Hanner (2005) | 5–11 | Expressive semantics: word association, categorizing words, identifying similarities between words, defining words, describing words | Not comprehensive language assessment |
| 24 | Montgomery Assessment of Vocabulary Acquisition (MAVA) | Montgomery (2008) | 3–12 | Receptive and expressive vocabulary | Not comprehensive language assessment |
| 25 | North Western Syntax Screening Test (NSST) | Lee (1969) | Unknown | Syntax and morphology | Not published in last 20 years |
| 26 | Peabody Picture Vocabulary test-4th Edition (PPVT-IV) | Dunn and Dunn (2007) | 2;6–90 | Receptive vocabulary | Not comprehensive language assessment |
| 27 | Pragmatic Language Skills (PLSI) | Gillam and Miller (2006) | 5;0–12;11 | Pragmatics | Not comprehensive language assessment |
| 28 | Preschool Language Assessment Instrument-Second Edition (PLAI-2) | Blank et al. (2003) | 3.0–5;11 | Discourse | Not comprehensive language assessment |
| 29 | Preschool Language Scales-5th Edition Screener (PLS-5 Screener) | Zimmerman (2013) | Birth-7;11 | General language | Screening assessment |
| 30 | Receptive One-Word Picture Vocabulary Tests-Fourth Edition (ROWPVT-4) | Martin and Brownell (2010) | 2;0–70 | Receptive vocabulary | Not comprehensive language assessment |
| 31 | Renfrew Action Picture Test-Revised (RAPT-Revised) | Renfrew (2010) | 3–8 | Expressive language: information content, syntax and morphology | Not comprehensive language assessment |
| 32 | Renfrew Bus Story-Revised edition (RBS-Revised) | Renfrew (2010) | 3–8 | Narrative retell | Not comprehensive language assessment |
| 33 | Rhode Island Test of Language Structure | Engen and Engen (1983) | 3–6 | Receptive syntax (designed for hearing impairment but has norms for non-hearing impairment) | Not comprehensive language assessment |
| 34 | Screening Kit of Language Development (SKOLD) | Bliss and Allen (1983) | 2–5 | General language | Not published within last 20 years |
| 35 | Screening Test for Adolescent Language (STAL) | Prather and Breecher (1980) | 11–18 | General language | Not published in last 20 years |
| 36 | Social Emotional Evaluation (SEE) | Wiig (2008) | 6;0–12;0 | Social skills and higher level language | Not comprehensive language assessment |
| 37 | Social Language Development Test Elementary (SLDT-E) | Bowers et al. (2008) | 6–11 | Language for social interaction | Not comprehensive language assessment |
| 38 | Structured Photographic Expressive Language Test-Third Edition (SPELT-3) | Dawson and Stout (2003) | 4,0–9,11 | Expressive syntax and morphology | Not comprehensive language assessment |
| 39 | Structured Photographic Expressive Language Test Preschool-2nd Edition (SPELT-P:2) | Dawson et al. (2005) | 3;0–5;11 | Expressive syntax and morphology | Not comprehensive language assessment |
| 40 | Test for Auditory Comprehension of Language-Fourth Edition (TACL-4) | Carrow-Woolfolk (2014) | 3;0–12;11 | Receptive vocabulary, syntax and morphology | Not comprehensive language assessment |
| 41 | Test of Auditory Reasoning and processing skills (TARPS) | Gardner (1993) | 5–13;11 | Auditory processing: verbal reasoning, inferences, problems solving, acquiring and organizing information | Not published within last 20 years |
| 42 | Test for Examining Expressive Morphology (TEEM) | Shipley (1983) | 3;0–7;0 | Expressive morphology | Not published within last 20 years |
| 43 | Test of Grammatical Impairment (TEGI) | Rice and Wexler (2001) | 3;0–8;0 | Syntax and morphology | Not comprehensive language assessment |
| 44 | Test of Grammatical Impairment-Screener (TEGI-Screener) | Rice and Wexler (2001) | 3–6;11 | Syntax and morphology | Screening assessment |
| 45 | Test of Language Competence-Expanded (TLC-E) | Wiig and Secord (1989) | 5;0–18;0 | Semantics, syntax and pragmatics | Not published within last 20 years |
| 46 | Test of Narrative language (TNL) | Gillam and Pearson (2004) | 5;0–11;11 | Narrative retell | Not comprehensive language assessment |
| 47 | Test of Pragmatic Language (TOLP-2) | Terasaki and Gunn (2007) | 6;0–18;11 | Pragmatic skills | Not comprehensive language assessment |
| 48 | Test of Problem Solving 3 Elementary (TOPS-3-Elementary) | Bowers et al. (2005) | Language-based thinking | Not comprehensive language assessment | |
| 49 | Test of Reception of Grammar (TROG-2) | Bishop (2003) | 4+ | Receptive grammar | Not comprehensive language assessment |
| 50 | Test of Semantic Skills-Intermediate (TOSS-I) | Huisingh et al. (2004) | 9–13 | Receptive and expressive semantics | Not comprehensive language assessment |
| 51 | Test of Semantic Skills-Primary (TOSS-P) | Bowers et al. (2002) | 4–8 | Receptive and expressive semantics | Not comprehensive language assessment |
| 52 | Test of Word Finding-Second Edition (TWF-2) | German (2000) | 4;0–12;11 | Expressive vocabulary: word finding | Not comprehensive assessment |
| 53 | Test of Word Finding in Discourse (TWFD) | German (1991) | 6;6–12;11 | Word finding in discourse | Not comprehensive assessment |
| 54 | Test of Word Knowledge (TOWK) | Wiig and Second (1992) | 5–17 | Receptive and expressive vocabulary | Not published within last 20 years |
| 55 | Token Test for Children-Second edition (TTFC-2) | McGHee et al. (2007) | 3;0–12;11 | Receptive: understanding of spoken directions | Not comprehensive language assessment |
| 56 | Wellcomm: A speech and language toolkit for the early years (Screening tool) English norms | Sandwell Primary Care Trust | 6 months–6 years | General language | Screening Assessment |
| 57 | Wh—question comprehension test | Vicker (2002) | 4-Adult | Wh-question comprehension | Not comprehensive language assessment |
| 58 | Wiig Assessment of Basic Concepts (WABC) | Wiig (2004) | 2;6–7;11 | Receptive and expressive: basic concepts | Not comprehensive assessment |
| 59 | Word Finding Vocabulary Test-Revised Edition (WFVT) | Renfrew (2010) | 3–8 | Expressive vocabulary: word finding | Not comprehensive language assessment |
| 60 | The WORD Test 2 Elementary (WORD-2) | Bowers et al. (2004) | 6–11 | Receptive and expressive vocabulary | Not comprehensive language assessment |
| 61 | Utah Test of Language Development (UTLD-4) | Mecham (2003) | 3;0–9;11 | Expressive semantics, syntax and morphology | Not comprehensive language assessment |
Articles selected for review.
| Eadie et al., | CELF-P:2 (Australian) Diagnostic accuracy | Investigation of sensitivity and specificity of CELF:P-2 at age 4 years against Clinical Evaluation of Language Fundamentals-4th Edition (CELF-4) at age 5 years |
| Hoffman et al., | CASL Structural Validity Hypothesis testing | Investigation of the construct (structural) validity of the CASL using factor analysis. Investigation of convergent validity between the CASL and Test of Language Development-Primary: 3rd Edition (TOLD-P:3) |
| Kaminski et al., | CELF-P:2 Hypothesis testing | Investigation of predictive validity and convergent validity between CELF:P-2 and Preschool Early Literacy Indicators (PELI) |
| McKown et al., | CASL Internal consistency Reliability (test-retest) | Examination of the internal consistency of the Pragmatic Judgment subtest of the CASL Examination of test-retest reliability of the Pragmatic Judgment subtest of the CASL |
| Pesco and O'Neill, | CELF:P-2 DELV-NR Hypothesis testing | Investigation of performance on the DELV-NR and CELF:P-2 to be predicted by the Language Use Inventory (LUI) |
| Reichow et al., | CASL Hypothesis testing | Examination of the convergent validity between selected subtests from the CASL with the Vineland Adaptive Behavior Scales |
| Spaulding, | TELD-3 Hypothesis testing | Investigation of consistency between severity classification on the TELD-3 and the Utah Test of Language Development-4th Edition (UTLD-4) |
This subtest forms part of the overall composite score on the CASL.
Ratings of methodological quality and study outcome of reliability and validity studies for selected assessments.
| ACE6-11 | ACE6-11 Manual | 77.8 | Test-retest 75.9 Excell | 53.3 | 42.9 Fair | 25 | Convergent 52.2 Good |
| ALL | ALL Manual | 75.0 | Test-retest 72.4 Good | 20 | 92.9 Excell | 33.3 | Convergent 52.2 Good |
| CASL | CASL Manual | 57.1 | Test-retst 56.0 | 40 | 71.4 Good | 33.3 | Convergent 39.1 Fair |
| Hoffman et al., | NR | NR | NR | NR | 33.3 | Convergent 73.9 Good | |
| McKown et al., | 83.3 | Test-retest 62.0 | NR | NR | NR | NR | |
| Reichow et al., | NR | NR | NR | NR | NR | Convergent 52.2 Good | |
| CELF-5 | CELF-5 Manual | 71.4 | Test-retest 72.4 Good | 40 | 71.4 Good | 58.3 Good | Convergent 65.2 Good |
| CELF:P-2 | CELF:P-2 Manual | 71.4 | Test-retest 72.4 Good | 40 | 64.3 Good | 33.3 | Convergent 47.8 Fair |
| Kaminski et al., | NR | NR | NR | NR | NR | Convergent 56.5 Good | |
| Pesco and O'Neill, | NR | NR | NR | NR | NR | Convergent 47.8 Good | |
| NR | NR | NR | NR | NR | Convergent 65.2 Good | ||
| NR | NR | NR | NR | NR | Convergent 69.6 Good | ||
| DELV-NR | DELV-NR Manual | 66.7 | Test-retest 69 Good | 40 | 57.1 Good | 50 | Convergent 34.8 Fair |
| NR | NR | NR | NR | NR | Convergent 47.8 Good | ||
| ITPA-3 | ITPA-3 Manual | 71.4 | Test-retest 62.1 Good | 40 | 57.1 Fair | 50 Fair | Convergent 34.7 Fair |
| LCT-2 | LCT-2 Manual | 50 | Test-retest 34.6 Fair | 40 | 28.5 Fair | 50 | Discriminant 29.4 |
| NRDLS | NRDLS Manual | 66.7 | Test-retest 60.0 Good | 40.0 | 57.1 Good | NR | Convergent 52.2 Good |
| OWLS-II | OWLS-II Manual | 57.1 | Test-retest 72.4 Good | 40 | 71.4 Good | 33.4 | Convergent 21.7 Poor NR Discriminant 47.1 Fair |
| PLS-5 | PLS-5 Manual | 50 | Test-retest 69.0 Good | 40 | 71.4 Good | 57.1 | Convergent 56.5 Good |
| TELD-3 | TELD-3 Manual | 61.1 | Test-retest 72.4 Good | 33.4 | 71.4 Good | 41.7 | Convergent 39.1 Fair |
| Spaulding, | NR | NR | NR | NR | NR | Convergent 47.8 Fair | |
| TOLD-I:4 | TOLD-P:4 Manual | 71.4 | Test-retest 72.4 Good | 40 | 57.1 Fair | 33.4 | Convergent 60.9 Good |
| TOLD-P:4 | TOLD-I:4 Manual | 71.4 | Test-retest 69.0 Good | 40 | 57.1 Fair | 50 | Convergent 60.9 Good |
| WJIVOL | WJIVOL Manual | 57.2 | NE | 40 | 78.6 Excell | 50 | Convergent 43.5 Fair |
Study outcome ratings are based on Terwee et al. (2007) and Schellingerhout et al. (2011). Excellent (Excell) = 100–75.1, Good = 75–50.1, Fair = 50–25.1, and Poor = 25–0; NR, No study reported for this measurement property in this publication; NE, study not evaluated as “poor” methodological rating; +, ?, – = See Table 3;
Uni-dimensionality of scale not checked prior to internal consistency calculation;
Sample size for factor analysis not stated or small;
Type of statistical analysis used unclear or inappropriate statistical analysis according to COSMIN;
Error measurement calculated using Cronbach alpha or split-half reliability method;
Time interval between assessment administrations not deemed appropriate;
sample size small;
Internal consistency calculated on split-half reliability;
Only reported correlations between subtests (no study using factor analysis);
This study was also evaluated for another of the selected assessments.
Level of evidence for each assessment based on Schellingerhout et al. (2011).
| ACE6-11 | ? | ? | ? | ? | ? | ++ |
| ALL | ? | ? | ? | +++ | ? | +++ |
| CASL | ? | ? | ? | ? | ? | ++ |
| CELF-5 | ? | ++ | ? | ++ | ? | +++ |
| CELF:P-2 | ? | ? | ? | ++ | ? | +++ |
| DELV-NR | ? | ? | ? | ? | ? | ? |
| ITPA-3 | ? | ? | ? | ? | ? | + |
| LCT-2 | ? | ? | ? | ? | ? | + |
| NRDLS | ? | ? | ? | ? | NA | ++ |
| OWLS-II | ? | + | ? | ? | ? | + |
| PLS-5 | ? | ? | ? | ++ | ? | +++ |
| TELD-3 | ? | ? | ? | ? | ? | + |
| TOLD-I:4 | ? | ? | ? | ? | ? | ++ |
| TOLD-P:4 | ? | ? | ? | ? | ? | ++ |
| WJIVOL | ? | NA | ? | ? | ? | + |
+++ or ——, Strong evidence positive/negative result; ++ or —-, Moderate evidence positive/negative result; + or –, Limited evidence positive/negative result; ±, Conflicting evidence across different studies; ?, Unknown due to poor methodological quality (See Table .
Some studies outside of the manuals were rated as having conflicting evidence within the same study.
Diagnostic Accuracy data reported for each assessment.
| ALL | ALL Manual | 10% base rate for population sample; | −1 SD = 98 | −1SD = 89 | 10% base rate: | 10% base rate: |
| CELF-5 | CELF-5 Manual | 10% base rate for population sample; | −1 SD = 100 | −SD = 91 | 10% base rate: | 10% base rate: |
| CELF:P-2 | CELF:P-2 Manual | 20% base rate for population sample; | NR | NR | 20% base rate: | 20%base rate: |
| Eadie et al., | CELF-P:2 scores at 4 years against CELF-4 scores at 5 years | −1.25 SD = 64.0 | −1.25 SD = 92.9 | NR | NR | |
| DELV-NR | DELV-NR Manual | 10% base rate for population sample; | −1 SD = 95 | −1 SD = 93 | 10% base rate: | 10% base rate: |
| PLS-5 | PLS-5 Manual | 20% base rate for population sample; | With standard score 85 as cut-off = 91 | With standard score 85 as cut-off = 78 | 20% base rate: | 20% base rate: |
| TOLD-I:4 | TOLD-P:4 Manual | Criterion against other assessments: | With Standard Score 90 as cut-off: | With Standard Score 90 as cut-off: | With Standard Score 90 as cut-off: | NR |
| TOLD-P:4 | TOLD-I:4 Manual | Criterion against other assessments: | With Standard Score 90 as cut-off: | With Standard Score 90 as cut-off: | With Standard Score 90 as cut-off: | NR |
PPP, Positive Predictive Power; NPP, Negative Predictive Power; Base rate for population sample, percentage of population expected to identify with language impairment; Base rate for referral population, percentage of children referred for assessment who identify with language impairment; NR, Not reported in this study; SD, Number of standard deviations selected as cut-off for calculation;
PLOS, Pragmatic Language Observation Scale;
PPVT-3, Peabody picture Vocabulary test-Third Edition;
TOLD-P:4, Test of Oral Language Development-Primary: 4th Edition;
WISC-IV, Weschler Intelligence Scale for Children-4th Edition (Verbal Comprehension Composite);
Global Language Score, Metavariable combining PLOS, PPVT-3, TOLD-P:4, WISC-IV scores;
TOLD-P:4, Test of Language Development-Intermediate: 4th Edition;
Global Language Score, Metavariable combining PLOS and TOLD-P:4 scores.