Literature DB >> 19826136

Screening for emotional distress in cancer patients: a systematic review of assessment instruments.

Andrea Vodermaier¹, Wolfgang Linden, Christopher Siu.

Abstract

Screening for emotional distress is becoming increasingly common in cancer care. This systematic review examines the psychometric properties of the existing tools used to screen patients for emotional distress, with the goal of encouraging screening programs to use standardized tools that have strong psychometrics. Systematic searches of MEDLINE and PsycINFO databases for English-language studies in cancer patients were performed using a uniform set of key words (eg, depression, anxiety, screening, validation, and scale), and the retrieved studies were independently evaluated by two reviewers. Evaluation criteria included the number of validation studies, the number of participants, generalizability, reliability, the quality of the criterion measure, sensitivity, and specificity. The literature search yielded 106 validation studies that described a total of 33 screening measures. Many generic and cancer-specific scales satisfied a fairly high threshold of quality in terms of their psychometric properties and generalizability. Among the ultrashort measures (ie, those containing one to four items), the Combined Depression Questions performed best in patients receiving palliative care. Among the short measures (ie, those containing five to 20 items), the Center for Epidemiologic Studies-Depression Scale and the Hospital Anxiety and Depression Scale demonstrated adequate psychometric properties. Among the long measures (ie, those containing 21-50 items), the Beck Depression Inventory and the General Health Questionaire-28 met all evaluation criteria. The PsychoSocial Screen for Cancer, the Questionnaire on Stress in Cancer Patients-Revised, and the Rotterdam Symptom Checklist are long measures that can also be recommended for routine screening. In addition, other measures may be considered for specific indications or disease types. Some measures, particularly newly developed cancer-specific scales, require further validation against structured clinical interviews (the criterion standard for validation measures) before they can be recommended.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2009 PMID： 19826136 PMCID： PMC3298956 DOI： 10.1093/jnci/djp336

Source DB: PubMed Journal: J Natl Cancer Inst ISSN： 0027-8874 Impact factor: 13.506

Transient mood disturbances occur frequently among cancer patients during the disease trajectory, and depression often persists in these patients (1). Consequently, psychosocial counseling has become an integral part of cancer care, and several meta-analyses support its efficacy (2–4). More specifically, behavioral interventions (5–7) and supportive–expressive group therapy (8,9) are effective in reducing emotional distress in cancer patients. These treatments work best for patients with pronounced clinical symptoms of emotional distress (10). To maximize the use of limited treatment resources and provide equitable access to mental health services, emotionally distressed cancer patients need to be reliably identified. Traditionally, referrals for mental health services are either self-initiated or based on physician judgment. However, the concordance rates between patients’ self-report and physicians’ clinical impressions are low, thus identifying a need for standardized validated tools for measuring emotional distress (11,12). Given that the so-called criterion standard clinical assessment interviews for emotional distress - either standardized (eg, the Composite International Diagnostic Interview (CIDI) for DSM IV Axis I Disorders) or structured (eg, the Structured Clinical Interview for DSM IV Axis I Disorders (SCID-I))—are time consuming for both the patient and the clinical staff who administer them and are, therefore, costly, their routine implementation in busy clinics is unlikely. Furthermore, patients who are receiving palliative care may not be physically able to complete lengthy diagnostic interviews. Thus, relatively brief but validated questionnaires would seem to be the tools of choice for routine screening of cancer patients’ emotional distress. Brief self-reports are easy to administer, inexpensive (some are even free), and, if properly validated, can help identify those patients most in need of professional mental health support. A distinct advantage of systematic screening of cancer patients for emotional distress is that it is likely to promote equal access to psychological services, whereas a system that is based only on physician- or patient-initiated referrals might fail to identify and/or overlook a substantial proportion of emotionally distressed patients who are in need of supportive treatment. Furthermore, systematic screening allows mental health staff to forecast their workload. To date, however, only a minority of cancer centers in the United States (13), the United Kingdom (14), and Canada (15) have implemented emotional distress screening of patients with standardized tools. Time constraints of health professionals and insufficient knowledge about the appropriate screening tool may partially account for the infrequent use of high-quality screening instruments in cancer care settings. The widely acknowledged shortage of professional staff for treatment follow-through suggests a need for screening tools with high sensitivity and high specificity that ensure that all patients in need of psychological support are identified. We posit that the choice of a screening tool ought to consider the psychometric properties of the instrument, with special emphasis on its sensitivity and specificity, the treatment environment, and the patient's disease stage. Psychological measures (which in this review are referred to as scales, tools, instruments, and measures) come in varying lengths and formats. One important distinguishing feature for various scales is their length, which is defined by the number of questions or test items they contain; the term “screening tool” usually refers to particularly short tests. Longer tests cost more money to administer but are sometimes needed to reach acceptable levels of reliability and validity. The advantages and disadvantages of screening tools of varying lengths are summarized in Table 1. We created the following length categories according to the number of items in a measure: ultrashort (one to four items), short (five to 20 items), and long (21–50 items); these cut points were chosen arbitrarily before data extraction and review. Ultrashort measures are typically limited to one psychological domain, such as depression or anxiety, and are the easiest to implement in routine care settings; however, they may not be appropriate for use in research settings. Their brevity presents a potential economic advantage because fewer staff resources are required for their administration and scoring. As one meta-analysis (16) demonstrated, ultrashort screening tools can possess adequate sensitivity to identify distressed patients but lack the specificity to rule out those patients who were wrongly identified as distressed (ie, false positives). Test instruments that contain more than four items can assess more aspects of emotional impairment and may possess superior psychometric properties. The trade-off is that routine use of longer tools, particularly their scoring and interpretation, places more of a burden on staff time. However, the availability of touch screen computer–based assessments can eliminate this disadvantage because the computer program can automatically score the assessment tool and generate a report.

Table 1

Advantages and disadvantages of screening tools of varying length

Ultrashort (1–4 items)	Short (5–20 items)	Long (21–50 items)
Excellent chance for adoption in busy clinics	Moderate chance for adoption in busy clinics	Routine use unlikely unless automated
Sensitivity may be high, low-to-moderate specificity	Likely high sensitivity, moderate-to-high specificity	Specificity and sensitivity can be high
Can only assess one domain	Can assess multiple domains	Can assess multiple domains
Not suitable for research	May be suitable for research, needs to be tested	Excellent for research
Inexpensive	Some cost in scoring (can be minimized via automation)	Potentially costly scoring (can be minimized via automation)

Advantages and disadvantages of screening tools of varying length Included in this review are both newly developed and well-established distress screening tools that have been validated in patients with cancer. To the best of our knowledge, this systematic review is the most comprehensive review of screening instruments for emotional distress in cancer patients to date. In this review, we define distress as a state of negative affect that is suggestive of affective disorders (ie, minor or major depressive disorder and dysthymia), anxiety disorders, and adjustment disorders (depressive, anxious, or mixed). Measures of related domains (eg, physical symptom distress, lack of social support, quality of life, and patient needs) were excluded.

Methods

Study Selection

The data extraction and study review process were performed according to the guidelines for systematic reviews of diagnostic tests in cancer (17). We searched MEDLINE (1966 to August 2008) and PsycINFO (1872 to August 2008) databases for English-language studies in cancer patients by using the following search terms (cancer OR screening OR instrument OR measure OR questionnaire OR validation) AND (distress OR depression OR anxiety OR adjustment disorder OR negative affect OR psychological). After eliminating the duplicate studies, the titles and abstracts of the remaining studies were reviewed independently by two authors (A. Vodermaier and C. Siu) (Figure 1). These authors also reviewed the full-length article for all studies that were retained, and their interrater reliability was calculated. Interrater reliability was computed as a kappa coefficient (κ = .86). Disagreements about whether or not studies met the inclusion criteria were resolved by seeking additional input from the second author (W. Linden). The first author (A. Vodermaier) performed a detailed assessment of the included studies and identified additional validation studies via cross-referencing.

Figure 1

Flowchart of studies included in systematic review.

Study Inclusion and Evaluation Criteria

A study was included in this review if it attempted to validate a newly developed cancer-specific questionnaire (either interviewer administered or standardized self-administered) or reported on an existing generic measure that had also been validated in a sample of cancer patients. The measure could not exceed 50 items and must have been published in a peer-reviewed English-language journal. We focused on published peer-reviewed studies because we expected them to be the most methodologically rigorous, thus yielding the strongest conclusions with regard to recommendations about tool choice. Studies included in this review were evaluated on the basis of the following criteria: the number of validation studies identified, the number of participants across studies, generalizability across cancer types and/or disease stages, reliability, type of the criterion measure (in which structured clinical interviews such as Composite International Diagnostic Interview or SCID represent the criterion standard), and validity. When information on sensitivity, specificity, positive predictive value, or negative predictive value was partly missing but could be computed on the basis of other data presented, we completed these computations.

Reliability

Using the recommendations of statistical experts (18), we required an internal consistency of .8 or higher for a screening instrument to warrant a designation of high quality. Internal consistency was usually reported in the included studies as the Cronbach alpha estimate (19) or as the Spearman–Brown rho coefficient (19). Reliability was available for the generic scales included in this systematic review. Therefore, internal consistency should be reported for newly developed cancer-specific scales for which psychometric properties have not been yet established in order to achieve an evaluation of adequate reliability as a screening tool. Unless subscale reliabilities were specifically reported, the Cronbach alpha or Spearman–Brown rho represents the internal consistency for the entire scale. Test–retest reliability was considered less important as another index of reliability than the scale's internal consistency because mood in cancer patients is known to be unstable and a function of where the patient is in the illness trajectory (20). Information on test–retest reliability or sensitivity to change was included in the description of studies when it was available.

Validity

We assumed that the typical screening measures included in this systematic review already have face and content validity. Therefore, this review focused on information about concurrent, construct, and discriminant validity. Concurrent validity is a test's ability to measure similar phenomena as do other tests for the same target variable, for example, other anxiety tests. Construct validity seeks agreement between a theoretical concept and a specific measure. For example, a researcher developing a depression scale will first make a concerted effort to define depression so that the new test actually captures the target variable of depression. Regarding quality of validation, we posit that the most important criterion is whether or not a screening tool has empirically validated cutoffs based on clearly identified sensitivity and specificity data. Hence, in this review, we placed the greatest emphasis on the results of receiver operating characteristic (ROC) analyses that provided empirically justified cutoffs for clinical decision making (ie, discriminant validity). The ROC curve is a graphical plot of the sensitivity vs 1 minus the specificity that provides information needed for choosing a useful cutoff. For this review, a tool was considered to have high validity if the average of its sensitivity and specificity estimates was .80 or higher. We searched for evidence of predictive validity in particular but could not find any study that was suitable to be included in the review.

Overall Judgment

Our evaluation of the validation studies used decision rules that are summarized in Table 2. The results of individual studies were averaged across each single measure such that the number of participants was weighted across studies within each measure to assess overall reliability, type of criterion measure, and validity. Reliability, type of criterion measure, and validity were rated as high, moderate, or low. These three ratings were condensed into a five-level overall judgment (excellent, good, moderate, fair, or poor) according to the decision rules described in Table 2. The overall judgment was “poor” if any of the three criteria was rated as low, reliability was not reported, no ROC analysis data were available, or the number of participants in a validation attempt was below a threshold of 100 when self-report scales were used as the criterion measure or below 50 when structured clinical interviews were used. Given that generalizability is not of general importance for screening tool choice, this criterion did not influence the overall judgment.

Table 2

Decision rule

Reliability	Criterion measure	Validity	Judgment
High	High	High	Excellent
High	High	Moderate	Good
High	Moderate	High
Moderate	High	High
High	Moderate	Moderate	Moderate
Moderate	High	Moderate
Moderate	Moderate	Moderate	Fair
Low or not reported*	Low	Low	Poor†
Construct validity data only
No. of participants across studies‡: n < 100 or n < 50

For established scales, reliability did not necessarily have to be reported. Reliability also was not applicable for one- or two-item scales.

If one or more criteria were rated low, the overall judgment was “poor.”

n < 100 when a scale was used as the criterion measure, and n < 50 when a structured clinical interview was used as the criterion measure.

Decision rule For established scales, reliability did not necessarily have to be reported. Reliability also was not applicable for one- or two-item scales. If one or more criteria were rated low, the overall judgment was “poor.” n < 100 when a scale was used as the criterion measure, and n < 50 when a structured clinical interview was used as the criterion measure.

Results

The literature search identified 2747 publications. A total of 1416 studies remained after duplicate studies were removed. The decision steps are detailed in Figure 1. Data extraction and additional articles found via checks of cross-references resulted in 106 validation studies that described a total of 33 measures. Table 3 provides the summary judgments for the screening tools based on the predefined evaluation criteria. The key data for each study were extracted and are presented in Tables 4, 5, and 6. Table 4 presents the validation studies on questionnaires that contain one to four items, Table 5 describes those containing five to 20 items, and Table 6 covers those with 21–50 items. When a non–English-speaking country is noted in the “sample” column, it refers to a version of the scale that was translated according to standard forward and backward translation procedures [except in one study (125), where this procedure was not used]. The Brief Symptom Inventory–18 (BSI-18) (127), the BSI-53 Global Severity Index (128), the Center for Epidemiological Studies–Depression Scale (CES-D) (129), the General Health Questionnaire–12 (GHQ-12) (130), the Hospital Anxiety and Depression Scale (HADS) (131), the Patient Health Questionnaire–9 (PHQ-9) (132), the Symptom Checklist-90–Revised (133), and the State–Trait Anxiety Inventory–trait version (134) were used as criterion measures; the BSI-18, CES-D, and HADS were also used as screening tools.

Table 3

Tool evaluation*

Scale length	Measure	No. of items	No. of studies	No. of participants†	Generalizability	Reliability‡	Criterion measure§	Validity║	Judgment¶
Ultrashort	Anxiety question	1	1	79	No	—	Moderate	Moderate	Poor
	BCD	4	1	100	Yes	Low	High	Moderate	Poor
	Depression question	1	6	1008	No	—	High	Moderate	Good
	Interest question	1	2	376	No	—	Moderate	Moderate	Moderate
	Combination depression question	2	3	573	No	—	High	High	Excellent
	One-question interview	1	1	275	Yes	—	Low	Moderate	Poor
	DT	1	15	4088	Yes	Moderate	Moderate	Moderate	Fair
	ESAS	2	2	295	Yes	—	Moderate	Moderate	Fair
	VAS	1	6	3888 (3456)	Yes	Moderate	High	Moderate	Moderate
Short	BDI-SF	13	2	424	No	Low	High	Moderate	Poor
	BEDS-6	6	1	246	No	Moderate	High	Moderate	Moderate
	BSI-18	18	4	10 749 (1764)	Yes	High	Moderate	High	Good
	CES-D	20	4	1002 (93)	Yes	High	High	High	Excellent
	EPDS	10	4	470 (420)	No	High	High	Moderate	Good
	GHQ-12	12	2	267 (188)	Yes	—	High	Moderate	Good
	HADS	14	41	10 203 (6332)	Yes	High	High	Moderate	Good
	HQ-9	9	1	250	No	High	High	Moderate	Good
	IES	15	3	283 (95)	No	High	High	Low	Poor
	MAX-PC	18	3	930 (0)	No	High	High	—	Poor
	PDI	13	2	239	No	High	High	Moderate	Good
	PHQ-9	9	2	390 (0)	No	High	—	—	Poor
	PCL-C	17	3	429 (82)	No	High	High	Moderate	Good
	POMS-LASA	6	1	42 (0)	No	Moderate	Moderate	—	Poor
	ZSDS	20	6	1459 (155)	Yes	Low	High	Moderate	Poor
Long	BAI	21	1	33	Yes	—	High	High	Poor
	BDI	21	4	398 (293)	Yes	High	High	High	Excellent
	DI-C	26	1	63 (0)	No	High	—	—	Poor
	GHQ-28	28	2	170	Yes	High	High	High	Excellent
	MEQ	23	2	90 (45)	No	High	High	Moderate	Poor
	POMS-SF	37	1	428 (0)	No	High	—	—	Poor
	PSSCAN	21	2	1939 (101)	Yes	High	Moderate	High	Good
	QCS-R23	23	2	3366 (596)	Yes	High	Moderate	High	Good
	RSCL	30	8	2439 (863)	Yes	High	High	Moderate	Good

BAI = Beck Anxiety Inventory; BCD = Brief Case Find for Depression; BDI-SF = Beck Depression Inventory–Short Form; BEDS-6 = Brief Edinburgh Depression Scale; BSI-18 = Brief Symptom Inventory–18; CES-D = Center for Epidemiological Studies–Depression Scale; DI-C = Distress Inventory for Cancer; DT = Distress Thermometer; EPDS = Edinburgh Postnatal Depression Scale; ESAS = Edmonton Symptom Assessment System; GHQ = General Health Questionnaire; HADS = Hospital Anxiety and Depression Scale; HQ-9 = Hornheide Questionnaire–9; IES = Impact of Event Scale; MAX-PC = Memorial Anxiety Scale for Prostate Cancer; MEQ = Mood Evaluation Questionnaire; PCL-C = PTSD checklist; PDI = Psychological Distress Inventory; PHQ-9 = Patient Health Questionnaire–9; POMS-LASA = overall quality of life visual analog scale; POMS-SF = Profile of Moods State–short form; PSSCAN = Psychosocial Screen for Cancer; ROC = receiver operating characteristic; RSCL = Rotterdam Symptom Checklist; VAS = visual analog scale; ZSDS = Zung Self-Rating Depression Scale; — = no or insufficient information available.

Number of participants refers to the total across studies that provide validity information based on ROC analyses data. If only a subsample was included in ROC analyses, then the subsample is indicated in parentheses.

Low = Cronbach alpha or Spearman–Brown rho < .60, κ < .40, or r = .2. Moderate = Cronbach alpha or Spearman–Brown rho ≥ .60 and <.80, κ ≥ .4 and <.60, or r = .5. High = Cronbach alpha or Spearman–Brown rho ≥ .80, κ ≥ .60, or r = .8. When reliability information was low or missing, the scale was judged as poor unless reliability was not applicable (in case of single- or double-item measures) or a scale is already well established.

Low = clinical diagnosis; moderate = validated questionnaire; high = structured clinical interview (criterion standard).

Low = averaged sensitivity and specificity <.6; moderate = averaged sensitivity and specificity ≤.6 and <.8; high = averaged sensitivity and specificity ≥.8.

Judgment according to decision rule in Table 2.

Table 4

Ultrashort measures*

Scale name	Study, first author (reference)	Criterion	No. of participants	Sample	Reliability	Cutoff	Sensitivity	Specificity	PPV	NPV	Further validation
Anxiety question	Teunissen (21)	HADST	79	Palliative	—	1	.78	.52	.46	.82	No correlation with symptom distress except for pain
BCD	Jefford (22)	PRIME-MD	100	Mixed; 60% with metastatic disease	κ = .21	†	.67	.75	.41	.89	Patients with higher BCD scores had a lower ECOG performance status
Depression question	Chochinov (23)	SADS: AD+MDD	197	Palliative	—	1	1.00	1.00	1.00	1.00
	Lloyd-Williams (24,25)	SCID	74	Palliative	—	1	.55	.74	.44	.82
	Akechi (26)	SCID	209	Palliative; Japan	—
		AD+MDD				1	.47	.97	.81	.86
		MDD				1	.79	.92	.41	.98
	Kawase (27)	SCID: AD+MDD	282	Receiving radiotherapy; Japan	—	1	.42	.86	.22	.94
	Payne (28)	CD	167	Palliative	—	1	.70	.81	.57	.89
	Teunissen (21)	HADST	78	Palliative	—	1	.61	.94	.93	.65	No correlation with symptom distress except for pain
Interest question	Akechi (26)	SCID	209	Palliative; Japan	—
		AD+MDD				1	.47	.96	.76	.86
		MDD				1	.93	.92	.45	.99
	Payne (28)	CD	167	Palliative	—	1	.79	.73	.50	.91
Combination depression question	Chochinov (23)	SADS	197	Palliative	—	2	1.00	.98	.86	1.00
	Akechi (26)	SCID	209	Palliative; Japan	—
		AD+MDD				1	.68	.94	.76	.91
		MDD				1	1.00	.86	.33	1.00
	Payne (28)	CD	167	Palliative		2	.91	.68	.49	.96
One-question interview	Akizuki (29)	CD	275	Mixed; 42% with recurrent, metastatic disease; Japan	—	65	.80	.61	.76	.66	ECOG performance status and recurrence/metastatic disease were associated with poor mood
Distress Thermometer	Trask (30)	CRS	50	Awaiting bone marrow transplantation	r = .49; κ = .44	5	.79	.68	.60	.84	r(DT, HADSA) = .42; r(DT, HADSD) = .23. Nurses underestimated patients with high levels of distress
	Akizuki (29)	CD	275	Mixed; 42% with recurrent, metastatic disease; Japan	—	5	.84	.61	.77	.71
	Hoffman (31)	BSI	68	Mixed	α = .81 (of PL total)	4	.70	.59	.53	.75	Lacks a single cutoff. r(DT, BSI-18) = .61; r(DT, BSI) = .59
		BSI				5	.59	.71	.57	.73
		BSI-18				4	.90	.53	.25	.97
		BSI-18				5	.70	.64	.25	.93
	Akizuki (32)	CD	295	Mixed; 41% with recurrent, metastatic disease; Japan	—						r(DT, HADST) = .70; r(IT, HADST) = .71
		AD+MDD				4	.82	.82	.80	.84
		MDD				5	.89	.70	.39	.97
	Gil (33)	HADST	312	Mixed; 20% with metastatic disease; multicenter; Europe	—						r(DT, HADSA) = .50; r(DT, HADSD) = .40; r(MT, HADSA) = .56; r(MT, HADSD) = .61; r(DT, MT) = .40
		DT				4	.65	.79	.55	.85
		DT				5	.70	.73	.51	.86
		MT				3	.85	.72	.55	.92
		MT				4	.78	.77	.58	.90
	Jacobsen (34)	HADST	380	Mixed; multicenter	—	4	.77	.68	.44	.90	Patients above the cutoff were more likely to be female, have poorer performance status, and endorse more problems on the PL
	Jacobsen (34)	BSI-18	380	Mixed; multicenter	—	4	.70	.70	.53	.83
	Ransom (35)	CES-D	491	Bone marrow transplantation	—	4	.80	.70	.46	.92	r(DT, CES-D) = .59. Patients above the cutoff had poorer performance status and endorsed more problems on the PL
	Ozalp (36)	HADST	182	Mixed; Turkish	—	4	.73	.49	.47	.76	r(DT, HADSA) = .45; r(DT, HADSD) = .39
	Recklitis (37)	SCL-90-R	119	Adult survivors of childhood cancer	—	4	.56	.81	.56	.81
	Recklitis (37)	SCL-90-R	119	Adult survivors of childhood cancer	—	5	.64	.65	.44	.81
	Butt (38)	HADSA	597	Mixed	—	5	.86	.77	.56	.94	r(DT, HADST) = .56; r(DT, HADSA) = .38
	Butt (38)	HADSD	597		—	5	.63	.68	.39	.85	r(DT, HADST) = .56; r(DT, HADSA) = .38
	Dolbeault (39)	HADST	561	Outpatients; 19% with metastatic disease; French	—	3	.76	.82	.72	.85	r(DT, HADST) = .64. Women were more distressed, as well as patients taking analgesics and low in social support
	Gessler (40)	HADS	171	Outpatients; some palliative	Sensitive to change	5	.79	.81	.61	.91
		GHQ-12				5	.63	.83	.71	.77
		BSI-18				5	.88	.74	.38	.97
	Hegel (41)	PHQ-9	321	Breast, newly diagnosed	—	7	.81	.85	.37	.98
	Shim (42)	HADST	108	Mixed, 28% with metastatic disease; Korean	—	4	.83	.59	.57	.84	R² = .27 for HADSA; R² = .27 for HADSD
	Tuinman (43)	HADST	277	Mixed	α = .90 (of PL total)	5	.85	.67	.39	.95
ESAS	Teunissen (21)	HADST	79	Palliative	—
		Anxiety				5	.90	.76	.70	.93
		Depression				5	.61	.73	.71	.63
	Vignaroli (44)	HADST	216	Mixed	—
		Anxiety				2	.86	.56	.60	.84
		Depression				2	.77	.55	.50	.81
Visual analog scales	Chochinov (23)	SADS	197	Palliative	—	55	.72	.50	.17	.92
	Lees (45)	HADS	25	Palliative	—	—	—	—	—	—	r(VAS, HADSA) = −.59; r(VAS, HADSD) = −.82; r(VAS, HADST) = −.87
	Payne (46)	SCID	275	Breast	r(patient, physician) = .45; r(patient, nurse) = .42; r(physician, nurse) = .58	—	—	—	—	—	No differentiation between patients with or without DSM-IV diagnosis on VAS; r(VAS, HADS) = −.59; r(VAS, BSI) = −.44
		HADS
		BSI
		VAS (physician)
		VAS (nurse)
	Onelöv (47)	CES-D	3009	Mixed; widows/parents who had lost a husband/child to cancer; age-, gender-, residence region–matched population controls	—	3	.77	.77	.51	.91
	Onelöv (47)	STAI-T	3009			2	.52	.87	.64	.80
	Sela (48)	ZSDS	132	Palliative	—	—	—	—	—	—	r(VAS, ZSDS) = .82
	Singer (49)	SCID	250	Laryngeal; German	—
		Mental disorder				37	.76	.69	.30	.94
		Depression				41	.90	.69	.20	.99

AD = adjustment disorder; BCD = Brief Case Find for Depression; BSI-18 = Brief Symptom Inventory–18; CD = clinical diagnosis; CES-D = Center for Epidemiological Studies–Depression Scale; CRS = Coordinator Rating Scale; DT = Distress Thermometer; ECOG = Eastern Cooperative Oncology Group; ESAS = Edmonton Symptom Assessment System; GHQ-12 = General Health Questionnaire–12; HADS = Hospital Anxiety and Depression Scale; HADSA = HADS–anxiety subscale; HADSD = HADS–depression subscale; HADST = HADS–total; IT = Impact Thermometer; MDD = major depressive disorder; MT = Mood Thermometer; NPV = negative predictive value; PHQ-9 = Patient Health Questionnaire–9; PL = problem list; PPV = positive predictive value; PRIME-MD = Primary Care Evaluation of Mental Disorders; SADS = Schedule for Affective Disorders and Schizophrenia; SCID = Structured Clinical Interview for DSM (Diagnostic and Statistical Manual of Mental Disorders); SCL-90-R = Symptom Checklist-90–Revised; STAI-T = State–Trait Anxiety Inventory–trait version; VAS = visual analog scale; ZSDS = Zung Self-Rating Depression Scale; — = no information was available.

No information on cutoff provided.

Table 5

Short measures*

Scale name	Study, first author (reference)	Criterion	No. of participants	Sample	Reliability†	Cutoff	Sensitivity	Specificity	PPV	NPV	Further validation
BDI-SF	Chochinov (23)	SADS	197	Palliative	—	8	.79	.71	.27	.96
	Love (50)	MILP	227	Breast cancer, metastatic							Overall agreement .70 and .65
		Depression			κ = .41	4	.84	.63	.52	.89
		MDD			κ = .18	5	.94	.63	.16	.99
BEDS-6	Lloyd-Williams (51)	PSE	246	Palliative	α = .78	6	.72	.83	.65	.87	FA revealed six items from the 10-item EPDS accounting for 36% of variance
BSI-18	Zabora (52)	BSI-53 GSI	1543	Mixed	GSI: α = .89	10	.91	.93	.73	.98	PCA resulted in four factors accounting for 58% of variance
	Recklitis (53)	SCL-90-R	221	Adult survivors of childhood cancer	Depression: α = .92; anxiety: α = .87; somatization: α = .82	50 (t score)	.97	.85	.79	.98	Correlations with SCL-90-R were .93, .89, .88, and .94
	Hjoerleifsdottir (54)	—	40	Mixed outpatients under treatment; Icelandic	Depression: α = .85; anxiety: α = .91; somatization: α = .50	—	—	—	—	—	Anxiety was highest in patients receiving CT; more women experienced depression than men
	Recklitis (55)	—	8945	Adult survivors of childhood cancer	Depression: α = .88; anxiety: α = .81; somatization: α = .75	—	—	—	—	—	FA confirmed the three-factor structure in both sexes
CES-D	Hann (56)	POMS-SF	117; 62	Breast cancer; healthy comparison subjects (patients’ friends and relatives)	α = .89; test–retest reliability: r = .57	—	—	—	—	—	r(CES-D, POMS) = .66; r(CES-D, STAI) = .77; r(CES-D, MCS) = −.65
		STAI-S
		MCS
	Schroevers (57)	STAI-S	475; 255	Mixed; age- and sex-matched healthy reference group; Dutch	α(DA) = .87; α(PA) = .75	—	—	—	—	—	FA resulted in two factors (depressed affect; positive affect). Depressive affect appeared to be the more reliable and valid subscale. Moderate correlations with criterion measures. DA discriminated patient group from reference group
		GHQ
		RSCL
		POMS-LASA
		SWLS
		EPQ
	Katz (58)	SADS	60	Head and neck cancer outpatients	—	17	1.00	.85	.63	1.00
	Hopko (59)	ADIS-IV	33	Mixed	α = .90	17	1.00	.79	.92	1.00
EPDS	Lloyd-Williams (60)	PSE	100	Palliative	α = .78	13	.81	.79	.53	.94	FA resulted in three factors accounting for 52% of variance
	Lloyd-Williams (61)	—	50	Palliative	α = .81; κ = .77	—	—	—	—	—	Scores were assessed biweekly over 12 weeks and were stable in those below the cutoff at baseline
	Lloyd-Williams (25)	SCID	74	Palliative	α = .83; ρ = .77	13	.70	.80	.56	.88
	Lloyd-Williams (51)	PSE	246	Palliative	α = .78	11	.72	.74	.55	.85
GHQ-12	Le Fevre (62)	CIS-R	79	Palliative		—	—	—	—	—	AUC = .81
	Reuter (63)	CIDI	188	Mixed, 64% with metastatic disease; German	—
		Anydisorder				5	.55	.73	.39	.84
		Depression				2	.93	.49	.55	.91
		Anxiety				6	.67	.76	.60	.81
HADS	Razavi (64)	DIS	210	Oncology inpatients; French	—
		MDD
		HADST				19	.70	.75	.36	.93
		HADSD				9	.71	.76	.37	.93
		HADSA				11	.54	.75	.30	.89
		MDD and AD
		HADST				13	.75	.75	.87	.57
		HADSD				7	.59	.78	.86	.45
		HADSA				8	.64	.72	.84	.46
	Hopwood (65)	CIS	81	Breast cancer, advanced	—
		HADST				18	.75	.74	.48	.90
		HADSA				11	.75	.90	.56	.95
		HADSD				11	.75	.75	.43	.92
	Moorey (66)	—	568	Mixed 12 weeks after diagnosis or recurrence; 10% with metastatic disease	α(A) = .93; α(D) = .90	—	—	—	—	—	FA resulted in a two-factor solution. Stability in subsamples. FA correlation was .5
	Brandberg (67)	—	273	Skin cancer; Swedish	α(A) = .94; α(D) = .97; α(R) = .88	—	—	—	—	—	FA resulted in three factors (anxiety, depression, restlessness)
	Razavi (68)	SCID	117	Hodgkin and non-Hodgkin lymphoma outpatients; French	—	10	.84	.66	.59	.88	Lacks sensitivity in women aged 50 y or older
	Ibbotson (69)	PAS	513	Mixed	—	14	.80	.76	.40	.95
	Ramirez (70)	PSE	91	Preoperative breast cancer outpatients	—	11	.84	.83	.78	.88
	Berard (71)	CD: HADSD	100	Mixed, 21% with metastatic disease; French	—	8	.71	.95	.79	.93
						10	.43	.96	.75	.86
	Kugaya (72)	SCID:	128	Mixed; Japanese	α(A) = .77; α(D) = .79; test–retest reliability (8 days): r(A) = .73; r(D) = .82; sensitive to change
		AD and MDD
		HADST				11	.92	.65	.61	.93
		HADSA				8	.75	.88	.78	.86
		HADSD				5	.92	.58	.56	.92
		MDD
		HADST				20	.82	.96	.78	.97
		HADSA				8	.94	.88	.53	.99
		HADSD				11	.82	.95	.74	.97
	Costantini (73)	SCID	132	Breast cancer; Italian	α(A) = .80; α(D) = .85; sensitive to change	10	.84	.79	.71	.89	Factor structure replicated
	Hall (74)	PSE	269	Breast cancer, within 3 months after diagnosis	—
		HADSA				7	.72	.80	.78	.74
		HADSD				7	.37	.93	.76	.71
	Hosaka (75)	SCID	50	Otolaryngology inpatients (half with benign disease, half with malignant disease); Japanese	—	12	.91	.96	.95	.93
	Le Fevre (62)	CIS Endicott depression	79	Palliative	—	20	.77	.85	.48	.95	Sensitivities are considerably lower when tested against any psychiatric disorder or moderate and severe depression
	Payne (46)		275	Breast cancer outpatients	—	—	—	—	—	—	r(HADS, BSI) = .68; r(VAS, HADS) = −.59
	Kugaya (76)	SCID	107	Newly diagnosed head and neck cancer; Japanese	—	15	.72	.81	.43	.94	Advanced stage and living alone were associated with distress
	Cull (77)	PSE	172	Chemotherapy outpatients	—	9	.85	.71	.47	.94	Best screening result: if MHI-5 <11, then patient considered nondistressed. If MHI-5 ≥11, then use HADS
	Lloyd-Williams (78)	PSE	100	Mixed with metastatic disease	α = .85						FA resulted in a four-factor solution
		HADST				19	.68	.67	.36	.88
		HADSD				11	.54	.74	.38	.85
		HADSA				10	.59	.68	.34	.85
	Morasso (79)	SCID	113	Breast cancer; Italian	—	11	.71	.61	.52	—
			105			11	.67	.77	.63	—
			132			11	.70	.88	.78	.83
	Reuter (63)	CIDI	188	Mixed, 64% with metastatic disease; German	—
		Mental disorder				16	.60	.79	.47	.86
		Depression				17	.79	.76	.69	.84
		Anxiety				13	.88	.57	.53	.90
	Love (80)	MILP	303	Breast cancer, newly diagnosed	—
		HADSD				8	.23	.95	.74	.68
		HADSA				8	.34	.73	.13	.90
		HADST				19	.17	.98	.92	.59
	Smith (81)	PSE or SCAN	1474; 381 PSE or SCAN	Mixed, outpatients	—						FA structure was replicated for the whole sample and subsamples of sex, age, and patients with metastatic disease
		HADSA				‡	.70	.41
		HADSD				‡	.70	.48
	Akizuki (29)	CD	275	Mixed; 42% with recurrent, metastatic disease; Japanese	—	10	.92	.57	.77	.82
	Montazeri (82)	Disease stages; emotional functioning, QoL	167	Breast cancer; 38% with metastatic disease; Persian	α(A) = .78; α(D) = .86	—	—	—	—	—	r(HADST, EF) = −.70; r(HADST, QoL) = −.77; higher distress was associated with more advanced disease
	Keller (83)	SCID	77	Mixed; 45% with advanced disease; German	—	16	.86	.87	.73	.94	Patients above the cutoff were more likely to be female and to suffer from physical symptoms
		CD: Surgeons	178			16	.64	.48	.31	.79
		CD: Nurses	165			16	.72	.57	.37	.85
	Jefford (22)	PRIME-MD	100	Mixed; 60% with metastatic disease	κ = .27 (with HADSD)	11	.48	.95	.71	.87
	Katz (58)	SADS	60	Head and neck cancer outpatients, 38% with metastatic disease	—
		HADSD				5	1.00	.90	.69	1.00
		HADST				12	1.00	.95	.86	1.00
	Love (50)	MILP	227	Breast cancer; metastatic
		Depression			κ = .17	11	.16	.97	.75	.71
		MDD only			κ = .29	7	.81	.81	.24	.98
	Mystakidou (84)	ECOG performance status; STAI-S	120	Palliative; Greek	α(A) = .89; α(D) = .70; κ(A) = .85; κ(D) = .76	—	—	—	—	—	Subscales discriminated well between subgroups of patients with different disease severity; r(HADSA, STAI) = .68; r(HADSD, STAI) = .49
	Osborne (85)	—	731; 158	Breast cancer, population-based, age-matched reference group of healthy women	—	—	—	—	—	—	Women with breast cancer scored lower on anxiety (7.5 vs 8.2) and depression (3.3 vs 4.2) compared with the reference group
	Akizuki (32)	CD	295	Mixed; 41% with recurrent, metastatic disease; Japanese	—
		AD and MDD				15	.76	.86	.83	.80
		MDD only				17	.77	.74	.39	.94
	Gil (33)	—	312	Outpatients; Southern Europe	α = .81 both subscales	—	—	—	—	—	FA confirmed two-factor solution
	Rodgers (86)	—	110	Breast cancer	α(T) = .85; α(A) = .79; α(D) = .87	—	—	—	—	—	FA resulted in three subscales: negative affectivity, autonomic anxiety, anhedonic depression
	Thomas (87)	—	240	Mixed, 17% with metastatic disease; Indian	α(A) = .71; α(D) = .85	—	—	—	—	—	FA resulted in a slightly different two-factor structure
	Akechi (26)	SCID	209	Palliative	—
		AD and MDD
		HADST				13	.80	.67	.41	.92
		HADSD				7	.78	.58	.35	.90
		MDD
		HADST				17	.71	.77	.19	.97
		HADSD				9	.86	.69	.17	.99
	Muszbek (88)	ECOG performance status; BDI; No. of emotional problems	715	Mixed; Hungarian	α(A) = .81; α(D) = .83	—	—	—	—	—	Factor structure confirmed. Patients with lower performance status, more advanced disease, and No. of emotional problems reported more distress
	Smith (89)	PSE or SCAN	1855; 381 PSE or SCAN	Mixed	—						Slightly different from the original scale's two- factor structure. However, did not increase the subscales’ discriminant validity
		HADSA				8	.67	.61	.37	.84
		HADSD				7	.73	.64	.40	.88
		HADST				11	.70	.70	.44	.87
	Néron (90)	MADRS	49; four time points of measure-ment	Nonresectable lung cancer, newly diagnosed	ρ = .54	11	.63	1.00	1.00	.74	HADST and MADRS correlated r = .8
						11	.52	1.00	1.00	.68
						11	.82	1.00	1.00	.86
						11	.57	1.00	1.00	.78
	Walker (91)	SCID	361	Mixed	—
		HADST				15	.87	.85	.35	.99
		HADSD				7	.90	.88	.40	.99
		HADSA				9	.87	.83	.31	.99
	Miklavcic (92)	CSI	202	Female; mixed; Slovenian	—	—	—	—	—	—	r(CSI, HADSD) = .81; r(CSI, HADSA) = .91
	Özalp (93)	SCID	183	Breast cancer; 29% with advanced disease; Turkish	—
		AD
		HADST				10	.84	.55	.23	.95
		HADSD				6	.72	.67	.26	.94
		HADSA				5	.88	.53	.23	.97
		MDD
		HADST				17	.71	.80	.28	.96
		HADSD				5	.88	.60	.19	.98
		HADSA				7	.65	.69	.18	.95
	Singer (49)	SCID	250	Laryngeal cancer outpatients; German	—
		Mental disorder				14	.70	.80	.40	.93
		Depression				17	.85	.86	.35	.98
HQ-9	Singer (49)	SCID	250	Laryngeal cancer outpatients; German	—
		Mental disorder				2	.67	.78	.40	.93
		Depression				2	.85	.74	.25	.97
IES	Kirsh (94)	SCID	95	Undergoing BMT	α = .93	‡	.87	.24
	Chen (95)	HADS	106	Oral cancer, newly diagnosed; Chinese	Intrusion: α = .91; avoidance: α = .81; test–retest reliability (3 days): r = .97	—	—	—	—	—	FA resulted in two factors accounting for 56% of variance. Moderate correlation with HADS subscales
	Mystakidou (96)	HADS	82	Advanced; Greek	Avoidance: α = .77, ρ = .96; intrusion: α = .72, ρ = .96; hyperarousal: α = .85, ρ = .94	—	—	—	—	—	FA resulted in three factors: intrusion, avoidance, hyperarousal accounting for 57% of variance. Moderate correlations with HADS subscales
MAX-PC	Roth (97)	HADS	385	Prostate cancer, early stage	α = .89; test–retest reliability: r = .89	—	—	—	—	—	FA resulted in three factors: prostate cancer anxiety, fear of recurrence, PSA anxiety; r(MAX-PC, HADS) = .57; r(MAX-PC, DT) = .45
	Roth (97)	DT	385	Prostate cancer, early stage	α = .89; test–retest reliability: r = .89	—	—	—	—	—
	Roth (98)	HADS	367	Prostate cancer, 55% with metastatic disease	α = .90	—	—	—	—	—	Factor structure was replicated; r(MAX-PC, HADSA) = .51; r(MAX-PC, DT) = .60
	Roth (98)	DT	367	Prostate cancer, 55% with metastatic disease	α = .90	—	—	—	—	—
	Dale (99)	HADS	178	Undergoing prostate biopsy	α = .91	—	—	—	—	—	r(MAX-PC, HADSA) = .71
PDI	Morasso (100)	CD	102; 107; 225	Breast cancer; mixed; mixed; Italian	α = .88; κ = .83	29	.75	.85	.83	.79
	Morasso (79)	SCID		Breast cancer; Italian	—
		Before	113			28	.71	.69	.58	—
		During	105			28	.80	.70	.61	—
		After CT	132			28	.72	.87	.77	.84
PHQ-9	Omoro (101)	QOL	48	Head and neck cancer; Kenyan	α = .80; ICC = .71	—	—	—	—	—	Correlation with TNM stage and head and neck cancer–specific QoL scale
	Fann (102)	SDS	342	Mixed	—	—	—	—	—	—	Moderate negative correlations with QoL, low correlation with symptom distress
	Fann (102)	EORTC- QLQ30	342	Mixed	—	—	—	—	—	—
PCL-C	Andrykowski (103)	SCID	82	Breast cancer	Criterion A: κ = .91; criterion B: κ = .71; criterion C: κ = .64; criterion D: κ = .94	50	.60	.99	.75	.97
	Smith (104)	IES	111	BMT survivors	α = .89	—	—	—	—	—	FA resulted in four factors: numbing, memories of cancer treatment, hyperarousal, and avoidance. Group comparisons (no, subclinical, and clinical symptoms) demonstrated concurrent and discriminant validity
	Smith (104)	BSI	111	BMT survivors	α = .89	—	—	—	—	—
	DuHamel (105)	SF-36	236	BMT survivors	—	—	—	—	—	—	Four factor model: reexperiencing, avoidance, numbing, arousal. No association with sociodemographic and clinical variables
POMS-LASA	Sutherland (106)	—	42	Mixed, participated in 6-week CBT program	The measure demonstrated sensitivity to change	—	—	—	—	—	r(POMS, POMS-LASA) = .76; r(POMS-LASA, SCL-90-R) = .67
ZSDS	Passik (107)	Physician rating	1109	Outpatients	κ = .17	—	—	—	—	—
	Dugan (108)		1109	Outpatients	α = .84	—	—	—	—	—	Correlation with short form r = .92.
	Passik (109)	—	1109	Cancer outpatients	—	—	—	—	—	—	Exploratory FA resulted in four factors accounting for 48% of the variance
	Passik (110)	MINI	60	Oncology inpatients	—						r(ZSDS, MINI) = −.66; r(BZSDS, MINI) = −.57
		ZSDS				49	.58	.93	.90	.64
		BZSDS				57	.24	1.00	1.00	.52
	Kirsh (94)	SCID	95	Undergoing BMT	α = .89	‡	.83	.55	.66	.76
	Sharpley (111)	—	195	Prostate cancer	α(ZSAS) = .77; α(ZSDS) = .84	—	—	—	—	—	FA of combined scale (ZSAS + ZSDS) resulted in four factors: loss-depression, fear, two somatic subscales

AD = adjustment disorder; ADIS-IV = Anxiety Disorder Interview Schedule for Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition (DSM-IV); AUC = area under the curve; BAI = Beck Anxiety Inventory; BDI(-SF) = Beck Depression Inventory(-Short Form); BEDS-6 = Brief Edinburgh Depression Scale; BMT = bone marrow transplantation; BSI-18 = Brief Symptom Inventory–18; BZSDS = Brief Zung Self-Rating Depression Scale; CBT = cognitive behavioral therapy; CD = clinical diagnosis; CES-D = Center for Epidemiological Studies–Depression Scale; CIDI = Composite International Diagnostic Interview; CIS(-R) = Clinical Interview Schedule (Revised); CSI = Clinical Structured Interview; CT = chemotherapy; DA = depressed affect; DIS = Diagnostic Interview Schedule; DT = Distress Thermometer; ECOG = Eastern Cooperative Oncology Group; EF = emotional functioning; EORTC-QLQ 30 = EORTC Quality of Life Questionnaire–30; EPDS = Edinburgh Postnatal Depression Scale; EPQ = Eysenck Personality Questionnaire; FA = factor analysis; GHQ-12 = General Health Questionnaire–12; GSI = Global Severity Index; HADS = Hospital Anxiety and Depression Scale; HADSA = HADS–anxiety subscale; HADSD = HADS–depression subscale; HADST = Hospital Anxiety and Depression Scale–total scale; HQ-9 = Hornheide Questionnaire; ICC = intraclass correlation; IES = Impact of Event Scale; MADRS = Montgomery–Asberg Depression Rating Scale; MAX-PC = Memorial Anxiety Scale for Prostate Cancer; MCS = Mental Health Summary Scale from the MOS Short-Form 36 Health Survey; MDD = major depressive disorder; MHI = Mental Health Inventory; MILP = Monash Interview for Liaison Psychiatry; MINI = Mini-International Neuropsychiatric Interview; PA = positive affect; PAS = Psychiatric Assessment Schedule; PCA = principal component analysis; PCL-C = PTSD checklist; PDI = Psychological Distress Inventory; PHQ-9 = Patient Health Questionnaire–9; POMS-LASA = overall quality of life visual analog scale; POMS-SF = Profile of Mood States-Short Form; PRIME-MD = Primary Care Evaluation of Mental Disorders; PSA = prostate specific antigen; PSE = Present State Examination; RSCL = Rotterdam Symptom Checklist; SADS = Schedule for Affective Disorders and Schizophrenia; SCAN = Schedule for Clinical Assessment in Neuropsychiatry; SCID = Structured Clinical Interview for DSM (Diagnostic and Statistical Manual of Mental Disorders); SCL-90-R = Symptom Checklist-90–Revised; SDS = Symptom Distress Scale; SF-36 = Medical Outcomes Study (MOS) Short Form-36 Health Survey; STAI-S = State–Trait Anxiety Inventory–state version; SWLS = Satisfaction with Life Scale; VAS = visual analog scale; ZSAS = Zung Self-Rating Anxiety Scale; ZSDS = Zung Self-Rating Depression Scale. — = no information was available.

Single letters in parentheses represent abbreviations of HADS subscales (A = anxiety, D = depression, R = restlessness) and the total scale (T).

No information on cutoff provided.

Table 6

Long measures*

Scale name	Study, first author (reference)	Criterion	No. of participants	Sample	Reliability	Cutoff	Sensitivity	Specificity	PPV	NPV	Further validation
BAI	Hopko (59)	ADIS-IV	33	Mixed	α = .94	10	.83	.89	.95	.67
BDI	Berard (71)	CD	100	Mixed, 21% with metastatic disease	α = .93	16	.86	.95	.82	.96
	Katz (58)	SADS	60	Head and neck cancer outpatients, 38% with metastatic disease	—	13	.92	.90	.69	.98
	Jefford (22)	PRIME-MD	100	Mixed, 60% with disease metastatic	κ = .43	19	.52	.90	.58	.88
	Mystakidou (112)	HADS	105	Palliative; Greek	α = .91; test–retest reliability (7 days): r = .86	—	—	—	—	—	r(BDI, HADSA) = .54; r(BDI, HADSD) = .66
	Hopko (59)	ADIS-IV	33	Mixed		22	.92	1.00	1.00	.82
DI-C	Thomas (113)		63	Head and neck cancer; Indian	α = .85						Four factors: personal, spiritual, physical, and family domains
GHQ-28	Hughson (114)	PSE	75	Breast cancer, undergoing chemotherapy	ρ = .73	10	.93	.92	.74	.98
	Hughson (114)	PSE	75	Breast cancer, undergoing chemotherapy	κ = .74	10	.93	.92	.74	.98
	Ibbotson (69)	PAS	95	Mixed		8	.75	.92	.69
MEQ	Meyer (115)	SCID	45	Palliative	α = .94	†	.88	.61	.58	.89	Fatigue was associated with severe depression
	Meyer (115)	SCID	45	Palliative	κ = .52	†	.88	.61	.58	.89	Fatigue was associated with severe depression
	Meyer (116)	SCID	45	Palliative	κ = .47	—	—	—	—	—	Sensitivity to change demonstrated by monthly follow-ups (6 times)
		PCP			κ = .24
		Depression Q			κ = .29
POMS-SF	Baker (117)		428	Awaiting BMT	α = .78 to α = .90	—	—	—	—	—	FA resulted in six factors. Evidence for convergent and discriminant validity provided.
PSSCAN	Linden (118,119)	HADSA	1057; 570; 101; 78; 85; 56	Mixed, newly diagnosed	α = .83; test–retest reliability: r = .64 (2 months)	11	.92	.98	1.00	.42	r(PSSCAN-A, HADSA) = .72 and .82; r(PSSCAN-D, HADSD) = .59 and .75, for two different samples
		HADSD				8	.79	.83	.65	.91
						11	1.00	.86	.73	.77
						8	.89	.76	.28	.98
QSC-R23	Herschbach (120,121)	HADSD	1721	Mixed, 28% and 29% with metastatic disease; German	α = .89; sensitive to change	1.69	.88.	.72	—	—	FA resulted in five factors. Convergent validity established
QSC-R23	Herschbach (120,121)	HADSA	1721	Mixed, 28% and 29% with metastatic disease; German	α = .89; sensitive to change	1.64	90	.71
RSCL	de Haes (122)	—	86; 56; 611	During CT or follow-up; during CT, ovarian; under treatment, disease-free control subjects	α = .88 to .94 (psychological subscale), α = .71 to .88 (somatic subscales)	—	—	—	—	—	FA resulted in four factors: one psychological, three somatic
	Hall (74)	PSE	267	Breast cancer, within 3 months after diagnosis	—	11	.31	.96	.90	.54
	Hopwood (65)	CIS	81	Breast cancer outpatients	—	11	.75	.80	.56	.91
	Paci (123)	STAI	61; 147	Breast cancer outpatients; healthy women; Italian	α = .91; α = .87	—	—	—	—	—	Factor structure confirmed. Healthy women and breast cancer patients did not differ with respect to psychological and physical functioning. Modest to moderate correlations with state and trait anxiety
	Watson (124)	HADS	266; 168	Mixed	α = .86	—	—	—	—	—	FA resulted in five factors that include 26 items. Moderate correlations between psychological and physical summary scales and with HADS and PAIS, respectively, except for HADSA and RSCL physical subscale. Sensitivity to differences in treatment stages
	Watson (124)	PAIS	266; 168	Mixed	α = .77	—	—	—	—	—
	Ibbotson (69)	PAS	266	Mixed	—	7	.83	.71	.37	.95
	Agra (125)	NHP	118	Palliative; Spanish	α = .74 to .90; test–retest reliability: r = .71 to .88	—	—	—	—	—	Moderate to high correlations with NHP
	Tchen (126)	—	63	Non-Hodgkin lymphoma; age 65 y or older; French	α = .60 to .83	—	—	—		—	FA converged into two factors

ADIS-IV = Anxiety Disorder Interview Schedule for Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition (DSM-IV ); BAI = Beck Anxiety Inventory; BDI = Beck Depression Inventory; CD = clinical diagnosis; CIS = Clinical Interview Schedule; DI-C = Distress Inventory for Cancer; FA = factor analysis; GHQ-28 = General Health Questionnaire-28; HADS = Hospital Anxiety and Depression Scale; HADSA = HADS–anxiety subscale; HADSD = HADS–depression subscale; HADST = HADS–total; ICC = intraclass correlation; MEQ = Mood Evaluation Questionnaire; NHP = Nottingham Health Profile; PAIS = Psychosocial Adjustment to Illness Scale; PAS = Psychiatric Assessment Schedule; PCP = palliative care professional; POMS-SF = Profile of Moods State–short form; PRIME-MD = Primary Care Evaluation of Mental Disorders; PSSCAN = Psychosocial Screen for Cancer; PSSCAN-A = PSSCAN–anxiety subscale; PSSCAN-D = PSSCAN–depression subscale; PSE = Present State Examination; QSC-R23 = Questionnaire on Stress in Cancer Patients; RSCL = Rotterdam Symptom Checklist; SADS = Schedule for Affective Disorders and Schizophrenia; SCID = Structured Clinical Interview for DSM (Diagnostic and Statistical Manual of Mental Disorders); STAI = State–Trait Anxiety Inventory. — = no information was available.

No information on cutoff provided.

Tool evaluation* BAI = Beck Anxiety Inventory; BCD = Brief Case Find for Depression; BDI-SF = Beck Depression Inventory–Short Form; BEDS-6 = Brief Edinburgh Depression Scale; BSI-18 = Brief Symptom Inventory–18; CES-D = Center for Epidemiological Studies–Depression Scale; DI-C = Distress Inventory for Cancer; DT = Distress Thermometer; EPDS = Edinburgh Postnatal Depression Scale; ESAS = Edmonton Symptom Assessment System; GHQ = General Health Questionnaire; HADS = Hospital Anxiety and Depression Scale; HQ-9 = Hornheide Questionnaire–9; IES = Impact of Event Scale; MAX-PC = Memorial Anxiety Scale for Prostate Cancer; MEQ = Mood Evaluation Questionnaire; PCL-C = PTSD checklist; PDI = Psychological Distress Inventory; PHQ-9 = Patient Health Questionnaire–9; POMS-LASA = overall quality of life visual analog scale; POMS-SF = Profile of Moods State–short form; PSSCAN = Psychosocial Screen for Cancer; ROC = receiver operating characteristic; RSCL = Rotterdam Symptom Checklist; VAS = visual analog scale; ZSDS = Zung Self-Rating Depression Scale; — = no or insufficient information available. Number of participants refers to the total across studies that provide validity information based on ROC analyses data. If only a subsample was included in ROC analyses, then the subsample is indicated in parentheses. Low = Cronbach alpha or Spearman–Brown rho < .60, κ < .40, or r = .2. Moderate = Cronbach alpha or Spearman–Brown rho ≥ .60 and <.80, κ ≥ .4 and <.60, or r = .5. High = Cronbach alpha or Spearman–Brown rho ≥ .80, κ ≥ .60, or r = .8. When reliability information was low or missing, the scale was judged as poor unless reliability was not applicable (in case of single- or double-item measures) or a scale is already well established. Low = clinical diagnosis; moderate = validated questionnaire; high = structured clinical interview (criterion standard). Low = averaged sensitivity and specificity <.6; moderate = averaged sensitivity and specificity ≤.6 and <.8; high = averaged sensitivity and specificity ≥.8. Judgment according to decision rule in Table 2. Ultrashort measures* AD = adjustment disorder; BCD = Brief Case Find for Depression; BSI-18 = Brief Symptom Inventory–18; CD = clinical diagnosis; CES-D = Center for Epidemiological Studies–Depression Scale; CRS = Coordinator Rating Scale; DT = Distress Thermometer; ECOG = Eastern Cooperative Oncology Group; ESAS = Edmonton Symptom Assessment System; GHQ-12 = General Health Questionnaire–12; HADS = Hospital Anxiety and Depression Scale; HADSA = HADS–anxiety subscale; HADSD = HADS–depression subscale; HADST = HADS–total; IT = Impact Thermometer; MDD = major depressive disorder; MT = Mood Thermometer; NPV = negative predictive value; PHQ-9 = Patient Health Questionnaire–9; PL = problem list; PPV = positive predictive value; PRIME-MD = Primary Care Evaluation of Mental Disorders; SADS = Schedule for Affective Disorders and Schizophrenia; SCID = Structured Clinical Interview for DSM (Diagnostic and Statistical Manual of Mental Disorders); SCL-90-R = Symptom Checklist-90–Revised; STAI-T = State–Trait Anxiety Inventory–trait version; VAS = visual analog scale; ZSDS = Zung Self-Rating Depression Scale; — = no information was available. No information on cutoff provided. Short measures* AD = adjustment disorder; ADIS-IV = Anxiety Disorder Interview Schedule for Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition (DSM-IV); AUC = area under the curve; BAI = Beck Anxiety Inventory; BDI(-SF) = Beck Depression Inventory(-Short Form); BEDS-6 = Brief Edinburgh Depression Scale; BMT = bone marrow transplantation; BSI-18 = Brief Symptom Inventory–18; BZSDS = Brief Zung Self-Rating Depression Scale; CBT = cognitive behavioral therapy; CD = clinical diagnosis; CES-D = Center for Epidemiological Studies–Depression Scale; CIDI = Composite International Diagnostic Interview; CIS(-R) = Clinical Interview Schedule (Revised); CSI = Clinical Structured Interview; CT = chemotherapy; DA = depressed affect; DIS = Diagnostic Interview Schedule; DT = Distress Thermometer; ECOG = Eastern Cooperative Oncology Group; EF = emotional functioning; EORTC-QLQ 30 = EORTC Quality of Life Questionnaire–30; EPDS = Edinburgh Postnatal Depression Scale; EPQ = Eysenck Personality Questionnaire; FA = factor analysis; GHQ-12 = General Health Questionnaire–12; GSI = Global Severity Index; HADS = Hospital Anxiety and Depression Scale; HADSA = HADS–anxiety subscale; HADSD = HADS–depression subscale; HADST = Hospital Anxiety and Depression Scale–total scale; HQ-9 = Hornheide Questionnaire; ICC = intraclass correlation; IES = Impact of Event Scale; MADRS = Montgomery–Asberg Depression Rating Scale; MAX-PC = Memorial Anxiety Scale for Prostate Cancer; MCS = Mental Health Summary Scale from the MOS Short-Form 36 Health Survey; MDD = major depressive disorder; MHI = Mental Health Inventory; MILP = Monash Interview for Liaison Psychiatry; MINI = Mini-International Neuropsychiatric Interview; PA = positive affect; PAS = Psychiatric Assessment Schedule; PCA = principal component analysis; PCL-C = PTSD checklist; PDI = Psychological Distress Inventory; PHQ-9 = Patient Health Questionnaire–9; POMS-LASA = overall quality of life visual analog scale; POMS-SF = Profile of Mood States-Short Form; PRIME-MD = Primary Care Evaluation of Mental Disorders; PSA = prostate specific antigen; PSE = Present State Examination; RSCL = Rotterdam Symptom Checklist; SADS = Schedule for Affective Disorders and Schizophrenia; SCAN = Schedule for Clinical Assessment in Neuropsychiatry; SCID = Structured Clinical Interview for DSM (Diagnostic and Statistical Manual of Mental Disorders); SCL-90-R = Symptom Checklist-90–Revised; SDS = Symptom Distress Scale; SF-36 = Medical Outcomes Study (MOS) Short Form-36 Health Survey; STAI-S = State–Trait Anxiety Inventory–state version; SWLS = Satisfaction with Life Scale; VAS = visual analog scale; ZSAS = Zung Self-Rating Anxiety Scale; ZSDS = Zung Self-Rating Depression Scale. — = no information was available. Single letters in parentheses represent abbreviations of HADS subscales (A = anxiety, D = depression, R = restlessness) and the total scale (T). No information on cutoff provided. Long measures* ADIS-IV = Anxiety Disorder Interview Schedule for Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition (DSM-IV ); BAI = Beck Anxiety Inventory; BDI = Beck Depression Inventory; CD = clinical diagnosis; CIS = Clinical Interview Schedule; DI-C = Distress Inventory for Cancer; FA = factor analysis; GHQ-28 = General Health Questionnaire-28; HADS = Hospital Anxiety and Depression Scale; HADSA = HADS–anxiety subscale; HADSD = HADS–depression subscale; HADST = HADS–total; ICC = intraclass correlation; MEQ = Mood Evaluation Questionnaire; NHP = Nottingham Health Profile; PAIS = Psychosocial Adjustment to Illness Scale; PAS = Psychiatric Assessment Schedule; PCP = palliative care professional; POMS-SF = Profile of Moods State–short form; PRIME-MD = Primary Care Evaluation of Mental Disorders; PSSCAN = Psychosocial Screen for Cancer; PSSCAN-A = PSSCAN–anxiety subscale; PSSCAN-D = PSSCAN–depression subscale; PSE = Present State Examination; QSC-R23 = Questionnaire on Stress in Cancer Patients; RSCL = Rotterdam Symptom Checklist; SADS = Schedule for Affective Disorders and Schizophrenia; SCID = Structured Clinical Interview for DSM (Diagnostic and Statistical Manual of Mental Disorders); STAI = State–Trait Anxiety Inventory. — = no information was available. No information on cutoff provided.

Ultrashort Measures

A total of 29 studies examined the use of ultrashort screening instruments (Table 4). The majority of these ultrashort measures were validated for use in patients with advanced cancer. The single-item question “Are you anxious?” (21) was studied as a screening tool for emotional distress in palliative care patients and showed insufficient specificity to rule out nonanxious patients. The anxiety subscale of the HADS (131) was used as the criterion measure. The Brief Case Find Depression is a four-item scale that was validated against the Primary Care Evaluation of Mental Disorders in a small sample of cancer patients (22). Its interrater reliability was low. The measure had moderate specificity and performed worse than the HADS and the Beck Depression Inventory (BDI) (135) in ruling out nondepressed patients. Lengthy questionnaires may be especially burdensome for patients in palliative care. For this reason, several studies have tested single questions from diagnostic interviews against structured clinical interviews as a screening method for depressive disorders in palliative care patients. Altogether, seven studies (21,23–28) examined the psychometric properties of single screening questions; four of these studies (23,24,26,27) tested the single question against a structured clinical assessment of the diagnosis. Three studies (23,26,28) also examined the combination of the two screening questions (hereafter referred to as the combination depression questions) that represent the first and second diagnostic criteria for a depressive disorder. The first criterion—“Are you depressed?”—yielded perfect sensitivity and specificity to detect any kind of depressive disorder and outperformed the BDI and the visual analog scales in one study (23). However, in several other studies, it had low sensitivity to detect any affective disorder (24–27), whereas its sensitivity to detect a major depressive disorder was high across all studies. The second diagnostic criterion for a depressive disorder—“Have you lost interest?”—showed the same pattern as the first diagnostic criterion question (26) in that it was much less sensitive in detecting minor disorders, such as adjustment disorder, than in detecting major depressive disorder (26). The combined screening questions did not increase the specificity compared with each individual question but increased the sensitivity (26,28). An alternative screening tool, the one-question interview, was developed by Akizuki et al. (29) and asked patients to “Please grade your mood during the past week by assigning it a score from 0 to 100, with a score of 100 representing your usual relaxed mood. A score of 60 is considered the passing grade.” The measure had comparable psychometric properties to the HADS (131) and the National Comprehensive Cancer Network Distress Thermometer (DT) (136), but its criterion validity was low. The National Comprehensive Cancer Network DT was introduced more than a decade ago (136) and measures overall emotional distress with one item on an 11-point rating scale (from 0 = no distress to 10 = extreme distress). Although domain-specific distress can be measured with a complementary problem list that asks whether problems exist in practical, familial, emotional, physical, or spiritual domains, most of the studies included in this review provided psychometric information only on the DT itself. Altogether, 15 validation studies (29–43) examined the DT. Of these, eight studies (33,34,36,38–40,42,43) used the HADS as the criterion measure, four studies (31,35,37,41) used exclusively other distress or depression scales, and two studies (29,32) relied on clinical diagnosis to assess the validity of the DT. The DT scale was tested in populations of cancer patients with mixed diagnoses and disease stages, breast cancer patients, and patients awaiting bone marrow transplantation. Two studies (31,43) provided information on the internal consistency of the problem list. Its overall reliability was good but was insufficient for some of the subscales. Sensitivity to change has been shown in one study (40): changes in DT scores at 4 and 8 weeks were comparable to changes in the criterion measures’ scores. However, the interrater reliability, that is, the congruence of patient self-report compared with nurses’ judgments, tested in one study (29), was moderate. Nurses seemed to underestimate the actual distress of the patients (29). Taken together, criterion measures were weak to moderate, and most studies demonstrated moderate specificity for the DT. The optimal cutoff for identifying clinically significant distress in most studies was defined as 4 or 5, depending on the diagnostic criteria or the validation measures used. Compared with nondistressed patients, distressed patients reported more problems on the problem list (34,35), had lower Eastern Cooperative Oncology Group performance status (34,35), and were more likely to be female (34,38). Several modifications and extensions of the DT have been developed, including two-item screening tools that combine the DT with an impact thermometer, which asks patients about the impact of distress on their daily life activity (32), and with a mood thermometer (33). Both alternatives have been tested in comparison with the DT and demonstrate better psychometric properties. Two studies (21,44) examined subscales of the Edmonton Symptom Assessment System (137) that measure anxiety and depressive symptoms in comparison with the HADS. The Edmonton Symptom Assessment System was developed to assess symptom distress in palliative care patients. The scale demonstrated moderate validity as a screening tool for emotional distress in palliative care patients. Six studies (23,45–49) examined the validity of visual analog scales that were derived from the Memorial Pain Assessment Card mood subscale (138) as screening tools for emotional distress in various populations of cancer patients. One study (46) reported a moderate correlation between patients’ self-reported distress and the distress levels rated by their physicians. Another study (49) that compared several screening instruments with structured clinical interviews provided evidence that visual analog scales performed worse than other screening measures.

Short Measures

Most of the screening measures that have been validated for use in cancer patients have between five and 20 items. Altogether, 72 studies described 15 screening instruments of this length (Table 5). The BDI–Short Form is a widely used depression scale that consists of 13 items (139). Two studies (23,50) examined the psychometric properties of this scale in populations of patients with advanced cancer. The BDI–Short Form demonstrated low interrater reliability and moderate specificity. The BSI-18 is a self-report scale that was designed to assess clinically relevant psychological symptoms (127). The scale was tested against its two long forms, the BSI-53 Global Severity Index (128) and the Symptom Checklist-90–Revised (133). With these criterion measures, the BSI-18 demonstrated excellent reliability and validity in a large mixed sample of cancer patients with a sensitivity and specificity of .91 and .93, respectively (52), and in adult survivors of childhood cancer with a sensitivity and specificity of .97 and .85, respectively (53). Internal consistency was high for the anxiety and depression subscales (54,55). Results of a factor analysis confirmed the scale's three-factor structure (ie, depression, anxiety, and somatization) (55). The CES-D (129) is a 20-item depression measure that has been validated in mixed samples of cancer patients and reference groups of healthy control subjects (56–59). Results from factor analyses (57) suggested that the negative affect subscale of the CES-D was a better measure of depression than the CES-D total score. The CES-D demonstrated good internal consistency (56,57,59). Two studies (58,59) provided information on the scale's sensitivity and specificity and revealed that it has very good psychometric properties. The Edinburgh Postnatal Depression Scale, a 10-item scale that was initially developed to screen for postpartum depression in new mothers, measures guilt, worthlessness, and hopelessness (140), which are symptoms that may also discriminate between depressed and nondepressed patients with advanced cancer. This scale was examined as a screening tool for depression in patients with advanced cancer (25,51,60,61) and tested against a structured clinical interview as the criterion (25,51,60). The sensitivity and specificity of the Edinburgh Postnatal Depression Scale were adequate, and it performed better than the HADS in this population. The Edinburgh Postnatal Depression Scale also demonstrated good internal consistency and interrater reliability (25,60,61). A short form of the EDPS, the six-item Brief Edinburgh Depression Scale had psychometric properties that were comparable to those of the original scale (51). The GHQ-12 (130) was tested as a screening tool for psychological distress in two studies (62,63) and compared with the HADS. Both studies demonstrated that the psychometric properties of the GHQ-12 were adequate but inferior to those of the HADS in samples of patients with advanced cancer. The HADS is a 14-item questionnaire that assesses anxiety and depressive symptoms in medical settings (131). A total of 41 of the identified validation studies of screening tools used to detect psychological distress in cancer patients were conducted on the HADS or compared its psychometric properties with other scales (22,26,29,32,33,46,49,50,58,62–93). Ten studies (33,66,67,73,79,81,86–89) tested whether or not the known two-factor structure of the HADS (which corresponds to the anxiety and depression subscales of the questionnaire) could be replicated in samples of cancer patients. Most of those studies (33,66,73,81,87–89) did replicate the two-factor structure of the HADS in cancer patients. Two studies (67,86) yielded a three-factor solution and one study (78) a four-factor solution. Smith et al. (81) demonstrated in a very large sample of cancer patients that the two-factor structure was stable across subsamples that were stratified by age, sex, and disease stage. The internal consistency of each subscale and of the total scale were shown to be adequate (33,66,67,72,73,78,82,86,88) and sensitive to change (72) in cancer patients. Twenty-six studies (22,26,49,50,58,62–65,68–70,72–81,83,89,91,93) examined the discriminant validity of the HADS by comparing it with structured clinical assessments such as the SCID, Present State Examination, Clinical Interview Schedule, Clinical Interview Schedule–Revised, Psychiatric Assessment Schedule, Monash Interview for Liaison Psychiatry, Schedule for Affective Disorders and Schizophrenia, Schedule for Clinical Assessment in Neuropsychiatry, Composite International Diagnostic Interview, Primary Care Evaluation of Mental Disorders, and the Diagnostic Interview Schedule. Ten studies (49,58,62,65,70,72,73,75,83,91) showed that the screening performance of the HADS was high, 14 studies (22,26,63,64,68,69,74,76–79,81,89,93) showed moderate performance, and two studies (50,80) reported low screening performance. One study (69) reported that the HADS performed better in patients who were disease free or who had stable disease than in patients in acute treatment or with advanced disease. The HADS failed as a screening instrument in patients newly diagnosed with breast cancer (80). In some studies (65,74,93), the anxiety subscale of the HADS performed better than the depression subscale. Other studies demonstrated that the HADS total score had psychometric properties that were comparable (65) or superior (49) to those of the anxiety or depression subscales. We were disconcerted to find that cutoffs for distinguishing anxious or depressed patients from nonanxious or nondepressed patients differed widely across studies and that this variability had not been justified. The cutoffs for the HADS total score ranged from 8 to 22 and for the subscale scores from 5 to 11. The Hornheide Questionnaire Short Form is a nine-item questionnaire that was validated in 122 German patients with head and skin cancer following surgery and had high internal consistency (α = .81) (141). One study (49) compared different screening measures in a sample of German patients with laryngeal cancer and found that the psychometric properties of the Hornheide Questionnaire Short Form and of the other instruments were inferior to those of the HADS. The Impact of Event Scale was originally developed as an instrument to measure posttraumatic stress and is a 15-item scale that is widely used to assess emotional distress in cancer patients (142). One study (94) examined the discriminant validity of the Impact of Event Scale to detect adjustment disorder in patients undergoing bone marrow transplantation and found that this scale had inadequate specificity for use as a screening tool in this population. Other studies (95,96) did not provide further evidence for recommending the Impact of Event Scale as a distress screening tool in cancer patients. The Memorial Anxiety Scale for Prostate Cancer is an 18-item scale that was developed for use in prostate cancer patients and consists of three subscales: prostate cancer anxiety, prostate-specific antigen anxiety, and fear of recurrence (97). Except for the prostate-specific antigen anxiety subscale, the Memorial Anxiety Scale for Prostate Cancer has good internal consistency. Preliminary results of the scale's validity have been reported (97,98), but clinical cutoffs have yet to be established. The Memorial Anxiety Scale for Prostate Cancer was also validated for use in men undergoing prostate biopsy (99). The Psychological Distress Inventory (100) is a 13-item scale that was developed to measure distress in Italian breast cancer patients. Its reliability and validity indices are good (79,100). The discriminant validity of the Psychological Distress Inventory was tested against a structured clinical interview as the criterion, and cutoffs of 28 (79) and 29 (100) have been considered clinically significant. However, its use is limited to Italian-speaking patients. The Patient Health Questionnaire–9 (PHQ-9) measures depressive symptoms according to Diagnostic and Statistical Manual of Mental Disorders–Fourth Edition (143) criteria. The PHQ-9 was validated in a large sample of primary care and obstetrics and gynecology patients and found to have strong psychometric properties (132). The PHQ-9 also demonstrated adequate reliability as well as concurrent and divergent validity in a small study of head and neck cancer patients (101) and in a study that used a touch screen computerized version of the questionnaire (102). However, information on the scale's sensitivity and specificity with regard to clinical decision making in cancer patients is lacking. The Post Traumatic Stress Disorder Checklist-Civilian Version (144) was tested as a measure of posttraumatic stress in breast cancer patients (103) and in survivors of bone marrow transplantation (104,105). The latter two studies (104,105) examined the scale's construct validity and demonstrated that it had high reliability and a four-factor structure. In a sample of breast cancer patients, the measure showed moderate sensitivity but high specificity to detect posttraumatic stress disorder (103). One study evaluated the Profile of Mood States–Linear Analog Self-Assessment (106) as a screening instrument in cancer patients with mixed diagnoses and for patients with different stages of disease and compared it with the original Profile of Mood States and the Symptom Checklist-90–Revised. The measure demonstrated sensitivity to change and concurrent validity. However, not enough data are available on its psychometric properties to recommend its use in clinical decision making. The Zung Self-Rating Depression Scale is a 20-item questionnaire that evaluates depression (145). Six studies (94,107–111) reported information on the scale's psychometric properties in cancer patients. A 13-item short form of this scale is highly correlated (r = .92) with the long form (110). Although the scale (long form) had high reliability, it demonstrated low concordance rates with physician ratings of depression (107) and moderate validity (94,110) when used for cancer patients. Also, the short-form scale was found to have inadequate sensitivity compared with the long-form scale (110).

Long Measures

Nine scales, each with more than 20 items, were identified for screening cancer patients (Table 6). One small study of cancer patients (59) examined the psychometric properties of the Beck Anxiety Inventory (146). The study provided evidence that the Beck Anxiety Inventory can be a valid measure to screen cancer patients for emotional distress, but there is not enough validation information available to justify a recommendation at this time. Five studies (22,58,59,71,112) examined the psychometric properties of the 21-item BDI (135), and all but one (112) provided data from ROC analyses. One study (22) showed that the scale possessed low sensitivity, whereas the other studies demonstrated that it had excellent sensitivity and specificity to detect any depressive disorder. The Distress Inventory for Cancer (113) was developed for use in head and neck cancer patients. To our knowledge, the only information available to date is on the scale's construct validity, and more studies on the scale's discriminant validity are necessary before a recommendation is possible. The GHQ-28 (130) was tested as a screening tool for psychological distress in two studies (69,114), where it demonstrated high sensitivity and specificity to detect cancer patients with psychiatric symptoms. The Mood Evaluation Questionnaire (147) is a 23-item measure that demonstrated excellent internal consistency but only moderate agreement with SCID interview data (115). Its discriminant validity was adequate (115). The Mood Evaluation Questionnaire has been used for repeated assessments in patients with advanced cancer (116). One study (117) provided information about the construct validity of the Profile of Mood Scale–Short Form (148) for patients awaiting bone marrow transplantation. A factor analysis identified six factors that provided evidence for construct validity. The internal consistency of the subscales was high, with Cronbach alphas that ranged from .78 to .90 (117). To date, there is insufficient information on this scale's validity to make recommendations for its implementation in routine screening. The 21-item Psychosocial Screen for Cancer was developed in mixed samples of cancer patients, and its psychometrics are good (118,119). The scale assesses six domains: depressive symptoms, anxiety symptoms, quality of life (global), quality of life (number of days impaired), perceived social support, and social support desired. The anxiety and depression subscales were highly sensitive and specific when compared with the HADS. In addition, normative data exist that compare different samples of cancer patients with healthy control subjects and with a control group of persons with a chronic disease other than cancer (119). Specificity data suggest the use of a cutoff of 11 for screening of an anxiety or depressive disorder and a cutoff of 8 for screening of anxiety and depressive symptoms. The Questionnaire on Stress in Cancer Patients–Revised is a 23-item validated scale that was developed in a large sample of German patients with diverse cancer diagnoses (120,121). The Questionnaire on Stress in Cancer Patients–Revised consists of five subscales that measure psychosomatic symptoms, anxiety, information gaps, impairments in everyday life, and social distress. The Questionnaire on Stress in Cancer Patients–Revised is highly sensitive and moderately specific in detecting anxiety and depressive symptoms compared with the HADS. However, its use is limited to German-speaking patients because to our knowledge, no psychometric information exists on its translation into English (121). The Rotterdam Symptom Checklist (RSCL) is a 30-item questionnaire that has been used extensively in clinical trials (122). Although some studies showed a four- (122,123) or five-factor structure of this scale (124), a two-factor psychological and composite somatic structure has also been suggested (122–126). The psychological subscale demonstrated stability across subsamples as well as high internal consistency (122,124). Three studies provided information from ROC analyses: Two studies (65,69) reported that the RSCL had moderate psychometric properties for use as a screening tool, and one (74) found that the RSCL failed as a screening tool because of its low sensitivity. The RSCL was superior to the HADS in two studies (65,69) for samples of patients with progressive disease. Three studies reported on the psychometric properties of non-English [French (126), Italian (123), and Spanish (125)] versions of the questionnaire and showed results congruent with the original report, thus providing evidence for its use in cross-cultural settings. One study (149) reported only on an extension of the physical symptom scales of the RSCL and, therefore, was not included in this systematic review.

Discussion

We have provided extensive details on tool psychometrics, as well as details on types of tools and extent of validation, to guide clinicians’ own choice of an assessment instrument for routine emotional distress screening. Making recommendations about which screening tools should be used depends on the context in which tools are going to be implemented and the intended objectives that may vary across settings and users. The following recommendations were based on composite quality criteria that we defined using transparent decision rules (Table 2). Among ultrashort measures, the two-item combination depression questions had the best psychometric properties. The widely used DT had been subjected to the most validation studies on the largest patient samples but was not validated against a structured clinical interview with established sufficient psychometrics. For the DT, the sensitivity and specificity findings were lower than 80% in about half and two-thirds, respectively, of the validation studies. However, some evidence suggests that modifications of the DT, such as the Mood Thermometer (33), or expansions, such as the Impact Thermometer (32), may represent improvements over the original scale. Our findings regarding ultrashort measures differ in part from the results of other meta-analyses and reviews on screening tool validity. Meta-analyses (16,150) as well as studies in primary care (151,152) have demonstrated a lack of specificity in ultrashort measures (including the DT) for identifying depression. However, our results reveal that this criticism does not apply to the combination depression questions as these were found to demonstrate high specificity. When it comes to ultrashort measures, patients have reported that a single-item interview format did not accurately describe or capture their mood (38,116). In line with these findings, Ohno et al. (153) reported that 65% of patients responded to the question “Are you depressed or not?” with “neither,” which indicates their uncertainty when rating emotional distress with such a simple question, even though their HADS scores suggested that they had clinical depression. Furthermore, agreement between ultrashort and longer measures in identifying distressed patients detected by structured clinical interviews was poor (115). Problems with determining the face validity of single-item measures as well as patients’ difficulty with scaling on single-item screening tools could explain these discrepant findings. Consequently, further comparison studies investigating tools of different lengths should be conducted. Among the short measures, we can recommend the CES-D as a screening tool for depression because it met all criteria for quality. The most extensive validation existed for the HADS, and this was the case across disease types and stages as well as across languages and cultures. The scale has been extensively tested against criterion standards. Note that many other tools relied on the HADS for discriminant validation. Studies that compared the discriminant validity of the HADS against other scales found that the HADS was superior (26,49,58,62,63) or equivalent (65,69) to other measures. With regard to whether or not to use the total score or the subscale scores of the HADS, several studies showed that the total score was superior in nonpsychiatric patients (49,65,154). The BSI-18 and the GHQ-12 are short measures that also demonstrated good psychometric properties. Nevertheless, ROC analyses of the BSI-18 were based on comparisons of short form with the long form of the same instrument and do not, therefore, represent independent validation (52,55). In addition, the GHQ-12 consistently performed worse than the HADS (62,63). Nonetheless, both scales have also been used as criterion measures in validation attempts of other scales. The Post Traumatic Stress Disorder Checklist-Civilian Version, the Psychological Distress Inventory, and the Hornheide Questionnaire Short Form are short measures that demonstrated adequate psychometric properties. However, their use to date is limited to specific cancer types or language applications. For patients receiving palliative care, the Edinburgh Postnatal Depression Scale or its short form, the six-item Brief Edinburgh Depression Scale, demonstrated adequate psychometric properties. Because of the strong psychometric properties of the PHQ-9 in large samples of primary care and obstetrics and gynecology patients (132), this scale deserves further empirical evaluation of its value for distress screening of cancer patients. Among the long measures, the BDI and the GHQ-28 met all quality criteria. The Psychosocial Screen for Cancer has not been validated against a structured clinical interview but otherwise met all criteria. In addition, the Psychosocial Screen for Cancer provides information on the social support that a patient desired and actually received, which may also guide decision making in psycho-oncological follow-up. The Questionnaire on Stress in Cancer Patients–Revised was validated in a large sample of cancer patients and provided good psychometric properties. The existing English version of the scale, therefore, deserves recommendation as a screening tool for emotional distress in cancer patients. Finally, the RSCL is a long measure that demonstrated adequate psychometric properties for distress screening. Cancer-specific tools may provide more relevant information than generic scales on patients with a specific type of cancer; however, some of these tools, such as the Memorial Anxiety Scale for Prostate Cancer (97), require additional validation. Furthermore, the routine use of cancer-specific tools is particularly likely to be implemented in specialized centers such as those that treat breast or prostate cancer patients. Facilities that treat patients with a broader disease spectrum may benefit most from a screening tool that can be applied to a mixed patient population, such as well-established scales including the BDI, the CES-D, the GHQ-28, or the HADS. Furthermore, the use of a scale that assesses anxiety as well as depressive symptoms, such as the BSI-18, GHQ-28, the HADS, the Psychosocial Screen for Cancer, or the RSCL, may prevent anxiety disorders from being overlooked within a routine screening program. We argue that, depending on the physical condition of the patients and the treatment setting, relatively short tools should be used for the screening of palliative care patients or patients who are undergoing strenuous treatment. Furthermore, the use of shorter tools for routine screening in an inpatient setting is easier to justify and implement. By contrast, patients who have completed treatment, have follow-up appointments, or are attending rehabilitative care may have more physical resources (eg, compared with patients under chemotherapy treatment or palliative care patients) and more time to complete longer questionnaires. Moreover, cancer patients who are undergoing treatment may require immediate psychological support, whereas cancer survivors may need to adapt to the disease in the long term. For the latter patients, a more extensive psychological assessment seems to be needed. Although single-item interviews may have a useful role in assessing distress in palliative care patients by minimizing patient burden, it is also true that somewhat longer scales may have higher content validity and may be better suited for longitudinal assessments. Future research should compare the accuracy and appropriateness of tools of differing lengths in specific treatment settings. Choosing a tool for routine screening of cancer patients requires a trade-off between a measure with adequate psychometric properties and one with a reasonable length. It has been shown that computerized versions of screening instruments that use touch screen technology can be used successfully, including by older patients (155). The use of fully computerized touch screen and autoscoring technology minimizes the workload of oncology treatment personnel, further reduces costs, and ensures the continuity and standardization of its application. The usefulness of a screening program for emotional distress can be evaluated according to whether or not screened patients accept referral to a mental health professional. Shimizu et al. (156) found that neither patient demographic variables nor the level of physical functioning, disease stage, or treatment status was associated with acceptance of a referral by the patient, whereas level of distress was, thus providing evidence that screening for emotional distress can result in enhanced utilization of psychological treatment. Compared with structured clinical interviews, distress screening instruments tend to overestimate the prevalence rates of depressive disorders in cancer patients (116). In this regard, measures that have superior psychometric properties may, therefore, reduce the workload of psycho-oncology staff and allow for the accurate forecasting of resource needs. When clinic staff, alone or in cooperation with researchers, want to undertake distress tracking over time to assess treatment outcomes and/or learn more about adjustment processes longitudinally, then ultrashort screening tools tend to fall short because they lack a range of scores. Only the longer versions of measures can accomplish such objectives. Several limitations of this systematic review must be noted. Some validation studies or measures could have been overlooked because of the fact that only peer-reviewed articles were included in this review. On the other hand, the scientific accuracy of such studies or measures would have remained unclear because of their lack of peer review. Furthermore, we only included validation studies that provided information on construct validity, discriminant validity, and/or concurrent validity for at least one additional measure, and we excluded feasibility studies that only reported on the measure itself or on a translation of the measure. Many studies that were included only reported on limited aspects of validation. Of these, several described results of factor analyses, as well as subscale and total scale reliabilities, whereas others provided data from ROC analyses without information on reliability. Also, many included studies did not provide sufficient descriptive statistics to allow us to compute missing indices of sensitivity, specificity, positive predictive values, and negative predictive values. Consequently, the conclusions we draw in this review depend on the information given in the original reports. However, the strength of a systematic review is that it provides a broader scope than meta-analyses, which typically combine studies of varying types and consequently provide only summary statistics. Hence, this systematic review is, to our knowledge, the most comprehensive review to date that addresses a broad range of screening tools, varying types of cancers, and disease stages. In conclusion, several generic and newly developed cancer-specific instruments meet high-quality criteria for use in emotional distress screening of cancer patients. Many general emotional distress screening tools focus on depression. Nonetheless, highly prevalent transient anxiety or mixed emotional disorders that occur during the cancer diagnosis and treatment trajectory deserve the attention of clinicians. Hence, the exclusive use of a depression scale may overlook other disorders (eg, anxiety disorders). Consequently, a scale that measures mixed emotional states rather than depression only has clear merit for clinical practice. Apart from purely psychometric considerations, large-scale implementation of screening for emotional distress may not occur if a given test has to be purchased for each use. This factor alone may have an impact on the choice of a screening tool, given that some well-validated screening tools have to be purchased for every use, whereas others are available at no cost. Another useful criterion for deciding which tool to use is the treatment setting. For example, treatment centers that specialize in breast or prostate cancer may prefer to use disease-specific measures. In terms of actual decision making, it is important to recognize that a measure's sensitivity and specificity are a function of the cutoff that is used to distinguish anxious or depressed patients from nonanxious or nondepressed patients. Higher cutoffs improve the measure's specificity, and treatment facilities can decide upfront, by consciously choosing a specific cutoff, the amount of psychological and psychotherapeutic follow-up treatment they are willing to or can provide. Given that we were able to find a large number of well-executed validation studies on distress screening tools, we question whether the development of additional tools at this time should be discouraged to avoid redundancy. However, it may be worthwhile to initiate additional attempts to improve the validity of work on the tools that have good psychometric properties but that have not yet been validated against criterion standards. Worthy of note is an ongoing National Institutes of Health project—the Patient-Reported Outcomes Measurement Information System network (http://www.nihpromis.org/default.aspx)—to improve measures of patient-reported outcomes. A number of tools for the assessment of emotional distress in patients with chronic diseases are in the process of being developed within this network that may be useful as potential screening tools for emotional distress in cancer patients in the future. Empirical findings published to date do not allow us to judge the predictive validity of screening tools for emotional distress. Nonetheless, the screening tools recommended here are effective for routine screening of emotional distress based on their high sensitivity and specificity. However, further information is needed about how screening affects long-term outcomes and patient quality of life.

Funding

Canadian Institutes for Health Research (team grant #AQC83559).

138 in total

1. A new psychosocial screening instrument for use with cancer patients.

Authors: J Zabora; K BrintzenhofeSzoc; P Jacobsen; B Curbow; S Piantadosi; C Hooker; A Owens; L Derogatis
Journal: Psychosomatics Date: 2001 May-Jun Impact factor: 2.386

2. The PHQ-9: validity of a brief depression severity measure.

Authors: K Kroenke; R L Spitzer; J B Williams
Journal: J Gen Intern Med Date: 2001-09 Impact factor: 5.128

3. Rasch analysis of the dimensional structure of the Hospital Anxiety and Depression Scale.

Authors: A B Smith; E P Wright; R Rush; D P Stark; G Velikova; P J Selby
Journal: Psychooncology Date: 2006-09 Impact factor: 3.894

4. Distress and coping in cancer patients: feasibility of the Icelandic version of BSI 18 and the WOC-CA questionnaires.

Authors: E Hjörleifsdóttir; I R Hallberg; I A Bolmsjö; E D Gunnarsdóttir
Journal: Eur J Cancer Care (Engl) Date: 2006-03 Impact factor: 2.520

5. Acceptability of common screening methods used to detect distress and related mood disorders-preferences of cancer specialists and non-specialists.

Authors: Alex J Mitchell; Stephen Kaar; Chris Coggan; Joanne Herdman
Journal: Psychooncology Date: 2008-03 Impact factor: 3.894

6. Depressive symptoms in advanced cancer. Part 2. Depression over time; the role of the palliative care professional.

Authors: H A Martine Meyer; Claire Sinnott; Paul T Seed
Journal: Palliat Med Date: 2003-10 Impact factor: 4.762

7. Use of the Zung Self-Rating Depression Scale in cancer patients: feasibility as a screening tool.

Authors: W Dugan; M V McDonald; S D Passik; B D Rosenfeld; D Theobald; S Edgerton
Journal: Psychooncology Date: 1998 Nov-Dec Impact factor: 3.894

8. The value of the Hospital Anxiety and Depression Scale (HADS) for comparing women with early onset breast cancer with population-based reference women.

Authors: R H Osborne; G R Elsworth; M A G Sprangers; F J Oort; J L Hopper
Journal: Qual Life Res Date: 2004-02 Impact factor: 4.147

9. Assessment of psychological distress in prospective bone marrow transplant patients.

Authors: P C Trask; A Paterson; M Riba; B Brines; K Griffith; P Parker; J Weick; P Steele; K Kyro; J Ferrara
Journal: Bone Marrow Transplant Date: 2002-06 Impact factor: 5.483

10. Development of a distress inventory for cancer: preliminary results.

Authors: B C Thomas; V N Mohan; I Thomas; M Pandey
Journal: J Postgrad Med Date: 2002 Jan-Mar Impact factor: 1.476

139 in total

1. Re-rethinking the article by Thombs and colleagues.

Authors: Wolfgang Linden; Andrea Vodermaier
Journal: CMAJ Date: 2012-03-06 Impact factor: 8.262

2. Nutritional status, acute phase response and depression in metastatic lung cancer patients: correlations and association prognosis.

Authors: Zoe Giannousi; Ioannis Gioulbasanis; Athanasios G Pallis; Alexandros Xyrafas; Danai Dalliani; Kostas Kalbakis; Vassilis Papadopoulos; Dimitris Mavroudis; Vassilis Georgoulias; Christos N Papandreou
Journal: Support Care Cancer Date: 2011-10-01 Impact factor: 3.603

10. Screening for major depressive disorder in adults with cerebral glioma: an initial validation of 3 self-report instruments.

Authors: Alasdair G Rooney; Shanne McNamara; Mairi Mackinnon; Mary Fraser; Roy Rampling; Alan Carson; Robin Grant
Journal: Neuro Oncol Date: 2012-12-09 Impact factor: 12.300

Screening for emotional distress in cancer patients: a systematic review of assessment instruments.

Methods

Study Selection

Study Inclusion and Evaluation Criteria

Reliability

Validity

Overall Judgment

Results

Ultrashort Measures

Short Measures

Long Measures

Discussion

Funding

1. A new psychosocial screening instrument for use with cancer patients.

2. The PHQ-9: validity of a brief depression severity measure.

3. Rasch analysis of the dimensional structure of the Hospital Anxiety and Depression Scale.

4. Distress and coping in cancer patients: feasibility of the Icelandic version of BSI 18 and the WOC-CA questionnaires.

5. Acceptability of common screening methods used to detect distress and related mood disorders-preferences of cancer specialists and non-specialists.

6. Depressive symptoms in advanced cancer. Part 2. Depression over time; the role of the palliative care professional.

7. Use of the Zung Self-Rating Depression Scale in cancer patients: feasibility as a screening tool.

8. The value of the Hospital Anxiety and Depression Scale (HADS) for comparing women with early onset breast cancer with population-based reference women.

9. Assessment of psychological distress in prospective bone marrow transplant patients.

10. Development of a distress inventory for cancer: preliminary results.

1. Re-rethinking the article by Thombs and colleagues.

2. Nutritional status, acute phase response and depression in metastatic lung cancer patients: correlations and association prognosis.

3. Education for cancer-related fatigue: could talking about it make people more likely to report it?

4. Development and validation of self- and caregiver-report of a distress screening tool for pediatric cancer survivors.

Review 5. Provision of integrated psychosocial services for cancer survivors post-treatment.

6. Measuring appraisal during advanced cancer: psychometric testing of the appraisal of caregiving scale.

7. Factors associated with anxiety and depression in cancer patients prior to initiating adjuvant therapy.

Review 8. Heath-related quality of life in Spanish breast cancer patients: a systematic review.

9. Factors associated with emotional distress in newly diagnosed prostate cancer patients.

10. Screening for major depressive disorder in adults with cerebral glioma: an initial validation of 3 self-report instruments.