Literature DB >> 34066777

How to Assess Shoulder Functionality: A Systematic Review of Existing Validated Outcome Measures.

Rocio Aldon-Villegas1, Carmen Ridao-Fernández1, Dolores Torres-Enamorado2, Gema Chamorro-Moriana1.   

Abstract

The objective of this review was to compile validated functional shoulder assessment tools and analyse the methodological quality of their validations. Secondarily, we aimed to provide a comparison of the tools, including parameter descriptions, indications/applications, languages and operating instructions, to choose the most suitable for future clinical and research approaches. A systematic review (PRISMA) was conducted using: PubMed, WoS Scopus, CINHAL, Dialnet and reference lists until 2020. The main criteria for inclusion were that papers were original studies of validated tools or validation studies. Pre-established tables showed tools, validations, items/components, etc. The QUADAS-2 and COSMIN-RB were used to assess the methodological quality of validations. Ultimately, 85 studies were selected, 32 tools and 111 validations. Risk of bias scored lower than applicability, and patient selection got the best scores (QUADAS-2). Internal consistency had the highest quality and PROMs development the lowest (COSMIN-RB). Responsiveness was the most analysed metric property. Modified UCLA and SST obtained the highest quality in shoulder instability surgery, and SPADI in pain. The most approached topic was activities of daily living (81%). We compiled 32 validated functional shoulder assessment tools, and conducted an analysis of the methodological quality of 111 validations associated with them. Modified UCLA and SST showed the highest methodological quality in instability surgery and SPADI in pain.

Entities:  

Keywords:  assessment scale; methodological quality; outcome measure; psychometrics properties; shoulder; systematic review

Year:  2021        PMID: 34066777      PMCID: PMC8151204          DOI: 10.3390/diagnostics11050845

Source DB:  PubMed          Journal:  Diagnostics (Basel)        ISSN: 2075-4418


1. Introduction

Shoulders are essential to human beings’ functionality. Their specific biomechanics make shoulders the most mobile joint complex in the body, providing upper limbs with mobility in the three axes of space [1]. Frequent shoulder dysfunctions are the third cause of musculoskeletal consultations in primary health care [2]. In fact, arthritis [3], rotator cuff (RC) injuries [4], shoulder instabilities [5] and fractures [6,7] constitute a large part of traumatology dysfunctions. Approximately 1% of the adult population in developed countries visit their doctor annually for shoulder pain [8]. For example, the incidence in the United Kingdom is 9.5 per 1000 inhabitants [9] and the annual prevalence in Spain is 70 to 200 cases per 1000 residents [10]. Currently, functional shoulder assessment methods are necessary to identify structural and/or biomechanical changes, and to link them to patients’ functional limitations and disabilities [11]. Furthermore, their use has increased in recent years [12], as they enable therapists and patients to work more objectively, standardise professional terms and develop and apply protocol treatments. All of this favours comparative analysis among the results obtained by the different interventions [13] and justifies the development of many functional shoulder assessment tools, with varying degrees of methodological quality and efficacy in the clinical setting. However, the wide range of possibilities and the difficult access to outcome measures mean that the selection of the most appropriate way to assess shoulder function and disability could be a difficult task [11]. The content of each of them should be adapted to shoulder pathologies, symptoms, user characteristics and cultural, population and occupational contexts [11]. In addition, clinicians should also consider the quality methodological criteria, based on their validation studies and practical characteristics (e.g., duration and administration method), before making a decision [14]. Thus, because of the important role shoulders play in human beings’ functionality, the high incidence of shoulder dysfunctions, the importance of functional evaluation as well as the large number of tools and the difficulty in accessing them, the purpose of this systematic review was to compile the validated functional assessment tools and analyse the methodological quality of the validations associated with them. A second aim was to provide an operational comparison of the tools by means of parameter descriptions, indications, applications, languages and tool instructions for use, in order to choose the most suitable for future clinical and research approaches.

2. Materials and Methods

The method employed in this systematic review is based on the PRISMA statement [15]. The protocol was registered in the PROSPERO database (CRD42020218616).

2.1. Data Sources and Search Strategy

An electronic search of PubMed, Web of Science (WoS), Scopus, Cumulative Index to Nursing and Allied Health Literature (CINAHL) and Dialnet (accessed on 11 January 2021) was conducted from inception through 2020, inclusive. The reference lists in this review and in each selected study were also considered to find other related articles. All papers that met the inclusion criteria were accepted. Most the search terms used in this study came from Mesh (Medical Subject Headings). Other terms of interest were included due to their frequent use. The terms applied and the full list of search strategies are reported in Table 1.
Table 1

Terms and search strategies.

TermsIdentifier
Scale OR scor* OR questionnaire OR test OR index OR assess* OR examination OR measure OR evaluation OR rating1
Shoulder 2
Database Search strategy Simplified strategy/ Filters employed
PubMedshoulder AND (scale OR scor* OR questionnaire OR test OR index OR assess* OR examination OR measure OR evaluation OR rating)1 AND 2In humans
Web of Scienceshoulder AND (scale OR scor* OR questionnaire OR test OR index OR assess* OR examination OR measure OR evaluation OR rating)1 AND 2
Scopusshoulder AND (scale OR scor* OR questionnaire OR test OR index OR assess* OR examination OR measure OR evaluation OR rating)1 AND 2
CINAHLshoulder AND (scale OR scor* OR questionnaire OR test OR index OR assess* OR examination OR measure OR evaluation OR rating)1 AND 2
Dialnetshoulder AND (scale OR scor* OR questionnaire OR test OR index OR assess* OR examination OR measure OR evaluation OR rating)1 AND 2

Note: MESH terms are in italics. All databases were filtered by language. Papers written in English, Spanish and French were included. Abbreviations: scor*: score, scoring; assess*: assess, assessment.

2.2. Study Selection

The included papers met the following inclusion criteria: original studies of validated functional shoulder assessment tools or validation studies (original or subsequently) associated with the identified scales, including physical tools or not; for human beings; validation studies in English, Spanish or French (outcome measures were included in any language). The reviewers, RA and DT, separately screened titles and abstracts of the search results to check if the studies met the pre-established inclusion criteria. GC solved the disagreements. The full texts of the studies that met the criteria were acquired and the causes for any exclusion at this stage were documented.

2.3. Data Extraction

Data extraction was carried out by one reviewer (RA) and verified by a second reviewer (CR). Discrepancies between reviewers were resolved by a third reviewer (GC), who assessed the information independently. A pre-designed table details data regarding the shoulder assessment tools: authors, years, original validation studies and subsequent validation studies, indications/applications, countries of origin, languages, descriptions and instructions for use, observations (e.g., recommendations, location of the physical scale) and bibliographic references of interest. Another table shows the study population of the validations. In addition, a pre-designed comparative table includes data regarding the contents of items and components of each tool. The content percentages are represented in a complementary bar chart. The quality assessment was evaluated with two standardised tables.

2.4. Assessment of Methodological Quality

Two assessment scales were used to evaluate the methodological quality of the validations included: the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) [16] and the COnsensus-based Standards for the selection of health Measurement Instruments Risk of Bias checklist (COSMIN RB) [17]. QUADAS-2 is an evaluation scale for the diagnostic criteria of validation studies. It was employed to assess risk of bias and applicability. The seven items of this tool helped identify a “low”, “high” or “unclear” risk of bias in each domain, or concerns regarding applicability. QUADAS-2 is recommended by the Agency for Healthcare Research and Quality, the Cochrane Collaboration and the UK National Institute for Health and Clinical Excellence for use in systematic reviews regarding diagnostics [16]. COSMIN RB [17] was used as a tool to assess the adequacy and validity of the identified outcome measures [18]. This checklist consists of ten boxes, which correspond to ten metric properties, each of which contains items concerning aspects of design and statistical method. Each assessment was classified in a range of four levels: “very good”, “adequate”, “doubtful” and “inadequate” [17]. COSMIN RB does not take into account the metric properties that have not been applied in papers; that is, it does not evaluate the absence of metric properties negatively.

3. Results

3.1. Search Results

A total of 184,043 records were identified both on electronic databases and through manual search (reference lists). Following the removal of duplicates, 93,159 studies were screened by title, abstract and full text, rejecting them based on not being original studies of the validated functional shoulder assessment scales or not being validation studies (whether original or subsequent) associated with these outcome measures; not being for humans; and not being published in English, French or Spanish. After the screening, 85 studies were selected: 32 validated functional shoulder assessment scales and 73 validation studies including 111 validations associated with them. We would like to note that some validation studies validated more than one tool, and some original studies were not validation papers. Figure 1 presents the flow diagram of the study selection process based on the PRISMA protocol [15].
Figure 1

PRISMA flow diagram. Note: Some validations studies validated more than one assessment tool, and a few original studies were not validation papers.

3.2. Characteristics of the Included Assessment Tools

Table 2 shows a summary of the descriptive data from the selected shoulder outcome measures: authors, years, original validation studies, other subsequent validation studies, indications or applications, countries of origin and languages, descriptions and instructions for use and observations (e.g., recommendations, location of physical scales).
Table 2

Characteristics of the included assessment tools.

Tools. Author, Years.Original Validation StudiesOther Subsequent Validation StudiesIndications/ApplicationsCountry of OriginLanguagesDescription and Operating InstructionsObservations (Recommendation, Physical Scale, etc.)
1. AMERICAN SHOULDER AND ELBOW SURGEONS STANDARDIZED SHOULDER ASSESSMENT FORM (ASES)(Richards et al., 1994) [19] Beaton et al., 1996 [20]. Beaton et al., 1998 [21], Cook et al., 2002 [22], Michener et al., 2002 [23],Oh et al., 2009 [24], Kemp et al., 2012 [25],Sciascia et al., 2017 [26], Dabija et al., 2019 [27],Vrotsou et al., 2019 [28], Gotlin et al., 2020 [29], Hou et al., 2020 [30], Baumgarten et al., 2020 [31]. Shoulder instability [25], total shoulder arthroplasty [26], rotator cuff (RC) tears [27], LHBT tenotomy [32], proximal humerus fracture [33], scapular dyskinesia and shoulder pain [34]. United StatesEnglish [19], German [35], Italian [36], Arabic [37],Turkish [38], Dutch [39], Finnish [40], Portuguese [41], Spanish [42,43].It consists of 2 sections: a patient self-evaluation and a clinician assessment. The patient self-evaluation form is divided into 2 parts: pain and instability, and activities of daily living (ADL). The clinician assessment portion consists of 3 components: range of motion (ROM), signs, and strength and instability. The shoulder score is derived by the following formula: (10 − Visual analogue scale pain score) x 5) + (5/3 x cumulative ADL score). The maximum score (100 points) indicates optimal state of the shoulder. Beaton et al. [21] made a modification to ADL; 2 items were eliminated and 5 were added.The physical scale can be found in [19].
2. CONSTANT-MURLEY SCORE (CMS)(Constant et al., 1987) [44] Conboy et al., 1996 [45].Cook et al., 2002 [22],Angst et al., 2008 [46], Razmjou et al., 2008 [47], Rocourt et al., 2008 [48],Oh et al., 2009 [24],Kemp et al., 2012 [25],Ban et al., 2016 [49], Mahabier et al., 2016 [50],Sciascia et al., 2017 [26], James-Belin et al., 2018 [51].Shoulder arthroplasty [46], RC disease [47], shoulder instability [25], clavicle fractures [49], humeral shaft fractures [50], subacromial pain [52].United StatesEnglish [44], Chinese [53],French [54], Portuguese-Brazilian [55],Italian [56], Arabic [57].It consists of 13 items divided into 4 components: pain (15 points), ADL (20 points), ROM (40 points), and strength (25 points). The maximum score (100 points) indicates optimal state of the shoulder.Self-administered section and clinician assessment section. CMS is one of the most commonly used international shoulder scoring scales [53].The physical scale can be found in [44].
3. DUTCH SHOULDER DISABILITY QUESTIONNAIRE (DUTCH-SDQ) (Van der Heijden et al., 1996) [58] Van der Windt et al., 1998 [59].Van der Heijden et al., 2000 [60],Paul et al., 2004 [61].Shoulder disorders [59,60], shoulder pain [61,62].Netherlands English [59],Spanish [62].It is composed of 16 questions in relation with shoulder functionality. Items scored by ticking a “yes”, “no”, or “not applicable” box if item does or does not describe patient. Items ticked “yes” are summed and normalized to 100. The maximum score (100 points) indicates the highest degree of disability. Self-administered.The physical scale can be found in [59].
4. FLEXILEVEL SCALE OF SHOULDER FUNCTION (FLEX-SF)(Cook et al., 2003) [63] Cook et al., 2003 [63]. Shoulder disorders [64], frozen shoulder syndrome [65], shoulder tightness [66], adhesive capsulitis [67], subacromial impingement syndrome [68], RC tears [69].United StatesEnglish [63].It includes 3 tests that target low, medium, and high shoulder function. Each level is composed of 15 items. Each item is valued from 0 to 5. The patient performs 1 of these 3 levels of difficulty based on their lesion. The maximum score (60 points) indicates optimal state of the shoulder.Self-administered.The physical scale can be found in [63].
5. FUDAN UNIVERSITY SHOULDER SCORE(Ge et al., 2013) [70] Ge et al., 2013 [70]. Shoulder disorders [70], arthroscopic repair of the supraspinatus [71].ChinaEnglish [70].It is composed of 4 domains: pain (20 points), ADL (27 points), ROM and strength (32 points), and satisfaction of the patient and clinician (21 points).The maximum score (100 points) indicates the optimal state of the shoulder.Self-administered section and clinician assessment section.Domains 1 and 2 comprise self-report assessments by patient, domain 3 is set up for clinician assessment. Section 4 is designed for both patient and clinician assessments.The physical scale can be found in [70].
6. FUNCTIONAL SHOULDER SCORE (FSS)(Iossifidis et al., 2015) [72] Iossifidis et al., 2015 [72]. RC disorders [72].United KingdomEnglish [72].It is composed of 11 items divided into 2 categories: pain (1 item; 50 points) and ADL (10 items; 50 points). Each item has a possible score from 0 to 10 points. The maximum score (100 points) indicates the optimal state of the shoulder. e.g.,: Total score = (pain score x 5) + (ADL Score/2).Self-administered.The physical scale can be found in [72].
7. KOREAN SHOULDER SCORING SYSTEM (KSS)(Tae et al., 2009) [73]Tae et al., 2009 [73]. RC disorders [73], RC repair [74,75,76], acromioclavicular joint dislocation [77], humeral fracture [78], adhesive capsulitis [79].KoreaEnglish [73].It is composed of 5 domains: function (ADL) (30 points); pain (20 points); satisfaction (10 points); ROM (20 points); and muscle power, consisting of strength (10 points) and endurance (10 points).The maximum score (100 points) indicates optimal state of the shoulder.Self-administered section and clinician assessment section.The physical scale can be found in [73].
8. MELBOURNE INSTABILITY SHOULDER SCALE (MISS) (Watson et al., 2005) [80] Watson et al., 2005 [80]. Shoulder instability [80,81].United StatesEnglish [80].It is composed of 22 items divided into 4 subgroups: pain (15 points), instability (33 points), function (32 points), and occupation and sporting demands (20 points). The maximum score (100 points) indicates optimal state of the shoulder.Self-administered. The scale contains a personal data sheet and medical information of interest.The physical scale can be found in [80].
9. MODIFIED CONSTANT-MURLEY SCORE (CMS)(Constant et al., 2008) [82]Van der Water et al., 2014 [83]. Proximal humeral fracture [83], shoulder impingement syndrome [84], shoulder pain [85], RC tears [86].United StatesEnglish [82], Danish [84], Greek [85], Turkish [87].It consists of 13 items divided into 4 components: pain (15 points), ADL (20 points), ROM (40 points), and strength (25 points). The maximum score (100 points) indicates optimal state of the shoulder. Self-administered section and clinician assessment section.Is a modification of the Constant-Murley Score [44]. Constant modified how to measure pain (VAS added), ADL (questions included), ROM (more indications), and strength (guidelines for correct measures).The physical scale can be found in [82].
10. MODIFIED ROWE SHOULDER SCORE (MRS)(Rowe et al., 1981) [88]Romeo et al., 1996 [89]. Shoulder subluxation [88], shoulder instability [89], anterior capsulolabral reconstruction [90], proximal humerus fractures [91], posterior shoulder dislocation [92], and SLAP lesion [93].United StatesEnglish [88], Portuguese [93].It consists of 4 components: function (50 points), pain (10 points), stability (30 points), and ROM (10 points). The maximum score (100 points) indicates optimal state of the shoulder. Interpretation: excellent (90–100 points), good (70–89 points), fair (40–49 points), and poor (<39 points).Self-administered section and clinician assessment section.Is a modification of the Rowe Scale [94].The items and the interpretation of both scales are different. The physical scale can be found in [88].
11. MODIFIED UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) (Ellman et al., 1986) [95] Cook et al., 2002 [22].Oh et al., 2009 [24], Van de Water et al., 2014 [83], Vascellari et al., 2018 [96].Shoulder disorders [22], shoulder surgery [24], proximal humeral fractures [83], anterior shoulder instability surgery [96], RC repair and proximal humeral fracture osteosynthesis [97], impingement syndrome [98].United StatesEnglish [95], Italian [99].It is composed of 5 components: pain (10 points), function (10 points), ROM (5 points), muscular strength (5 points), and patient satisfaction (5 points).The maximum score (35 points) indicates optimal state of the shoulder.Interpretation: excellent evaluation (35–34 points), good (33–29 points), fair (27–21 points), and poor result (<20 points).Self-administered section and clinician assessment section.The physical scale can be found in [95].
12. MUNICH SHOULDER QUESTIONNAIRE (MSQ)(Schmidutz et al., 2012) [100]Schmidutz et al., 2012 [100]. Shoulder disorders [100], reconstruction of proximal humerus fractures [101], subacromial impingement [102], dislocated fracture of the lateral clavicle [103].GermanyEnglish [100].It consists of 30 items divided into 6 domains: ROM (5 items; 50 points), power of the shoulder (1 item; 24 points), pain (6 items; 60 points), work and ADL (9 items; 90 points), recreational activities/sports (6 items; 60 points), and social life (3 items; 30 points).The maximum score (314 points) indicates optimal state of the shoulder.The score can be reported as a percentage of normal by subtracting the total from 314, dividing by 314, and multiplying by 100.e.g.,: (314 − total score/314) x 100.Self-administered.It consists of 3 parts: cover sheet, objective section, and subjective assessment.The physical scale can be found in [100].
13. OXFORD INSTABILITY SCORE (OIS)(Dawson et al., 1999) [104] Dawson et al., 1999 [104].Van der Linde et al., 2017 [105].Shoulder instability [105,106], arthroscopic Bankart repair [107], SLAP lesion [108].United KingdomEnglish [104],Dutch [106],Italian [109],Turkish [110].It is composed of 12 questions in relation to shoulder instability (5 points), pain (10 points), occupational sphere (5 points), ADL (20 points), physical and sport activities (5 points), social life (5 points), and psychosocial aspects (10 points).The maximum score (60 points) indicates the highest degree of disability.Self-administered.The physical scale can be found in [104].
14. OXFORD SHOULDER SCORE (OSS)(Dawson et al., 1996) [111] Dawson et al., 1996 [111].Van de Water et al., 2014 [83].Impingement or tendinitis of the shoulder [112], disorders of the RC [113], proximal humerus fractures [113], shoulder pain [114], shoulder disorders, rheumatoid arthritis [115], frozen shoulder [116].United KingdomEnglish [111], German [112],Romanian [113],French [114], Portuguese-Brazilian [115],Portuguese [117], Polish [118], Turkish [119], Korean [120], Chinese [121], Italian [122], Dutch [123], Persian [124], Danish [125], Norwegian [126], Arabic [127],Spanish [128].It is composed of 12 items divided into 2 subscales: pain (20 points) and ADL (40 points). Each item is rated from 1 to 5 points. The maximum score (60 points) indicates the highest degree of disability. Self-administered.The physical scale can be found in [111].
15. PEDIATRIC/ADOLESCENT SHOULDER SURVEY (PASS)(Edmonds et al., 2017) [129]Edmonds et al., 2017 [129]. Shoulder disorders [129], shoulder instability [130], glenoid labral pathology [131].United StatesEnglish [129].Consists of 13 questions that assess symptoms, limitations, need for compensatory mechanisms, and emotional distress.Each question is provided on a 0 to 5 scale (questions 2, 4, 9, 10, 12, 13) or 0 to 10 scale (questions 1, 3, 5–8, 11). Once reverse scoring is applied to items 1 through 9, the reverse scores from items 1 through 9 are summed together with the actual scores from items 10 through 13.The formula for the total score is: SUM (reverse score 1–9, 10–13)/100.The maximum score (100 points) indicates optimal state of the shoulder.Self-administered.The PASS was developed because most of the adult-age questionnaires ask questions that are not age appropriate. The physical scale can be found in [129].
16. PENN SHOULDER SCORE (PSS)(Leggin et al., 1999) [132] Cook et al., 2001 [133].Leggin et al., 2006 [134].Shoulder disorders [135], subacromial pain syndrome [136], reverse shoulder arthroplasty [137], RC repair [138], scapular dyskinesis [139], shoulder pain [140].United StatesEnglish [134], Turkish [135],Portuguese [140], Brazilian [141].It consists of 24 items divided into 3 components: pain (30 points), satisfaction (10 points), and function (60 points). The pain subscale consists of 3 pain items. All are based on a 10-point numeric rating scale. Patient satisfaction is also assessed with a 10-point numeric rating scale. The function subsection is based on a sum of 20 items, each with a 4-point Likert scale.The maximum score (100 points) indicates optimal state of the shoulder. Self-administered.The PSS can be used in the aggregate or each subscale individually.The physical scale can be found in [134].
17. ROTATOR CUFF QUALITY OF LIFE (RC-QOL)(Hollinshead et al., 2000) [142] Hollinshead et al., 2000 [142].Razmjou et al., 2006 [143], Eubank et al., 2017 [144].RC disease [142,144], impingement syndrome, RC repair, acromioplasty, or decompression surgeries [143], full-thickness RC tears [145], latissimus dorsi tendon transfer and partialcuff repair in irreparable postero-superior RC tear [146], chronic RC tear [147].CanadaEnglish [142], Italian [148], Chinese [149,150], Turkish [151], German [152],Spanish [153].It is composed of 34 items divided into 5 domains: symptoms and physical complaints (16 items), sport/recreation (4 items), work-related concerns (4 items), lifestyle issues (5 items), and social and emotional issues (5 items).Each item has a possible score from 0 to 100 (100 mm visual analogue scale). The maximum score (100 mm) indicates optimal state of the shoulder.Self-administered.The instrument provides instructions to the patients.The physical scale can be found in [142].
18. ROWE SCALE (Rowe et al., 1978) [94] Romeo et al., 1996 [89]. Oh et al., 2009 [24].Shoulder instability [89], shoulder surgery [24], anterior shoulder luxation [154], anterior shoulder reconstruction [155], Latarjet surgery for traumatic anterior shoulder instability [156], arthroscopic Bankart repair for shoulder instability [157].United StatesEnglish [94], Portuguese [158].It is composed of 3 components: shoulder stability (50 points), ROM (20 points), and function (30 points).The maximum score (100 points) indicates optimal state of the shoulder. Interpretation: excellent evaluation (90–100 points), good (75–89 points), fair (74–51 points), and poor evaluation (50–0 points).Self-administered section and clinician assessment section.There are 4 different Rowe score versions. This is the original version of the Modified Rowe Scale. This is the first scale developed for this purpose. The physical scale can be found in [94].
19. SHORT WESTERN ONTARIO ROTATOR CUFF INDEX (SHORTWORC) (Razmjou et al., 2012) [159] Razmjou et al., 2012 [159].Dewan et al., 2016 [160],Dewan et al., 2018 [161], Furtado et al., 2020 [162].RC repair [159,160,161], RC pathology [162].CanadaEnglish [159].It consists of 7 items, including all items from the WORC work and lifestyle domains except the one relating to roughhousing.Each item has a possible score from 0 to 100 (100 mm visual analogue scale) and these scores are added to give a total score from 0 to 700 points.The maximum score (700 points) indicates the highest degree of disability.The score can be reported as a percentage of normal by subtracting the total from 700, dividing by 700, and multiplying by 100.e.g.,: (700 − total score/700) x 100Self-administered.If answers to 10% of questions are missing for an index, the index is considered to be missing completely.The physical scale can be found in [159].
20. SHOULDER ACTIVITY RATING SCALE (SARS) (Brophy et al., 2005) [12] Brophy et al., 2005 [12]. Shoulder disorders [12,163], total shoulder arthroplasty [164].United StatesEnglish [12], Persian [163].It is a numeral sum of scores for physical activities: carrying objects 8 pounds or heavier by hand, handling objects overhead, weight training with arms, swinging motion, and lifting objects 25 pounds or heavier. Each of the 5 activity items was scored from never performed (0 points) to daily (4 points). Two additional multiple choice questions provide a score assessing participation in contact and overhead sports.The maximum score (20 points) indicates optimal state of the shoulder. Self-administered.The physical scale can be found in [12].
21. SHOULDER FUNCTION INDEX (SFInX)(Van de Water et al., 2015) [165] Van de Water et al., 2015 [165].Van de Water et al., 2015 [166].Proximal humeral fractures [165,166].AustraliaEnglish [165].It is composed of 13 questions that evaluate shoulder function. The scoring categories for 5 items are “able” or “unable”, and 8 items also have a middle “partially able” category, which is chosen when compensation is used to complete the task.Total raw scores are converted to a 0–100 interval level SFInX score using the conversion table on the assessment form.The maximum score (100 points) indicates optimal state of the shoulder.Self-administered.The physical scale can be found in [165].
22. SHOULDER PAIN AND DISABILITY INDEX (SPADI)(Roach et al., 1991) [167] Roach et al., 1991 [167]. Beaton et al., 1996 [20], Heald et al., 1997 [168],Beaton et al., 1998 [21], Roddey et al., 2000 [169], Cook et al., 2001 [133], Cook et al., 2002 [22], Paul et al., 2004 [61], MacDermid et al., 2006 [170], Angst et al., 2008 [46], Bicer et al., 2010 [171], Staples et al., 2010 [172], Hill et al., 2011 [173], Riley et al., 2015 [174], Jerosch-Herold et al., 2017 [175], Thoomes de Graaf et al., 2017 [176], James-Berlin et al., 2018 [51], Vascellari et al., 2018 [96], Riley et al., 2019 [177],Dabija et al., 2019 [27], Boake et al., 2020 [178].Shoulder disorders [168], shoulder pain [171], adhesive capsulitis [172], RC disease [51], shoulder arthroplasty [179].United StatesEnglish [167], German [179], Arabic [180], Chinese [181,182],Danish [183],Dutch [184],Greek [185,186],Italian [99,187],Korean [188],Nepali [189], Slovene [190],Thai [191], Indian [192],Japanese [193], Spanish [194].It contains 13 items that assess two domains: a 5-item subscale that measures pain and an 8-item subscale that measures disability.Each subscale is summed and transformed to a score out of 100. A mean is taken of the two subscales.The maximum score (100 points) indicates the highest degree of disability. Self-administered.There are 2 versions of the SPADI; the original version has each item scored on a visual analogue scale (VAS). The second version has items scored on a numerical rating scale (NRS) [195].The physical scale can be found in [167].
23. SHOULDER PAIN SCORE (SPS)(Winters et al., 1996) [196] Winters et al., 1996 [196]. Shoulder arthroplasty [197], periarthritis humeroscapularis [198], arthroscopic RC repair [199], subacromial impingement [200], laparoscopic gastric bypass [201], oral squamous cell carcinoma [202].NetherlandsEnglish [196].It contains 7 items about pain: pain at rest, pain in motion, nightly pain, sleeping problems caused by pain, incapability of lying on the painful side, degree of radiation, and numerical pain scale. Each item was scored from none (0 points) to severe/past the elbow (4 points). The maximum score (28 points) indicates the highest degree of disability.Self-administered.The physical scale can be found in [196].
24. SHOULDER RATING QUESTIONNAIRE (SRQ)(L’Insalata et al., 1997) [203] L’Insalata et al., 1997 [203].Paul et al., 2004 [61].Shoulder pain [61], shoulder disorders [204,205], shoulder pain or limitation of function [188].United StatesEnglish [203], Dutch [204], Portuguese [205],Korean [188]. It is composed of 21 items divided into 6 groups: global evaluation (domain score multiplied by 1.5; score range, 0 to 15 points), pain (domain score multiplied by 4; score range, 8 to 40 points), ADL (domain score multiplied by 2; score range, 4 to 20 points), recreational and athletic activities (domain score multiplied by 1.5; score range, 3 to 15 points), work (domain score multiplied by 1; score range, 2 to 10 points), and satisfaction (points not included in the total score). The global assessment domain consists of a 10 cm visual analogue scale. This scale is scored from 0 to 10 points. Each of the other scored domains consist of a series of multiple-choice questions with 5 selections per score from 1 to 5 points. The maximum score (100 points) indicates optimal state of the shoulder.Self-administered.The physical scale can be found in [203].
25. SIMPLE SHOULDER TEST (SST)(Lippitt et al., 1993) [206] Beaton et al., 1996 [20].Beaton et al., 1998 [21], Roddey et al., 2000 [169], Cook et al., 2001 [133], Godfrey et al., 2007 [207],Oh et al., 2009 [24], Roy et al., 2010 [208],Hsu et al., 2017 [209], Vascellari et al., 2018 [96], Baumgarten et al., 2020 [31].Shoulder pain [21], shoulder disorders [169], shoulder instability and RC injuries [207], shoulder arthroplasty [209], anterior shoulder instability surgery [96], proximal humerus fracture [210].United StatesEnglish [206], Italian [99],Dutch [211],Persian [212],Portuguese-Brazilian [213],Lithuanian [214],Spanish [215].It is composed of 12 questions related to function, pain, strength, and ROM. The questions are on a dichotomous scale (1 = yes and 0 = no). The maximum score (12 points) indicates optimal state of the shoulder. Self-administered.The physical scale can be found in [206].
26. SINGLE ASSESSMENT NUMERIC EVALUATION RATING (SANE)(Williams et al., 1999) [216]Sciascia et al., 2017 [26].Gowd et al., 2019 [217], Thigpen et al., 2018 [218], Cohn et al., 2020 [219].Shoulder surgery [216], total shoulder arthroplasty [26,217,220], RC disease [221], glenoid labral pathology [131], shoulder instability [222]. United StatesEnglish [216].It consists of a single question about function. It is valued from 0 to 100 points. The question is “How would you rate your shoulder’s function with 100 being normal?”The maximum score (100 points) indicates optimal state of the shoulder.Self-administered.The physical scale can be found in [216].
27. SUBJECTIVE SHOULDER RATING SCALE (SSRS)(Kohn et al., 1992) [223] Beaton et al., 1996 [20].Kohn et al.1997 [224],Beaton et al., 1998 [21].Shoulder disorders [21,224], shoulder pain [20].GermanyEnglish [224].It is composed of 5 components: pain (35 points), ROM (35 points), instability (15 points), activity (10 points), and overhead work (5 points).The maximum score (100 points) indicates optimal state of the shoulder.Self-administered.The physical scale can be found in [224].
28. UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA)(Amstutz et al., 1981) [225] Romeo et al., 1996 [89].Roddey et al., 2000 [169].Shoulder instability [89], shoulder disorders [169], adhesive capsulitis [226], calcifying tendinitis of the shoulder [227], proximal humeral fractures [228], RC repair [229].United StatesEnglish [225].It is composed of 3 components: pain (10 points), function (10 points), and muscle power and ROM (10 points).The maximum score (30 points) indicates optimal state of the shoulder.Self-administered section and clinician assessment section.There is also a Modified UCLA [95].The physical scale can be found in [225].
29. UNITED KINGDOM SHOULDER DISABILITY QUESTIONNAIRE (UK-SDQ)(Croft et al., 1994) [230]Croft et al., 1994 [230].Paul et al., 2004 [61].Shoulder pain [61,230]. United KingdomEnglish [230], Italian [231].It contains 22 items about problems with daily living related to shoulder pain. The questions are on a dichotomous scale (1 = yes and 0 = no).The maximum score (22 points) indicates the highest degree of disability.Self-administered.The physical scale can be found in [230].
30. WESTERN ONTARIO SHOULDER INSTABILITY INDEX (WOSI)(Kirkley et al., 1998) [232] Kirkley et al., 1998 [232].Oh et al., 2009 [24], Kemp et al., 2012 [25],Van der Linde et al., 2017 [105].Shoulder instability [232], shoulder surgery [24], surgical correction of shoulder instability [25,233], posterior shoulder instability [234], SLAP lesion or recurrent anterior dislocation [235].United StatesEnglish [232], French [236,237], Danish [238], Dutch [239,240], German [241],Hebrew [242], Italian [243], Japanese [244], Swedish [245]Turkish [246], Arabic [247,248],Spanish [249].It contains 21 items that assess 4 domains: physical symptoms (10 items), sport/recreation work (4 items), lifestyle (4 items), and emotions (3 items). Each item has a possible score from 0 to 100 (100 mm visual analogue scale) and these scores are added to give a total score from 0 to 2100 points. The maximum score (2100 points) indicates the highest degree of disability.The score can be reported as a percentage of normal by subtracting the total from 2100, dividing by 2100, and multiplying by 100.e.g.,: (2100 − total score/2100) x 100.Self-administered. The physical scale can be found in [232].
31. WESTERN ONTARIO OSTEOARTHRITIS OF THE SHOULDER INDEX (WOOS)(Lo et al., 2001) [250] Lo et al., 2001 [250].Sciascia et al., 2017 [26].Osteoarthritis of the shoulder [250,251], total shoulder arthroplasty [26,252], proximal humeral fracture [253].United StatesEnglish [250], Danish [254], Italian [255],Swedish [256], Chinese [257].It is composed of 19 items representing 4 domains: 6 questions for pain and physical symptoms; 5 for sport, recreation, and work function; 5 for lifestyle function; and 3 for emotional function. Each item has a possible score from 0 to 100 (100 mm visual analogue scale) and these scores are added to give a total score from 0 to 1900 points. The maximum score (1900 points) indicates the highest degree of disability.The score can be reported as a percentage of normal by subtracting the total from 1900, dividing by 1900, and multiplying by 100. e.g.,: (1900 − total score/1900) x 100.Self-administered.The physical scale can be found in [250].
32. WESTERN ONTARIO ROTATOR CUFF INDEX (WORC)(Kirkley et al., 2003) [258] Kirkley et al., 2003 [258].Razmjou et al., 2006 [143],Gadsboell et al., 2017 [259].RC disease [258,260], impingement syndrome, RC repair, acromioplasty, or decompression surgeries [143], scapula alata [259], subacromial impingement syndrome [261].United StatesEnglish [258], Brazilian-Portuguese [262], Chinese [263], Dutch [264,265], Japanese [266],Persian [267], Turkish [268],Danish [269], Canadian-French [270],Polish [271], Swedish [272].It is composed of 21 items representing 5 domains: 6 questions in the physical symptoms domain, 4 in sports and recreation, 4 in work, 4 in lifestyle, and 3 in the emotional domain. Each item has a possible score from 0 to 100 (100 mm visual analogue scale) and these scores are added to give a total score from 0 to 2100 points. The maximum score (2100 points) indicates the highest degree of disability.The score can be reported as a percentage of normal by subtracting the total from 2100, dividing by 2100, and multiplying by 100.e.g.,: (2100 − total score/2100) x 100.Self-administered.The physical scale can be found in [258].

Note: Some validation studies validated more than one shoulder tool. Some physical scales are found in papers that are not original or validation studies (see observations). The fourth column (applications/indications) shows a maximum of 6 applications or indications as examples.

3.3. Assessment of Methodological Quality

The results of the QUADAS-2 [16] and the COSMIN RB [273] for the 111 validations from 73 selected studies are shown in Table 3 and Table 4.
Table 3

Assessment of the methodological quality with QUADAS-2.

Risk of BiasApplicability
ToolsValidation StudiesPatient SelectionIndex TestReference StandardFlow and TimingPatient SelectionIndex TestReference Standard
1. AMERICAN SHOULDER AND ELBOW SURGEONS STANDARDIZED SHOULDER ASSESSMENT FORM (ASES) [19] Beaton et al., 1996 [20] * ??
Beaton et al., 1998 [21] ?--- -
Cook et al., 2002 [22] --- -
Michener et al., 2002 [23] ??
Oh et al., 2009 [24] ??
Kemp et al., 2012 [25] --- -
Sciascia et al., 2017 [26] ??
Dabija et al., 2019 [27] ??
Vrotsou et al., 2019 [28] --- -
Gotlin et al., 2020 [29] --- -
Hou et al., 2020 [30] --- -
Baumgarten et al., 2020 [31] --- -
2. CONSTANT-MURLEY SCORE (CMS) [44] Conboy et al., 1996 [45] *?--- -
Cook et al., 2002 [22] --- -
Angst et al., 2008 [46] --- -
Razmjou et al., 2008 [47] ??
Rocourt et al., 2008 [48] --- -
Oh et al., 2009 [24] ??
Kemp et al., 2012 [25] --- -
Ban et al., 2016 [49] ??
Mahabier et al., 2016 [50] ??
Sciascia et al., 2017 [26] ??
James-Belin et al., 2018 [51] --- -
3. DUTCH SHOULDER DISABILITY QUESTIONNAIRE (DUTCH-SDQ) [58] Van der Windt et al., 1998 [59] * ??
Van der Heijden et al., 2000 [60]?--- -
Paul et al., 2004 [61] ??
4. FLEXILEVEL SCALE OF SHOULDER FUNCTION (FLEX-SF) [63] Cook et al., 2003 [63] * ??
5. FUDAN UNIVERSITY SHOULDER SCORE [70]Ge et al., 2013 [70] *???
6. FUNCTIONAL SHOULDER SCORE (FSS) [72] Iossifidis et al., 2015 [72] * ??
7. KOREAN SHOULDER SCORING SYSTEM (KSS) [73]Tae et al., 2009 [73] * ??
8. MELBOURNE INSTABILITY SHOULDER SCALE (MISS) [80] Watson et al., 2005 [80] * --- -
9. MODIFIED CONSTANT-MURLEY SCORE [82]Van der Water et al., 2014 [83] *???
10. MODIFIED ROWE SHOULDER SCORE (MRS) [88] Romeo et al., 1996 [89] *
11. MODIFIED UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [95] Cook et al., 2002 [22] * --- -
Oh et al., 2009 [24] ??
Van de Water et al., 2014 [83] ??
Vascellari et al., 2018 [96] ??
12. MUNICH SHOULDER QUESTIONNAIRE (MSQ) [100] Schmidutz et al., 2012 [100] * ??
13. OXFORD INSTABILITY SCORE (OIS) [104]Dawson et al., 1999 [104] *???
Van der Linde et al., 2017 [105] ??
14. OXFORD SHOULDER SCORE (OSS) [111] Dawson et al., 1996 [111] * ??
Van de Water et al., 2014 [83]???
15. PEDIATRIC/ADOLESCENT SHOULDER SURVEY (PASS) [129] Edmonds et al., 2017 [129] * ???
16. PENN SHOULDER SCORE (PSS) [132] Cook et al., 2001 [133] * --- -
Leggin et al., 2006 [134]???
17. ROTATOR CUFF QUALITY OF LIFE (RC-QOL) [142] Hollinshead et al., 2000 [142] * ???
Razmjou et al., 2006 [143] ??
Eubank et al., 2017 [144] ??
18. ROWE SCALE [94]Romeo et al., 1996 [89] *
Oh et al., 2009 [24] ??
19. SHORT WESTERN ONTARIO ROTATOR CUFF INDEX (SHORT-WORC) [159] Razmjou et al., 2012 [159] * ??
Dewan et al., 2016 [160] ??
Dewan et al., 2018 [161]???
Furtado et al., 2020 [162] --- -
20. SHOULDER ACTIVITY RATING SCALE (SARS) [12]Brophy et al., 2005 [12] *????
21. SHOULDER FUNCTION INDEX (SFInX) [165]Van de Water et al., 2015 [165] * --- -
Van de Water et al., 2015 [166] ??
22. SHOULDER PAIN AND DISABILITY INDEX (SPADI) [167]Roach et al., 1991 [167] *???
Beaton et al., 1996 [20] ??
Heald et al., 1997 [168]
Beaton et al., 1998 [21]?--- -
Roddey et al., 2000 [169]???
Cook et al., 2001 [133] --- -
Cook et al., 2002 [22] --- -
Paul et al., 2004 [61] ??
MacDermid et al., 2006 [170] ??
Angst et al., 2008 [46] --- -
Bicer et al., 2010 [171] ??
Staples et al., 2010 [172]???
Hill et al., 2011 [173] ??
Riley et al., 2015 [174] --- -
Jerosch-Herold et al., 2017 [175]?--- -
Thoomes de Graaf et al., 2017 [176] ??
James-Berlin et al., 2018 [51] --- -
Vascellari et al., 2018 [96] ??
Riley et al., 2019 [177] --- -
Dabija et al., 2019 [27] ??
Boake et al., 2020 [178] --- -
23. SHOULDER PAIN SCORE (SPS) [196] Winters et al., 1996 [196] *?--- -
24. SHOULDER RATING QUESTIONNAIRE (SRQ) [203]L’Insalata et al., 1997 [203] *???
Paul et al., 2004 [61] ??
25. SIMPLE SHOULDER TEST (SST) [206] Beaton et al., 1996 [20] * ??
Beaton et al., 1998 [21]?--- -
Roddey et al., 2000 [169]???
Cook et al., 2001 [133] --- -
Godfrey et al., 2007 [207] ??
Oh et al., 2009 [24] ??
Roy et al., 2010 [208]???
Hsu et al., 2017 [209] ??
Vascellari et al., 2018 [96] ??
Baumgarten et al., 2020 [31] --- -
26. SINGLE ASSESSMENT NUMERIC EVALUATION RATING (SANE) [216]Sciascia et al., 2017 [26] * ??
Gowd et al., 2019 [217]?--- -
Thigpen et al., 2018 [218] ??
Cohn et al., 2020 [219] ??
27. SUBJECTIVE SHOULDER RATING SCALE (SSRS) [223] Beaton et al., 1996 [20] *???
Kohn et al., 1997 [224] ??
Beaton et al., 1998 [21]?--- -
28. UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [225] Romeo et al., 1996 [89] *
Roddey et al., 2000 [169]???
29. UNITED KINGDOM SHOULDER DISABILITY QUESTIONNAIRE (UK-SDQ) [230] Croft et al., 1994 [230] * --- -
Paul et al., 2004 [61] ??
30. WESTERN ONTARIO SHOULDER INSTABILITY INDEX (WOSI) [232] Kirkley et al., 1998 [232] * ??
Oh et al., 2009 [24] ??
Kemp et al., 2012 [25] --- -
Van der Linde et al., 2017 [105] ??
31. WESTERN ONTARIO OSTEOARTHRITIS OF THE SHOULDER INDEX (WOOS) [250] Lo et al., 2001 [250] *???
Sciascia et al., 2017 [26] ??
32. WESTERN ONTARIO ROTATOR CUFF INDEX (WORC) [258]Kirkley et al., 2003 [258] *???
Razmjou et al., 2006 [143] ??
Gadsboell et al., 2017 [259] --- -

Note: * Original validation studies. There were validation studies which validated more than one shoulder tool. Interpretation: , low risk of bias or low concerns regarding applicability; , high risk of bias or high concerns regarding applicability; ?, unclear risk of bias or unclear concerns regarding applicability; -, not applicable.

Table 4

Assessment of methodological quality with COSMIN Risk of Bias checklist.

ToolsValidation StudiesPROMs DevelopmentContent ValidityStructural ValidityInternal ConsistencyReliabilityMEASUREMENT ERRORCriterion ValidityConstruct ValidityResponsiveness
1. AMERICAN SHOULDER AND ELBOW SURGEONS STANDARDIZED SHOULDER ASSESSMENT FORM (ASES) [19] Beaton et al., 1996 [20] * Very good
Beaton et al., 1998 [21] Adequate Doubtful
Cook et al., 2002 [22] Very goodInadequate
Michener et al., 2002 [23] Very goodAdequateInadequate Very goodInadequate
Oh et al., 2009 [24] Very good Very goodVery goodDoubtful
Kemp et al., 2012 [25] AdequateVery good
Sciascia et al., 2017 [26] Inadequate
Dabija et al., 2019 [27] Inadequate InadequateDoubtful
Vrotsou et al., 2019 [28] Inadequate
Gotlin et al., 2020 [29] Inadequate
Hou et al., 2020 [30] Doubtful
Baumgarten et al., 2020 [31] Doubtful
2. CONSTANT-MURLEY SCORE (CMS) [44] Conboy et al., 1996 [45] * InadequateAdequate
Cook et al., 2002 [22] Inadequate
Angst et al., 2008 [46] Doubtful
Razmjou et al., 2008 [47] Very good Inadequate
Rocourt et al., 2008 [48] Adequate
Oh et al., 2009 [24] Inadequate DoubtfulVery goodDoubtful
Kemp et al., 2012 [25] AdequateVery good
Ban et al., 2016 [49] InadequateAdequateAdequate Very good
Mahabier et al., 2016 [50] Very good Very goodVery good
Sciascia et al., 2017 [26] Inadequate
James-Belin et al., 2018 [51] Adequate Very good
3. DUTCH SHOULDER DISABILITY QUESTIONNAIRE (DUTCH-SDQ) [58] Van der Windt et al., 1998 [59] * Very good
Van der Heijden et al., 2000 [60]Inadequate Very good
Paul et al., 2004 [61] Very good Inadequate
4. FLEXILEVEL SCALE OF SHOULDER FUNCTION (FLEX-SF) [63] Cook et al., 2003 [63] *Inadequate Inadequate InadequateInadequate
5. FUDAN UNIVERSITY SHOULDER SCORE [70]Ge et al., 2013 [70] *Doubtful Very good InadequateInadequate
6. FUNCTIONAL SHOULDER SCORE (FSS) [72] Iossifidis et al., 2015 [72] *Inadequate AdequateVery goodInadequateVery good Very goodDoubtful
7. KOREAN SHOULDER SCORING SYSTEM (KSS) [73]Tae et al., 2009 [73] *InadequateInadequate Very good Very goodInadequateDoubtful
8. MELBOURNE INSTABILITY SHOULDER SCALE (MISS) [80] Watson et al., 2005 [80] *Inadequate AdequateAdequate
9. MODIFIED CONSTANT-MURLEY SCORE [82]Van der Water et al., 2014 [83] * Doubtful AdequateAdequate Very goodVery good
10. MODIFIED ROWE SHOULDER SCORE (MRS) [88] Romeo et al., 1996 [89] * Inadequate
11. MODIFIED UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [95]Cook et al., 2002 [22] * Inadequate
Oh et al., 2009 [24] Inadequate DoubtfulVery goodDoubtful
Van de Water et al., 2014 [83] Doubtful AdequateAdequate Very goodVery good
Vascellari et al., 2018 [96] AdequateAdequate Very good
12. MUNICH SHOULDER QUESTIONNAIRE (MSQ) [100]Schmidutz et al., 2012 [100] *Inadequate Inadequate
13. OXFORD INSTABILITY SCORE (OIS) [104] Dawson et al., 1999 [104] * Inadequate Very goodDoubtful InadequateInadequate
Van der Linde et al., 2017 [105] Inadequate
14. OXFORD SHOULDER SCORE (OSS) [111] Dawson et al., 1996 [111] *Inadequate DoubtfulDoubtful AdequateDoubtful
Van de Water et al., 2014 [83] Doubtful AdequateAdequate Very goodVery good
15. PEDIATRIC/ADOLESCENT SHOULDER SURVEY (PASS) [129]Edmonds et al., 2017 [129] *Inadequate Very goodDoubtful Very goodDoubtful
16. PENN SHOULDER SCORE (PSS) [132] Cook et al., 2001 [133] * Inadequate
Leggin et al., 2006 [134] InadequateDoubtfulInadequate InadequateInadequate
17. ROTATOR CUFF QUALITY OF LIFE (RC-QOL) [142] Hollinshead et al., 2000 [142] *Doubtful Inadequate Doubtful
Razmjou et al., 2006 [143] Very goodDoubtful
Eubank et al., 2017 [144] Doubtful Very goodDoubtful Very goodInadequate
18. ROWE SCALE [94]Romeo et al., 1996 [89] * Inadequate
Oh et al., 2009 [24] Inadequate Very goodVery goodDoubtful
19. SHORT WESTERN ONTARIO ROTATOR CUFF INDEX (SHORTWORC) [159] Razmjou et al., 2012 [159] *Inadequate Very good Doubtful InadequateInadequate
Dewan et al., 2016 [160] Very goodAdequateAdequate
Dewan et al., 2018 [161] InadequateInadequate
Furtado et al., 2020 [162] Adequate
20. SHOULDER ACTIVITY RATING SCALE (SARS) [12] Brophy et al., 2005 [12] *Doubtful Doubtful Inadequate
21. SHOULDER FUNCTION INDEX (SFInX) [165] Van de Water et al., 2015 [165] *Inadequate
Van de Water et al., 2015 [166] AdequateAdequate AdequateAdequate
22. SHOULDER PAIN AND DISABILITY INDEX (SPADI) [167] Roach et al., 1991 [167] *Inadequate DoubtfulInadequate Very goodInadequate
Beaton et al., 1996 [20] Very good
Heald et al., 1997 [168] Very goodInadequate
Beaton et al., 1998 [21] Adequate Doubtful
Roddey et al., 2000 [169] Very goodInadequate Very good
Cook et al., 2001 [133] Inadequate
Cook et al., 2002 [22] Very goodInadequate
Paul et al., 2004 [61] Very good Inadequate
MacDermid et al., 2006 [170] InadequateVery good Very goodVery good
Angst et al., 2008 [46] Doubtful
Bicer et al., 2010 [171] Very goodAdequate Very good
Staples et al., 2010 [172] Very goodDoubtful
Hill et al., 2011 [173] InadequateVery good Very good
Riley et al., 2015 [174] Very good
Jerosch-Herold et al., 2017 [175] Very good
Thoomes de Graaf et al., 2017 [176] AdequateAdequate Adequate
James-Berlin et al., 2018 [51] Adequate Very good
Vascellari et al., 2018 [96] Very goodAdequateAdequate Very good
Riley et al., 2019 [177] Doubtful
Dabija et al., 2019 [27] Inadequate InadequateDoubtful
Boake et al., 2020 [178] Very good Inadequate
23. SHOULDER PAIN SCORE (SPS) [196]Winters et al., 1996 [196] *Inadequate DoubtfulVery good
24. SHOULDER RATING QUESTIONNAIRE (SRQ) [203] L’Insalata et al., 1997 [203] *Inadequate Very goodDoubtful DoubtfulAdequateInadequate
Paul et al., 2004 [61] Very good Very good
25. SIMPLE SHOULDER TEST (SST) [206] Beaton et al., 1996 [20] * Very good
Beaton et al., 1998 [21] Adequate Doubtful
Roddey et al., 2000 [169] Very goodInadequate Very good
Cook et al., 2001 [133] Inadequate
Godfrey et al., 2007 [207] Inadequate Adequate Very goodVery good
Oh et al., 2009 [24] Very good DoubtfulVery goodDoubtful
Roy et al., 2010 [208] Very goodDoubtful
Hsu et al., 2017 [209] Inadequate Very good Doubtful
Vascellari et al., 2018 [96] Very goodAdequateAdequate Very good
Baumgarten et al., 2020 [31] Doubtful
26. SINGLE ASSESSMENT NUMERIC EVALUATION RATING (SANE) [216] Sciascia et al., 2017 [26] * Inadequate
Gowd et al., 2019 [217] Inadequate
Thigpen et al., 2018 [218] AdequateAdequateVery goodInadequateDoubtful
Cohn et al., 2020 [219] Inadequate
27. SUBJECTIVE SHOULDER RATING SCALE (SSRS) [223] Beaton et al., 1996 [20] * Very good
Kohn et al.1997 [224]Inadequate Inadequate
Beaton et al., 1998 [21] Adequate Doubtful
28. UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [225] Romeo et al., 1996 [89] * Inadequate
Roddey et al., 2000 [169] Very good
29. UNITED KINGDOM SHOULDER DISABILITY QUESTIONNAIRE (UK-SDQ) [230] Croft et al., 1994 [230] *InadequateInadequate
Paul et al., 2004 [61] Very good Inadequate
30. WESTERN ONTARIO SHOULDER INSTABILITY INDEX (WOSI) [232] Kirkley et al., 1998 [232] *Doubtful Inadequate InadequateInadequate
Oh et al., 2009 [24] Very good Very goodVery goodDoubtful
Kemp et al., 2012 [25] AdequateVery good
Van der Linde et al., 2017 [105] Inadequate
31. WESTERN ONTARIO OSTEOARTHRITIS OF THE SHOULDER INDEX (WOOS) [250] Lo et al., 2001 [250] *Doubtful Adequate InadequateDoubtful
Sciascia et al., 2017 [26] Inadequate
32. WESTERN ONTARIO ROTATOR CUFF INDEX (WORC) [258] Kirkley et al., 2003 [258] *Doubtful Adequate Inadequate
Razmjou et al., 2006 [143] Very goodDoubtful
Gadsboell et al., 2017 [259] Inadequate

Note: * Original validation studies. There were some validation studies which validated more than one shoulder tool. Only one section was supressed (cross-cultural validity/measurement invariance), as it was not considered in any of the validated articles included. Abbreviations: PROMs, patient reported outcome measures.

The methodological quality results are summarised below. QUADAS-2. The risk of bias section obtained worse results than applicability. Risk of bias: patient selection stands out with 81/111 positive outcomes (72.97%). Index test and reference standard got 4/111 positive results (3.60%) and 70/111 unclear results (63.06%). The scope could not be evaluated in 37/111 cases (33.33%) in index test, reference standard and flow and timing due to the lack of a reference standard in the validation processes. Concerns regarding applicability: patient selection and index test got the best possible score in all validations (100%). The reference standard could not be evaluated in 37/111 cases (33.33%) as for the risk of bias section. In relation to total QUADAS-2 scores, 4/111 validations (3.60%) stood out for obtaining 6 positive results in the QUADAS-2 and 22/111 validations (19.81%) achieved 5 results of “low” risk of bias or “low” concerns regarding applicability. COSMIN RB. Patient reported outcome measures (PROMs) development: 22/111 validations (19.81%) were analysed and the score was “doubtful” in 6/22 cases (27.27%) and “inadequate” in 16/22 cases (72.73%). Content validity was addressed in 10/111 of the validations (9%), and the score was “adequate” in 1/10 cases (10%), “doubtful” in 4/10 cases (40%) and “inadequate” in 5/10 cases (50%). Structural validity was taken into account in only 6/111 validations, and 33.33% of the results were both “very good” and “inadequate”. Internal consistency was taken into account in 31/111 validations (27.92%), and 24/31 were very favourable, obtaining “very good” results (77.41%). Reliability was addressed in 58/111 validations (52.25%), and 41.38% of the results were both “adequate” and “inadequate”. Measurement error was calculated in 16/111 validations (14.41%), and many of its scores were “adequate” (76.47%). Criterion validity was considered for 16/111 validations (14.41%), and 12/16 stood out with “very good” results (75%). Construct validity was evaluated in 59/111 validations (53.15%), and 34/59 stood out for obtaining “very good” results (57.62%). Responsiveness was the most measured metric property. It was considered in 64/111 validations (57.65%). Of these, 14/64 obtained “very good” results (21.87%), 2/64 “adequate” (3.12%), 27/64 “doubtful” (42.18%), and 21/64 “inadequate” (32.81%). Regarding the general COSMIN RB score, 9/111 validations obtained “very good” scores regarding the metric properties they addressed.

3.4. Indications/Applications, Transcultural Adaptations and Administration

In relation to the applications and indications, 13/32 tools (40.63%) [72,73,80,88,94,104,142,159,165,225,232,250,258] were initially designed to assess specific populations: shoulder pathologies like RC disease [73,142,159,258], instability [80,104,232], proximal humeral fracture [165] and osteoarthritis [250]; or surgical interventions like Bankart repair [88,94], RC disease repair [72] and total arthroplasty [225]. The populations in which the tools have been validated are listed below, differentiating among populations regarding symptoms/signs, pathologies and surgical treatments, whether general or specific techniques (Table 5).
Table 5

Study population of the validations.

ToolsValidation StudiesPopulations
Symptoms/SignsPathologiesSurgical Interventions
1. AMERICAN SHOULDER AND ELBOW SURGEONS STANDARDIZED SHOULDER ASSESSMENT FORM (ASES) [19] Beaton et al., 1996 [20] * Shoulder disorders
Beaton et al., 1998 [21] PainRC disease, OA, glenohumeral instability, malunion of a shoulder fxRC repair, total shoulder arthroplasty
Cook et al., 2002 [22]Shoulder dysfunction
Michener et al., 2002 [23]WeaknessImpingement syndrome, instability/dislocation, RC syndrome, adhesive capsulitis, humeral fx, RC and adhesive capsulitisSurgery
Oh et al., 2009 [24] RC disorder, SLAP lesion, shoulder instability
Kemp et al., 2012 [25] Symptoms of shoulder instabilityShoulder instability
Sciascia et al., 2017 [26] Primary glenohumeral OATotal shoulder arthroplasty
Dabija et al., 2019 [27] RC tear
Vrotsou et al., 2019 [28] Subacromial pathology with/without RC rupture, tendinopathy, instabilitySurgery repair
Gotlin et al., 2020 [29] RC repair
Hou et al., 2020 [30]PainRC tear, frozen shoulder, impingement syndrome, instability of shoulder, AC joint arthritis, SLAP lesion, biceps tendinopathy
Baumgarten et al., 2020 [31] RC repair and total shoulder arthroplasty
2. CONSTANT-MURLEY SCORE (CMS) [44] Conboy et al., 1996 [45] * Dislocation, arthritis, impingement
Cook et al., 2002 [22]Shoulder dysfunction
Angst et al., 2008 [46] Primary unilateral or bilateral total shoulder arthroplasty
Razmjou et al., 2008 [47] Impingement syndrome or partial thickness RC tearsRC repair
Rocourt et al., 2008 [48]Shoulder dysfunctions
Oh et al., 2009 [24] RC disorders, isolated SLAP lesions, shoulder instability
Kemp et al., 2012 [25]Anterior glenohumeral instabilityShoulder instability
Ban et al., 2016 [49] Clavicle fx
Mahabier et al., 2016 [50] Humeral shaft fx
Sciascia et al., 2017 [26] Glenohumeral OATotal shoulder arthroplasty
James-Belin et al., 2018 [51]PainDegenerative RC disease (tendinopathy with or without full-thickness tear)
3. DUTCH SHOULDER DISABILITY QUESTIONNAIRE (DUTCH-SDQ) [58] Van der Windt et al., 1998 [59] *PainCapsular syndrome, acute bursitis, acromioclavicular syndrome, subacromial syndrome
Van der Heijden et al., 2000 [60]Pain and restricted passive ROM glenohumeral
Paul et al., 2004 [61]Pain
4. FLEXILEVEL SCALE OF SHOULDER FUNCTION (FLEX-SF) [63] Cook et al., 2003 [63] * Shoulder pathology
5. FUDAN UNIVERSITY SHOULDER SCORE [70]Ge et al., 2013 [70] *Pain or discomfortRC tear, biceps tendon injury, subacromial impingement, labrum injury, frozen shoulder, tendinopathy
6. FUNCTIONAL SHOULDER SCORE (FSS) [72] Iossifidis et al., 2015 [72] * RC diseaseRC repair
7. KOREAN SHOULDER SCORING SYSTEM (KSS) [73]Tae et al., 2009 [73] * RC tears, impingement syndrome or RC tendinopathy
8. MELBOURNE INSTABILITY SHOULDER SCALE (MISS) [80] Watson et al., 2005 [80] * Glenohumeral dislocation or subluxationSurgical stabilization
9. MODIFIED CONSTANT-MURLEY SCORE [82]Van der Water et al., 2014 [83] * Isolated proximal humeral fx
10. MODIFIED ROWE SHOULDER SCORE (MRS) [88] Romeo et al., 1996 [89] * Shoulder stabilization proceduresBankart-type repairs, capsular shifts, arthroscopic stabilizations
11. MODIFIED UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [95]Cook et al., 2002 [22] * Shoulder dysfunction
Oh et al., 2009 [24] RC disorders, isolated SLAP lesions, shoulder instability
Van de Water et al., 2014 [83] Isolated proximal humeral fx
Vascellari et al., 2018 [96] Anterior shoulder instabilityArthroscopic Bankart repair, open Bristow-Latarjet procedure **
12. MUNICH SHOULDER QUESTIONNAIRE (MSQ) [100] Schmidutz et al., 2012 [100] * Shoulder disorder
13. OXFORD INSTABILITY SCORE (OIS) [104]Dawson et al., 1999 [104] * Shoulder instability
Van der Linde et al., 2017 [105] Primary and recurrent shoulder instability
14. OXFORD SHOULDER SCORE (OSS) [111] Dawson et al., 1996 [111] * Impingement syndrome, RC tear, calcified deposits in the RC tendon, primary or secondary OA, inflammatory arthritis, adhesive capsulitis
Van de Water et al., 2014 [83] Isolated proximal humeral fracture
15. PEDIATRIC/ADOLESCENT SHOULDER SURVEY (PASS) [129] Edmonds et al., 2017 [129] *Complaints related to the shoulder
16. PENN SHOULDER SCORE (PSS) [132] Cook et al., 2001 [133] *Pain and dysfunction
Leggin et al., 2006 [134] Impingement/tendinopathy, RC tear, instability, adhesive capsulitis/frozen shoulder, proximal humerus fx, acromioclavicular joint arthritis, glenohumeral joint arthritis
17. ROTATOR CUFF QUALITY OF LIFE (RC-QOL) [142] Hollinshead et al., 2000 [142] * RC disease
Razmjou et al., 2006 [143] Impingement syndromeRC repair
Eubank et al., 2017 [144] Chronic full-thickness RC tear
18. ROWE SCALE [94]Romeo et al., 1996 [89] * Shoulder stabilization proceduresBankart-type repairs, capsular shifts, arthroscopic stabilizations
Oh et al., 2009 [24] RC disorders, isolated SLAP lesions, shoulder instability
19. SHORT WESTERN ONTARIO ROTATOR CUFF INDEX (SHORT-WORC) [159]Razmjou et al., 2012 [159] * RC pathology with biceps lesionAcromioplasty, RC repair, debridement, tenodesis or tenotomy of LHB
Dewan et al., 2016 [160] RC repair with or without acromioplasty
Dewan et al., 2018 [161] RC repair
Furtado et al., 2020 [162] RC disorders
20. SHOULDER ACTIVITY RATING SCALE (SARS) [12] Brophy et al., 2005 [12] * RC tears, glenohumeral joint OA, RC arthropathyReverse total shoulder arthroplasty
21. SHOULDER FUNCTION INDEX (SFInX) [165]Van de Water et al., 2015 [165] * Isolated proximal humeral fx, proximal humeral fx-dislocation
Van de Water et al., 2015 [166] Isolated proximal humeral fx or proximal humeral fx-dislocation
22. SHOULDER PAIN AND DISABILITY INDEX (SPADI) [167]Roach et al., 1991 [167] *Pain
Beaton et al., 1996 [20] Shoulder disorders
Heald et al., 1997 [168]Pain,weaknessImpingement/tendinopathy/bursitis, instability/dislocation, RC syndrome, adhesive capsulitis, fx, sternoclavicular or acromioclavicular joint subluxation, contusionArthroscopic surgery, RC repair
Beaton et al., 1998 [21]PainRC disease, OA, glenohumeral instability, malunion of a shoulder fxRC repair, total shoulder arthroplasty
Roddey et al., 2000 [169] Shoulder disorders
Cook et al., 2001 [133]Pain and dysfunction
Cook et al., 2002 [22]Dysfunction
Paul et al., 2004 [61]Pain
MacDermid et al., 2006 [170]Pain
Angst et al., 2008 [46] Primary unilateral or bilateral total shoulder arthroplasty
Bicer et al., 2010 [171]PainAdhesive capsulitis, RC/biceps tendinopathy, RC tear, myofascial, OA, bursitis
Staples et al., 2010 [172]Pain, stiffnessAdhesive capsulitis
Hill et al., 2011 [173]Pain, stiffness
Riley et al., 2015 [174]Pain
Jerosch-Herold et al., 2017 [175]Pain
Thoomes de Graaf et al., 2017 [176]Pain
James-Berlin et al., 2018 [51]PainDegenerative RC disease (tendinopathy with or without full-thickness tear)
Vascellari et al., 2018 [96] Anterior shoulder instabilityArthroscopic Bankart repair, open Bristow-Latarjet procedure **
Riley et al., 2019 [177]Pain
Dabija et al., 2019 [27] RC tears
Boake et al., 2020 [178] RC repair
23. SHOULDER PAIN SCORE (SPS) [196] Winters et al., 1996 [196] *Pain
24. SHOULDER RATING QUESTIONNAIRE (SRQ) [203]L’Insalata et al., 1997 [203] * Impingement instability, complete tear of the RC, OA of the glenohumeral joint, adhesive capsulitis, OA of the acromioclavicular joint
Paul et al., 2004 [61]Pain
25. SIMPLE SHOULDER TEST (SST) [206] Beaton et al., 1996 [20] * Shoulder disorder
Beaton et al., 1998 [21]PainRC disease, OA, glenohumeral instability, malunion of a shoulder fxRC repair, total shoulder arthroplasty
Roddey et al., 2000 [169] Shoulder disorders
Cook et al., 2001 [133]Pain and dysfunction
Godfrey et al., 2007 [207] Shoulder instability, RC injury
Oh et al., 2009 [24] RC disorders, isolated SLAP lesions, shoulder instability
Roy et al., 2010 [208] Shoulder arthroplasty: hemiarthroplasty, total shoulder arthroplasty, reverse total shoulder arthroplasty
Hsu et al., 2017 [209] OA, rheumatoid arthritis, avascular necrosis, capsulorrhaphy arthropathy, post-traumatic arthritis, cuff tear arthropathyShoulder arthroplasty
Vascellari et al., 2018 [96] Anterior shoulder instabilityArthroscopic Bankart repair, open Bristow-Latarjet procedure **
Baumgarten et al., 2020 [31] RC repair and total shoulder arthroplasty
26. SINGLE ASSESSMENT NUMERIC EVALUATION RATING (SANE) [216] Sciascia et al., 2017 [26] * Primary glenohumeral OATotal shoulder arthroplasty
Gowd et al., 2019 [217] Primary glenohumeral arthritis and RC arthropathyAnatomic or reverse total shoulder arthroplasty
Thigpen et al., 2018 [218]Signs and symptoms of subacromial impingement or adhesivecapsulitis Primary arthroscopic RC repair, total shoulder replacement
Cohn et al., 2020 [219] Total shoulder arthroplasty or reverse total shoulder arthroplasty
27. SUBJECTIVE SHOULDER RATING SCALE (SSRS) [223] Beaton et al., 1996 [20] * Shoulder disorders
Kohn et al., 1997 [224] Anterior shoulder reconstructions, subacromial decompressions
Beaton et al., 1998 [21]PainRC disease, OA, glenohumeral instability, malunion of a shoulder fxRC repair, total shoulder arthroplasty
28. UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [225]Romeo et al., 1996 [89] * Shoulder stabilization proceduresBankart-type repairs, capsular shifts, arthroscopic stabilizations
Roddey et al., 2000 [169] Shoulder disorders
29. UNITED KINGDOM SHOULDER DISABILITY QUESTIONNAIRE (UK-SDQ) [230] Croft et al., 1994 [230] *Pain
Paul et al., 2004 [61]Pain
30. WESTERN ONTARIO SHOULDER INSTABILITY INDEX (WOSI) [232] Kirkley et al., 1998 [232] * Instability shoulder
Oh et al., 2009 [24] RC disorder, SLAP lesion, shoulder instability
Kemp et al., 2012 [25]Symptoms of shoulder instabilityShoulder instability
Van der Linde et al., 2017 [105] Primary and recurrent shoulder instability
31. WESTERN ONTARIO OSTEOARTHRITIS OF THE SHOULDER INDEX (WOOS) [250]Lo et al., 2001 [250] *PainOA
Sciascia et al., 2017 [26] Primary glenohumeral OATotal shoulder arthroplasty
32. WESTERN ONTARIO ROTATOR CUFF INDEX (WORC) [258]Kirkley et al., 2003 [258] *SymptomsRC tendinopathy, RC tendinopathy with no tear, partial-thickness RC tears, full-thickness RC tears, RC arthropathy
Razmjou et al., 2006 [143] Impingement syndrome
Gadsboell et al., 2017 [259]Scapula alata

Note: * Original validation studies. ** Specific surgical technique. Abbreviations: fx, fracture; OA, osteoarthritis; RC, rotator cuff; ROM, range of motion.

Regarding the study populations, 58/111 validations (52.25%) were performed in a general population and 53/111 (47.75%) were carried out in a specific population (according to the pathology, surgical intervention or sign). A specific pathology was analysed in 28/111 cases (25.22%): RC disease in 12/111 validations (10.81%), shoulder instability in 7/111 (6.31%), humeral fracture in 6/111 (5.41%) and clavicle fracture, OA and adhesive capsulitis in 1/111 (0.90%). Surgical interventions were analysed in 24/111 validations (21.62%): surgical arthroplasty in 12/111 (10.81%), and RC repair and shoulder stabilisation procedures (as Bankart-type repairs or capsular shifts) in 6/111 (5.41%). A sign, scapula alata, was analysed in 1/111 studies (0.90%). Regarding transcultural adaptations of rating scales, 17/32 outcome measures (53.12%) [12,19,44,58,82,88,94,104,111,132,142,167,203,206,232,250,258] were validated in other languages. The assessment tools that obtained the highest results in cross-cultural adaptation regarding languages were: Oxford Shoulder Score (OSS) [111] validated in 17 cases, Shoulder Pain and Disability Index (SPADI) [167] in 15, Western Ontario Shoulder Instability Index (WOSI) [232] in 12 and Western Ontario Rotator Cuff Index (WORC) [258] in 11. In relation to the administration of the scale, 23/32 (71.87%) [12,58,63,72,80,100,104,111,129,132,142,159,165,167,196,203,206,216,223,230,232,250,258] are self-administered and 9/32 outcome measures (28.12%) [19,44,70,73,82,88,94,95,225] have to be administered by expert clinicians.

3.5. Content Approached by Items and Components of the Tools

Table 6 shows the items and components of the outcome measures grouped by content.
Table 6

Content approached by items and components of outcome measures.

Outcome MeasuresROMShoulder StabilityPainPatient/Clinician SatisfactionMuscle Power/StrengthPhysical Symptoms/SignsADLPhysical and Sport ActivitiesWorkSocial LifePsychological Aspects
1. AMERICAN SHOULDER AND ELBOW SURGEONS STANDARDIZED SHOULDER ASSESSMENT FORM (ASES) [19]
2. CONSTANT-MURLEY SCORE (CMS) [44]
3. DUTCH SHOULDER DISABILITY QUESTIONNAIRE (DUTCH-SDQ) [58]
4. FLEXILEVEL SCALE OF SHOULDER FUNCTION (FLEX-SF) [63]
5. FUDAN UNIVERSITY SHOULDER SCORE [70]
6. FUNCTIONAL SHOULDER SCORE (FSS) [72]
7. KOREAN SHOULDER SCORING SYSTEM (KSS) [73]
8. MELBOURNE INSTABILITY SHOULDER SCALE (MISS) [80]
9. MODIFIED CONSTANT-MURLEY SCORE [82]
10. MODIFIED ROWE SHOULDER SCORE (MRS) [88]
11. MODIFIED UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [95]
12. MUNICH SHOULDER QUESTIONNAIRE (MSQ) [100]
13. OXFORD INSTABILITY SCORE (OIS) [104]
14. OXFORD SHOULDER SCORE (OSS) [111]
15. PEDIATRIC/ADOLESCENT SHOULDER SURVEY (PASS) [129]
16. PENN SHOULDER SCORE (PSS) [132]
17. ROTATOR CUFF QUALITY OF LIFE (RC-QOL) [142]
18. ROWE SCALE [94]
19. SHORT WESTERN ONTARIO ROTATOR CUFF INDEX (SHORTWORC) [159]
20. SHOULDER ACTIVITY RATING SCALE (SARS) [12]
21. SHOULDER FUNCTION INDEX (SFInX) [165]
22. SHOULDER PAIN AND DISABILITY INDEX (SPADI) [167]
23. SHOULDER PAIN SCORE (SPS) [196]
24. SHOULDER RATING QUESTIONNAIRE (SRQ) [203]
25. SIMPLE SHOULDER TEST (SST) [206]
26. SINGLE ASSESSMENT NUMERIC EVALUATION RATING (SANE) [216]
27. SUBJECTIVE SHOULDER RATING SCALE (SSRS) [223]
28. UNIVERSITY OF CALIFORNIA—LOS ANGELES SHOULDER SCALE (UCLA) [225]
29. UNITED KINGDOM SHOULDER DISABILITY QUESTIONNAIRE (UK-SDQ) [230]
30. WESTERN ONTARIO SHOULDER INSTABILITY INDEX (WOSI) [232]
31. WESTERN ONTARIO OSTEOARTHRITIS OF THE SHOULDER INDEX (WOOS) [250]
32. WESTERN ONTARIO ROTATOR CUFF INDEX (WORC) [258]

Abbreviations: ADL, activities of daily living; ASES, American Shoulder and Elbow Surgeons Standardized Shoulder Assessment Form; CMS, Constant-Murley Score; Dutch-SDQ, Dutch Shoulder Disability Questionnaire; FLEX-SF, Flexilevel Scale of Shoulder Function; FSS, Functional Shoulder Score; MISS, Melbourne Instability Shoulder Scale; MSQ, Munich Shoulder Questionnaire; OIS, Oxford Instability Score; OSS, Oxford Shoulder Score; PSS, Penn Shoulder Score; RC-QOL, Rotator Cuff Quality Of Life; ROM, range of motion; SHORTWORC, Short Western Ontario Rotator Cuff Index; SARS, Shoulder Activity Rating Scale; SFInX, Shoulder Function Index; SPADI, Shoulder Pain and Disability Index; SPS, Shoulder Pain Score; SRQ, Shoulder Rating Questionnaire; SST, Simple Shoulder Test; SANE, Single Assessment Numeric Evaluation Rating; SRSS, Subjective Shoulder Rating Scale; UCLA, University of California—Los Angeles Shoulder Scale; UK-SDQ, United Kingdom Shoulder Disability Questionnaire; WOSI, Western Ontario Shoulder Instability Index; WOOS, Western Ontario Osteoarthritis of the Shoulder index; WORC, Western Ontario Rotator Cuff Index.

No topic was included in every tool, and no scale addressed all the contents presented. The frequency in which the subjects were considered by the evaluated tools is represented in percentages by means of a bar graph in Figure 2.
Figure 2

Frequency with which the scales consider specific topics. Abbreviations: ADL, activities of daily living; ROM, range of motion.

The contents addressed, in descending order of frequency, were: activities of daily living (ADL) (81.25%), pain (78.13%), range of motion (ROM) (65.63%), muscle power or strength (62.5%), physical and sport activities (62.5%), work (59.38%), psychological aspects (28.13%), shoulder stability (25%), physical symptoms or signs (18.75%) (compensations, weakness, stiffness, tenderness, atrophy, etc.), patient or clinician satisfaction (15.63%) and social life (12.5%).

4. Discussion

This systematic review compiled 32 validated functional assessment scales and analysed the methodological quality of 111 validations from 73 validation studies associated with said tools. Secondarily, an operational comparison of the methods was carried out to choose the most appropriate in each case, providing a detailed analysis of their characteristics: authors, years, validation studies, indications or applications, origins, languages, instructions for use and observations, as well as the topics addressed.

4.1. Methodological Quality

The QUADAS-2 [16] and the COSMIN RB [273] were used to assess methodological quality in a complementary way, which helped to determine the degree of reliability of the results obtained in the validations [17]. Regarding the QUADAS-2 [16], the patient selection domain obtained the best results because a large number of validations, such as that of Van der Windt [59], described the methodology used in this process and the patients included. However, two validations [144,160] included a convenience sample, increasing the risk of statistical bias in their results. The index test, reference standard and flow and timing domains could not be evaluated in 33% of the validations, as they did not include a reference standard. From a clinical perspective, the use of a reference standard is crucial, since it enables the comparison of the outcome measure that is being validated with a method of proven quality that can create scientific evidence. From a methodological perspective, the ideal validation study should include a blind and independent comparison between the tool to be validated and the reference standard, and both should be assessed in the same patient at the same time [16]. This was done by authors such as MacDermid et al. [170]. This review used the updated version of the COSMIN (2012) [274] (i.e., COSMIN RB (2018) [17]), developed exclusively for use in systematic reviews on outcome measures [17]. Consequently, it has led to a better assessment of the reliability of the results obtained, increased transparency, and therefore, a higher methodological quality of this study [17]. Additionally, the update has made it more intuitive and easier for reviewers to administer. However, including the new PROMs development section [17] resulted in the overall results being less favourable than those that would have been obtained with the COSMIN [274]. This is because the COSMIN RB makes it difficult for the validation studies that contained the tool design and development to obtain positive scores. Regarding the metric properties evaluated in this checklist, responsiveness was the most addressed. Despite this, this metric property obtained a large number of inadequate results because, among other reasons, the validations did not describe the intervention applied, as in Ge et al. [70], or the construct measured by the comparison instrument was not clear, as in the study by Razmjou et al. [47]. Regarding internal consistency, most of the validations obtained the highest possible score with the COSMIN RB. An example of this is the validation of Cook et al. [22], where the internal consistency statistic was calculated for each scale or subscale using Cronbach’s alpha. Regarding the design and development of the functional assessment methods, a large percentage of the validations obtained inadequate or doubtful results. This was because the researcher–patient interviews were not recorded or included notes, as in the validation of Razmjou et al. [159]. Furthermore, cases with small sample size—such as that of L’Insalata et al. [203], where the sample size was 30 patients—obtained an “inadequate” result. In order to obtain a “doubtful” result, 31 to 49 subjects are required and at least 50 are required for the result to be “adequate”. Optimising the sample size is essential for good methodological quality, since if the sample is too small the study not be able to detect an effect that is of interest, and if the sample is too large, it would suppose an unnecessary use of resources [275]. After comprehensive quality analysis using the QUADAS-2 [16] and the COSMIN RB [17], the results obtained using both tools contain inconsistencies. This occurred in the validation of Romeo et al. [89] of the Modified Rowe Shoulder Score [88], with a very favourable score using the QUADAS-2, but obtaining inappropriate results regarding reliability—the only metric property assessed by the COSMIN RB. The uncertainty about whether the patients’ health condition was the same at the time of each measurement determined the unfavourable reliability result. In contrast, the validations of Vascellari et al. [96] using the Modified University of California—Los Angeles Shoulder Scale (UCLA) [95] and the Simple Shoulder Test (SST) [206] obtained good results using both the QUADAS-2 and the COSMIN RB. This was interpreted as high reliability to assess arthroscopic Bankart repair or open Bristow-Latarjet procedure for recurrent anterior shoulder instability using the Modified UCLA [95] and the SST [206]. The same occurred with the validation of Van der Water et al. [165] using the Shoulder Function Index (SFInX) [165] and that of Bicer et al. [171] using the SPADI [167]. Thus, the SFInX [165] and the SPADI [167] are highly recommended in proximal humerus fractures and shoulder pain, respectively. The Modified UCLA [95], the SPADI [167] and the SST [206] were the tools with the highest methodological quality according to the QUADAS-2 and the COSMIN RB. All of them obtained at least one validation with positive results: “low” risk of bias in 5/7 criteria (QUADAS-2), as well as “very good” and “adequate” (COSMIN RB). The aforementioned Modified UCLA [95], SPADI [167] and SST [206] were validated for a wide variety of dysfunctions, although they showed the highest quality for the assessment of surgical interventions for shoulder instability [96] using the Modified UCLA [95] and the SST [206] and shoulder pain [171] using the SPADI [167]. It should be noted that different factors need to be taken into consideration when choosing an assessment method. Therefore, a high level of methodological quality is neither the only characteristic to be taken into account, nor does it have to be the main one. Sometimes, the specificity of a scale regarding a population may be the key to the clinician’s decision making, as shown below. This is the case, for example, for scales designed specifically for a surgical intervention [72]. In relation to the four validations using the Modified UCLA [95], the most recent [96] (2018) obtained the highest quality due to scientific development over time. In the same way, the quality of the 10 validations using the SST [206] improved between 1996 [20] and 2020 [31]. These findings are linked to the current standards expected by prestigious scientific journals. In contrast, the 21 validations using the SPADI [167] did not evolve over time (1991 [167]–2020 [178]). In fact, the best quality was obtained in 2010 [171].

4.2. Indications/Applications and Cross-Cultural Adaptations

Most of the functional assessment scales included have been applied for different shoulder injuries, as they are considered general assessment tools. However, as this study shows, some outcome measures were designed for a specific pathology, possibly due to their high incidence [276]. These include RC injuries [73,142,159,258], shoulder instabilities [80,104,232], proximal humerus fractures [165] and osteoarthritis [250]. Still, over the years, they have been validated and applied for different dysfunctions. This is the case of the WORC [258], which was originally designed only for RC injuries, but which was later validated for scapula alata [259] as well, expanding its application possibilities. On occasion, some specific scales have even been applied to other populations without having been validated—for example, the WOOS [250] was created for osteoarthritis and was subsequently used for proximal humerus fractures [253]. There are also tools for specific populations regarding symptoms/signs or general surgeries [26]. On the other hand, RC assessment methods stand out due to the large number of validations that support them [27,47,72,73,142,144,159,160,161,162,207,258]. In fact, these validations are both of their specific and general scales. However, it is noteworthy that another frequent dysfunction, adhesive capsulitis [277], does not have a specific validated tool, and only a general scale has been validated for this dysfunction (i.e., the SPADI) [167]. Many of the functional assessment scales included are known worldwide and have cross-cultural adaptations in other languages. The tool with the most cross-cultural adaptations is the OSS [111], as it is fast, practical, reliable, valid and clinically sensitive to changes. By contrast, despite having fewer adaptations to other languages, the CMS [44] is more often used [53] than the OSS [111], both at a clinical level and for scientific dissemination. Being aware of the wide variety of existing scales and their general or specific applications, linked to validations that ratify their effectiveness, makes it easier for the clinician to choose the most appropriate method in each case.

4.3. Tool Administration

Regarding the administration of the assessment tools, the majority were self-administered by the patients (72%) following the authors’ instructions. Thus, the outcome measures used simple language users could understand, such as: “Is your shoulder comfortable with your arm at rest by your side?” [206] and “How much difficulty do you have sleeping because of your shoulder?” [232]. Furthermore, the Munich Shoulder Questionnaire [100] enables comprehension using images representing positions or actions. On the other hand, some scales require specialised shoulder clinicians to assess motion ranges [19,44,70,73,82,88,94,95,225], medical signs [19], muscle strength or power [19,44,70,82,95,225] and stability [19,88,94]. This can all be done through observation, palpation, instrumentation (goniometer), assessment tests (Daniels for strength, Apprehension Test for stability), etc. In recent decades, clinicians have tended to take patients’ perception into account [278], which improves communication between patient and clinician [279]. Indeed, the original version of the renowned UCLA [225] was modified to include the degree of patient satisfaction [95]. Complementing the objective data with this perception favours the evaluations and, therefore, decision making throughout the functional recovery process. This would justify the notable increase in the design and development of PROMs.

4.4. Content Addressed by the Items and Components of the Tools

The items included in the scales are shown below in descending order of frequency. The ADL were the most considered component. Only 6 out of 32 outcome measures [12,88,94,129,196,223] did not address them. Including them is essential since they measure medical condition in terms of functionality [279]. In addition, specifically, “reaching above head level” was included by the vast majority of scales (81.25%). The second most frequently included aspect was shoulder pain [19,44,58,70,72,73,80,82,88,95,100,104,111,129,132,142,167,196,203,223,225,230,232,250,258], possibly due to its high incidence in the population [280]. In particular, the Shoulder Pain Score [196] focuses solely on this topic. Night-time pain is highlighted specifically since the quality of sleep generally decreases in patients who suffer from it [281]. Lack of rest leads to the alteration of the abilities to perform the ADL, even having an impact on the emotional area. This justifies its consideration in the assessment tools. Indeed, Constant et al. [82] modified their original version to include night-time pain [82] among other items. The ROM [12,19,44,58,63,70,73,80,82,88,94,95,100,129,132,142,203,206,223,225,232] was the third most considered topic. Regarding this, the great amplitude of the shoulder stands out [1]. This enables the performance of the necessary supracranial motions in the usual range of ADL. Physical and sports activities [12,44,80,82,88,94,100,104,129,132,142,165,203,206,216,223,230,232,250,258] and muscle strength and power [12,19,44,63,70,73,80,82,88,95,100,104,129,142,159,165,167,203,206,225,232] were the fourth most frequently considered topics. Both aspects are closely related. In fact, imbalance between external and internal rotation forces, as well as infraspinatus muscle atrophy, are common in volleyball players [282]. Furthermore, there is a clear link between certain sports and many shoulder injuries. For example, glenohumeral laxity and instability and scapular dyskinesia commonly affect swimmers [283]. RC disorders, especially subacromial impingement, are typical of golfers [284]. The direct relationship between shoulder impairment and appropriate work performance makes addressing the work area essential [12,44,80,82,88,94,100,104,111,132,142,159,203,206,216,223,232,250,258]. Shoulder disorders are the third most common cause of musculoskeletal consultations [2]. In particular, surgical interventions are directly linked with temporary work disabilities, and may even be permanent at times. This professional absenteeism not only causes socioeconomic losses but also affects the mental and emotional state [285]. Only a few tools addressed psychological aspects [58,100,104,129,142,230,232,250,258] and shoulder stability [19,80,88,94,104,129,223,232]. Shoulder stability and muscle strength are closely related—so much so that shoulder stability is improved through strength training [286]. Furthermore, stability together with a great shoulder ROM are essential for the adequate execution of ADL [287]. For their part, psychological factors are especially relevant and can lead to chronicity or modify the perception of the intensity of the pain and therefore the degree of dysfunction [288]. Physical signs and symptoms [19,129,159,232,250,258], degree of satisfaction [70,73,95,132,203] and social life [100,104,129,142] were the least addressed aspects, even though they also influence functionality. Despite the importance of the contents shown above, this review did not identify any functional assessment tool that included all of them. For this reason, a prospective study suggested by the authors would be the development of an outcome measure of methodological quality that includes this requirement.

4.5. Limitations and Strengths

Regarding the limitations of this review, its extension—which resulted from the high number of identified tools and validations analysed—led the authors to exclude the methodological quality of cross-cultural adaptations. Even so, we decided to provide the references in order to make it easier for interested readers to find them. As to its strengths, the paper compiled up to 32 validated shoulder outcome measures, providing a unique and useful document for the clinician to choose the most appropriate tool at all times. In addition, the methodological quality of the 111 validations associated with these scales was not only analysed using the COSMIN RB but supplemented with the QUADAS-2. This resulted in an even stronger basis for creating scientific evidence.

5. Conclusions

A necessary and practical compilation of 32 functional shoulder outcome measures was undertaken. The rating scales were systematically evaluated, and the methodological quality of 111 validations associated with these tools was analysed. An operational comparison of the outcome measures was also provided in order to facilitate the choice of the most appropriate for both clinical and research settings. The Modified University of California—Los Angeles Shoulder Scale and the Simple Shoulder Test showed the highest quality in the assessment of surgical interventions for shoulder instability, as did the Shoulder Pain and Disability Index for shoulder pain. The level of methodological quality is not the only factor to consider when selecting an assessment method. Specificity regarding the population, among other factors, could be decisive. A large number of functional assessment tools were applied for different shoulder injuries, increasing the possibility of choice in their clinical application. The scales were mostly self-administered, clarifying the tendency to consider patients’ perceptions. Activities of daily living together with pain were the most addressed contents in the outcome measures.
  274 in total

1.  Shoulder kinematic features using arm elevation and rotation tests for classifying patients with frozen shoulder syndrome who respond to physical therapy.

Authors:  Jing-lan Yang; Chein-wei Chang; Shiau-yee Chen; Jiu-jenq Lin
Journal:  Man Ther       Date:  2007-10-02

2.  Secondary motions of the shoulder during arm elevation in patients with shoulder tightness.

Authors:  Jing-Lan Yang; Tung-Wu Lu; Feng-Ching Chou; Chein-Wei Chang; Jiu-Jenq Lin
Journal:  J Electromyogr Kinesiol       Date:  2008-12-16       Impact factor: 2.368

3.  EXERCISE REHABILITATION IN THE NON-OPERATIVE MANAGEMENT OF ROTATOR CUFF TEARS: A REVIEW OF THE LITERATURE.

Authors:  Peter Edwards; Jay Ebert; Brendan Joss; Gev Bhabra; Tim Ackland; Allan Wang
Journal:  Int J Sports Phys Ther       Date:  2016-04

4.  The assessment of shoulder instability. The development and validation of a questionnaire.

Authors:  J Dawson; R Fitzpatrick; A Carr
Journal:  J Bone Joint Surg Br       Date:  1999-05

5.  An Intra-articular Steroid Injection at 6 Weeks Postoperatively for Shoulder Stiffness After Arthroscopic Rotator Cuff Repair Does Not Affect Repair Integrity.

Authors:  In-Bo Kim; Dong Wook Jung
Journal:  Am J Sports Med       Date:  2018-06-20       Impact factor: 6.202

6.  Validation and reliability of a Spanish version of Simple Shoulder Test (SST-Sp).

Authors:  M D Membrilla-Mesa; V Tejero-Fernández; A I Cuesta-Vargas; M Arroyo-Morales
Journal:  Qual Life Res       Date:  2014-07-20       Impact factor: 4.147

7.  Surgical Release of the Pectoralis Minor Tendon for Scapular Dyskinesia and Shoulder Pain.

Authors:  Matthew T Provencher; Hannah Kirby; Lucas S McDonald; Petar Golijanin; Daniel Gross; Kevin J Campbell; Lance LeClere; George Sanchez; Shawn Anthony; Anthony A Romeo
Journal:  Am J Sports Med       Date:  2016-09-30       Impact factor: 6.202

8.  Cross-cultural adaptation and validation of the Romanian Oxford Shoulder Score.

Authors:  Horia Haragus; Radu Prejbeanu; Jenel Patrascu; Cosmin Faur; Mihai Roman; Razvan Melinte; Bogdan Timar; Ion Codorean; William Stetson; Guido Marra
Journal:  Medicine (Baltimore)       Date:  2018-06       Impact factor: 1.889

9.  Translation and validation of the Western Ontario Osteoarthritis of the Shoulder (WOOS) index - the Danish version.

Authors:  Jeppe V Rasmussen; John Jakobsen; Bo S Olsen; Stig Brorson
Journal:  Patient Relat Outcome Meas       Date:  2013-09-18

10.  Interpretation and content validity of the items of the numeric rating version short-WORC to evaluate outcomes in management of rotator cuff pathology: a cognitive interview approach.

Authors:  Rochelle Furtado; Joy C MacDermid; Dianne M Bryant; Kenneth J Faber; George S Athwal
Journal:  Health Qual Life Outcomes       Date:  2020-03-30       Impact factor: 3.186

View more
  1 in total

1.  Cross-cultural adaptation and validation of the Arabic version of the simple shoulder test in the United Arab Emirates.

Authors:  Tamer Shousha; Fatima Alowais; Ashokan Arumugam
Journal:  PLoS One       Date:  2022-05-04       Impact factor: 3.752

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.