Literature DB >> 34825344

Development and validation of a tool for evaluating YouTube-based medical videos.

Mehmet Akif Guler1, Esref Orkun Aydın2.   

Abstract

BACKGROUND/AIMS: Today, one of the ways to access medical information is the internet. Our objective was to develop a measurement tool to assess the quality of online medical videos.
METHODS: Online videos covering a variety of subjects (COVID-19, low back pain, weight loss, hypertension, cancer, chest pain, vaccination, asthma, allergy, and cataracts) were evaluated using our Medical Quality Video Evaluation Tool (MQ-VET) by 25 medical and 25 non-medical professionals. Exploratory factor analysis, Cronbach's alpha, and correlation coefficients were used to assess the validity and reliability of the MQ-VET.
RESULTS: The final MQ-VET consisted of 15 items and four sections. The Cronbach's alpha reliability coefficient for the full MQ-VET was 0.72, and the internal consistency for all factors was good (between 0.73 and 0.81). The correlation between the DISCERN questionnaire scores and MQ-VET scores was significant.
CONCLUSION: Collectively, our findings indicated that the MQ-VET is a valid and reliable tool that will help to standardize future evaluations of online medical videos.
© 2021. The Author(s), under exclusive licence to Royal Academy of Medicine in Ireland.

Entities:  

Keywords:  Medical videos; Questionnaire; Reliability; Validity; YouTube

Mesh:

Year:  2021        PMID: 34825344      PMCID: PMC8616030          DOI: 10.1007/s11845-021-02864-0

Source DB:  PubMed          Journal:  Ir J Med Sci        ISSN: 0021-1265            Impact factor:   2.089


Introduction

Eight out of ten internet users access medical information online [1]. The YouTube platform, in particular, allows users to create medical content, without any obligation to post verified information [2, 3]. In 2007, Keelan et al. first examined the quality of immunization-related online videos [4]. Many subsequent studies have further assessed the reliability of medical videos on YouTube. Presently, the search term “YouTube” returns more than 1,500 publications on PubMed and Scopus (accessed on 17 Jan 2021) [1]. However, a standardized tool for evaluating medical health videos is lacking. Most previous studies used novel, topic-specific scoring systems based on the literature and authors’ own knowledge [5]. However, the generalizability of these scoring systems is poor, and the results obtained using them are difficult to repeat. Moreover, their validity and reliability have not been adequately measured. A variety of tools to evaluate the accuracy of medical information are available, including the DISCERN instrument, Health on the Net (HON) code, Journal of the American Medical Association (JAMA) evaluation system, brief DISCERN instrument, global quality score (GQS), and video power index (VPI); medical videos can also be evaluated subjectively [1, 5, 6]. The HON Foundation devised eight principles for websites to abide by, called the HONcode [7]. Certification by the HONcode Foundation is available for a fee, but the quality of medical information is not rated. Furthermore, the validity and reliability of this system for YouTube videos have not been confirmed. The JAMA scoring system was created to evaluate medical information on websites [8], but has not been validated for videos. The DISCERN instrument was created nearly 20 years ago for application to “written information about treatment choices” [9, 10]. Again, however, this instrument has not been validated for medical videos. In addition, the second part of the DISCERN questionnaire is focused on treatment information. Thus, videos that exclude treatment information yield misleading results [10]. The VPI score, as a measure of audience approval, is calculated as the number of likes a video has divided by the number of likes and dislikes [11]. This scoring system, which is frequently used, is not suitable for evaluating the quality and reliability of medical videos. Given the lack of suitable instruments, we developed a reliable instrument, i.e., the Medical Quality Video Evaluation Tool (MQ-VET), for use by patients and healthcare professionals.

Materials and methods

Instrument development and item generation

Our original questionnaire included 42 novel items based on published evaluations of medical video quality. All questionnaires used in YouTube-related articles and questions used in subjective evaluations were examined by both authors [1, 5, 12–14]. Candidate items were rated by the authors (0 points, not applicable; 10 points, highly applicable). Duplicated questions and those with a score below the average were excluded, resulting in a total of 28 questions.

Participants

Videos were evaluated by 25 medical and 25 non-medical participants who have obtained sufficient points in any of the valid English language tests in the country and fluent in English. The questionnaire items were rated by participants in terms of quality and relevance (0–10 points for each item). The face and content validity of the questionnaire were also evaluated using the 10-point rating system. After excluding items with a score < 7, the final MQ-VET included 19 items. Ten unique videos (first appeared video for the popular topics from different medical subjects using YouTube’s default setting) differing in terms of the uploader and medical topic were evaluated using a 5-point Likert scale (Table 1). The DISCERN instrument was also used to evaluate each video for concurrent validity.
Table 1

General information of the evaluated YouTube videos (updated on 12.02.2021)

Video linkTopicSourceView countComment countLike/dislikeUploaded
https://www.youtube.com/watch?v=i0ZabxXmH4YCOVID-19Institution517,9215365465/35915 January 2020
https://www.youtube.com/watch?v=BOjTegn9RuYLow back painMedical doctor1,988,635123716,390/101224 January 2014
https://www.youtube.com/watch?v=2MoGxae-zyoHow to lose weightPrivate139,422,338116,5862,801,454/18,0008 August 2019
https://www.youtube.com/watch?v=X5TknCu3RV0HypertensionInternet source157,110722301/3824 August 2019
https://www.youtube.com/watch?v=SGaQ0WwZ_0ICancerInstitution2,423,944222810,298/106731 October 2013
https://www.youtube.com/watch?v=vEQQidcJF1QChest painInstitution96,213172398/3315 October 2019
https://www.youtube.com/watch?v=Atrx1P2EkiQVaccinationCompany364,676753310/18830 January 2020
https://www.youtube.com/watch?v=PzfLDi-sL3wAsthmaCompany3,964,81911,16274,291/100911 May 2017
https://www.youtube.com/watch?v=llZFx8n-WCQAllergyWebsite/video channel94,73623974/2818 November 2019
https://www.youtube.com/watch?v=d5D0B2PoC7UCataractInstitution207,946411049/3726 October 2015
General information of the evaluated YouTube videos (updated on 12.02.2021)

Statistical analysis

Statistical analyses were performed using SPSS ver. 20.0 (IBM, Armonk, NY, USA). Data distribution was examined using the Shapiro–Wilk test and histograms. Continuous variables are expressed as means ± standard deviation (SD) with ranges, and categorical variables are expressed as numbers and percentages. For item analysis, kurtoses, item-item correlations (IIC), and item-total correlations were calculated. Exploratory factor analysis (EFA) was conducted to verify construct validity. Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy and Bartlett’s sphere test were conducted to check whether the data were suitable for EFA. In general, KMO values between 0.8 and 1 indicate that the sampling is adequate, and KMO values less than 0.6 indicate that the sampling is not adequate [15]. Factors in the EFA were extracted using principal components analysis and the varimax kappa 4 of the rotation. Reliability was assessed by analyzing Cronbach’s alpha value. Spearman’s correlation coefficients between MQ-VET and DISCERN scores were used for concurrent validity. After the study was completed, the post hoc power analysis was performed using the G*Power version 3.1.9.2 software (Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany). For the bivariate normal model correlation from exact test family, the post hoc power was calculated as 0.81 in the power analysis using correlation between the MQ-VET and the DISCERN scores (Table 2).
Table 2

MQ-VET and DISCERN scores of the YouTube videos on different topics

TopicMQ-VET scoresDISCERN scores
Part 1Part 2Part 3Part 4Total scoreSection 1Section 2Section 3Total score
COVID-1918.12 ± 4.7416.96 ± 2.3913.24 ± 1.4912.60 ± 2.0860.92 ± 8.3631.48 ± 4.0215.06 ± 5.872.46 ± 12.6549 ± 7.05
Low back pain18.42 ± 4.2216.72 ± 2.8113 ± 1.8512.04 ± 2.4960.18 ± 9.1630.54 ± 4.6725.02 ± 4.113.9 ± 0.7359.46 ± 6.4
How to lose weight12.18 ± 3.7010.60 ± 4.1812.74 ± 2.2511.24 ± 2.9746.76 ± 8.6122.66 ± 5.2314.52 ± 6.582.12 ± 1.0839.3 ± 11.06
Hypertension13.06 ± 4.1415.18 ± 2.8912.56 ± 1.7810.64 ± 2.5751.44 ± 7.6622.52 ± 5.2616.48 ± 7.482.18 ± 1.2841.18 ± 13.28
Cancer14.10 ± 4.3314.16 ± 2.6612.14 ± 2.819.86 ± 2.4950.26 ± 9.1225.30 ± 6.7621.18 ± 5.963.26 ± 1.0849.74 ± 11.89
Chest pain11.44 ± 2.8912 ± 3.3611.54 ± 9.049.04 ± 2.5044.02 ± 7.1819.24 ± 6.1514.12 ± 5.991.96 ± 1.0235.08 ± 11.86
Vaccination13.96 ± 3.5415.56 ± 2.7412.36 ± 2.8110.70 ± 2.4352.58 ± 8.2622.68 ± 6.3918.20 ± 5.592.72 ± 1.1943.60 ± 11.14
Asthma14.08 ± 4.3015.40 ± 2.5112.34 ± 2.3511.82 ± 2.4553.64 ± 8.1328.14 ± 3.3023.62 ± 4.372.92 ± 1.2454.68 ± 6.90
Allergy11.08 ± 4.9412.44 ± 3.169.34 ± 2.7010.46 ± 2.0843.32 ± 9.4122.94 ± 4.9116.48 ± 6.542.52 ± 1.1141.94 ± 10.82
Cataract10.80 ± 4.5516.44 ± 2.7712.40 ± 2.809.36 ± 2.2149.00 ± 6.8122.80 ± 5.1713.50 ± 7.501.84 ± 1.0938.14 ± 12.54
MQ-VET and DISCERN scores of the YouTube videos on different topics

Results

The mean age of the participants was 30.98 ± 4.38 years (range: 25–42 years). The professions of the participants were as follows: doctor (44%, n = 22), pharmacist (6%, n = 3), academic/teacher (20%, n = 10), and engineer (24%, n = 12). Profession data were missing in three cases. There were 23 (46%) participants with a bachelor’s degree, 6 (12%) with a masters, and 21 (42%) with a doctorate.

Exploratory factor analysis

The Kaiser–Meyer–Olkin (KMO) value for the final MQ-VET (19 items) was 0.83 and the Bartlett’s test statistic was x2 = 3920.72 (p < 0.001). Thus, the data were suitable for further analysis. The first exploratory factor analysis (EFA) had five factors. The component correlation matrix was orthogonal, and varimax rotation with Kaiser normalization was applied. The factor loadings of the final EFA are displayed in Table 3. Ultimately, our questionnaire included 15 items across four factors (5, 4, 3, and 3 items for factors 1–4, respectively).
Table 3

Factor loadings from exploratory factor analysis

ItemsFactor 1Factor 2Factor 3Factor 4Communality
13. Date and updates, if any, are clearly stated0.8220.724
15. The recording date of the video and date on which the information was accessed are mentioned0.8080.701
12. The resources and references used are clearly stated0.7640.606
14. Concerns about advertising and potential conflicts of interest have been resolved0.7310.670
23. Sufficient information was provided about the identity of the presenter in the video0.5710.578
9. The materials used in the video facilitated learning0.7830.729
10. The video covered the basic concepts of the subject0.6960.550
17. To explain the medical topic, visual resources were used sufficiently0.6710.672
19. The medical terms used were well-explained0.4560.4440.418
22. The sound quality of the video was sufficient0.8570.764
21. The image quality of the video was sufficient0.8220.704
1. The information in the video is clear and understandable0.4960.4590.488
4. The video generally met my expectations0.7740.698
5. Information about the video content was provided at the beginning0.7440.591
2. The video provided new knowledge and skills0.7020.577
Eigen value4.8172.1511.3031.201
Explained variance (%)32.11214.3408.6848.004
Total explained variance (%)32.11246.45355.13763.140
Factor loadings from exploratory factor analysis

Concurrent validity

Correlation between the final form of the MQ-VET and DISCERN questionnaire used for concurrent validity. The scores of the questionnaires have shown in Table 2. The first part of the MQ-VET significantly correlated with all sections of DISCERN: Sect. 1 (rho = 0.617, p < 0.001), Sect. 2 (rho = 0.508, p < 0.001), Sect. 3 (rho = 0.436, p < 0.001), and total score of DISCERN (rho = 0.640, p < 0.001). The second part of the MQ-VET also significantly weakly correlated with all sections of DISCERN: Sect. 1 (rho = 0.456, p < 0.001), Sect. 2 (rho = 0.167, p < 0.001), Sect. 3 (rho = 0.123, p = 0.006), and total score of DISCERN (rho = 0.326, p < 0.001). The third part of the MQ-VET only significantly weakly correlated with Sect. 1 of the DISCERN scores (rho = 0.228, p < 0.001), but not significantly correlated with Sect. 2, Sect. 3, and total scores of DISCERN (p = 0.975, p = 0.578, p = 0.18, respectively). The fourth part of the MQ-VET significantly correlated with all DISCERN scores: Sect. 1 (rho = 0.510, p < 0.001), Sect. 2 (rho = 0.231, p < 0.001), Sect. 3 (rho = 0.205, p < 0.001), and total score (rho = 0.395, p < 0.001). Total scores of the MQ-VET significantly correlated with all sections: Sect. 1 (rho = 0.654, p < 0.001), Sect. 2 (rho = 0.377, p < 0.001), Sect. 3 (rho = 0.320, p < 0.001), and total scores of DISCERN (rho = 0.564, p < 0.001).

Reliability

Regarding internal consistency, the Cronbach’s alpha value was 0.81, 0.78, 0.75, and 0.73 for factors 1–4, respectively. The Cronbach’s alpha reliability coefficient for the overall MQ-VET questionnaire was 0.72.

Discussion

Collectively, our results confirmed the validity and reliability of the MQ-VET questionnaire. Although previous publications have discussed YouTube medical video quality [16-18], standardized assessment tools were not utilized. Typically, de novo questionnaires were devised based on the literature and the authors’ own knowledge [1, 5]. Several tools exist for the evaluation of written online information [5]. However, the applicability of these tools to videos is not known [7]. The DISCERN questionnaire is designed to evaluate treatment options [10]. Thus, it is inappropriate for analyzing videos lacking treatment information. The JAMA questionnaire, the GQS, and the HONcode were also created for the evaluation of the medical sites and written information on the internet. The VPI was designed specifically for evaluating videos, but assesses popularity rather than quality and content. Scores based on popularity change over time, which impacts repeatability [11]. The MQ-VET resolves the aforementioned issues, and its validity and reliability have been demonstrated for a variety of medical topics. Also, the MQ-VET was designed for use by both medical professionals and the general population. Evaluation of additional medical topics by more reviewers will provide further support for the MQ-VET, while translation into other languages will increase its utility. This study was limited by the low number of participants and videos, and by the lack of the test–retest reliability of the MQ-VET. However, we believe that these problems will be resolved in future studies (Table 4).
Table 4

Final version of the Medical Quality Video Evaluation Tool

ItemStrongly disagree(1 point)Disagree(2 points)Neutral(3 points)Agree(4 points)Strongly agree(5 points)
Part 11. Dates of updates, if any, are clearly stated
2. The recording date of the video and date on which the information was accessed are mentioned
3. The resources and references used are clearly stated
4. Concerns about advertising and potential conflicts of interest have been resolved
5. Sufficient information was provided about the identity of the presenter in the video
Part 26. The materials used in the video facilitated learning
7. The video covered the basic concepts of the subject
8. To explain the medical topic, visual resources were used sufficiently
9. The medical terms used were well-explained
Part 310. The sound quality of the video was sufficient
11. The image quality of the video was sufficient
12. The information in the video is clear and understandable
Part 413. The video generally met my expectations
14. Information about the video content was provided at the beginning
15. The video provided new knowledge and skills
Final version of the Medical Quality Video Evaluation Tool In conclusion, we have developed a questionnaire to evaluate the quality of online medical videos posted by both medical professionals and members of the general public. We believe that this tool will help standardize evaluations of online videos.
  8 in total

1.  An exploratory assessment of weight loss videos on YouTube™.

Authors:  C H Basch; I C-H Fung; A Menafro; C Mo; J Yin
Journal:  Public Health       Date:  2017-07-12       Impact factor: 2.427

2.  YouTube as a source of information on immunization: a content analysis.

Authors:  Jennifer Keelan; Vera Pavri-Garcia; George Tomlinson; Kumanan Wilson
Journal:  JAMA       Date:  2007-12-05       Impact factor: 56.272

3.  Content of widely viewed YouTube videos about celiac disease.

Authors:  C H Basch; G C Hillyer; P Garcia; C E Basch
Journal:  Public Health       Date:  2019-01-23       Impact factor: 2.427

4.  YouTube as a source of information on fibromyalgia.

Authors:  Tugba Ozsoy-Unubol; Ebru Alanbay-Yagci
Journal:  Int J Rheum Dis       Date:  2020-12-23       Impact factor: 2.454

Review 5.  Healthcare information on YouTube: A systematic review.

Authors:  Kapil Chalil Madathil; A Joy Rivera-Rodriguez; Joel S Greenstein; Anand K Gramopadhye
Journal:  Health Informatics J       Date:  2014-03-25       Impact factor: 2.681

6.  Evaluating the Accuracy and Quality of the Information in Kyphosis Videos Shared on YouTube.

Authors:  Mehmet Nuri Erdem; Sinan Karaca
Journal:  Spine (Phila Pa 1976)       Date:  2018-11-15       Impact factor: 3.468

7.  English-language videos on YouTube as a source of information on self-administer subcutaneous anti-tumour necrosis factor agent injections.

Authors:  Sena Tolu; Ozan Volkan Yurdakul; Betul Basaran; Aylin Rezvani
Journal:  Rheumatol Int       Date:  2018-05-14       Impact factor: 2.631

Review 8.  A systematic review of patient inflammatory bowel disease information resources on the World Wide Web.

Authors:  André Bernard; Morgan Langille; Stephanie Hughes; Caren Rose; Desmond Leddin; Sander Veldhuyzen van Zanten
Journal:  Am J Gastroenterol       Date:  2007-05-19       Impact factor: 10.864

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.