Literature DB >> 33425047

Applying data mining techniques to explore user behaviors and watching video patterns in converged IT environments.

Abstract

Comfortable leisure and entertainment is expected through multimedia. Web multimedia systems provide diversified multimedia interactions, for example, sharing knowledge, experience and information, and establishing common watching habits. People use information technology (IT) systems to watch multimedia videos and to perform interactive functions. Moreover, IT systems enhance multimedia interactions between users. To explore user behaviors in viewing multimedia videos by key points in time, multimedia video watching patterns are analyzed by data mining techniques. Data mining methods were used to analyze users' video watching patterns in converged IT environments. After the experiment, we recorded the processes of clicking the Web multimedia video player. The system logs of using the video player are classified into four variables, playing time, active playing time, played amount, and actively played amount. To explore the four variables, we apply the k-means clustering technique to organize the similar playing behavior patterns of the users into three categories: actively engaged users, watching engaged users, and long engaged users. Finally, we applied statistical analysis methods to compare the three categories of users' watching behaviors. The results showed that there were significant differences among the three categories. © Springer-Verlag GmbH Germany, part of Springer Nature 2021.

Entities: Chemical Disease Species

Keywords: Converged IT environments; Data mining techniques; User behaviors; Watching video patterns

Year: 2021 PMID： 33425047 PMCID： PMC7775737 DOI： 10.1007/s12652-020-02712-6

Source DB: PubMed Journal: J Ambient Intell Humaniz Comput

Introduction

In recent years, multimedia has made leisure and entertainment more comfortable. Web multimedia systems (e.g., YouTube and Vimeo) not only offer people a platform for sharing information, but also provide them a recreation to adjust their body and mind (Duncm 2011). Our society was based primarily on industry, but now it is evolving into an information-based society in which people are immersed in advanced technological environments and facing numerous complex, ill-structured, rapidly changing tasks, situations, and problems (Lai et al. 2019; Su et al. 2019a, b; Su et al. 2020). Therefore, traditional Web systems are not appropriate for addressing these challenges, and Web multimedia systems need to evolve with the required interactive tools to survive in a world filled with various kinds of information. In addition to entertainment, people prefer using multimedia platforms that have interactive characteristics. Cheng et al. (2008) showed that users share their videos, watch other people’ videos, and express whether they like videos on Web multimedia systems. Multimedia designers must develop new content for people to absorb the new knowledge, skills, and abilities that are necessary for dealing with complicated situations. The COVID-19 pandemic has halted flipped classroom education worldwide, and multimedia materials are becoming very popular as they can enable learning without physical contact (Angeli and Valanides 2020; Kim et al. 2014; Pears et al. 2007; Su et al. 2017). Pears et al. (2007) reviewed the literature on flipped classroom education from the perspective of curriculum, pedagogy, and instruments, stating that despite the vast amount of literature in this field. Unlike traditional education where the instructors lecture in the classroom and the learners finish their homework assignments at home (Akçayır and Akçayır 2018). Kim et al. (2014) presented the experience of flipped classrooms in a university, and flipped classrooms were applied to engineering-related multimedia courses. The flipped classrooms allow learners to learn anytime and anywhere to develop basic knowledge of new materials before pre-class learning, and thus, instructors would have more time to conduct in-class learning activities, such as collaborative learning, problem-based learning, and group hands-on activities (Akçayır and Akçayır 2018). Flipped classrooms are still considered a new instructional method for improving traditional education. The flipped classroom approach offers more opportunities to utilize IT aids for pre-classroom preparation. Watching multimedia materials is an important pre-class learning activity in flipped classrooms (Cheng et al. 2008, 2013; Duncm 2011; Kopcha and Sullivan 2008; Thompson et al. 2008; Su and Chen 2020). Users usually use Web multimedia systems (e.g., Facebook or YouTube) to watch multimedia videos provided by instructors in order to build background knowledge before class. Multimedia interaction also promotes the growth of Web multimedia systems (Duncm 2011). The most well-known and successful multimedia system is YouTube. YouTube provides users with a platform to upload and watch videos, post likes and dislikes, and the opportunity to leave comments (Cheng et al. 2008). Thompson et al. (2008) presented the interesting videos in YouTube had offered relevant examples. Users need to watch YouTube videos before class, discuss the content of the videos in the discussion area of the videos, and finally answer questions in the learning system. Jovanović et al. (2019) required users to watch multimedia video assignments on Facebook before class and take notes in their notebooks. Multimedia systems for watching multimedia videos have several advantages in flipped classrooms. First, users can select multimedia videos for viewing at their own convenience, which would benefit users’ watching motivations (Kopcha and Sullivan 2008). Second, the user behaviors in viewing videos can be recorded in the system logs, and the system logs can be analyzed to understand user behaviors in order to provide personalized assistance. Data mining techniques have been developed to reduce manpower and reproduce human intelligence, and perform more efficiently than humans. Data mining techniques must be able to learn autonomously, and a significant amount of data and experiences are recorded through data mining and statistical methods. In addition, data mining techniques can utilize the concept of data exploration to integrate large quantities of unrelated data, find useful correlations, and recover valuable information from the data. Data mining technologies are divided into statistics, classification, clustering, regression, and association (Chou et al. 2020; Lee et al. 2018; Romero and Ventura, 2010). Lee et al. (2018) applied user profile data to predict the levels of tasks and expertise of programmers. The results showed that user profile data help to predict programmer expertise in easy or difficult tasks. Thus, data mining techniques are used to analyze generated systems using behavior data. For example, watching multimedia videos to collect effective information and provide information for multimedia designers and developers to improve future Web multimedia systems. To explore the behavioral data of watching multimedia videos, related studies (Brinton et al. 2016; Dringus and Ellis 2005; Mourdi et al. 2019; Ledbetter et al. 2016; Liu and Xiu 2017; Lin et al. 2013; Su et al. 2019a, b; Xie et al. 2017) have analyzed the system logs of the multimedia annotation systems where users use video players to watch multimedia video assignments and annotate the content of the videos Ellis et al. (2015) developed a Web multimedia annotation system that allows users to add an annotation at any time point of the multimedia videos they are playing. Lai et al. (2020) proposed a multimedia annotation system in which users draw regular shapes (e.g., arrows, rectangles, circles, etc.) to annotate the video content at any time point while a video is being played. Their results revealed that the annotation system helps users concentrate on the critical areas of the multimedia video. In general, multimedia video players provide several basic playing operations, such as play, pause, seek bar forward, seek bar backward, resume, and volume control, playback speed control etc. In addition to conventional multimedia video players, there are studies that propose Web multimedia annotation systems that provide functions for annotating multimedia videos. Users can analyze multimedia video content and do some reflections on them. Previous studies have applied clustering methods of data mining techniques to successfully analyze multimedia video viewing behaviors of users (Cheng et al. 2013; Dai et al. 2019; Dai et al. 2020a, b; Lee et al. 2018; Ledbetter et al. 2016; Lin et al. 2013; MacQueen, 1967). Kodinariya and Makwana (2013) applied the k-means clustering algorithm to find the minimum sum of squared errors between the data in the cluster center and the cluster. K-means is a clustering algorithm proposed by MacQueen (1967), which is an unsupervised machine learning method to classify similar data into optimal categories. It is a simple, easy-to-implement, and efficient method for larger datasets. Therefore, this study uses the K-means clustering method to analyze the system logs recorded in multimedia annotation systems. From the system logs, we can obtain the time users spend on performing the experimental activity, and the time they spend watching multimedia videos. In order to support the entire experimental activity, we developed a Web multimedia annotation system that can provide three video annotation formats, including comments, discussions, and questions (Su et al. 2015). Multimedia video designers can use these annotation formats to conduct different experimental activities in viewing multimedia videos, and motivate users to watch multimedia videos. For example, using question annotation to test whether the users understood the content of the multimedia video or not. Our contributions explore the viewing processes of using the system, and we apply data mining methods to analyze the time users spent watching multimedia videos to identify meaningful multimedia video watching behaviors. The rest of this paper is organized as follows. We provide the methodology, instruments, data collection, data analysis, and the data mining method in Sect. 2. The experimental results are described in Sect. 3, and we demonstrate the effectiveness of the proposed method. Finally, we present the conclusions and suggestions in Sect. 4.

Methods

Research subjects and procedure

The research subjects in this study were mainly freshmen students in a college located in northern Taiwan. Twenty-seven users participated in the experiment via voluntary registration. This activity is “the techniques and applications of virtualization.” The objective of this activity is to enable users to enhance their knowledge and concepts of cloud and virtualization technologies (Krup 2019). The technical materials were based on Krup (2019), and we developed multimedia videos about the concept of virtualization technology, basic operations of Docker virtualization tools, skills and demonstration of practical development. The experiment was performed at 100 min per week. Before the experiment, the instructor introduced the operating steps of viewing multimedia video assignments on the Web multimedia annotation system. The Web multimedia annotation system provides three multimedia annotation formats, namely comments, discussions, and questions. Later, the designer publishes multimedia video assignments on the Web multimedia annotation system. The participants were given a 3-week period to watch multimedia videos in the multimedia annotation system. They could view multimedia videos anytime and anyplace and could answer the questions from the designer by repeatedly watching multimedia videos on the multimedia annotation system. After the experiment, we could further explore how the user behaviors of watching multimedia videos affect their motivations by summarizing the system logs of the multimedia annotation system.

Instruments

Web multimedia annotation system

The interface of the Web multimedia annotation system (Su et al. 2015) is shown in Fig. 1. There are two main functions in the system, namely the multimedia video function and the play bar function. The multimedia video function is designed to present one video at a time. The participants would use this function to play or pause the video. The play bar function includes a playing operation that can resume or pause the video. The video timeline could be given annotations at specific time points, and the volume operation could control the volume.

Fig. 1

Schematic of the Web multimedia annotation system

Schematic of the Web multimedia annotation system Each added annotation is represented as a tag on the timeline. After editing the annotations for the video, the designer can publish the multimedia video with the annotations as an assignment for the users to watch. Users can view the video, use the play button, play bar, and volume button to control their multimedia video watching process, and also keep up with designers’ annotations. They can also add comment annotations to record their ideas and opinions to discuss the video content with peers. Finally, each multimedia video assignment provides a monitoring page in which several messages are presented, for example, who had or had not viewed the multimedia video, how many comments, discussions, and question annotations are created by a user, and how many questions a user has answered. The user can use this function to obtain the viewing status of each multimedia video assignment.

Multimedia video assignments

As shown in Fig. 2, the content of the multimedia video in this experiment is about how to use the Docker virtualization tools. The students were assumed to have not watched multimedia videos before they were published. To make it easy for users to understand the objective of the experimental activity, the instructor designed three multimedia videos of the experiment, namely, the concept of virtualization technology, the basic operation of Docker virtualization tools, skills and demonstration of practical development. The length of the multimedia videos was 943 s. The instructors uploaded the multimedia videos to the multimedia annotation system and added some question annotations in the multimedia videos, leading the users to complete the multimedia video assignments.

Fig. 2

Viewing multimedia video assignments

Data collection and analysis

The participants use the Web multimedia annotation system to watch all multimedia video assignments, and the experiment concluded. In the Web multimedia annotation system, users use the playbar section for multimedia video viewing. Users can click the playbar section with the mouse to play and pause the multimedia video. The video timeline area can control the watching time of the multimedia video. The users click on any position of the playbar section to watch the multimedia video at a certain time. Moreover, the viewing behavior patterns of multimedia videos are saved with time information. At the end of the experiment, we organized the system log data of the multimedia annotation system to analyze the viewing behavior patterns of multimedia videos. In order to analyze the system logs of the multimedia annotation system, we established four variables to represent the users’ video watching patterns. These variables and their definitions are shown in Table 1. The total time from the starting time to the stopping time is represented by “playing time”, “active playing time”, “played amount”, and “actively played amount”. The values of these variables are based on the system logs of the Web multimedia annotation system.

Table 1

Operational definitions of the four variables

Variable	Definition
Playing Time	Total time the multimedia video was played for
Active Playing Time	Total time the mouse pointer and the user concentrated on the multimedia video while the video was playing
Played Amount	Total number of multimedia videos played by the user
Actively Played Amount	If the user’s mouse point was concentrated on the multimedia video, then the total multimedia video amount was actively played by the user

Operational definitions of the four variables The elbow K-means clustering method is applied to categorize the four variables of the users into similar groups. The seven steps are as the followings shown in Fig. 3.

Fig. 3

The flowchart of the elbow k-means clustering algorithm

The flowchart of the elbow k-means clustering algorithm Step 1. Set the number of the categories is k. Step 2. Randomly choose the initial point as the center point. Step 3. Apply the Euclidean Distance formula (1) to calculate the distance between each data point and the center point. The formula is calculated as follows Step 4. All data points are sorted to the center point from the closest distance. Step 5. Separate all data points into k categories, where the data are in the category i if is the minimum distance of all k categories. The formula (2) is calculated as follows. Step 6. In the interactive procedure, if there are no new data points, then the iteration is stopped. Step 7. Find the optimal k categories. If there is a statistical significance between the three clusters for four variables, then the Kruskal–Wallis test is used (Schutz et al. 1998). If at least one category is different from another category, then the Kruskal–Wallis test is significantly different. Therefore, we should conduct the Pairwise Mann–Whitney-U test to compare the different categories and determine which category is different from the other categories (Lopez et al. 2015). The pairwise Mann–Whitney-U test is then applied to compare three categories.

Results and discussions

Descriptive statistics of watching multimedia video patterns

During the 3-week period, all users watched multimedia videos in the multimedia annotation system. The total duration of multimedia videos is 943 s, including virtualization technology concepts, basic operations of Docker virtualization tools, and hands-on development skills and demonstrations. Although the standard deviations (SD) of the playing time, active playing time, and actively played amount were large, the SD of the played amount was relatively small. The results indicate that all users completed the multimedia video assignments, but their video watching patterns are very different. Regarding the users’ annotation behaviors, all users answered the related questions. Few users created annotations in multimedia videos. In addition, all users created comment annotations. On average, 15 users created 9.36 comments. Upon carefully examining their comment annotations, we found that they used the comment annotations to segment the multimedia video and take notes. Table 2 presents the min, max, mean, and standard deviations (S.D.) of the variables related to video watching patterns. From the descriptive statistics of the number of times users watched multimedia video assignments, this finding shows that the number of users playing time was 4.17. With respect to the number of times of active playing, we found that the number of users’ active playing time was 1.245. The results of the playing amount showed that the users’ minimum length (Min) is 813 s. From the results of the actively playing amount, it can be seen that Min for users that actively watch multimedia videos is 259 s. On average, the users concentrated on watching 88% of multimedia video assignments.

Table 2

Descriptive statistics of four variables

No	Variables	Min	Max	Mean	S.D
1	Playing Time (seconds)	934	3830	1632.12	732.35
2	Active Playing Time (seconds)	326	2832	1221.18	823.36
3	Played Amount (seconds)	813	943	912.31	48.22
4	Actively Played Amount (seconds)	259	932	832.83	201.83

Descriptive statistics of four variables

Analysis of multimedia video watching behaviors

To explore the distinct behavioral patterns of watching video assignments, the k-means clustering algorithm was applied to analyze four variables: playing time, active playing time, played amount, and actively played amount. Based on the bias of the elbow k-means clustering method, the variables are classified into video-watching patterns similar to that in Kodinariya and Makwana (2013). The video-watching patterns of the users were shifted in order to classify the four variables into similar categories before conducting the elbow k-means clustering method. Since the distance is greatly affected by the scale of the four variables, it is customary to normalize the data first. First, the non-normalized data of video-watching patterns are converted into normalized data. Second, we calculate the distance between each data point and the center point to categorize users’ video-watching patterns (distortion) into similar categories. Each iteration refines the appropriate category by calculating the mean squared distance between the initial centroid points and other data points. Finally, the results show that the optimal number of categories is three. Therefore, the turning point of k is 3, as shown in Fig. 4.

Fig. 4

The elbow K-means clustering method for finding optimal k categories

The elbow K-means clustering method for finding optimal k categories Since the optimal clustering number of k is 3, we classify users’ video-watching patterns into three similar categories. For example, the playing time durations and playing amount were assigned values of 1, 2, and 3. 1/3 is the lowest (value = 1), 1/3 is the intermediate (value = 2), and 1/3 is the highest (value = 3). This indicated low, moderate, and high watching time and amount. Moreover, we observed the average status and overall characteristics of the three categories. The results of the clustering analysis are shown in Table 3. The findings indicated that the users in category 3 (c3, n = 9) had the highest values for each variable. The users in category 1 (c1, n = 9) played the multimedia videos for a longer time and length than the users in category 2 (c3, n = 9). However, the users in category 1 (c1) spent a shorter time in which their mouse pointer concentrated on the multimedia videos than the users in category 2 (c2).

Table 3

K-means clustering analysis of the clustering centroids of three categories

Category	Variable
Category	Playing time	Active playing time	Played amount	Actively played amount
Actively Engaged Users (Category 1, c1)	2.10	1.40	1.90	1.30
Watching Engaged Users (Category 2, c2)	1.42	1.83	1.80	2.30
Long Engaged Users (Category 3, c3)	3.21	2.95	2.20	2.43

K-means clustering analysis of the clustering centroids of three categories To clearly compare users’ video watching patterns among the three categories, the Kruskal–Wallis test was used. As shown in Table 4, the users among the three categories demonstrated significantly different behaviors, except for the playing amount (x2 (2, N = 27) = 4.123, p = 0.182). Based on analysis of the post hoc analysis (Pairwise-Mann–Whitney-U test), this result revealed statistical significances in several comparisons, including playing time (category 1 vs. category 2) (category 2 vs. category 3), active playing time (category 1 vs. category 3) (category 2 vs. category 3), and actively played amount (category 1 vs. category 3).

Table 4

Analysis of multimedia video-watching behaviors among the three categories

Variable	Category
	Actively engaged users (Category 1, c1)		Watching engaged users (Category 2, c2)		Long engaged users (Category 3, c3)		Kruskal–Wallis analysis	Mann–Whitney-U analysis
	Mean	S.D	Mean	S.D	Mean	S.D	p	Mann–Whitney-U analysis
Playing time	1289.23	252.23	936.52	51.20	2168.00	823.76	0.002**	c1 > c2 c3 > c2
Active playing time	739.08	311.62	832.30	272.63	1820.60	720.33	0.001**	c3 > c1 c3 > c2
Played amount	940.20	62.42	912.40	60.12	952.20	32.12	0.182
Actively played amount	583.40	242.38	867.00	265.12	936.33	65.72	0.043*	c3 > c1 **

*p < 0.05 shows difference; **p < 0.01 shows the obvious difference

Analysis of multimedia video-watching behaviors among the three categories c1 > c2 ** c3 > c2 ** c3 > c1 ** c3 > c2 ** *p < 0.05 shows difference; **p < 0.01 shows the obvious difference In summary, users in category 3 (c3) spent longer time in the four variables than those in category 1 (c1) and category 2 (c2). Therefore, it was labeled as long-engaged users. Category 1 spent significantly longer playing times and answered more questions correctly than category 2. Category 1 spent a shorter time in active playing time and had a shorter active playing amount than category 2, although the results did not demonstrate statistical significance. These results may indicate that category 2 focused on the multimedia video, so category 2 left the multimedia video less frequently. However, category 1 frequently left the video when the multimedia video was playing, so category 1 demonstrated longer play time but shorter active playing time and active playing amount. Since category 1 significantly correctly answered more questions than category 2, it may represent that category 1 left the video to refer to additional references on other browser tabs or windows for answering the questions. Therefore, category 1 was labeled as actively engaged users and category 2 was labeled as watching engaged users.

Conclusion and suggestions

People use information technology (IT) systems to watch multimedia videos, and the interactive function on converged social media and IT systems enhances multimedia interactions between users. In order to explore users’ behaviors of watching multimedia videos by key points in time, users’ video-watching patterns are analyzed using data mining techniques. This study applied clustering methods to explore users’ multimedia video watching patterns in converged IT environments. Moreover, we developed a Web multimedia annotation system that allows users to conduct different multimedia video watching activities. An experiment was conducted to demonstrate the contributions of this research. For users’ multimedia video-watching behavioral variables, the descriptive statistics result exhibited meaningful findings. First, the users viewed all multimedia video assignments, but their video watching behaviors were very different. Second, the users answered the video questions, and fifteen users created comment annotations. Since the experimental activity educator did not force the users to answer the question and to use the annotation functions, the results may indicate that the multimedia annotation functions are useful, and thus, the users were willing to use them actively. By analyzing the users’ multimedia video watching patterns, the four variables (playing time, active playing time, played amount, and actively played amount) were found to be very different. We can classify into three similar categories based on the four variables. Therefore, we labeled the three categories as actively engaged users (c1), watching engaged users (c2), and long engaged users (c3). According to the Kruskal–Wallis test and Mann–Whitney-U test results, long engaged users (c3) spent more time in the four variables than actively engaged students (c1) and watching engaged students (c2). Actively engaged users (c1) spent significantly longer playing times, and demonstrated shorter actively play time and actively play amount than watching engaged students (c2). Because actively engaged users (c1) answered more questions correctly than watching engaged users (c2), these results may indicate that actively engaged users (c1) left the multimedia video frequently and refer to additional references for answering questions, while watching engaged users (c2) focused on viewing the multimedia video and did not answer any question.

8 in total

1. Relationship Between Teachers' Teaching Modes and Students' Temperament and Learning Motivation in Confucian Culture During the COVID-19 Pandemic.

Authors: Chuan-Yu Mo; Jiyang Jin; Peiqi Jin
Journal: Front Psychol Date: 2022-05-26

2. Influencing Factors of University Relocation on College Students' Intention to Engage in Local Entrepreneurship and Employment.

Authors: Shihao Chen; Qianqian Zhang; Qun Zhao; Huiru Deng; Yu-Sheng Su
Journal: Front Psychol Date: 2021-12-13

3. Immersive Learning Design for Technology Education: A Soft Systems Methodology.

Authors: C H Wu; Y M Tang; Y P Tsang; K Y Chau
Journal: Front Psychol Date: 2021-12-17

4. Study of Virtual Reality Immersive Technology Enhanced Mathematics Geometry Learning.

Authors: Yu-Sheng Su; Hung-Wei Cheng; Chin-Feng Lai
Journal: Front Psychol Date: 2022-02-17

5. Predicting Heavy Metal Concentrations in Shallow Aquifer Systems Based on Low-Cost Physiochemical Parameters Using Machine Learning Techniques.

Authors: Thi-Minh-Trang Huynh; Chuen-Fa Ni; Yu-Sheng Su; Vo-Chau-Ngan Nguyen; I-Hsien Lee; Chi-Ping Lin; Hoang-Hiep Nguyen
Journal: Int J Environ Res Public Health Date: 2022-09-26 Impact factor: 4.614

6. An Empirical Exploration of Sports Sponsorship: Activation of Experiential Marketing, Sponsorship Satisfaction, Brand Equity, and Purchase Intention.

Authors: Chun-Hua Hsiao; Kai-Yu Tang; Yu-Sheng Su
Journal: Front Psychol Date: 2021-06-24

7. Safe Sexual Behavior Intentions among College Students: The Construction of an Extended Theory of Planned Behavior.

Authors: Chien-Liang Lin; Yuan Ye; Peng Lin; Xiao-Ling Lai; Yuan-Qing Jin; Xin Wang; Yu-Sheng Su
Journal: Int J Environ Res Public Health Date: 2021-06-11 Impact factor: 3.390

8. Innovative Pedagogy and Design-Based Research on Flipped Learning in Higher Education.

Authors: Li Zhao; Wei He; Yu-Sheng Su
Journal: Front Psychol Date: 2021-02-18

8 in total