Literature DB >> 34363282

Individualized video recommendation modulates functional connectivity between large scale networks.

Conghui Su¹, Hui Zhou¹, Chunjie Wang², Fengji Geng³, Yuzheng Hu¹.

Abstract

With the emergence of AI-powered recommender systems and their extensive use in the video streaming service, questions and concerns also arise. Why can recommended video content continuously capture users' attention? What is the impact of long-term exposure to personalized video content on one's behaviors and brain functions? To address these questions, we designed an fMRI experiment presenting participants with personally recommended videos and generally recommended ones. To examine how large-scale networks were modulated by personalized video content, graph theory analysis was applied to investigate the interaction between seven networks, including the ventral and dorsal attention networks (VAN, DAN), frontal-parietal network (FPN), salience network (SN), and three subnetworks of default mode network (dorsal medial prefrontal (dMPFC), Core, and medial temporal lobe (MTL)). Our results showed that viewing nonpersonalized video content mainly enhanced the connectivity in the DAN-FPN-Core pathway, whereas viewing personalized ones increased not only the connectivity in this pathway but also the DAN-VAN-dMPFC pathway. In addition, both personalized and nonpersonalized short videos decreased the couplings between SN and VAN as well as between two DMN subsystems, Core and MTL. Collectively, these findings uncovered distinct patterns of network interactions in response to short videos and provided insights into potential neural mechanisms by which human behaviors are biased by personally recommended content.

Entities: Chemical

Keywords: graph theory; network interaction; personalized short videos; recommender system

Mesh：

Year: 2021 PMID： 34363282 PMCID： PMC8519862 DOI： 10.1002/hbm.25616

Source DB: PubMed Journal: Hum Brain Mapp ISSN： 1065-9471 Impact factor: 5.038

INTRODUCTION

We explore the real world and society primarily through visual and auditory stimulation from the surrounding environment. The advent of digital devices and global online streaming services have largely extended the temporal and spatial limits of our explorable scope by providing numerous videos recorded from daily life or created purely by imagination. In such a digital era, individuals can proactively select interesting videos based on their personal habits, preference, cultural background, and so on. Further, their behavioral patterns of selection now appear to be “captured” and “predicted” by powerful recommendation algorithms that usually suggest users the same type of personalized video content that makes them immerse in Davidson et al. (2010) and Shani and Gunawardana (2011). The prolonged exposure to personalized audiovisual stimuli might limit the diversity of content people are exposed to, thus leading to biased belief, behavior, and brain function (Pariser, 2011). As modern AI‐based recommendation algorithms are lack of interpretability, investigation of brain response to the recommended video content might help us understand the potential brain basis of human self‐stimulation behaviors with AI‐recommended audiovisual content from a new perspective. Watching videos is a complex and dynamic process involving attention, emotion, and social cognition, which might require the coordinated interplay of multiple brain networks. Literature has shown that the human brain is organized into distributed large‐scale networks and their dynamic interactions are essential for complex mental processing (Damoiseaux et al., 2006; Power et al., 2011; Sporns, Chialvo, Kaiser, & Hilgetag, 2004). Previous neuroimaging studies on naturalistic stimuli including movie, music, narrative story have shown the engagement of the default mode network (DMN), attention networks, salience network (SN), and frontal–parietal network (FPN) (Bottenhorn et al., 2018; Brandman, Malach, & Simony, 2021; Brauchli, Leipold, & Jäncke, 2020; Kim, Kay, Shulman, & Corbetta, 2017; Simony et al., 2016). The existing literature using audiovisual video clips, however, has mainly focused on the neural activity related to emotion and social cognition (Gao, Weber, Wedell, & Shinkareva, 2020; Goldberg, Preminger, & Malach, 2014; Iacoboni et al., 2004; Lahnakoski et al., 2012; Lee Masson, Pillet, Boets, & Op de Beeck, 2020). Very few studies have investigated the impact of “individualized” attributes on brain activity. This is partly due to the fact that, only until recent years, short video sharing platforms have just become popular because of the emergence of AI‐powered recommender systems. Our previous study examined the effect of short video viewing on regional brain activation (Su et al., 2021). We found that both personalized and nonpersonalized video viewing activated primary visual and auditory cortices, with the former inducing higher activation in multiple regions including prefrontal cortex, temporal cortex, premotor cortex, and cerebellum. Particularly, the DMN displayed such a functional heterogeneity that its dorsal media prefrontal cortex (dMPFC) subsystem was more activated when participants were viewing personalized videos than generalized videos, whereas the medial temporal lobe (MTL) subsystem did not show differences between the two conditions (Su et al., 2021). In the present study, we were interested in how personally recommended video content modulated brain network interactions. Based on previous studies on the potential roles of large‐scale networks in the high‐level perception of movie stimuli (Betti et al., 2013; Emerson, Short, Lin, Gilmore, & Gao, 2015; Gao & Lin, 2012; Li, Lu, & Yan, 2020) and our prior findings (Su et al., 2021), we focused on high‐order networks including the DMN, the dorsal attention network (DAN), the ventral attention network (VAN), SN, and FPN. An extensive body of literature has suggested that the DMN is associated with internally oriented mental processes, such as self‐referential processing, autobiographical memory, and theory of mind (Buckner, Andrews‐Hanna, & Schacter, 2008; Rameson, Satpute, & Lieberman, 2010; Spreng & Grady, 2010). However, more and more studies have demonstrated that the sub‐regions of DMN are not engaged uniformly across a variety of cognitive processes (Bellana, Liu, Diamond, Grady, & Moscovitch, 2017; Harrison et al., 2008; Xu, Yuan, & Lei, 2016). Further, previous research suggests that there are three heterogeneous subsystems in DMN: the midline Core, the dMPFC subsystem, and the MTL subsystem (Andrews‐Hanna, Reidler, Sepulcre, Poulin, & Buckner, 2010). The dMPFC subsystem, including the dorsal medial PFC, the temporoparietal junction (TPJ), the lateral temporal cortex, and the temporal pole, is thought to engage in social cognition and semantic comprehension (Andrews‐Hanna, Smallwood, & Spreng, 2014). On the contrary, the MTL subsystem, mainly comprising the ventromedial prefrontal cortex (vMPFC), the hippocampus, and the posterior inferior parietal lobe, has been implicated in autobiographical memory (Kim, 2012), future‐oriented thought, and memory retrieval (Andrews‐Hanna, Reidler, Huang, & Buckner, 2010; Andrews‐Hanna, Saxe, & Yarkoni, 2014). The Core subsystem, which is comprised of anterior medial prefrontal cortex (aMPFC) and PCC, is associated with self‐referential processing (Kim, 2012) and autobiographical memory (Andrews‐Hanna, Saxe, & Yarkoni, 2014). Previous studies using naturalistic stimuli paradigm have demonstrated the pivotal role of DMN in movie watching and social cognition (Brandman et al., 2021; Jääskeläinen, Sams, Glerean, & Ahveninen, 2021; Redcay & Moraczewski, 2020). However, the three subnetworks of DMN may play different roles in differentiating user's favorite videos from uninteresting ones due to their functional heterogeneity mentioned above. The dynamic features in visual and auditory stimuli require efficient attention control for effective information processing during video watching. Such attentional control is thought to be mediated by two anatomically distinct networks (Vossel, Geng, & Fink, 2014). The DAN, including frontal eye fields and superior parietal lobes, is involved in goal‐directed and voluntary top‐down orienting of attention (Corbetta, Patel, & Shulman, 2008; Corbetta & Shulman, 2002). On the contrary, the VAN, including TPJ, supramarginal gyrus, and middle frontal gyrus, is important for reorienting attention toward salient and novel sensory stimuli via bottom‐up inputs (Corbetta, Kincade, Ollinger, McAvoy, & Shulman, 2000; Corbetta & Shulman, 2002; Vossel et al., 2014). Recently, a meta‐analysis on neuroimaging studies of social interactions supports the roles of DMN and VAN in social cognition (Feng et al., 2021). Especially, the dMPFC has been found to process social interaction during movie viewing (Wagner, Kelley, Haxby, & Heatherton, 2016). Further, the connectivity between dMPFC and TPJ, two key regions from the dMPFC DMN subsystem and VAN, respectively, is associated with the ability of understanding other's mental states (Li, Mai, & Liu, 2014). Using psychophysiological interaction analyses, our previous study (Su et al., 2021) also showed that the dMPFC and temporal‐pole, two vital nodes in the dMPFC subsystem, increased their connections with TPJ but decreased their couplings with anterior cingulate cortex (ACC). In addition, Vincent, Kahn, Snyder, Raichle, and Buckner (2008) proposed that the SN and FPN might integrate information from DMN and DAN. The FPN is also thought to play an essential role in top‐down cognitive control functions (Cole et al., 2013; Niendam et al., 2012). The SN, primarily comprised of the anterior insula and dorsal anterior cingulate cortex (dACC), is associated with directing attention to salient stimuli (Seeley et al., 2007). Spreng, Stevens, Chamberlain, Gilmore, and Schacter (2010) suggested that SN and FPN can preferentially couple with either DAN or DMN depending on task demands. Specifically, internally oriented autobiographical planning increased their couplings with DMN, whereas the external planning task increased their couplings with DAN. In light of studies mentioned above, video watching would modulate brain activity at a large‐scale network level. However, how personally recommended video content modulates within and between large‐scale network interactions remains unknown. To address this question, the present study designed an fMRI experiment with three conditions: watching personalized videos (PV), watching generalized videos (GV), and a brief rest (Rest) (Figure 1). A key feature that distinguishes the PV from the GV is that the content of PV has more self‐related elements that may generate more self‐referential and self‐conscious internal thoughts. Given the vital roles of VAN and DAN in attentional orientation, we hypothesized that the two attention networks would exhibit more inter‐network connectivity in response to short video stimuli. Considering the dual roles of DMN in both internal‐oriented thought and external stimuli processing, we conjectured the three subsystems would display differential patterns of connectivity between rest and video watching conditions. Besides, we hypothesized that the FPN and SN might function as hubs to interact with DMN, DAN, and VAN to satisfy the need for comprehension and saliency detection when participants were watching videos.

FIGURE 1

The block design of fMRI experiment. The personalized video (PV) block and the generalized video (GV) block were presented intermittently. Each block lasted for 60 s, then followed by a 30 s fixation

METHODS AND MATERIALS

Participants and experimental design

Thirty healthy students from Zhejiang University participated in this study (14 females; age: 23.73 ± 2.38 [M, SD]). All participants were adults and had a history of using TikTok. They reported no major diseases or mental health problems. As none of the subjects showed excessive head motion during scanning (maximum head displacement >3 mm and mean framewise head displacement >0.5 mm), no participant was excluded from our final fMRI analyses due to head motion. We obtained their written informed consent before the experiment. This study was approved by the Ethic Committee of Zhejiang University. All the participants viewed two types of video clips in the scanner, namely PV and GV. The PV were referred to as videos recommended by the App (TikTok in this study) for each user, whereas the GV were referred to as videos recommended by the App for a new user with no use history. Both PV and GV were short video clips recorded from TikTok with a total six‐minute length. The difference was that the GV were lack of user‐specific attributes, whereas the PV were customized for each participant by an AI‐based recommender algorithm in the App. The AI‐based recommender algorithm “learns” a user's preference from his/her watching feedback, including repeated watching, comment, share, and “liked” tag in TikTok, and then posts a streaming of videos to the user accordingly (Ma & Hu, 2021). It has been shown that the video viewing completion rate is an important indicator, which means that if a user had watched a video to the end rather than scrolled it down quickly, the algorithm will be more likely to recommend videos of the same type/tag later (Chen, He, Mao, Chung, & Maharjan, 2019). The same dataset has been used to locate brain activation elicited by short videos (Su et al., 2021). A block design including three conditions (PV, GV, and Rest) was used in this experiment. Both PV and GV conditions contained six blocks. Each block lasted for 1 min and consisted of 1 to 6 short videos, followed by a 30‐s rest period during which only a white fixation cross was presented on the screen. All the stimuli were presented by E‐prime 3.0 (psychology software tools, https://www.pstnet.com), and participants could watch them in an angled mirror and hear the soundtrack of videos by headphones. To counterbalance the effect of video order, fourteen participants watched PV followed by GV and the rest watched GV preceding PV. None of the videos was shown to participants before the experiment. In the scanner, participants were merely instructed to watch the presentation of videos as usual. After scanning, each participant was interviewed to evaluate their preference for each video clip, with a rank from 1 (dislike extremely) to 3 (like extremely). They also reported their overall preference for PV and GV at the end of the fMRI experiment.

Imaging acquisition

Participants were scanned in a Siemens 3.0‐T scanner (MAGNETOM Prisma, Siemens Healthcare Erlangen, Germany) using a 20‐channel coil. Structural images were acquired during a 5 min and 18 s scan with a T1‐weighted magnetization prepared rapid gradient echo sequence: TR = 2,300 ms, TE = 2.32 ms, slice thickness = 0.9 mm, voxel size = 0.90 × 0.90 × 0.90 mm3, voxel matrix = 256 × 256, flip angle = 8°, field of view = 240 mm. During each task session of task‐based scan, a total of 1,095 whole‐brain volumes were collected using a T2*‐weighted gradient echo planar imaging sequence with multi‐bands acceleration (TR = 1,000 ms, TE = 34 ms, slice thickness = 2.50 mm, voxel size = 2.50 × 2.50 × 2.50 mm3, voxel matrix = 92 × 92, flip angle = 50°, field of view = 230 mm2, slices number = 52, MB‐factor = 4).

Image preprocessing

Preprocessing of fMRI data included the following steps. First, slice time correction and head motion correction were performed using AFNI (Cox, 1996). Then, tissue segmentation was conducted to extract brains using SPM12 (https://www.fil.ion.ucl.ac.uk/spm/). Structural and functional images were normalized to the MNI space (2.5 mm isotropic spatial resolution) using ANTs (http://stnava.github.io/ANTs/). Finally, spatial smoothing was conducted with a 5 mm full‐width‐at‐half‐maximum Gaussian kernel. Before calculating correlation coefficients between networks, a voxel‐wise multiple regression approach was used to remove the effects of confounding variables as below. The block wise hemodynamic response to each task condition (PV, GV, Rest) and their first‐derivative terms were used to remove the task‐evoked block‐wise changes in fMRI signals (Whitfield‐Gabrieli & Nieto‐Castanon, 2012). The first five principal components of white‐matter and cerebrospinal fluid signals, six motion parameters, and their temporal first‐order derivatives were used as covariates to remove physiology‐ and motion‐related confounds. A high pass filtering (>0.001 Hz) and linear detrending were also performed. After above preprocessing steps, the Conn toolbox (https://web.conn-toolbox.org/resources/manual) was used for condition‐dependent functional connectivity analysis of block design by dividing BOLD time series into scans associated with each block, and all of the scans for the same condition were concatenated. Finally, ROI‐to‐ROI correlation coefficients were calculated and graph theory analysis was performed (see details below).

Graph theory analysis

Graph theory analysis was employed to characterize interactions between multiple networks during the task. The Conn functional connectivity toolbox 19.b (Whitfield‐Gabrieli & Nieto‐Castanon, 2012) was applied to construct condition‐specific functional connectivity matrices for each participant using a template of 81 regions of interest (ROIs) detailed below. We defined the large‐scale networks with 81 ROIs primarily based on Power and colleagues' work (Power et al., 2011). Nevertheless, to better characterize the functional heterogeneity of DMN, we replaced the DMN coordinates in Power's template with coordinates used by Andrews‐Hanna, Reidler, Huang, and Buckner (2010) and Andrews‐Hanna, Reidler, Sepulcre, et al. (2010), in which the DMN Core subsystem was define by four nodes (2 for each hemisphere), the dMPFC subsystem was defined by 7 nodes (3 for each hemisphere, and the dMPFC was at the medial plane), and the MTL subsystem was defined by 9 nodes (4 for each hemisphere, and the vMPFC was at midline). Since this atlas (Power et al., 2011) includes several networks that we are not interested in, we only used the coordinates of these four networks, namely FPN, VAN, DAN, and SN. The set of ROIs (5 mm radius spheres) was generated using 3dcalc with AFNI (Cox, 1996) with the center coordinates defined above (Figure 2). These networks were visualized with the BrainNet Viewer program (https://www.nitrc.org/projects/bnv/) (Xia, Wang, & He, 2013).

FIGURE 2

The 81 regions of interest encompass seven networks, including three subsystems of default mode network (Core, dMPFC, and MTL), FPN, VAN, DAN, and SN. DAN, dorsal attention network; dMPFC, the dorsal medial prefrontal cortex subsystem; FPN, frontal–parietal network; MTL, the medial temporal lobe subsystem; VAN, ventral attention network; SN, salience network Each ROI's average time course was extracted and pair‐wise Pearson correlations between these time courses were calculated after preprocessing procedures described above. Following previous routines of graph theory analysis (Reineberg & Banich, 2016; Wang, Hu, Weng, Chen, & Liu, 2020), we set a sparsity of 15% to remove the weak links and maintain the strong positive correlations. Since there is still no consensus on which threshold is the best to choose, we also verified the results with the proportional threshold of 10% and 20%. The overall results were consistent (supplemental material, Table S1), therefore we only reported the results of 15% in the main text. Brain Connectivity Toolbox (Rubinov & Sporns, 2010) was used to calculate the graph theory measures, including seven intra‐network connections and twenty‐one inter‐network connections in two task conditions and rest, separately. Statistical analysis was conducted using SPSS 21.0 software (SPSS Inc, Chicago, IL). A one‐way repeated measures analysis of variance (ANOVA) was used to test whether there were any differences among the three conditions. The Bonferroni correction was used to control false positive for multiple comparisons. As mentioned above, we focused on seven networks that would generate seven intra‐network analyses and twenty‐one inter‐network analyses. Therefore, the corrected statistical significance threshold was set at = .05/7 = .007 for intra‐connection, and = .05/21 = .0024 for inter‐connections.

RESULTS

Behavioral information

The rating scores of twenty‐eight participants were obtained in our study, and two participants were excluded due to their incomplete evaluation data. Paired T‐test showed that the mean rating score for video preference was significantly higher for PV than that for GV (t = 3.647, p = .0011). In addition, 26 out of 28 participants reported a higher overall preference to PV. These results confirmed that participants did prefer PV to GV.

Task modulation on intra‐network interactions

For the intra‐network connections, video watching exerted a significant influence on the dorsal attention network. As revealed by the repeated measures ANOVA, the connectivity within the DAN was significantly different among GV, PV, and Rest conditions ( = 12.34, p < .0001). Post hoc analyses with Bonferroni correction (indicated higher connectivity under GV condition than at Rest (GV vs. Rest: t = 5.05, < .0001) and PV condition (GV vs. PV: t = 3.02, = .016). But the DAN connectivity did not show a difference between the PV and Rest conditions (PV vs. Rest: t = 1.95, = .183). No significant differences among the three conditions were found for the other six intra‐network connectivity, and therefore no pair‐wise comparison was tested. The ANOVA statistics were listed in Figure 3, and the intra‐network connectivity under each task condition was plotted in Figure 4.

FIGURE 3

FIGURE 4

Bar diagrams of average intra‐network functional connectivity. Error bars represent standard error of the mean (SE). * < .05; ** < .01; *** < .001

The significance matrix of ANOVA comparing within and between seven network connectivity among three conditions (diagonal lines denote intra‐network connectivity). *p < .05/7 (intra) or p < .05/21 (inter) Bar diagrams of average intra‐network functional connectivity. Error bars represent standard error of the mean (SE). * < .05; ** < .01; *** < .001

Task modulation on inter‐network interactions

As shown in Figure 5, watching videos modulates the inter‐network connectivity between the DMN Core and MTL ( = 17.42, p < .0001), as well as FPN ( = 6.7, p = .002). Post‐hoc tests indicated that the Core‐MTL connectivity was lower under both the PV and GV conditions when compared to the Rest condition (PV vs. Rest: t = −5.0, < .0001; GV vs. Rest: t = −4.48, = .0003), but with no difference between the PV and GV conditions (t = 1.24, = .68). In contrast, the Core‐FPN connectivity was higher under both the GV condition (GV vs. Rest: t = 3.96, = .001) and the PV condition (PV vs. Rest: t = 2.47, = .059).

FIGURE 5

Bar diagrams of average inter‐network functional connectivity showing significant differences among the three conditions with Bonferroni correction. Error bars represent standard error of the mean (SE). * < .05; ** < .01; *** < .001 The inter‐network connectivity between VAN and three networks including the DMN dMPFC subsystem ( = 7.62, p = .0012), DAN ( = 7.38, p = .0014), and SN ( = 15.5, p < .0001) showed significant differences among the three conditions. Post‐hoc analyses indicated higher VAN connectivity with dMPFC under the PV than GV conditions (PV vs. GV: t = 4.3, = .0005), but with no difference between GV and Rest (GV vs. Rest: t = −1.39, = .53). In contrast, a significant reduction in connectivity between VAN and SN was observed for both PV (PV vs. Rest: t = −5.01, < .0001) and GV (GV vs. Rest: t = −3.57, = .0038) conditions when compared to the Rest condition. The pattern of change in the VAN‐DAN connectivity was a reverse of that in the VAN‐SN connectivity. The inter‐network connectivity between VAN and DAN was significantly increased under PV condition when compared to GV condition and Rest (PV vs. GV: t = 2.64, = .039; PV vs. Rest: t = 3.39, = .006), with the latter two showing no difference (t = 1.38, = .54). In addition, the connectivity between FPN and DAN was also modulated by video watching ( = 11.28, p < .0001). The connectivity was significantly increased under GV condition than Rest (GV vs. Rest: t = 4.6, = .0002), and the difference of FPN‐DAN connectivity was also significant between PV and GV conditions (PV vs. GV: t = −3.27, = .0084). No more inter‐network connectivity among the three conditions was found significant, therefore no more pair‐wise comparison was further made.

DISCUSSION

In this study, we applied graph theory to examine whether and how the connectivity within and between large‐scale networks was modulated by personalized video content when subjects were watching short videos in the scanner. As summarized in Figure 6, PV increased the interaction between DAN and VAN, and between VAN and the dMPFC subsystem of DMN; GV increased the intra‐connections of DAN and interaction of DAN‐FPN and FPN‐Core; both PV and GV reduced the inter‐network connection of Core‐MTL and SN‐VAN. As expected, the VAN and DAN did show greater connectivity with other networks, and the three DMN subnetworks exhibited functional heterogeneity in response to short videos. Intriguingly, the FPN appeared to increase coupling with DAN and part of DMN directly, whereas the SN showed reduced coupling with VAN during short video watching. As discussed below, the difference of functional connectivity between video‐watching state (both PV and GV) and resting state may be attributed to the intrinsic competition between internal and external attention, whereas the network connectivity alterations between PV and GV conditions would help us to understand how personalized attribute of video content influences human brain functions.

FIGURE 6

Schematic illustration of the modulation of short videos on network interactions. The thick solid lines represent increased coupling (compared to Rest), whereas the dashed lines mean reduced coupling (compared to Rest) (Orange, PV; Blue, GV). The thin solid line denotes a stronger interconnectivity between VAN and dMPFC in response to PV than that to GV

The modulation of personalized video content

The most significant difference in brain activity elicited by the two types of short videos was manifested in the interaction between DMN and attention networks. Specifically, the VAN showed greater connectivity with DAN and dMPFC subsystem of DMN in response to PV than that to GV. The dMPFC subsystem has been found to be selectively activated when individuals concern current mental states (Andrews‐Hanna, Reidler, Huang, & Buckner, 2010; Andrews‐Hanna, Reidler, Sepulcre, et al., 2010). Andrews‐Hanna and colleagues also suggested its role in mentalizing, social cognition, and semantic comprehension in a meta‐analysis study (2014). The VAN plays a role in detecting novel stimuli and capturing attention in a bottom‐up way (Corbetta et al., 2008; Corbetta & Shulman, 2002), and it is regarded as an “alerting” system capable to detect environmental changes (Kim, 2014; Macaluso, 2010). Such an increased information flow between the VAN and dMPFC subsystem could contribute to individuals' continuous enhanced attention to current stimuli, which, in turn, may help to achieve a better appreciation of personalized video content. Besides, an extensive body of literature has suggested the roles of attention networks in temporarily maintaining verbal and visual information and their competitive relationship in short‐term memory (Anticevic, Repovs, Shulman, & Barch, 2010; Majerus et al., 2012; Todd, Fougnie, & Marois, 2005). Evidence supported that the activation of dMPFC subsystem tends to accompany processing internal or self‐generated information in social tasks (Buckner et al., 2008; Lieberman, 2007). The self‐referential processing is a process similar to episodic memory retrieval that relates more to recollection rather than familiarity (Sajonz et al., 2010). A meta‐analysis study (Kim, 2010) found that the recollection response induced higher DMN activity than familiarity response. In contrast, elevated activation of VAN was observed when familiarity response was increased (Kim, 2010). Taken together, the enhanced coupling between VAN and dMPFC subsystem under PV condition might indicate that the VAN‐dMPFC pathway is a potential neural basis conveying familiar salience of video stimuli to self‐referential processing, leading to a more immersed state when participants were watching personalized videos.

Reduced coupling of Core‐MTL and VAN‐SN during short videos watching

Our results revealed a decreased coupling between the DMN subsystems (Core and MTL) when participants watched short videos regardless of type. The DMN is initially defined as a set of regions showing higher activation during the resting state (Raichle et al., 2001; Shulman et al., 1997), and its functional connectivity has been found to be modulated by external stimuli and associated with task performance (Esposito et al., 2009; Hampson, Driesen, Skudlarski, Gore, & Constable, 2006; Newton, Morgan, Rogers, & Gore, 2011). Increasing evidence has supported the functional dissociation of DMN across cognitive tasks. Individuals who experienced more spontaneous episodic thoughts about the past and future would exhibit higher functional connectivity within the MTL subsystem (Andrews‐Hanna, Reidler, Huang, & Buckner, 2010; Andrews‐Hanna, Reidler, Sepulcre, et al., 2010). The individual's tendency to daydream is positively associated with dynamic functional connectivity between PCC and the MTL subsystem (Kucyi & Davis, 2014). A study on rumination (Chen et al., 2020) also found an increased connectivity between Core and MTL and a decreased coupling between Core and dMPFC when subjects were in a state of rumination, compared to that in a distraction state. Based on these studies, it is reasonable to conjecture that such a decreased coupling between Core and MTL during short video viewing is related to less occurrence of spontaneous thoughts due to enhanced engagement in processing external video stimuli. Contrary to our hypothesis that SN would function as a hub to interact with multiple networks, our results showed reduced connectivity between SN and VAN during video watching task (Figure 6). The short‐video watching task is a process that not only involves constantly detecting and receiving various audiovisual stimuli, but also requires high‐level cognitive processes to comprehend and evaluate the content. Both the VAN and SN were thought to play crucial roles in detecting the behavior‐relevant salient stimuli (Corbetta et al., 2008; Seeley et al., 2007). Emerging evidence has also suggested the involvement of SN in a wide range of cognitive and affective tasks (Ham, Leff, de Boissezon, Joffe, & Sharp, 2013), including switching between internal‐directed mental process and external‐oriented attention (Menon & Uddin, 2010; Sridharan, Levitin, & Menon, 2008). Cascio et al. (2014) compared the neural response between autism spectrum disorders (ASD) and healthy controls when they were passively viewing their own personalized pictures and pictures of interest to others. They revealed that anterior insula and mid‐dorsal ACC, two critical nodes of SN, showed greater activation when young ASD viewed pictures of interest to themselves than those of interest to others. Further, Kohls et al. chose individualized video stimuli for each participant (both ASD and control group) based on self‐ and parent‐reported circumscribed interests, and they found that caudate, thalamus, vMPFC, ACC, and insula showed greater BOLD responses in both groups when they were watching videos of personalized interest versus social‐related videos (Kohls, Antezana, Mosner, Schultz, & Yerys, 2018). Hence, it was reasonable to expect the SN would show enhanced coupling with other networks during video watching. Such an apparent discrepancy between our results and above‐mentioned findings indicates that the pattern of regional brain activation might be different from the pattern of network connectivity in a dynamic video watching task, yet the psychological and neural mechanisms underlying this difference warrant further investigation. In addition, a previous study has indicated that SN plays a modulatory role when a rapid behavioral change occurs (Dosenbach et al., 2006). A working memory study has also suggested that high cognitive loads lead to an increased integration between SN and DMN as well as the executive‐control network (Liang, Zou, He, & Yang, 2015). As a totally “passive” task, watching short videos does not require much cognitive control to generate outputs/responses, even does not require any internal reflections. This is consistent with our previous finding that the dorsal anterior cingulate cortex composing SN displayed deactivation during short video watching (Su et al., 2021). It seemed that the VAN might contribute more to detecting the saliency of stimuli when voluntary response is not required in a task, whereas the SN, which also plays an important role in cognitive control, would reduce its engagement in such tasks. Given the critical role of SN in human brain function and the neural plasticity induced by long‐term depression (Citri & Malenka, 2008), how reduced regional activation and weakened between‐network connection of SN would impact one's behavior after frequent and long‐term short video watching warrants further investigation.

Increased coupling within DAN, between FPN and DAN and Core when watching short videos

The increased connectivity within DAN may indicate outward attention (Hopfinger, Buonocore, & Mangun, 2000), and the enhanced coupling between FPN and DAN as well as the Core subsystem of DMN may support the sustained attention focus on external stimuli. The cognitive control processing can be implemented by enhancing functional connectivity within networks and between networks (Ray et al., 2020). Previous research has suggested that brain connectivity within networks is related to a specific cognitive process, whereas the connectivity between networks is essential for effectively communicating and integrating information of various cognitive processes (Warren et al., 2014). The dynamic process of watching short videos demands continuous external attention and more cognitive control to focus on current stimuli, in which DAN and FPN play a vital role (Zanto & Gazzaley, 2013). It has been widely supported by numerous studies that the FPN might mediate internal and external cognition via a dynamic balance in its coupling between DMN and attention networks (Smallwood, Brown, Baird, & Schooler, 2012; Spreng, Sepulcre, Turner, Stevens, & Schacter, 2013; Vincent et al., 2008). The increased connectivity of FPN‐DAN occurs in tasks that require cognitive control for external attention (Maillet, Beaty, Kucyi, & Schacter, 2019). Though the FPN was suggested to show increased connectivity with DAN and decreased coupling with DMN during externally‐directed attention tasks (Spreng et al., 2010), our study found that FPN enhanced coupling with both the DAN and DMN (Core subsystems) during watching short videos. This inconsistency is likely attributed to the functional heterogeneity of DMN with divergent functions of three subsystems. Another possibility is that FPN is also characterized by heterogeneity (Yeo et al., 2011); part of FPN is coupling with DMN to regulate the internal mental process, whereas another part is connected to DAN to modulate external attention (Dixon et al., 2018). Furthermore, the relatively weaker connectivity of FPN‐DAN and intra‐DAN under PV condition might be associated with the competition of attention resources between internal processing and external input. As discussed above, the PV was tailored to each user and featured in a highly self‐related attribute, which was thought to evoke more DMN‐related self‐referential processes. One limitation of the present study is that the low‐level video features, such as sound frequency, volume, luminance, color, and rhythms, are different in PV across subjects. However, to deliver personalized video content such differences are unavoidable, although the contribution of such low‐level features to the modulation of large‐scale network interaction is yet to be determined. As our previous study on regional brain activation in response to PV and GV showed no differences in the primary visual and auditory cortices between the two conditions (Su et al., 2021), it is more likely that the “personalized” attribute dominantly impacts large‐scale network couplings. Another limitation of this study is that the video viewing task in the present work, just like other naturalistic stimuli designs, lacks overt or built‐in measures of attention (Eickhoff, Milham, & Vanderwal, 2020). It has been shown that some features of short videos (such as cuts and fast pacing) can elicit more involuntary attention responses (Bolls, Muehling, & Yoon, 2003), whereas the video content that an individual is interested seems to induce more voluntary attention. It is, therefore, important to differentiate what type of attention is modulated by individualized video content and how it relates to the personalized video interest. Although it is still challenging to assess attentional level, future studies can integrate new techniques such as in‐scanner eye‐tracking (Kim, Jin, Jo, & Lee, 2020), physiological measures of arousal, and predictive Eye Estimation Regression (Son et al., 2019) to track how PV and GV modulate attention when participants are watching short videos in the scanner.

CONCLUSION

In sum, the present study provides some tentative evidence of the reconfiguration of several functional networks in response to short videos, and the modulation of personalized videos on intra‐ and inter‐network connectivity. The three subsystems of DMN displayed heterogeneous functional connectivity in the video watching task, and the connectivity between the dMPFC subsystem and VAN might be associated with processing self‐related personalized content. These findings may advance our understanding of the dynamic interaction of brain networks during naturalistic stimuli processing, and shed light on the potential neural basis underlying the effects of the recommended content on human behaviors.

CONFLICT OF INTEREST

The authors declare no conflict of interest. Table S1 The statistical results of intra‐ and inter‐network connectivity with different threshold Click here for additional data file.

85 in total

Review 1. Orienting of spatial attention and the interplay between the senses.

Authors: Emiliano Macaluso
Journal: Cortex Date: 2009-05-28 Impact factor: 4.027

2. Functional-anatomic fractionation of the brain's default network.

Authors: Jessica R Andrews-Hanna; Jay S Reidler; Jorge Sepulcre; Renee Poulin; Randy L Buckner
Journal: Neuron Date: 2010-02-25 Impact factor: 17.173

3. The subsystem mechanism of default mode network underlying rumination: A reproducible neuroimaging study.

Authors: Xiao Chen; Ning-Xuan Chen; Yang-Qian Shen; Hui-Xian Li; Le Li; Bin Lu; Zhi-Chen Zhu; Zhen Fan; Chao-Gan Yan
Journal: Neuroimage Date: 2020-07-22 Impact factor: 6.556

4. Affective neural response to restricted interests in autism spectrum disorders.

Authors: Carissa J Cascio; Jennifer H Foss-Feig; Jessica Heacock; Kimberly B Schauder; Whitney A Loring; Baxter P Rogers; Jennifer R Pryweller; Cassandra R Newsom; Jurnell Cockhren; Aize Cao; Scott Bolton
Journal: J Child Psychol Psychiatry Date: 2013-10-07 Impact factor: 8.982

5. Evaluating fMRI-Based Estimation of Eye Gaze During Naturalistic Viewing.

Authors: Jake Son; Lei Ai; Ryan Lim; Ting Xu; Stanley Colcombe; Alexandre Rosa Franco; Jessica Cloud; Stephen LaConte; Jonathan Lisinski; Arno Klein; R Cameron Craddock; Michael Milham
Journal: Cereb Cortex Date: 2020-03-14 Impact factor: 5.357

6. An fMRI Study of Affective Congruence across Visual and Auditory Modalities.

Authors: Chuanji Gao; Christine E Weber; Douglas H Wedell; Svetlana V Shinkareva
Journal: J Cogn Neurosci Date: 2020-02-28 Impact factor: 3.225

7. Social cognitive neuroscience: a review of core processes.

Authors: Matthew D Lieberman
Journal: Annu Rev Psychol Date: 2007 Impact factor: 24.137

8. Watching social interactions produces dorsomedial prefrontal and medial parietal BOLD fMRI signal increases compared to a resting baseline.

Authors: Marco Iacoboni; Matthew D Lieberman; Barbara J Knowlton; Istvan Molnar-Szakacs; Mark Moritz; C Jason Throop; Alan Page Fiske
Journal: Neuroimage Date: 2004-03 Impact factor: 6.556