Literature DB >> 34651297

Development and validation of the Interoceptive States Static Images (ISSI) database.

Federica Biotti1, Sarah Ahmad2, Racquel Quinn2, Rebecca Brewer2.   

Abstract

Internal bodily signals provide an essential function for human survival. Accurate recognition of such signals in the self, known as interoception, supports the maintenance of homeostasis, and is closely related to emotional processing, learning and decision-making, and mental health. While numerous studies have investigated interoception in the self, the recognition of these states in others has not been examined despite its crucial importance for successful social relationships. This paper presents the development and validation of the Interoceptive States Static Images (ISSI), introducing a validated database of 423 visual stimuli for the study of non-affective internal state recognition in others, freely available to other researchers. Actors were photographed expressing various exemplars of both interoceptive states and control actions. The images went through a two-stage validation procedure, the first involving free-labelling and the second using multiple choice labelling and quality rating scales. Five scores were calculated for each stimulus, providing information about the quality and specificity of the depiction, as well as the extent to which labels matched the intended state/action. Results demonstrated that control action stimuli were more recognisable than internal state stimuli. Inter-category variability was found for the internal states, with some states being more recognisable than others. Recommendations for the utilisation of ISSI stimuli are discussed. The stimulus set is freely available to researchers, alongside data concerning recognisability.
© 2021. The Author(s).

Entities:  

Keywords:  Bodily signals; Internal states; Interoception; Social interaction; Static images

Mesh:

Year:  2021        PMID: 34651297      PMCID: PMC9374619          DOI: 10.3758/s13428-021-01706-2

Source DB:  PubMed          Journal:  Behav Res Methods        ISSN: 1554-351X


Introduction

Internal bodily signals, such as hunger, thirst, fatigue, nausea, pain, temperature, and cardiac and respiratory signals, are essential for human survival, indicating the physiological state and functioning of the body (e.g. the sensation of thirst signalling the level of dehydration in the body). The ability to perceive and identify these internal sensations, known as interoception (Craig, 2003a), is fundamental to multiple psychological processes, such as emotional processing (e.g., Critchley & Garfinkel, 2017; Garfinkel & Critchley, 2013; Schachter & Singer, 1962; Seth, 2013), and learning and decision-making (Bechara & Damasio, 2002; Dunn et al., 2010; Werner et al., 2009). Furthermore, a growing body of research has linked interoception to mental health and subjective wellbeing; atypical perception of interoceptive states has been found in several mental health conditions and neurodevelopmental disorders, such as Eating Disorders (Klabunde et al., 2013; Pollatos et al., 2008), autism (Garfinkel et al., 2016; Hatfield et al., 2019; Mul et al., 2018; Nicholson et al., 2019), anxiety and Panic Disorder (Ehlers, 1993; Paulus & Stein, 2006; see Khalsa et al., 2018 for a review), depression (Dunn et al., 2007; Furman et al., 2013; Forrest et al., 2015; Harshaw, 2015; see Eggart et al., 2019 for a review), and schizophrenia (Ardizzi et al., 2016). Given the vital role of interoception in understanding typical emotion processing and learning and decision-making, as well as its atypicality in several mental health conditions, research on interoception and emotion has grown significantly in recent years. While numerous studies have focused on the perception of interoceptive states in the self, very few (e.g., Kaulard et al., 2012) have researched the recognition of these states in others, beyond the domain of affective emotion (e.g., happiness, anger, sadness). Recognition of others’ affective emotional states (which feature an interoceptive component; Schachter & Singer, 1962) has been studied in detail, in typical adulthood, clinical samples, and across development; indeed a PubMed search using the term “emotion recognition” generated 15,009 results. Recognition of others’ emotional states is crucial for successful social interactions, as well as building and maintaining relationships, making it an important area for psychological research. Recognition of interoceptive states (beyond the affective domain) in others, including identifying others’ hunger, nausea, pain, and breathlessness, for example, is presumably equally important for social interaction, and arguably more important from an evolutionary perspective (as identifying perturbations in these states is necessary in order to offer care and assistance to others). Similarly, studying the mechanisms behind the ability to recognise other people’s bodily sensations is crucial to improve our understanding of empathy for these states in others, with important theoretical and clinical implications. It is somewhat surprising, therefore, that research has, thus far, neglected to investigate this ability. One reason for the dearth of research investigating the recognition of others’ non-affective internal states is presumably the lack of available stimuli. Compared to affective emotion recognition research, the lack of stimuli for the investigation of non-affective state recognition is striking. Since the publication of the “Pictures of Facial Affect” (Ekman & Friesen, 1976), the first standardised battery of facial emotion stimuli, several databases of visual stimuli depicting facial and bodily affective expressions have been developed (e.g., Beaupré et al., 2000; Langner et al., 2010; Lundqvist et al., 1998; Matsumoto & Ekman, 1988; Volkova et al., 2014; Wingenbach et al., 2016). Visual stimuli depicting facial and bodily expressions of affective states have been a key component of emotion research, and substantially contributed to our knowledge of affective and cognitive neuroscience, and social and clinical psychology. A purpose-built battery of stimuli depicting non-affective interoceptive states in others will enable research on social cognition to investigate the ability to perceive and recognise these signals in others. This will lead to an expansion of our theoretical understanding of the constructs of interoception and social perception in typical adult populations, developmental samples and clinical groups, both at the behavioural and neurological levels. This report presents the development and validation of the Interoceptive States Static Images (ISSI), a database of full body static images of actors expressing either a non-affective interoceptive state or a control action, which is freely available to other researchers. The battery consists of 423 stimuli, which depict eight actors expressing various exemplars of nine internal states and nine control actions. All photos were taken from a frontal view in a controlled environment, and underwent a standardised image processing procedure to control for lighting conditions, size, position, and background. Stimuli were validated in two stages, one utilising free labelling and the other utilising visual analogue rating scales. Recognition data for each individual stimulus and for the state and control actions overall, including the extent to which they are confused with each other, are provided.

Methods

Stimulus development

Actors

Eight trained actors (four female) aged 22 to 48 were recruited through online and campus advertisement. Neither ethnicity nor first language was specified as a recruitment criterion, but they were recorded. No specific ethnic group was targeted for recruitment and no actors were excluded based on their ethnicity. The recruitment was interrupted once the necessary number of actors was reached. All the actors who responded to our recruitment call reported being of Caucasian ethnicity. Actors were either drama students or had previously completed acting training. Actors were informed about the procedure and the purpose of the stimulus set, and gave their consent to take part in the recording session and for their images to be used in scientific research, presented at conferences, published in academic journal articles, and shared with other researchers. Actors received financial remuneration for their time.

Procedure

Prior to the recording session, actors were provided with the list of the internal states and actions they would be required to perform, and were asked to practice depicting each state or action prior to the recording session. During the recording session, they were required to wear black trousers, black socks, and a black t-shirt. Female actors were asked not to wear make-up and to tie their hair to ensure their face was completely visible at all times. Photos were taken in a purpose-built photography studio. Actors stood in a specified position in the centre of a white background, facing a camera placed on a tripod. Softbox LED lighting was used to control lighting conditions across different shooting sessions, and to reduce shadows. Actors first produced ten control actions (jumping, clapping, lifting, running, washing hands, spinning/twirling, stumbling, walking, waving, beckoning) and then expressed ten non-emotional internal states (cold, fatigue, nausea, pain, breathlessness, hunger, thirst, hot, satiety, itch). For each stimulus category, the actor was asked to practice before posing the state or control action five separate times, which were used as different exemplars of the same stimulus. Between each attempt, the actor was asked to re-set to a neutral body position and to re-position in the middle of the background. Between each stimulus type, a longer break was given to allow actors to rest and prepare for the next stimulus category. The order of stimulus production was fixed and did not vary across actors.

Image processing

Raw photos were edited in Adobe Photoshop 2019. The backdrop was replaced with an artificial white matte background. Image artefacts and distracting visual information (e.g. tattoos) were removed. Brightness and contrast were adjusted and standardised across different images. Sharpness was increased with the function Smart Sharpness by 309%, with 0.6-pixel radius, 100% noise reduction, and by removing lens blur. Image size and actor position were matched across stimuli using a 3456 × 5184 pixels white template. The first stimulus was positioned in the centre of the template to reach the desired size, which was sized to subtend 12° of visual angle vertically when viewed at 60 cm. Guidelines were drawn to delimit the boundaries of the actor in this position (extremes of head and feet in the vertical axis and extremes of right and left shoulders in the horizontal axis) and to provide a frame of reference for all the subsequent stimuli. For each actor, images were layered onto the original template and the size and position of the actor was adjusted to fit these guidelines. Each layered image was saved as a new file.

Stimuli validation

All stimuli went through a pre-selection process based on basic visual properties by the researchers. Photos where parts of the actor’s body were missing (e.g. the head being outside the picture top edge in some ‘jumping’ exemplars), or where motion blur could not be resolved through editing, were removed from the database. This resulted in all stimuli depicting the control action ‘stumbling’ being removed, due to a high proportion of images including several of these issues. To retain an equal number of control actions and internal states, stimuli depicting ‘thirst’ were also removed, owing to actors reporting that this state was difficult to portray because of the lack of a visible behavioural response to feeling thirst, and the authors agreeing that stimuli were not recognisable as depicting thirst. For each actor, and for each stimulus category, the four exemplars with the highest visual quality, judged by the researchers, were selected to be included in the first validation task, yielding a total of 560 stimuli.

Stimulus selection: Free-labelling task

Forty participants (four male) aged 18–30 years (M = 19.05, SD = 2.68) were recruited through Royal Holloway, University of London (RHUL) SONA System to take part in a free-rating task. Participants were all students at RHUL and received course credits for their participation. There were no exclusion criteria for this task, although any diagnosis of a mental health condition was recorded. A general description of the task procedure was provided but participants were not informed of the aim of the study until the end of the session, to avoid influencing their responses. Stimuli were divided into two sets of 280 images. Participants viewed one of the two sets, in order to reduce fatigue. Instructions were standardised across participants and the experimenter provided them verbatim as follows: You will see a series of body postures, one by one. For each one, you need to provide a very brief description of what you think the body posture represents (for example what the person is doing, thinking or feeling). There will be many stimuli, so it’s very important that you keep your answers as brief as possible. Ideally, you will use a single word or a short phrase. For example, if you see an image depicting a person sneezing, you can simply answer ‘sneezing’. If you think that the person could be doing, thinking or feeling more than one thing, you can give multiple answers, but please try to keep the description of each one brief. If I need more details, I will ask for them. There are not right or wrong answers, so I will not provide any feedback during or after the session. I will simply record your answers and occasionally intervene if I think something is not clear or if I need more details. Following the instructions, participants were invited to ask any questions they may have about the procedure. Then, the experimenter sat a few meters behind the participant and typed their responses verbatim. When additional information was required, the experimenter used standardised phrases to prompt the participant. If the answer required more details, the experimenter would say “Can you tell me more about that?”. If the answer was ambiguous/unclear, the experimenter would say “Can you be more specific?” or “Can you tell me what you mean by that?”. Finally, if participants’ responses were too verbose, the experimenter would say “Try to use single words or short phrases”. This task took approximately 20 min to complete.

Stimulus validation: Label selection and rating task

Based on the results of the free-rating task, 423 stimuli were selected to be used in the second step of validation (details on the selection procedure can be found in the Results section). Of these, 202 stimuli depicted nine internal states (breathlessness, cold, fatigue, hot, hunger, itch, nausea, pain, satiety) (Fig. 1a) and 221 stimuli depicted nine control actions (beckoning, clapping, jumping, lifting, running, twirling, walking, washing hands, waving) (Fig. 1b). Participants were recruited from the RHUL Sona System, Testable Minds database (www.testable.org), and through advertisements on social media. A total of 412 participants (169 female) aged 18–71 years (M = 30.08, SD = 10.56) with no diagnosis of any mental health condition took part in an online labelling task. The task was programmed in Testable and presented participants with a random sample of 100 stimuli. On each trial, a single image was presented in the centre of the computer screen and remained visible until participants had finished responding. Participants were provided with a list of the nine internal state and nine action labels (presented in alphabetical order) and asked to select which label best described the image. If they were unsure, participants could select more than one label or skip to the next trial if they thought that no label applied. Following label selection, participants were prompted to rate how well each chosen label described the image, using a five-point Likert scale (Very Poorly; Poorly; Moderately; Well; Very Well). This task took approximately 30 min to complete.
Fig. 1

Examples of stimuli from the ISSI database. a Examples of internal state stimuli. b Examples of control action stimuli

Examples of stimuli from the ISSI database. a Examples of internal state stimuli. b Examples of control action stimuli

Results

Free-labelling task

Participants’ responses in this stage were analysed qualitatively. First, two coders independently coded the responses for accuracy (referring to identification of the intended state or action). A score of 1 was given to those responses which either correctly identified the state or action or, for state stimuli, correctly described the action portrayed, associated with the state (e.g. for the ‘fatigue’ stimuli, both ‘tired’ and ‘yawning’ were considered correct responses). A score of 0 was given to inaccurate responses (e.g. ‘hot’ or ‘shocked’ to describe a ‘breathlessness’ stimulus). In instances where coders disagreed, responses were discussed by all authors until an agreement was found. Inter-coder agreement was near perfect (k = 0.81). Each stimulus was given a recognisability index(RI) which corresponded to the mean accuracy score. Overall, internal state and action stimuli were recognised correctly 65% and 75% of the time, respectively. Of the internal states, itch (M = 88%, SD = 13%, range 55–100%) and cold (M = 88%, SD = 15%, range 55–100%) were the best recognised states, while hunger was the least well recognised state (M = 22%, SD = 8%, range 10–40%). Among the control actions stimulus set, walking was the best recognised state (M = 91%, SD = 12%, range 55–100%), whereas beckoning stimuli were the least well recognised (M = 49%, SD = 11%, range 30–75%). See Table 1 for a full summary of RIs.
Table 1

Recognisability Indices (RI) for each Internal State and Action category. RIs represent the proportion of recognition accuracy in the free-labelling task (Stage 1)

%RI mean (SD)%RI minimum%RI maximum
BREATHLESSNESS29 (16)565
COLD88 (15)55100
FATIGUE80 (18)45100
HOT35 (19)565
HUNGER22 (8)1040
ITCH88 (13)55100
NAUSEA70 (15)4095
PAIN79 (20)30100
SATIETY39 (16)1570
BECKONING49 (11)3075
CLAPPING85 (15)55100
JUMPING78 (14)50100
LIFTING65 (23)10100
RUNNING84 (15)55100
TWIRLING66 (16)4595
WALKING91 (12)55100
WASHING HANDS57 (10)3075
WAVING80 (12)60100
Recognisability Indices (RI) for each Internal State and Action category. RIs represent the proportion of recognition accuracy in the free-labelling task (Stage 1) Based on the RI, each stimulus was categorised according to recognisability, into five categories: Very poor (RI scores 0.0–0.2), Poor (RI scores 0.21–0.4), Average (RI scores 0.41–0.6), Good (RI scores 0.61–0.8), and Very good (RI scores 0.81–1). All stimuli categorised as Very good, Good, and Average were kept in the final database. In addition, we retained a minimum of two exemplars per actor for each stimulus category. For stimulus categories where fewer than two stimuli for a given actor were categorised as Very good, Good, or Average, the two stimuli with the highest RI were retained. See Appendix Table 3 for RI scores for every retained stimulus. A final set of 423 stimuli was retained and used in the second stage of validation1. In this final stimulus set, 209 stimuli depicted male actors, whilst 214 depicted female actors. Each of the eight actors was present in at least 50 stimuli, and the most depicted actor appeared in 56 stimuli.
Table 3

Recognisability, quality, specificity, and accuracy scores for each stimulus of the ISSI database. RI Recognisability Index, QI Quality Index, SI Specificity Index, SI+Maximum-Distractor Specificity Index, CR Choice Rate, CR+High-Quality Choice Rate. The stimulus name indicates, in order, the state/action displayed, the actor’s gender (M or F), the actor’s identifier (1–4), and the exemplar (1–4)

STIMULUSRI (%)QISISI+CR (%)CR+ (%)
BREATHLESSNESS_M1_1203.462.282.198577
BREATHLESSNESS_M1_2102.681.331.297463
BREATHLESSNESS_M2_1452.30– 0.40– 0.585738
BREATHLESSNESS_M2_2252.881.511.437874
BREATHLESSNESS_M3_1652.19– 0.65– 0.835536
BREATHLESSNESS_M3_2153.041.761.707569
BREATHLESSNESS_M4_151.56– 1.20– 1.444321
BREATHLESSNESS_M4_2153.772.642.628882
BREATHLESSNESS_F1_1253.742.952.879086
BREATHLESSNESS_F1_2253.202.312.257979
BREATHLESSNESS_F2_1452.150.150.035748
BREATHLESSNESS_F2_2303.312.522.488483
BREATHLESSNESS_F3_1503.221.641.577867
BREATHLESSNESS_F3_2453.221.341.207664
BREATHLESSNESS_F4_1253.382.031.978574
BREATHLESSNESS_F4_2252.891.491.337764
COLD_M1_11004.514.364.359899
COLD_M1_2904.363.933.879995
COLD_M1_31004.083.713.699596
COLD_M1_4603.723.173.169290
COLD_M2_1904.233.863.839495
COLD_M2_2553.743.183.168786
COLD_M2_31004.494.254.259898
COLD_M2_41004.363.973.949692
COLD_M3_1954.504.054.029993
COLD_M3_2703.913.243.199387
COLD_M3_3954.483.933.869993
COLD_M3_41004.153.673.669489
COLD_M4_11004.624.013.9710096
COLD_M4_2603.722.442.368976
COLD_M4_3904.323.703.669892
COLD_M4_41004.534.074.039896
COLD_F1_1854.303.513.499794
COLD_F1_2803.873.523.509591
COLD_F1_3853.933.753.709891
COLD_F1_4953.803.613.599294
COLD_F2_1954.594.184.169897
COLD_F2_2954.294.094.0810099
COLD_F2_3954.323.873.849897
COLD_F2_4954.554.344.3210098
COLD_F3_11004.323.633.569793
COLD_F3_21004.473.983.949795
COLD_F3_31004.634.174.1210097
COLD_F3_4954.233.713.679792
COLD_F4_1604.273.963.939695
COLD_F4_21004.394.013.999796
COLD_F4_3553.232.552.528684
COLD_F4_4753.522.462.448575
FATIGUE_M1_1450.97– 2.19– 2.303017
FATIGUE_M1_2652.862.712.717783
FATIGUE_M1_3701.97– 0.62– 0.735834
FATIGUE_M1_4652.450.320.226748
FATIGUE_M2_1702.902.312.257376
FATIGUE_M2_21003.182.642.647980
FATIGUE_M2_3603.051.471.297760
FATIGUE_M3_1803.943.763.729186
FATIGUE_M3_2452.380.520.476745
FATIGUE_M3_3903.893.273.249291
FATIGUE_M4_1954.113.062.999386
FATIGUE_M4_2904.042.752.699381
FATIGUE_M4_3703.762.122.069076
FATIGUE_M4_4903.492.632.608980
FATIGUE_F1_1703.343.083.038081
FATIGUE_F1_2953.753.413.409088
FATIGUE_F1_3652.912.822.807583
FATIGUE_F2_1803.272.552.488178
FATIGUE_F2_2953.333.203.197985
FATIGUE_F2_31003.673.343.338287
FATIGUE_F2_4952.922.572.537377
FATIGUE_F3_11003.423.133.128184
FATIGUE_F3_2502.231.791.756064
FATIGUE_F3_31003.513.103.078283
FATIGUE_F4_1953.583.193.178283
FATIGUE_F4_2953.602.822.778883
HOT_M1_1102.700.420.277242
HOT_M1_2352.761.401.356862
HOT_M2_1452.700.860.747354
HOT_M2_2302.180.100.065552
HOT_M3_152.01– 0.60– 0.785827
HOT_M3_2201.94– 0.91– 0.985229
HOT_M4_1252.940.290.217044
HOT_M4_2353.121.110.997557
HOT_F1_1602.540.960.876555
HOT_F1_2551.76– 0.22– 0.284938
HOT_F2_1202.19– 0.08– 0.175940
HOT_F2_2101.54– 0.25– 0.364636
HOT_F3_1452.170.880.805751
HOT_F3_2352.740.770.656549
HOT_F4_1652.991.281.217166
HOT_F4_2602.991.671.617171
HOT_F4_3452.941.221.127162
HUNGER_M1_1302.26– 0.07– 0.185742
HUNGER_M1_2402.540.610.496648
HUNGER_M2_1201.53– 1.60– 1.794120
HUNGER_M2_2251.52– 1.53– 1.794520
HUNGER_M3_1201.82– 0.46– 0.615232
HUNGER_M3_2251.93– 0.87– 1.075826
HUNGER_M4_1101.19– 1.78– 1.953420
HUNGER_M4_2101.13– 2.42– 2.723610
HUNGER_F1_1151.29– 1.37– 1.523922
HUNGER_F1_2250.94– 2.39– 2.602711
HUNGER_F2_1252.020.190.055748
HUNGER_F2_2151.85– 0.76– 0.955226
HUNGER_F3_1201.93– 0.23– 0.355541
HUNGER_F3_2151.15– 1.92– 2.063419
HUNGER_F4_1251.70– 0.51– 0.605033
HUNGER_F4_2251.85– 0.43– 0.535138
ITCH_M1_1703.312.682.628382
ITCH_M1_21004.233.373.319689
ITCH_M1_3903.742.812.779179
ITCH_M2_1954.383.843.839694
ITCH_M2_2954.383.983.959393
ITCH_M2_31004.503.583.559790
ITCH_M2_41004.323.923.919496
ITCH_M3_11003.882.952.929086
ITCH_M3_2853.863.203.188888
ITCH_M3_3652.17– 0.62– 0.695537
ITCH_M3_4802.870.820.777357
ITCH_M4_1754.103.613.619590
ITCH_M4_2754.002.842.799382
ITCH_M4_3953.912.872.848979
ITCH_M4_4854.013.063.029282
ITCH_F1_1954.313.913.849490
ITCH_F1_2954.293.973.959494
ITCH_F1_3954.253.593.599590
ITCH_F2_11004.353.773.719691
ITCH_F2_2854.143.663.639389
ITCH_F2_3954.093.733.739494
ITCH_F2_4853.693.363.339091
ITCH_F3_11003.963.593.569089
ITCH_F3_2652.440.500.466351
ITCH_F3_3954.163.843.849595
ITCH_F3_4953.833.383.369491
ITCH_F4_11004.514.154.149596
ITCH_F4_2752.771.331.247065
ITCH_F4_3552.21– 0.15– 0.315843
ITCH_F4_4954.113.143.078883
NAUSEA_M1_1603.091.451.417766
NAUSEA_M1_2753.371.601.518064
NAUSEA_M1_3803.922.402.318874
NAUSEA_M2_1603.401.921.848271
NAUSEA_M2_2853.011.301.217661
NAUSEA_M2_3503.331.531.398261
NAUSEA_M3_1803.892.512.419377
NAUSEA_M3_2503.291.301.208161
NAUSEA_M4_1904.082.362.278873
NAUSEA_M4_2853.251.231.157965
NAUSEA_M4_3653.491.261.158460
NAUSEA_M4_4803.812.152.049071
NAUSEA_F1_1402.410.08– 0.126441
NAUSEA_F1_2653.361.921.858066
NAUSEA_F2_1503.191.911.808068
NAUSEA_F2_2753.612.822.788683
NAUSEA_F2_3603.652.632.558581
NAUSEA_F2_4704.083.333.269488
NAUSEA_F3_1753.301.481.417659
NAUSEA_F3_2954.363.493.409386
NAUSEA_F3_3503.662.322.258675
NAUSEA_F4_1853.942.822.778984
NAUSEA_F4_2754.143.163.129286
NAUSEA_F4_3704.163.153.149186
PAIN_M1_1853.943.022.989186
PAIN_M1_21004.353.793.769795
PAIN_M1_31004.574.334.339797
PAIN_M1_4803.863.133.089382
PAIN_M2_1903.452.041.988571
PAIN_M2_21003.852.862.819082
PAIN_M2_3953.651.921.848872
PAIN_M2_4653.421.751.658367
PAIN_M3_1854.213.643.619692
PAIN_M3_21004.534.004.009795
PAIN_M3_3704.123.263.239589
PAIN_M3_41004.464.024.009998
PAIN_M4_1904.404.013.999594
PAIN_M4_2653.332.082.037971
PAIN_M4_3954.454.274.259795
PAIN_F1_1803.161.651.568658
PAIN_F1_2953.903.203.199185
PAIN_F1_3803.582.472.439081
PAIN_F2_1402.532.052.047379
PAIN_F2_2552.771.491.437565
PAIN_F3_1804.043.703.679293
PAIN_F4_1553.792.932.849280
PAIN_F4_2553.162.011.958373
PAIN_F4_3303.092.062.057871
SATIETY_M1_1301.18– 1.61– 1.813320
SATIETY_M1_2251.80– 0.49– 0.604739
SATIETY_M2_1351.73– 0.79– 0.894336
SATIETY_M2_2551.84– 0.75– 1.024532
SATIETY_M3_1451.59– 1.28– 1.444325
SATIETY_M3_2150.79– 2.25– 2.452318
SATIETY_M4_1501.84– 0.56– 0.714538
SATIETY_M4_2400.99– 2.38– 2.732615
SATIETY_F1_1351.22– 1.57– 1.823419
SATIETY_F1_2301.21– 1.93– 2.123321
SATIETY_F2_1200.78– 2.30– 2.472115
SATIETY_F2_2151.38– 1.35– 1.444025
SATIETY_F3_1501.68– 0.73– 0.914235
SATIETY_F3_2601.63– 0.71– 0.844737
SATIETY_F3_3702.661.271.166459
SATIETY_F4_1451.81– 0.36– 0.544940
SATIETY_F4_2501.79– 0.48– 0.584741
BECKONING_M1_1503.813.553.539292
BECKONING_M1_2504.034.014.009294
BECKONING_M2_1754.303.813.799391
BECKONING_M2_2654.143.953.959293
BECKONING_M2_3504.284.044.039393
BECKONING_M2_4504.243.983.969392
BECKONING_M3_1353.623.143.108987
BECKONING_M3_2403.813.663.639193
BECKONING_M4_1403.953.493.469087
BECKONING_M4_2453.493.163.138585
BECKONING_F1_1703.973.643.649091
BECKONING_F1_2503.813.533.529191
BECKONING_F1_3503.433.103.058787
BECKONING_F2_1302.471.911.897775
BECKONING_F2_2453.713.873.859296
BECKONING_F3_1503.172.682.678384
BECKONING_F3_2353.182.822.798481
BECKONING_F4_1553.262.712.708285
BECKONING_F4_2503.552.872.848382
CLAPPING_M1_11004.604.364.369897
CLAPPING_M1_2803.723.373.359185
CLAPPING_M1_31004.344.174.139696
CLAPPING_M1_41004.363.953.929792
CLAPPING_M2_1903.862.972.919083
CLAPPING_M2_21004.594.284.249795
CLAPPING_M2_31004.634.384.369896
CLAPPING_M2_41004.494.074.049590
CLAPPING_M3_1753.562.572.538581
CLAPPING_M3_2853.933.313.319391
CLAPPING_M3_3702.951.091.037558
CLAPPING_M3_4853.882.972.939280
CLAPPING_M4_1553.402.802.768586
CLAPPING_M4_2553.142.302.248478
CLAPPING_M4_3903.482.552.538882
CLAPPING_M4_41004.384.094.059793
CLAPPING_F1_11004.634.394.3710096
CLAPPING_F1_21004.223.743.719592
CLAPPING_F1_3603.151.621.618063
CLAPPING_F1_4703.532.172.168473
CLAPPING_F2_1904.444.214.179896
CLAPPING_F2_2854.253.883.889691
CLAPPING_F2_3803.372.011.938367
CLAPPING_F2_4903.392.021.988369
CLAPPING_F3_1852.700.600.546850
CLAPPING_F3_2802.900.850.817453
CLAPPING_F3_3552.751.081.046960
CLAPPING_F4_1853.723.163.158986
CLAPPING_F4_2653.532.992.968986
CLAPPING_F4_31004.253.893.889593
CLAPPING_F4_4953.702.542.528681
JUMPING_M1_1853.522.061.988173
JUMPING_M1_2854.393.713.679389
JUMPING_M1_3954.343.893.899493
JUMPING_M1_41004.353.883.839290
JUMPING_M3_1653.082.712.678082
JUMPING_M3_2802.722.272.227480
JUMPING_M4_1752.731.821.797171
JUMPING_M4_2802.281.641.606471
JUMPING_F1_1653.143.093.028188
JUMPING_F1_2553.052.522.487779
JUMPING_F1_3903.233.153.158491
JUMPING_F1_4803.403.303.308690
JUMPING_F2_1503.272.882.848685
JUMPING_F2_2803.042.682.637579
JUMPING_F2_31004.354.194.189896
JUMPING_F2_4953.963.943.939293
JUMPING_F3_1903.593.243.228687
JUMPING_F3_2753.382.172.138274
JUMPING_F3_3603.713.383.378588
JUMPING_F3_4803.883.563.548891
JUMPING_F4_1752.792.102.097076
JUMPING_F4_2603.322.732.727880
JUMPING_F4_3852.902.422.377176
LIFTING_M1_11004.314.064.049495
LIFTING_M1_2754.223.823.809193
LIFTING_M2_1152.291.501.456669
LIFTING_M2_2553.552.892.878582
LIFTING_M3_1603.362.762.738281
LIFTING_M3_2352.892.362.347377
LIFTING_M4_1603.593.263.258686
LIFTING_M4_2653.823.093.068684
LIFTING_M4_3854.293.853.789691
LIFTING_F1_1854.043.823.788790
LIFTING_F1_2653.663.253.248586
LIFTING_F1_3803.853.593.558591
LIFTING_F2_1854.073.823.809092
LIFTING_F2_2102.342.142.137081
LIFTING_F3_1653.993.783.768992
LIFTING_F3_2653.402.602.558479
LIFTING_F4_1854.204.234.209196
LIFTING_F4_2653.983.683.658688
LIFTING_F4_3854.334.264.249395
LIFTING_F4_4603.592.972.908485
RUNNING_M1_1653.792.802.798884
RUNNING_M1_2903.752.932.918883
RUNNING_M1_3903.382.222.168474
RUNNING_M2_1953.893.393.349591
RUNNING_M2_2803.412.282.228280
RUNNING_M2_31003.893.043.029281
RUNNING_M2_41003.242.182.148176
RUNNING_M3_11004.233.803.759592
RUNNING_M4_1753.352.212.178275
RUNNING_M4_2702.721.261.207359
RUNNING_M4_3953.873.193.189187
RUNNING_M4_4903.752.772.749081
RUNNING_F1_1553.322.122.048570
RUNNING_F1_2552.370.900.856958
RUNNING_F1_3752.450.830.826655
RUNNING_F1_4703.031.861.837967
RUNNING_F2_1603.502.051.978674
RUNNING_F2_2853.522.352.338474
RUNNING_F2_3953.522.212.208577
RUNNING_F2_4904.033.613.609492
RUNNING_F3_1803.552.522.478775
RUNNING_F3_21004.614.344.299796
RUNNING_F3_31004.524.214.199896
RUNNING_F3_41004.243.703.689288
RUNNING_F4_1904.313.683.649490
RUNNING_F4_2853.171.661.627767
TWIRLING_M1_1452.100.600.525954
TWIRLING_M1_2652.831.621.557566
TWIRLING_M1_3602.892.162.127773
TWIRLING_M2_1803.653.313.289192
TWIRLING_M3_1553.933.603.569089
TWIRLING_M3_2653.713.393.358889
TWIRLING_M3_3903.222.602.588180
TWIRLING_M4_1502.891.961.957671
TWIRLING_M4_2852.991.961.947971
TWIRLING_F1_1503.031.351.347664
TWIRLING_F1_2703.943.553.499090
TWIRLING_F1_3903.662.912.868784
TWIRLING_F1_4953.722.982.958685
TWIRLING_F2_1501.81– 0.46– 0.585141
TWIRLING_F2_2602.470.630.546256
TWIRLING_F3_1603.833.273.258987
TWIRLING_F3_2553.161.281.257366
TWIRLING_F4_1453.382.902.868384
TWIRLING_F4_2803.302.432.397976
WALKING_M1_11004.243.973.979795
WALKING_M1_21003.953.713.699392
WALKING_M1_3903.763.483.479191
WALKING_M1_4653.373.353.348691
WALKING_M2_11004.193.853.849695
WALKING_M2_2953.062.242.217878
WALKING_M2_3803.613.263.199089
WALKING_M2_4752.783.053.047691
WALKING_M3_11003.101.841.827767
WALKING_M3_21004.103.803.809393
WALKING_M3_3903.572.432.409078
WALKING_M3_41004.284.124.109695
WALKING_M4_1953.753.553.559295
WALKING_M4_2902.691.791.707570
WALKING_M4_3803.413.223.208589
WALKING_M4_4953.733.483.439188
WALKING_F1_11004.284.114.099594
WALKING_F1_21004.294.114.099594
WALKING_F1_3802.812.292.277478
WALKING_F1_4552.742.672.647584
WALKING_F2_1953.563.133.128887
WALKING_F2_2853.623.293.299092
WALKING_F2_3652.042.452.456083
WALKING_F2_41004.274.134.109697
WALKING_F3_1953.593.383.348990
WALKING_F3_2954.274.114.119698
WALKING_F3_31004.324.164.149394
WALKING_F3_41004.033.783.789695
WALKING_F4_1954.043.843.839495
WALKING_F4_21004.023.813.808992
WALKING_F4_31004.153.943.919696
WALKING_F4_4903.843.283.279090
WASHING_HANDS_M1_1503.342.682.658581
WASHING_HANDS_M1_2553.863.373.359088
WASHING_HANDS_M2_1603.883.443.438888
WASHING_HANDS_M2_2603.202.312.238276
WASHING_HANDS_M3_1553.973.803.809092
WASHING_HANDS_M3_2553.102.132.107675
WASHING_HANDS_M3_3553.482.692.658281
WASHING_HANDS_M4_1303.082.552.528383
WASHING_HANDS_M4_2453.683.243.248788
WASHING_HANDS_F1_1454.153.423.409588
WASHING_HANDS_F1_2704.113.643.629190
WASHING_HANDS_F1_3654.203.543.499086
WASHING_HANDS_F2_1753.993.413.409087
WASHING_HANDS_F2_2603.773.263.258788
WASHING_HANDS_F3_1553.412.712.678483
WASHING_HANDS_F3_2653.673.283.268787
WASHING_HANDS_F3_3552.861.691.607768
WASHING_HANDS_F4_1653.823.042.988585
WASHING_HANDS_F4_2553.693.443.438690
WAVING_M1_1854.123.593.569790
WAVING_M1_2904.113.603.589391
WAVING_M1_3753.843.223.208983
WAVING_M1_4853.763.012.998684
WAVING_M2_1803.603.103.088588
WAVING_M2_2953.903.383.368988
WAVING_M2_31003.853.413.419593
WAVING_M2_4954.173.613.599089
WAVING_M3_1804.193.683.679393
WAVING_M3_2853.993.463.448888
WAVING_M3_3703.202.792.768288
WAVING_M3_41004.073.613.609092
WAVING_M4_1703.532.922.908985
WAVING_M4_2803.683.123.118886
WAVING_M4_3703.572.742.728581
WAVING_M4_4753.352.642.628381
WAVING_F1_1703.662.792.788484
WAVING_F1_2703.663.063.068785
WAVING_F1_3602.862.312.317779
WAVING_F1_4853.803.293.288788
WAVING_F2_1853.853.343.328585
WAVING_F2_2904.033.633.589188
WAVING_F2_3753.482.942.908587
WAVING_F2_41003.833.523.508789
WAVING_F3_1753.723.323.308688
WAVING_F3_2703.953.503.489291
WAVING_F3_3652.762.232.227982
WAVING_F3_4603.062.602.578684
WAVING_F4_1803.663.343.318991
WAVING_F4_2703.693.043.028787
WAVING_F4_3953.943.633.619191
WAVING_F4_4703.833.283.288989

Label selection and rating task

Quality and Accuracy Scores

Each stimulus was rated by a mean of 97 participants (Min = 74, Max = 123). There are multiple ways in which the validity and quality of stimuli can be defined, so to allow researchers to select stimuli based on their own requirements, a comprehensive range of stimulus measures has been created and is provided below. For each stimulus, five separate scores were calculated: the quality index(QI); the specificity index(SI); the maximum-distractor specificity index (SI+), the choice rate(CR); and the high-quality choice rate (CR+) (Table 2). The scores were calculated based on the ratings of the whole sample (both female and male observers), as well as on ratings of female and male observers separately.
Table 2

Summary of scores. T = target; D = distractor. In the formulae for QI, SI, and SI+ T and D correspond to a value between 0 and 5 (participants’ ratings of how well a stimulus depicts a given state label). In the formulae for CR and CR+, T and D correspond to a binary value: 0 or 1 (indicating whether the label was selected (1) or not (0)). n = total number of stimulus ratings across all participants. i = ‘for all individual stimulus ratings across all participants’

ScoreAbbreviationDescriptionFormulaRangeInterpretation
Quality IndexQIHow well the target label describes the image\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{\Sigma\ Ti}{n_i}$$\end{document}ΣTini0 – 5

0 = target label not selected

1 = very poor depiction

5 = very good depiction

Specificity IndexSIHow well the target label describes the image, over and above distractor state/action labels\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{\Sigma \left({T}_i-\left(\frac{\Sigma\ Di}{n}\right)\right)}{n}$$\end{document}ΣTiΣDinn-5 – 5

Negative values: target label received a lower rating than distractor labels taken together

0 = target and distractor labels are rated equally

Positive values: target label received a higher rating than distractor labels taken together

Maximum-distractor Specificity IndexSI+How well the target label describes the image, over and above the distractor receiving the highest rating\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{\Sigma \left({T}_i-{D_i}_{max}\right)}{n}$$\end{document}ΣTiDimaxn-5 – 5

Negative values: target received lowerrating than distractors with highest rating

0 = target and distractors are rated equally

Positive values: target received higherrating than distractors with highest rating

Choice RateCRProportion of raters who selected the target label, regardless of the quality rating\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{{{\Sigma T}_i}_{selected}}{n}\times 100$$\end{document}ΣTiselectedn×1000% – 100%

0% = target label was never selected to describe the stimulus

100% = target label was always selected to describe the stimulus

High-quality Choice RateCR+Proportion of raters who gave the target label (rather than a distractor label) the highest quality rating on that trial\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\frac{{{\Sigma T}_i}_{max}}{n}\times 100$$\end{document}ΣTimaxn×1000% – 100%

0% = target label was never rated higher than distractors when describing the stimulus

100% = target label was always rated higher than distractors when describing the stimulus

Summary of scores. T = target; D = distractor. In the formulae for QI, SI, and SI+ T and D correspond to a value between 0 and 5 (participants’ ratings of how well a stimulus depicts a given state label). In the formulae for CR and CR+, T and D correspond to a binary value: 0 or 1 (indicating whether the label was selected (1) or not (0)). n = total number of stimulus ratings across all participants. i = ‘for all individual stimulus ratings across all participants’ 0 = target label not selected 1 = very poor depiction 5 = very good depiction Negative values: target label received a lower rating than distractor labels taken together 0 = target and distractor labels are rated equally Positive values: target label received a higher rating than distractor labels taken together Negative values: target received lowerrating than distractors with highest rating 0 = target and distractors are rated equally Positive values: target received higherrating than distractors with highest rating 0% = target label was never selected to describe the stimulus 100% = target label was always selected to describe the stimulus 0% = target label was never rated higher than distractors when describing the stimulus 100% = target label was always rated higher than distractors when describing the stimulus The QI is a score ranging between 0 and 5, and was computed by taking the mean (across all stimulus ratings) of all quality judgements given to the target (intended) label. A score of 0 was assigned whenever the target label was not selected. The QI therefore reflects the extent to which the target label is perceived as describing the image well. High QI scores indicate that the target label describes the image very well. Conversely, lower QI scores indicate that the target label does not describe the stimulus well. The SI reflects the extent to which the target label is perceived as a good description of the image, over and above distractor states or action labels. SI was computed by subtracting the mean rating given to selected distractor labels from the rating given to the target label, and taking the mean of these values across all stimulus ratings. SI values range between – 5 and 5. Negative values indicate that the target label received a lower score than the distractor labels taken together. Conversely, positive values signify that the target label received a higher rating compared to distractor labels taken together. The SI+ was obtained by subtracting the highest distractor rating from the rating given to the target label, and taking the mean of these values across all stimulus ratings. SI+ is a score ranging between – 5 and 5, whereby negative values indicate that distractor labels were given higher ratings than the target label, whilst positive values indicate that the target label received a higher rating than the distractor with the highest rating. The SI and SI+ are more conservative scores than the QI, as they take into account the discrepancy between ratings of intended and unintended labels. Values of SI and SI+ close to 0 indicate that the target label is not perceived to be a better description of the stimulus than the distractor labels. The CR consists of the proportion of participants who selected the target label to describe the stimulus, regardless of the quality rating given. CR scores range between 0% to 100%, whereby 0% indicates that the target label was never selected to describe the image, whilst 100% indicates that the target label was always selected to describe the image. The CR+ is the proportion of participants who gave the target label the highest quality rating of all labels. CR+ was calculated by assigning a score of 1 to those stimuli whose target label received the highest quality rating. Whenever a distractor obtained a quality rating equal to or higher than the target, a score of 0 was assigned. CR+ scores of 0% indicate that the target label was never rated higher than distractor labels when describing the image. CR+ scores of 100% indicate that the target label always received the highest rating, compared to distractor labels, when describing the image. All five scores are presented for each stimulus in Table 3 in the Appendix. Whole-sample analyses revealed that QI scores were higher for action stimuli (M = 3.62, SD = .55) than for internal state stimuli (M = 3.22, SD = 1.02) [t(421) = – 5.09, p < .001]. Separate ANOVAs were conducted for QI of internal states and QI of control actions, with Stimulus Category (all internal state/action stimulus categories) and Actor Sex (male, female) as IVs. For the internal states, a significant main effect of Stimulus Category [F (17, 184) = 62.97, p < .001, η2 = .73] was found. Cold received the highest QI (M = 4.2, SD = .35). Conversely, the lowest QI was attributed to hunger (M = 1.67, SD = .44) (Fig. 2a). Post hoc t tests were conducted across all pairs of states with Bonferroni corrections and are shown in Fig. 2a. Similarly, the ANOVA for the action stimuli resulted in a significant main effect of Stimulus Category [F (17, 203) = 2.63, p = .009, η2 = .09]. Clapping stimuli had the highest QI (M = 3.8, SD = .59), while the mean QI for twirling was the lowest of the action stimulus set (M = 3.18, SD = .60) (Fig. 2b). Post hoc t tests comparing all pairs of actions, with Bonferroni corrections, and are shown in Fig. 2b). Actor Sex did not contribute to variations in QI for either internal states [F (17, 184) = .06, p = .81] or control actions [F (17, 203) = .49, p = .48], and did not interact with Stimulus Category in either internal states [F(17, 184) = 1.75, p = .09] or control actions [F(17, 203) = 1.38, p = .21].
Fig. 2

Distribution of Quality Index (QI) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before Bonferroni correction is reported in italics below the mean difference value

Distribution of Quality Index (QI) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before Bonferroni correction is reported in italics below the mean difference value SI scores were higher for action stimuli (M = 3.01, SD = .86) than internal state stimuli (M = 1.90, SD = 1.84) [t(421) = – 8.08, p < .001)]. Again, separate ANOVAs were conducted for the action and the internal states stimulus sets, with SI as the DV and Stimulus Category and Actor Sex as IVs. For the internal states, the main effect of Stimulus Category was significant [F (17, 184) = 63.197, p < .001, η2 = .73]. Cold had the highest SI (M = 3.71, SD = .50), while SI was lowest for satiety (M = – 1.07, SD = .91) (Fig. 3a). Post hoc t tests for Stimulus Category using Bonferroni corrections are shown in Fig. 3a. There was a significant main effect of Actor Sex [F (17, 184) = 5.04, p = .02, η2 = .03], whereby SI scores were higher for stimuli depicted by female actors (M = 1.99, SD = 1.77) than those portraying male actors (M = 1.81, SD = 1.90). Actor Sex did not interact significantly with Stimulus Category [F (17, 184) = 1.96, p = .054]. For the action stimulus set, a significant main effect of Stimulus Category [F (17, 203) = 4.76, p < .001, η2 = .16] was observed. Beckoning and twirling had the highest (M = 3.36, SD = .58) and lowest (M = 2.21, SD = 1.14) SIs, respectively (Fig. 3b). Post hoc t tests using Bonferroni corrections were conducted on Stimulus Category and are reported in Fig. 3b. The main effect of Actor Sex [F (17, 203) = .41, p = .52] and the interaction between Actor Sex and Stimulus Category [F (17, 203) = 1.67, p = .11] were non-significant.
Fig. 3

Distribution of Specificity Index (SI) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before Bonferroni corrections is reported in italics below the mean difference value

Distribution of Specificity Index (SI) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before Bonferroni corrections is reported in italics below the mean difference value SI+ was significantly higher for action stimuli (M = 2.99, SD = .87) than for internal state stimuli (M = 1.82, SD = 1.89) [t(421) = – 8.21, p < .001]. Separate ANOVAs were conducted for the action and internal state stimulus sets, with Stimulus Category and Actor Sex as IVs and SI+ scores as the DV. A significant main effect of Stimulus Category was found for internal state stimuli [F (17, 184) = 63.797, p < .001, η2 = .735]. Cold and Satiety had the highest (M = 3.68, SD = .51) and lowest (M = – 1.25, SD = .95) SI+, respectively (Fig. 4a). Bonferroni-corrected post hoc t tests were conducted on all the levels of Stimulus Category and are reported in Fig. 4a. A significant main effect of Actor Sex was found [F (17, 184) = 5.35, p < .05, η2 = .03] whereby stimuli depicting female actors (M = 1.93, SD = 1.82) received slightly higher SI+ score than those depicting male actors (M = 1.73, SD = 1.96). Actor Sex did not interact with Stimulus Category [F (17, 184) = 1.93, p = .06]. The ANOVA for SI+ scores of action stimuli resulted in a significant main effect of Stimulus Category [F (17, 203) = 4.87, p < .001, η2 = .16]. Beckoning was the category to receive the highest SI+ scores (M = 3.34, SD = .58), whilst Twirling received the lowest SI+ scores (M = 2.17, SD = 1.16) (Fig. 4b). Post hoc t tests for each pair of action categories were conducted, using Bonferroni corrections (Fig. 4b). Actor sex did not contribute to variation in total SI+ scores [F (17, 203) = .38, p = .54] and did not interact with Stimulus Category [F (17, 203) = 1.65, p = .11].
Fig. 4

Distribution of Max-distractor Specificity Index (SI+) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before post hoc corrections is reported in italics below the mean difference value

Distribution of Max-distractor Specificity Index (SI+) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before post hoc corrections is reported in italics below the mean difference value CR scores showed that participants selected the target label to describe action stimuli (M = 86%, SD = 8%) significantly more often than they did to describe internal states (M = 77%, SD = 20%) [t(421) = -6.10, p < .001]. ANOVAs were computed for CR scores of internal states and control actions separately, with Stimulus Category and Actor Sex as IVs. For the internal states, a significant main effect of Stimulus Category [F (17, 184) = 73.89, p < .001, η2 = .76] was observed. Cold was the state with the highest CR (96%), whilst satiety had the lowest CR (40%) (Fig. 5a). Post hoc t tests using Bonferroni corrections are displayed in Fig. 5a. There was no main effect of Actor Sex [F (17, 184) = .09, p = .93] or interaction between Actor Sex and Stimulus Category [F (17, 184) = 1.08, p = .38]. The ANOVA for the action stimuli resulted in a significant main effect of Stimulus Category [F (17, 203) = 4.03, p < .001, η2 = .14]. Clapping had the highest CR (89%), whereas twirling was the action with the lowest CR (78%) (Fig. 5b). Post hoc t tests with Bonferroni correction across all categories are shown in Fig. 5b. Actor Sex did not contribute to variations of CR [F (17, 203) = 1.25, p = .26] or interact with Stimulus Category [F (17, 203) = .83, p = .58].
Fig. 5

Distribution of Choice Rate (CR) scores across different Stimulus Categories of Internal States (Panel a) and Control Actions (Panel b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post-hoc t-tests showing the mean difference (row - column) for each pair of Internal States (Panel a) and Control Actions (Panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before post hoc corrections is reported in italics below the mean difference value

Distribution of Choice Rate (CR) scores across different Stimulus Categories of Internal States (Panel a) and Control Actions (Panel b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post-hoc t-tests showing the mean difference (row - column) for each pair of Internal States (Panel a) and Control Actions (Panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The p value before post hoc corrections is reported in italics below the mean difference value Finally, CR+ scores for action stimuli (M = 84%, SD = 10%) were significantly higher than CR+ scores for internal state stimuli (M = 69%, SD = 25%) [t(421) = – 8.56, p < .001]. Once again, separate ANOVAs were conducted for the CR+ scores of action and internal states stimuli, with Stimulus Category and Actor Sex as IVs. For the internal states, a main effect of Stimulus Category was found [F (17, 184) = 61.80, p < .001, η2 = .73]. Cold stimuli received the highest (M = 92%, SD = 6%) CR+ scores, whilst Satiety stimuli had the lowest CR+ scores (M = 30%, SD = 12%) (Fig. 6a). Bonferroni corrected post-hoc t-tests across all pairs of states are shown in Fig. 6a. A significant main effect of Actor Sex was found [F (17, 184) = 5.77, p < .05, η2 = .03] whereby internal state stimuli depicting female actors (M = 70%, SD = 24%) received slightly higher CR+ scores than those depicting male actors (M = 67%, SD = 25%). Finally, Actor Sex did not interact significantly with Stimulus Category [F (17, 184) = 1.62, p = .12]. The ANOVA for the action stimuli returned a significant main effect of Stimulus Category [F (17, 203) = 6.27, p < .001, η2 = .198], whereby Walking and Twirling received the highest (M = 89%, SD = 8%) and lowest (M = 75%, SD = 14%) CR+ scores, respectively (Fig. 6b). Post-hoc t-tests with Bonferroni corrections across all pairs of actions are shown in Fig. 6b. The effect of Actor Sex on variations of CR+ scores did not reach statistical significance [F (17, 203) = .41, p = .52]. Likewise, Actor Sex did not interact with Stimulus Category [F (17, 203) = 1.71, p = .10].
Fig. 6

Distribution of High-quality Choice Rate (CR+) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The alpha level before post hoc corrections is reported in italics below the mean difference value

Distribution of High-quality Choice Rate (CR+) scores across different Stimulus Categories of Internal States (a) and Control Actions (b). The boxplots for each state and action are presented. Individual stimuli are plotted as single data points over the boxplot. Both graphs are presented alongside tables of post hoc t tests showing the mean difference (row - column) for each pair of Internal States (panel a) and Control Actions (panel b). Asterisks denote statistical significance at alpha level of .001 (**) and .05 (*) after Bonferroni corrections. The alpha level before post hoc corrections is reported in italics below the mean difference value To investigate the effect of observer gender on the evaluation of internal state stimuli, separate ANOVAs were conducted for the five recognition indices with Stimulus Category (all the internal states) and Observer Gender (female, male) as factors. The ANOVA with QI as DV did not reveal a main effect of Observer Gender [F(17, 386) = .57, p = .45, η2 = .001]. Moreover, Observer Gender did not interact significantly with Stimulus Category [F(17, 386) = .77, p = .63, η2 = .02]. The ANOVA for SI scores returned a main effect of Stimulus Category [F(17, 386) = 111.41, p < .001, η2 = .698], and a main effect of Observer Gender [F(17, 386) = 10.74, p = .001, η2 = .03], whereby female observers (M = 2.09, SD = 1.83) had higher SI indices than male observers (M = 1.76, SD = 1.90). Observer Gender did not interact significantly with Stimulus Category [F(17, 386) = .57, p = .80, η2 = .01]. The ANOVA for SI+ scores revealed a main effect of Stimulus Category [F(17, 386) = 112.55, p < .001, η2 = .70], and a main effect of Observer Gender [F(17, 386) = 11.25, p = .001, η2 = .03], with female observers (M = 2.03, SD = 1.88) having higher SI+ indices than male observers (M = 1.67, SD = 1.95), but no interaction between Observer Gender and Stimulus Category [F(17, 386) = .54, p = .82, η2 = .01]. The ANOVA for CR scores resulted in a main effect of Stimulus Category [F(17, 386) = 123.91, p < .001, η2 = .72], and a main effect of Observer Gender [F(17, 386) = 7.13, p = .008, η2 = .02], where female observers (M = 95.53, SD = 4.65) had slightly higher CR scores than males (M = 94.87, SD = 4.87). The interaction between the two factors was non-significant [F(17, 386) = .68, p = .708, η2 = .01]. Finally, the ANOVA for CR+ scores returned a main effect of Stimulus Category [F(17, 386) = 108.03, p < .001, η2 = .691], and a main effect of Observer Gender [F(17, 386) = 5.66, p = .02, η2 = .01], with higher CR+ scores in female observers (M = 72.59, SD = 23.70) than male observers (M = 69.34, SD = 24.50), but no interaction between Observer Gender and Stimulus Category [F(17, 386) = .56, p = .812, η2 = .01].

Confusion across stimulus categories

In order to determine which states/actions were confused with each other, confusion scores were created based on CR and CR+ scores. Confusion matrices were created whereby each row corresponds to the intended state or action portrayed by the actor, and each column represents the proportion of times each state or action label was selected regardless of quality rating in the CR matrix, and the proportion of times each state or action label was given the highest quality rating in the CR+ matrix. Among the internal states, some categories were particularly confused with others; Hunger stimuli were often rated as depicting Pain (CR = 46%; CR+ = 20%) and Nausea (CR = 40%; CR+ = 16%), Satiety stimuli were also rated as depicting Hunger (CR = 39%; CR+ = 25%) and Nausea (CR = 36%; CR+ = 17%), and Nausea stimuli were often rated as depicting Pain (CR = 30%; CR+ = 7%) (Fig. 7a). On the other hand, the confusion matrix for action stimuli revealed lower levels of confusion (i.e. target actions were less often labelled as non-target actions). Clapping stimuli were sometimes labelled as depicting Washing Hands (CR = 22%; CR+ = 10%), Running stimuli were also rated as depicting Walking (CR = 23%; CR+ = 11%), Twirling stimuli were sometimes rated as depicting Jumping (CR = 16%; CR+ = 6%), Waving stimuli were also rated as depicting Beckoning (CR = 14%; CR+ = 6%), and Beckoning stimuli were occasionally labelled as depicting Waving (CR = 10%; CR+ = 4%) (Fig. 7b).
Fig. 7

Confusion matrixes showing the proportion of the time that each label was used to describe stimuli of each intended state (Choice Rate (CR) matrix) and the proportion of the time that each label was given the highest quality rating to describe stimuli of each intended state (High-quality Choice Rate (CR+) matrix). Confusion matrices are presented separately for Internal States (panel a) and Control Actions (panel b)

Confusion matrixes showing the proportion of the time that each label was used to describe stimuli of each intended state (Choice Rate (CR) matrix) and the proportion of the time that each label was given the highest quality rating to describe stimuli of each intended state (High-quality Choice Rate (CR+) matrix). Confusion matrices are presented separately for Internal States (panel a) and Control Actions (panel b)

Discussion

The current report presents the creation and validation of the ISSI database, a novel stimulus set of 423 static images representing non-affective internal bodily states and control actions. Each stimulus is presented alongside a range of indices from the second stage of validation, representing the quality and specificity of depiction, and the extent to which each stimulus was recognised as the intended state or action. Confusion matrices of internal states and control actions are also included to provide an indication of which states and which actions tend to be confused with each other. The stimuli are freely available to researchers for their use in scientific research and can be downloaded from the Insulab website (https://www.insulab.uk). Overall, 77% (Ra: 40–96%) of participants selected the intended label to describe the internal state stimuli, and 86% (Ra: 78–89%) of participants selected the intended label to describe action stimuli. When observer gender was considered, female observers gave higher ratings and were more likely to select the intended label for the stimuli compared to male observers. Within the internal state stimulus set, there was high variability between stimulus categories in terms of quality and specificity of depiction, and proportion of participants selecting the target state, with the pattern of results across different indices being relatively consistent. Satiety was the most difficult state to recognise and discriminate from other states, followed by hunger. Hunger and satiety stimuli were given fairly low quality (QI) scores, with the majority being given a mean score below 2 (‘Poor’ on the rating scale), and negative specificity (SI and SI+) scores, indicating that distractor labels were often judged to be better descriptors of the stimulus than the target label. Similarly, CR and CR+ scores were often under 50%, indicating that the target label was selected to describe the stimulus (CR), or as the best descriptor of the stimulus (CR+), less than half of the time. Other internal state categories, however, were given high quality and specificity ratings, and the intended label was selected frequently. The vast majority of cold, itch, pain, fatigue and nausea stimuli, for example, were given QI scores above 3, positive SI/SI+ scores, and CR/CR+ scores above 70%. While there is therefore variability across internal state categories and individual stimuli, all stimuli rated in the second validation stage have been retained in the final stimulus set, in order for researchers to select stimuli according to their own research requirements. While we would recommend using stimuli with high quality, specificity and choice rate scores where studies require stimuli that have been validated and are recognisable by typical participants as their intended state, the range of recognition scores also allows for the study of ambiguous stimuli, or internal states that are easily confused. Notably, action stimuli were consistently recognised better than internal state stimuli, and there was less variability among different action stimulus categories in terms of quality and specificity ratings, and the extent to which the intended label was selected to describe the stimuli. Variability was also observed across actors, both in terms of quality of depiction and recognisability of the stimuli produced. Individual differences in the ability to produce recognisable non-affective internal states are expected, and elucidating the predictors of such differences should be investigated in future research. Previous works on facial expressions of emotion indicates, for example, that autistic individuals produce less typical emotional expressions compared to neurotypical individuals (e.g., Brewer et al., 2016; Langdell, 1981). Further research is needed to elucidate whether a similar pattern is observed for the expression of interoceptive states. It is likely that internal states were recognised less well than action stimuli due to the associations and similarities between internal states giving rise to greater confusability. In particular, there is an over-representation of gastric internal signals in the current stimulus set (i.e. nausea, hunger, satiety), which could be responsible for lower specificity scores and choice rates for these stimulus categories. Actors frequently expressed these internal states by placing their hands on or around the abdomen, likely making these stimuli difficult to differentiate. Crucially, despite variability in the low-level visual features of the stimuli within state categories, there was consistency across actors’ depictions of states, and visual cues were often in line with those that would be expected based on the location at which states are perceived within the body (e.g., the abdomen). Notably, recognition scores are likely to be dramatically increased if fewer gastric response options are available to participants (e.g., researchers could include nausea, hunger, and satiety under the same umbrella term ‘gastric discomfort’); in the current validation task, the availability of all target labels may have led to more conservative recognition estimates, while in a two-alternative forced choice task where stimuli must be labelled as either cold or satiety, for example, it is likely that participants would perform near ceiling, as the visual cues associated with these states are highly distinct. Indeed, in tasks assessing affective emotion recognition, recognition accuracy is improved by having fewer available response options, or less confusable response options in alternative forced choice tasks. For example, angry facial expressions were less likely to be labelled as depicting anger in a task where response options included “anger”, “frustration”, and “contempt” than when fewer closely related response options were available (Russell, 1993). Similarly, recognition of happiness expressions (which often shows ceiling effects even in those with difficulties recognising other facial expressions) has been found to be impaired in those with emotion processing impairments (alexithymia) when stimuli depicting pain are included in the recognition task, likely due to painful expressions sharing perceptual characteristics with happy expressions (Brewer et al., 2015). In contrast, action stimulus categories were more distinct from each other in their associated behavioural cues, and therefore less confusable with each other. Naturalness of expression may have also played a role in the disparity of recognition scores among stimulus categories. Although visual behavioural expressions of states such as hunger and satiety, such as rubbing the abdomen (hunger and satiety) or exhaling heavily (satiety), do occur, they may be less spontaneous than behavioural expressions of other states such as feeling cold (e.g. rubbing one’s arms) or feeling itchy (e.g. scratching one’s skin). This may be due to the behavioural responses to cold and itch serving a purpose to reduce the internal state, and thus being performed more often, rather than serving a more communicative purpose and therefore only being used in social situations, and potentially less frequently. Similarly, actions that are performed with a more communicative purpose may be more frequently accompanied by a verbal description (e.g. stating ‘I’m so hungry’ while rubbing one’s abdomen), reducing the requirement for an observer to recognise the visual signals. It is worth noting that facial expressions of affective states can be either spontaneous or posed for communicative purposes, and these tend to differ in their visual features, such as onset time, duration, and amplitude of physical facial movement (Schmidt et al., 2006; Valstar et al., 2006). It is likely that spontaneous and posed/communicative expressions of non-affective internal states also differ, and the extent to which they differ may vary across internal states. It is possible that actors’ depictions of internal states were therefore more recognisable for states where spontaneous and posed expressions of the state are more similar, making the actors’ depiction more ecologically valid. For states which either are infrequently expressed, or for which spontaneous expressions differ greatly from posed expressions, actors’ depictions may have been less recognisable. Notably, the communicative value of individual internal states may also vary across different cultures. Future research is needed to examine cross-cultural influences on the expression and recognition of internal states. Moreover, the expression of certain internal states is likely to be multidimensional, with expressions including a combination of visual (e.g. kinematic), auditory (e.g. vocal) and contextual cues. Recognition of internal states in others may, therefore, be greatly improved by the addition of vocal cues, body movement, or contextual information. When observing an individual rubbing their abdomen, for example, contextual information might be necessary in order to interpret the action accurately as a sign of hunger (e.g. it is lunch time and we are in a queue to buy food), rather than a sign of satiety (e.g. we just ate a large meal). Future research is needed to elucidate whether some states rely more than others on visual cues for their expression, and what type of cues are necessary for their recognition. The availability of stimuli depicting states that are easily confused with each other in this stimulus set will make it possible to address these research questions. Notably, research into the perception and recognition of non-affective internal states in others will pose new methodological challenges, in part complementary to those faced when studying the perception of internal states in the self. On one hand, some internal states (e.g. itch, fatigue) are associated with visual cues but are difficult to measure objectively, potentially making study of these states easier in relation to others than to the self. Conversely, some internal states, such as cardiac signal changes, are easy to objectively assess in the individual, but are not accompanied by visual cues, making them difficult to observe in others. Another crucial aspect to consider is the relative role of facial and bodily information in participants’ recognition of the current stimulus set. Facial cues were not obscured from the stimuli in either validation stage, as both facial expressions and postural cues are likely to be important for conveying internal states, and full body postures were deemed to be the most ecologically valid. It is possible, however, that facial and body information are recognisable in isolation, or that the relative contribution of facial and body cues to state recognition varies across internal states. As emotional cues are particularly expressed by the face (Adolphs, 1999; Frith, 2009), and it may be possible to experience affective and non-affective states simultaneously, interference effects from emotional cues may be especially evident when facial cues are present. While we note that stimuli have only been validated with integrated facial and body cues, it is of course possible for future work to investigate recognition from distinct regions of the stimuli, for example by separating or manipulating facial and body cues. Crucially, the theoretical distinction between affective (emotional) and interoceptive states is not clear-cut. Here we refer to interoceptive states as internal bodily sensations beyond the affective domain. With this, we do not imply that emotions and interoceptive states are necessarily separate entities. On the contrary, according to the leading model of emotion perception, interoception is a fundamental component of emotional experience, which derives from sensory and affective experiences in combination with contextual cues (Schachter & Singer, 1962). However, it is common in the literature to find emotion processing and interoception treated as separate components. Similarly, some states, such as pain, seem to be considered as both emotional and interoceptive, with Craig describing pain as a ‘homeostatic emotion’ due to its sensory component alongside a motivational drive to re-establish the body’s homeostasis (Craig, 2003b). This definition could arguably be applied to a number of interoceptive states. Future work is needed to assess whether individuals process affective and interoceptive states in others differently. To this end, the call for stimuli depicting internal sensations beyond the affective domain is even more critical. Going forward, it is important that categories of internal states are clearly defined, both theoretically and operationally. In conclusion, the ISSI stimulus set will allow, for the first time, the investigation of humans’ ability to recognise non-affective internal states in others. There are opportunities for investigating this basic process, for example the role of contextual cues and the contribution of facial and body postural cues to recognition, as well as for investigating correlates of individual differences in this ability, the genetic and neural basis of recognition, developmental trajectories, and the relationship between psychopathology and recognition abilities. Less recognisable stimuli have not been eliminated from the database, as researchers are encouraged to select stimuli based on their specific needs and research questions. If the aim of the study is that of assessing the accuracy of internal state recognition, then we advise researchers to select stimuli with higher quality, specificity, and choice rates, as these offer greater validity. The availability of more ambiguous stimuli, however, will allow investigation of individual differences in interpretation, and the biasing role of additional cues, for example. Researchers using the ISSI stimuli are encouraged to report their stimulus selection process transparently, and may utilise the validation statistics in the ISSI database to do this.
  32 in total

1.  Listening to your heart. How interoception shapes emotion experience and intuitive decision making.

Authors:  Barnaby D Dunn; Hannah C Galton; Ruth Morgan; Davy Evans; Clare Oliver; Marcel Meyer; Rhodri Cusack; Andrew D Lawrence; Tim Dalgleish
Journal:  Psychol Sci       Date:  2010-11-24

2.  (Dis)connected: An examination of interoception in individuals with suicidality.

Authors:  Lauren N Forrest; April R Smith; Robert D White; Thomas E Joiner
Journal:  J Abnorm Psychol       Date:  2015-08

3.  Role of facial expressions in social interactions.

Authors:  Chris Frith
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2009-12-12       Impact factor: 6.237

Review 4.  Interoception and emotion.

Authors:  Hugo D Critchley; Sarah N Garfinkel
Journal:  Curr Opin Psychol       Date:  2017-04-23

5.  Interoception, emotion and brain: new insights link internal physiology to social behaviour. Commentary on:: "Anterior insular cortex mediates bodily sensibility and social anxiety" by Terasawa et al. (2012).

Authors:  Sarah N Garfinkel; Hugo D Critchley
Journal:  Soc Cogn Affect Neurosci       Date:  2013-03       Impact factor: 3.436

6.  Reduced perception of bodily signals in anorexia nervosa.

Authors:  Olga Pollatos; Anne-Lene Kurz; Jessica Albrecht; Tatjana Schreder; Anna Maria Kleemann; Veronika Schöpf; Rainer Kopietz; Martin Wiesmann; Rainer Schandry
Journal:  Eat Behav       Date:  2008-03-04

Review 7.  A new view of pain as a homeostatic emotion.

Authors:  A D Craig
Journal:  Trends Neurosci       Date:  2003-06       Impact factor: 13.837

8.  Interoceptive sensitivity deficits in women recovered from bulimia nervosa.

Authors:  Megan Klabunde; Dean T Acheson; Kerri N Boutelle; Scott C Matthews; Walter H Kaye
Journal:  Eat Behav       Date:  2013-08-15

9.  The MPI emotional body expressions database for narrative scenarios.

Authors:  Ekaterina Volkova; Stephan de la Rosa; Heinrich H Bülthoff; Betty Mohler
Journal:  PLoS One       Date:  2014-12-02       Impact factor: 3.240

10.  Can Neurotypical Individuals Read Autistic Facial Expressions? Atypical Production of Emotional Facial Expressions in Autism Spectrum Disorders.

Authors:  Rebecca Brewer; Federica Biotti; Caroline Catmur; Clare Press; Francesca Happé; Richard Cook; Geoffrey Bird
Journal:  Autism Res       Date:  2015-06-06       Impact factor: 5.216

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.