| Literature DB >> 27375348 |
S A Cassidy1, B Stenger2, L Van Dongen3, K Yanagisawa2, R Anderson4, V Wan4, S Baron-Cohen4, R Cipolla5.
Abstract
Adults with Autism Spectrum Conditions (ASC) experience marked difficulties in recognising the emotions of others and responding appropriately. The clinical characteristics of ASC mean that face to face or group interventions may not be appropriate for this clinical group. This article explores the potential of a new interactive technology, converting text to emotionally expressive speech, to improve emotion processing ability and attention to faces in adults with ASC. We demonstrate a method for generating a near-videorealistic avatar (XpressiveTalk), which can produce a video of a face uttering inputted text, in a large variety of emotional tones. We then demonstrate that general population adults can correctly recognize the emotions portrayed by XpressiveTalk. Adults with ASC are significantly less accurate than controls, but still above chance levels for inferring emotions from XpressiveTalk. Both groups are significantly more accurate when inferring sad emotions from XpressiveTalk compared to the original actress, and rate these expressions as significantly more preferred and realistic. The potential applications for XpressiveTalk as an assistive technology for adults with ASC is discussed.Entities:
Keywords: Assistive technology; Autism spectrum conditions; Emotion recognition; Intervention; Social cognition
Year: 2016 PMID: 27375348 PMCID: PMC4913554 DOI: 10.1016/j.cviu.2015.08.011
Source DB: PubMed Journal: Comput Vis Image Underst ISSN: 1077-3142 Impact factor: 3.876
Fig. 1Cluster adaptive training (CAT). Each cluster is represented by a decision tree and defines a basis in expression space. Given a position in this expression space defined by the properties of the HMMs to use for synthesis can be found as a linear sum of the cluster properties.
Fig. 2Error histograms for three iterations of the model building process. Errors are decreased with each new iteration of the model.
Fig. 3Active Appearance Model. The shape mesh is shown in (a). Example synthesis results for (b) neutral, (c) tender, (d) happy, (e) sad, (f) afraid and (g) angry.
Participant characteristics. Autism Quotient (AQ) scores are missing for three participants in the typical control group.
| ASC group | Control group | ||
|---|---|---|---|
| ( | ( | ||
| Mean ± S.D. | Mean ± S.D. | ||
| (Range) | (Range) | ||
| Age (years) | 40.9 ± 13.2 | 43.7 ± 14.8 | |
| (19–63) | (16–63) | ||
| AQ | 40.4 ± 6.2 | 17.8 ± 10.4 | |
| (19–49) | (3–42) |
Fig. 4Screenshot of interface for synthesising with XpressiveTalk. The interface allows for inputting text and setting the values of the expression parameters which are used to create the animation of the talking avatar.
Confusion matrices showing the percentage of emotion inferences for real faces and XpressiveTalk in the typical group.
| Real face | XpressiveTalk | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Correct emotion | Correct emotion | ||||||||||
| Happy | Sad | Angry | Afraid | Neutral | Happy | Sad | Angry | Afraid | Neutral | ||
| Emotion response | Happy | 87.2 | 0.0 | 0.0 | 0.0 | 1.9 | 66.0 | 0.0 | 1.3 | 0.0 | 1.9 |
| Sad | 0.0 | 74.4 | 0.0 | 5.8 | 3.2 | 0.0 | 85.9 | 0.6 | 10.9 | 0.0 | |
| Angry | 1.3 | 0.0 | 94.9 | 2.6 | 1.9 | 1.9 | 0.0 | 64.7 | 1.9 | 3.2 | |
| Afraid | 0.6 | 22.4 | 1.9 | 89.1 | 1.9 | 15.4 | 12.2 | 15.4 | 85.9 | 0.0 | |
| Neutral | 10.9 | 3.2 | 3.2 | 2.6 | 91.0 | 16.7 | 1.9 | 17.9 | 1.3 | 94.9 | |
Confusion matrices showing the percentage of emotion inferences for real faces and XpressiveTalk in the ASC group.
| Real face | XpressiveTalk | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Correct emotion | Correct emotion | ||||||||||
| Happy | Sad | Angry | Afraid | Neutral | Happy | Sad | Angry | Afraid | Neutral | ||
| Emotion response | Happy | 77.5 | 0.0 | 1.9 | 0.0 | 2.5 | 43.8 | 0.0 | 2.5 | 0.0 | 6.9 |
| Sad | 0.0 | 60.0 | 0.0 | 13.8 | 4.4 | 5.0 | 79.4 | 2.5 | 11.3 | 3.8 | |
| Angry | 4.4 | 1.3 | 86.3 | 5.6 | 2.5 | 1.3 | 0.0 | 53.1 | 6.3 | 5.0 | |
| Afraid | 2.5 | 20.6 | 2.5 | 68.8 | 3.1 | 14.4 | 13.8 | 19.4 | 60.0 | 0.6 | |
| Neutral | 15.6 | 18.1 | 9.4 | 11.9 | 87.5 | 35.6 | 6.9 | 22.5 | 22.5 | 83.8 | |
Confusion matrices showing the percentage of emotion inferences for real faces and XpressiveTalk (typical and ASC groups combined).
| Real face | XpressiveTalk | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Correct emotion | Correct emotion | ||||||||||
| Happy | Sad | Angry | Afraid | Neutral | Happy | Sad | Angry | Afraid | Neutral | ||
| Emotion response | Happy | 82.3 | 0.0 | 0.9 | 0.0 | 2.2 | 54.7 | 0.0 | 1.9 | 0.0 | 4.4 |
| Sad | 0.0 | 67.1 | 0.0 | 9.8 | 3.8 | 2.5 | 82.6 | 1.6 | 11.1 | 1.9 | |
| Angry | 2.8 | 0.6 | 90.5 | 4.1 | 2.2 | 1.6 | 0.0 | 58.9 | 4.1 | 4.1 | |
| Afraid | 1.6 | 21.5 | 2.2 | 78.8 | 2.5 | 14.9 | 13.0 | 17.4 | 72.8 | 0.3 | |
| Neutral | 13.3 | 10.8 | 6.3 | 7.3 | 89.2 | 26.3 | 4.4 | 20.3 | 12.0 | 89.2 | |
Preference rating for each emotion in the ASC and typical control group, in the real face and XpressiveTalk conditions.
| Real face | XpressiveTalk | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Happy | Sad | Angry | Afraid | Neutral | Happy | Sad | Angry | Afraid | Neutral | |
| ASC | 44.6 | 22.8 | 33.3 | 39.2 | 34.9 | 28.8 | 39.3 | 24.7 | 28.7 | 43.9 |
| Typical control | 58.1 | 34.0 | 41.4 | 46.1 | 44.8 | 40.1 | 49.2 | 32.3 | 38.4 | 57.1 |
| Total | 51.3 | 28.4 | 37.3 | 42.6 | 39.8 | 34.4 | 44.2 | 28.5 | 33.5 | 50.4 |
Realism rating for each emotion in the ASC and typical control group, in the real face and XpressiveTalk conditions.
| Real face | XpressiveTalk | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Happy | Sad | Angry | Afraid | Neutral | Happy | Sad | Angry | Afraid | Neutral | |
| ASC | 61.6 | 36.3 | 64.0 | 47.5 | 32.2 | 32.4 | 63.9 | 36.8 | 40.3 | 62.6 |
| Typical control | 69.4 | 44.0 | 70.0 | 53.4 | 37.6 | 38.7 | 70.7 | 40.2 | 50.3 | 73.3 |
| Total | 65.5 | 40.1 | 66.9 | 50.4 | 34.9 | 35.5 | 67.2 | 38.5 | 45.2 | 67.9 |