| Literature DB >> 35615057 |
Fang Tian1, Hailun Xie2, Yiying Song2, Siyuan Hu2, Jia Liu3.
Abstract
The face inversion effect (FIE) is a behavioral marker of face-specific processing that the recognition of inverted faces is disproportionately disrupted than that of inverted non-face objects. One hypothesis is that while upright faces are represented by face-specific mechanism, inverted faces are processed as objects. However, evidence from neuroimaging studies is inconclusive, possibly because the face system, such as the fusiform face area, is interacted with the object system, and therefore the observation from the face system may indirectly reflect influences from the object system. Here we examined the FIE in an artificial face system, visual geometry group network-face (VGG-Face), a deep convolutional neural network (DCNN) specialized for identifying faces. In line with neuroimaging studies on humans, a stronger FIE was found in VGG-Face than that in DCNN pretrained for processing objects. Critically, further classification error analysis revealed that in VGG-Face, inverted faces were miscategorized as objects behaviorally, and the analysis on internal representations revealed that VGG-Face represented inverted faces in a similar fashion as objects. In short, our study supported the hypothesis that inverted faces are represented as objects in a pure face system.Entities:
Keywords: AlexNet; VGG-Face; deep convolutional neural network; face inversion effect; face system
Year: 2022 PMID: 35615057 PMCID: PMC9124772 DOI: 10.3389/fncom.2022.854218
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 3.387
Figure 1Example stimuli in our study. (A) Top, upright faces; bottom, inverted faces. (B) Top, upright objects; bottom, inverted objects.
Figure 2Recognition performance and representations of VGG-Face. (A) The recognition accuracy of the upright and inverted faces and objects of VGG-Face. The error bars denote the standard error of the mean across the 30 groups of images in each condition. (B) The classification confusion matrix of VGG-Face. The percentage in the matrix denotes the classification errors of VGG-Face in each condition. (C) The representational similarity matrix of the three FC layers in VGG-Face. The color in the matrix indicates correlation values between activation patterns of different stimulus identities, with cool color in the matrix indicating low correlation and the warm color indicating high correlation.
Figure 3Recognition performance and representations of AlexNet. (A) The recognition accuracy of the upright and inverted faces and objects of AlexNet. The error bars denote the standard error of the mean across the 30 groups of images in each condition. (B) The classification confusion matrix of AlexNet. The percentage in the matrix denotes the classification errors of AlexNet in each condition. (C) The representational similarity matrix of the three FC layers in AlexNet. The color in the matrix indicates correlation values between activation patterns of different stimulus identities, with cool color in the matrix indicating low correlation and the warm color indicating high correlation.
Figure 4Recognition performance of VGG-16 and AlexNet pre-trained with face recognition. (A) The recognition accuracy of the upright and inverted faces and objects of VGG-16 pretrained with object classification. (B) The recognition accuracy of the upright and inverted faces and objects of AlexNet pretrained with face recognition. The error bars denote the standard error of the mean across the 30 groups of images in each condition.