| Literature DB >> 35739185 |
Isabelle Hupont1, Songül Tolan2, Hatice Gunes3, Emilia Gómez2.
Abstract
This work focuses on facial processing, which refers to artificial intelligence (AI) systems that take facial images or videos as input data and perform some AI-driven processing to obtain higher-level information (e.g. a person's identity, emotions, demographic attributes) or newly generated imagery (e.g. with modified facial attributes). Facial processing tasks, such as face detection, face identification, facial expression recognition or facial attribute manipulation, are generally studied as separate research fields and without considering a particular scenario, context of use or intended purpose. This paper studies the field of facial processing in a holistic manner. It establishes the landscape of key computational tasks, applications and industrial players in the field in order to identify the 60 most relevant applications adopted for real-world uses. These applications are analysed in the context of the new proposal of the European Commission for harmonised rules on AI (the AI Act) and the 7 requirements for Trustworthy AI defined by the European High Level Expert Group on AI. More particularly, we assess the risk level conveyed by each application according to the AI Act and reflect on current research, technical and societal challenges towards trustworthy facial processing systems.Entities:
Mesh:
Year: 2022 PMID: 35739185 PMCID: PMC9223252 DOI: 10.1038/s41598-022-14981-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Definitions as provided in the European Commission’s AI Act proposal that are particularly relevant for categorising facial processing systems and applications. Note that the AI Act is currently under discussion with European co-legislators (as of June 2022) and these definitions might be subject to change.
| Concept | Definition as in the AI Act | Article |
|---|---|---|
| Biometric data | Personal data resulting from specific technical processing relating to the physical, physiological or behavioural characteristics of a natural person, which allow or confirm the unique identification of that natural person, such as facial images or dactyloscopic data. | Article 3(33) |
| Emotion recognition system | AI system for the purpose of identifying or inferring emotions or intentions of natural persons on the basis of their biometric data. | Article 3(34) |
| Biometric categorisation system | AI system for the purpose of assigning natural persons to specific categories, such as sex, age, hair colour, eye colour, tattoos, ethnic origin or sexual or political orientation, on the basis of their biometric data. | Article 3(35) |
| Remote biometric identification system | AI system for the purpose of identifying natural persons at a distance through the comparison of a person’s biometric data with the biometric data contained in a reference database, and without prior knowledge of the user of the AI system whether the person will be present and can be identified. | Article 3(36) |
| “Real-time” remote biometric identification system | Remote biometric identification system whereby the capturing of biometric data, the comparison and the identification all occur without a significant delay. This comprises not only instant identification, but also limited short delays in order to avoid circumvention. | Article 3(37) |
| “Post” remote biometric identification system | Remote biometric identification system other than a “real-time” remote biometric identification system. | Article 3(38) |
Figure 1Objective of this work: establishing the landscape of facial processing tasks and applications in the context of the 7 requirements for Trustworthy AI and the new European AI Act. The requirements for Trustworthy AI are the 7 pillars upon which facial processing tasks must been built. A facial processing application is a real-world use case utilising one or more facial processing tasks, with a particular intended purpose and in a concrete context of use, being the object of the AI Act. Face drawings are courtesy of Pixabay (https://pixabay.com).
Description of most relevant facial processing tasks in AI research. Each task is assigned an acronym which is used as reference throughout the paper.
| Task | Acronym | Description |
|---|---|---|
| Face detection | FD | Determines the presence of faces in an image and, if present, returns the location and extent of each face[ |
| Facial landmark extraction | FLE | Locates facial salient features, such as points around the eyes, nose and mouth[ |
| Face tracking | FT | Tracks (i.e. follows) the position of each face appearing in a video, from the point it enters until the point it leaves the scene[ |
| Face identification | FI | Carries out a one-to-many (1:N) query for a “live” detected face against a database of known faces (e.g. a blacklist of N persons). It involves the extraction of a biometric template for each detected face, i.e. a small-size feature vector containing the most relevant facial information[ |
| Face verification | FV | Also called face authentication, it performs a one-to-one (1:1) query for a “live” detected face against a reference facial image of a known person[ |
| Kinship verification | KV | Aims at finding out whether there is a kin (i.e. family) relationship or not between given persons by analysing their facial images[ |
| Face spoofing detection | FSD | It is also known as |
| Facial expression recognition | FER | “Facial expression recognition” and “facial emotion recognition” have been used interchangeably in the literature. The FER task aims at automatically detecting expressions of emotion from a person’s facial image or video[ |
| Action Unit detection | AU | Action Units (AUs) encode movements of facial muscles and their intensity according to the Facial Action Coding System (FACS)[ |
| Facial attribute estimation | FAE | Recognises whether certain attributes are present in given facial images[ |
| Facial attribute manipulation | FAM | Synthesising or removing desired attributes from the original facial image[ |
| Automatic lip reading | ALR | Decode speech (spoken words) exclusively by analysing facial (lip/mouth region) images, i.e. mimicking the human capability to perform lip reading[ |
Figure 2Inputs, outputs and most commonly used computational pipelines for the facial processing tasks studied in this work, identified by their corresponding acronym.
Areas considered for the assessment of facial processing applications. Eight top rows correspond to the “high-risk” areas mentioned in the AI Act, under Annex III. The number of use cases related to each area are also shown, per type of system (BI Biometric identification, BC Biometric Categorisation, ER Emotion Recognition, and OT Other) and in total. The most frequent application area is in bold and second most frequent underlined.
| Code | Area | Number of use cases | ||||
|---|---|---|---|---|---|---|
| BI | BC | ER | OT | Total | ||
| BIC | Biometric identification and categorisation of natural persons | 0 | 0 | |||
| MCI | Management and operation of critical infrastructure | 8 | 0 | 0 | 1 | 9 |
| EDU | Education and vocational training | 3 | 1 | 3 | 0 | 7 |
| EMP | Employment, workers management and access to self-employment | 5 | 1 | 1 | 0 | 7 |
| SER | Access to and enjoyment of essential private services and public services benefits | 2 | 1 | 0 | 0 | 3 |
| LE | Law enforcement | 2 | 2 | 24 | ||
| MIG | Migration, asylum and border control management | 2 | 1 | 1 | 0 | 4 |
| JUS | Administration of justice and democratic processes | 1 | 1 | 1 | 3 | 6 |
| ENT | Entertainment and leisure | 7 | 2 | |||
| MKT | Marketing and retail | 4 | 2 | 15 | ||
| CUL | Culture, art and heritage | 0 | 0 | 3 | 1 | 4 |
| CLI | Clinical use in medicine and healthcare | 6 | 2 | 4 | 20 | |
| FIN | Finances and banking | 4 | 1 | 0 | 0 | 5 |
| SOC | Social assistance | 1 | 1 | 2 | 2 | 6 |
| VSU | Video-surveillance for security | 9 | 2 | 1 | 6 | 18 |
| TRA | Transportation and mobility | 4 | 1 | 2 | 0 | 7 |
| TOU | Tourism, hotels and restaurants | 3 | 2 | 1 | 1 | 7 |
| IND | Industry and logistics | 4 | 0 | 0 | 0 | 4 |
| POL | Politics | 2 | 0 | 1 | 1 | 4 |
Details on the 60 identified use cases. Application identifier (ID) starts with BI for Biometric Identification; BC for Biometric Categorisation; ER for Emotion Recognition; and OT for Others. BI applications marked with * and ** indicate “non-remote” and “post” BI, respectively. Applications marked with † could possibly be “high-risk” when the AI system is a safety component or part of a medical device or machine. The risk levels conveyed by each use case according to the AI Act are marked with coloured circles as follows: “unacceptable risk” or prohibited practice; “high-risk”; “transparency” risk; “minimal” risk. Some use cases may exceptionally entail two different risk levels, depending on their application area, because of exceptions stated in the legal text. In these cases, we have underlined the exception area in its corresponding colour. Please note that risk labels have been assigned by the authors based on their own interpretation of the AI Act. At the time of writing this paper (June 2022), the AI Act is under discussion with the European co-legislators and the assignment of risk levels might be subject to change in the future.
Summary of the main challenges and research needs towards trustworthy facial processing identified in this work.
| Requirement for | Main challenges and research needs towards trustworthy facial processing | |
|---|---|---|
| Trustworthy AI | Challenge | Research needs |
| 1. Human agency and oversight | The | Investigate on new ways of providing users with the most |
| Too general accuracy-centered metrics are provided to the users as a proxy to systems’ performance, which might cause | Design | |
| 2. Technical robustness and safety | Existing | Improve the quality and annotations of existing facial datasets, in the pursuit of AI Act’s |
| Current | Research on | |
| Facial processing systems are | Increase | |
| 3. Privacy and data governance | Existing | Develop |
| The training of facial processing systems requires compiling and | Investigate on | |
| New forms of | Research on | |
| 4. Transparency | The | Dataset creators to |
| Current facial processing systems are commonly used as | Investigate | |
| 5. Diversity, non-discrimination and fairness | Facial datasets are | Create |
| Most popular evaluation benchmarks do not provide protocols to | Create | |
| 6. Societal and environmental well-being | Facial processing systems are becoming increasingly distributed and computationally complex, and their deployment might entail a | Research on |
| Largest facial | In order to give the opportunity to smaller companies and institutions to create competitive and innovative facial processing products, establish | |
| 7. Accountability | There are | State the allowed uses and conditions of distribution |