| Literature DB >> 31423472 |
Yiling Li1, Yi Lin1, Hongwei Ding1, Chunbo Li2,3.
Abstract
BACKGROUND: The employment of clinical databases in the study of mental disorders is essential to the diagnosis and treatment of patients with mental illness. While text corpora obtain merely limited information of content, speech corpora capture tones, emotions, rhythms and many other signals beyond content. Hence, the design and development of speech corpora for patients with mental disorders is increasingly important. AIM: This review aims to extract the existing speech corpora for mental disorders from online databases and peer-reviewed journals in order to demonstrate both achievements and challenges in this area.Entities:
Keywords: mental disorder; review; speech database; systematic
Year: 2019 PMID: 31423472 PMCID: PMC6677935 DOI: 10.1136/gpsych-2018-100022
Source DB: PubMed Journal: Gen Psychiatr ISSN: 2517-729X
Search terms in Cochrane Library
| Search range | Keywords | |
| Title, Abstract, Keyword | Mental disorder | |
| AND | Title, Abstract, Keyword | Speech |
| AND | All text | Database |
Figure 1Flowchart of literature search and screening.
Description of nine speech databases and their subsidiaries
| Rank | Corpus | Year | Language | Average age | Subjects, n | Speech length (hour) | Location | Media type | Description | Citations, n |
| 1 | AphasiaBank | 12 | 170 | |||||||
|
| 2016 | Cantonese | – | 7/2 | Hong Kong, China | Video | Native Cantonese speakers with stroke-induced aphasia | |||
|
| 2017 | Croatian | – | 10/10 | Zagreb, Croatia | Video | Native Croatian speakers with stroke-induced aphasia | |||
|
| 2016 | French | – | 11/14 | France | Audio | Native French speakers with aphasia | |||
|
| 2011 | Italian | – | 10 | USA | Video | Native Italian speakers with aphasia | |||
|
| 2015 | Mandarin | 45 | 9 | China | Video | All patients with Mandarin as L1 and the aetiology is cerebral vascular accident (CVA) | |||
|
| 2011 | Spanish | – | 4 | USA | Video | Communication impairments by monolingual and bilingual speakers of Spanish and/or English | |||
| 2 | WRAP | English | 54 | 64/200 | 30–40 | USA | Audio | Connected speech problems of patients with dementia | 168 | |
| 3 | Orozco-Arroyave Database | 2014 | Spanish | 62; 60 | 50/50 | >150 | Spain | Audio | Speech recordings of patients with Parkinson’s disease and healthy controls | 57 |
| 4 | DementiaBank | 70–80 | 17 | |||||||
|
| 2016 | English | 68; 72 | 2 | USA | Video | Individuals with Alzheimer's disease—language tasks from a Telerounds presentation | |||
|
| 2016 | English | 81 | 6 | USA | Audio | Individuals with Alzheimer's disease—conversation and Cookie-Theft picture descriptions | |||
|
| 2016 | English | – | 208/104 | USA | Audio | Dementia and control data for four language tasks from a large longitudinal study | |||
|
| 2016 | English | 66 | 1 | USA | Video | Individual with primary progressive aphasia longitudinal data | |||
|
| 2016 | English | – | 36 | USA | Audio | Individuals with primary progressive aphasia data | |||
|
| 2016 | English | – | – | Germany | Audio | Primary progressive aphasia data | |||
|
| 2016 | Mandarin | – | 52 | China | Audio | Individuals with dementia data | |||
|
| 2012 | Spanish | – | 21 | Spain | Audio | Individuals with Alzheimer's disease and dementia data | |||
|
| 2016 | Taiwanese | – | 16 | China | Audio | Individuals with dementia | |||
| 5 | Cambridge Cookie-Theft Corpus | 2010 | English | 54 | 87/227 | 41.5 | Cambridge | Audio, brain scans | Individuals who have suffered from brain injury given the language task of picture description | 9 |
| 6 | CoDAS | 2006 | Dutch | 54 | 6 | 0.5 | Netherlands and Flanders | Audio | A pilot study of six aphasic speakers with two levels of annotation: an orthographic-phonetic transcription and a part-of-speech (POS) tagging | 2 |
| 7 | GREECAD | 2016 | Greek | 55 | 72/28 | 1 | Athens | Audio | An annotated Greek Corpus of Aphasic Discourse | 1 |
| 8 | FluencyBank | 46 | 1 | |||||||
|
| 2013 | English | 7 | 25/25 | Washington, DC | Audio | Children with epilepsy and controls | |||
|
| 2005 | English | 4 | 100/50 | USA | Audio | Seminal study of children who stutter with controls | |||
|
| 2012 | English | 3 | 23/15 | USA | Audio | Children who stutter and controls | |||
|
| 2006 | English | 42 | 12 | USA | Video | Interviews from the Voices of Stuttering project | |||
|
| 1997 | German | 6 | 94 | Ulm | Audio | Children who stutter from Ulm | |||
| 9 | DAIC-WOZ | 2014 | English | – | – | 50 | USA | Video | Anxiety, depression and post-traumatic stress disorder in University of Southern California | 124 |
Year denotes the establishment of the speech database.
n refers to the number of subjects and citations.
– denotes information not available or the project is still ongoing with an increasing number of subjects.
/ denotes the number of patients proportioned to the number of controls.
CoDAS, Corpus of Dutch Aphasic Speech; DAIC-WOZ, Distress Analysis Interview Corpus-Wizard of Oz; GREECAD, Greek Corpus of Aphasic Discourse; POLER, Plasticity of Language in Epilepsy Research; WRAP, Wisconsin Registry for Alzheimer’s Prevention.