| Literature DB >> 35669265 |
Cheng Wan1,2, Xuewen Ge1, Junjie Wang1,2, Xin Zhang1,3, Yun Yu1,2, Jie Hu1,2, Yun Liu1,3, Hui Ma4.
Abstract
Mood disorders are ubiquitous mental disorders with familial aggregation. Extracting family history of psychiatric disorders from large electronic hospitalization records is helpful for further study of onset characteristics among patients with a mood disorder. This study uses an observational clinical data set of in-patients of Nanjing Brain Hospital, affiliated with Nanjing Medical University, from the past 10 years. This paper proposes a pretrained language model: Bidirectional Encoder Representations from Transformers (BERT)-Convolutional Neural Network (CNN). We first project the electronic hospitalization records into a low-dimensional dense matrix via the pretrained Chinese BERT model, then feed the dense matrix into the stacked CNN layer to capture high-level features of texts; finally, we use the fully connected layer to extract family history based on high-level features. The accuracy of our BERT-CNN model was 97.12 ± 0.37% in the real-world data set from Nanjing Brain Hospital. We further studied the correlation between mood disorders and family history of psychiatric disorder.Entities:
Keywords: electronic health records; family history; mood disorder; pretrained BERT CNN model; psychiatric disorder
Year: 2022 PMID: 35669265 PMCID: PMC9163373 DOI: 10.3389/fpsyt.2022.861930
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 5.435
Figure 1The research pipeline.
Figure 2Architecture of the BERT–CNN model.
Demographic characteristics of the study participants.
|
|
|
| |
|---|---|---|---|
| Diagnosis names and ICD-10 codes of records | (F31) | (F32) | (F33) |
|
| 2,123 | 5,353 | 4,530 |
| Sex, | |||
| Female | 1,263 (16.0) | 3,383 (43.0) | 3,204 (41.0) |
| Male | 860 (20.6) | 1,970 (47.4) | 1,326 (32.0) |
| Age (years), | |||
| ≤ 18 | 148 (18.7) | 539 (68.4) | 101 (12.9) |
| 18–40 | 949 (30.6) | 1,468 (47.2) | 691 (22.2) |
| 41–60 | 707 (16.2) | 1,977 (45.6) | 1,666 (38.2) |
| 61–80 | 303 (8.6) | 1,315 (36.9) | 1,937 (54.5) |
| >80 | 16 (7.8) | 54 (26.3) | 135 (65.9) |
| Marital status, | |||
| Married | 1,179 (13.6) | 3,855 (44.3) | 3,660 (42.1) |
| No spouse | 940 (28.6) | 1,491 (45.2) | 865 (26.2) |
| Not specified | 4 (25.0) | 7 (43.7) | 5 (31.3) |
|
| |||
| –Disease name, | |||
| Schizophrenia | 145 (31.5) | 151 (32.8) | 169 (36.7) |
| Unspecified non-organic psychosis | 26 (21.4) | 58 (47.9) | 37 (30.6) |
| MDD | 177 (17.2) | 419 (40.5) | 437 (42.3) |
| BD | 35 (59.4) | 12 (20.3) | 12 (20.3) |
| Other non-organic mental disorders | 212 (32.4) | 250 (38.2) | 192 (29.4) |
| –Kinship level, | |||
| First degree | 368 (23.1) | 593 (37.2) | 632 (39.6) |
| Second degree | 115 (33.6) | 145 (42.3) | 82 (23.9) |
| Third degree | 18 (28.5) | 23 (36.5) | 22 (34.9) |
| Not mentioned | 1,622 (16.2) | 4,592 (45.8) | 3,794 (37.9) |
Performance measures of the model.
|
|
|
| ||
|---|---|---|---|---|
|
| ||||
| Accuracy (sd) | 0.971 (2.60e-3) | 0.970 (5.86e-3) | ||
| F1 (sd) | 0.721 (2.03e-2) | 0.708 (3055e-2) | ||
| Precision (sd) | 0.568 (2.39e-2) | 0.559 (5.13e-2) | ||
| Micro | Recall (sd) | 0.989 (1.41e-2) | 0.973 (1.92e-2) | 0.851 |
| F1(sd) | 0.668 (5.26e-2) | 0.606 (5.84e-2) | ||
| Precision(sd) | 0.563 (9.03e-2) | 0.515 (1.32e-1) | ||
| Macro | Recall(sd) | 0.936 (7.23e-3) | 0.841 (6.87e-3) | 0.296 |
|
| ||||
| Accuracy (sd) | 0.972 (1.98e-3) | 0.972 (1.82e-3) | ||
| F1 (sd) | 0.724 (1.41e-2) | 0.722 (1.25e-2) | ||
| Precision (sd) | 0.576 (1.32e-2) | 0.576 (1.10e-2) | ||
| Micro | Recall(sd) | 0.973 (2.29e-2) | 0.969 (2.64e-2) | 0.973 |
| F1 (sd) | 0.630 (1.32e-2) | 0.613 (8.27e-2) | ||
| Precision (sd) | 0.559 (1.32e-1) | 0.524 (1.42e-1) | ||
| Macro | Recall(sd) | 0.847 (1.11e-2) | 0.828 (1.27e-2) | 0.701 |
p>0.05 indicates stable performance of the model under different super-parameter combinations.
Figure 3Heatmap representation of disorder extraction scores from different evaluation metrics. Acc, accuracy; P, precision; R, recall; Lr1, learning rate = 1e-4; Lr2, learning rate = 1e-5; B16, batch size = 16; B32, batch size = 32.
Risk of MDD as first diagnosis at admission presented as adjusted ORs grouped by family history of psychiatric disorder.
|
|
|
| |
|---|---|---|---|
| Family history of MDD | |||
| Yes | 1.058 | 0.851–1.327 | 0.617 |
| No | 1.00 (ref.) | 1.00 (ref.) | |
| Family history of schizophrenia | |||
| Yes | 0.464 | 0.356–0.611 | <0.05 |
| No | 1.00 (ref.) | 1.00 (ref.) | |
| Family history of BD | |||
| Yes | 0.137 | 0.071–0.264 | <0.05 |
| No | 1.00 (ref.) | 1.00 (ref.) | |
| Family history of unspecified non-organic psychosis | |||
| Yes | 0.643 | 0.368–1.180 | 0.135 |
| No | 1.00 (ref.) | 1.00 (ref.) | |
| Family history of other non-organic mental disorders | |||
| Yes | 0.409 | 0.332–0506 | <0.05 |
| No | 1.00 (ref.) | 1.00 (ref.) |
Adjusted for sex, age, marital status, and profession.