| Literature DB >> 35249538 |
Abstract
BACKGROUND: Developing machine learning models to support health analytics requires increased understanding about statistical properties of self-rated expression statements used in health-related communication and decision making. To address this, our current research analyzes self-rated expression statements concerning the coronavirus COVID-19 epidemic and with a new methodology identifies how statistically significant differences between groups of respondents can be linked to machine learning results.Entities:
Keywords: Convolutional neural network; Coronavirus; Decision making; Disabled; Interpretation; Machine learning; Patient; Personalized care; Self-rating; The need for help
Mesh:
Year: 2022 PMID: 35249538 PMCID: PMC8898191 DOI: 10.1186/s12874-021-01502-8
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
A description of the proposed new methodology of influence analysis concerning machine learning that can be applied to measure the patient’s “need for help” ratings of expression statements in respect to groupings based on the answer values of background questions, and further to evaluate the applicability of training and validation of a machine learning model to learn the groupings concerning the ratings
|
|
|---|
- Each person gives the - The rating answers given by the person form his/her “need for help” rating profile. (Described in the chapter “Gathering ratings about expression statements from persons representing various background features”) |
(Described in the chapter “Identifying statistically significant rating differences for expression statements in respect to background questions”) |
(Described in the chapter “Training and validation of a machine learning model to learn groupings concerning the ratings”) |
(Described in the chapter “Comparing the validation accuracies of the machine learning model with the probabilities of pure chance”) |
(Described in the chapter “Contrasting the validation accuracies of the machine learning model with the statistically significant rating differences in respect to groupings”) |
(Described in the chapter “Drawing conclusions about the applicability of the current machine learning model”) |
Fig. 1A schematic illustration about the steps 1-6 for the methodology of influence analysis concerning machine learning described in Table 1
Expression statements (ES) concerning the coronavirus COVID-19 epidemic that were rated by the person in respect to the impression about the “need for help”
|
|
|
|
|---|---|---|
| ES1 | “I have a flu.” | 0-10 |
| ES2 | “I have a cough.” | 0-10 |
| ES3 | “I have a shortness of breath.” | 0-10 |
| ES4 | “My health condition is weakening.” | 0-10 |
| ES5 | “I have a sore throat.” | 0-10 |
| ES6 | “I have muscular ache.” | 0-10 |
| ES7 | “I have a fever.” | 0-10 |
| ES8 | “A sudden fever rises for me with 38 degrees of Celsius or more.” | 0-10 |
| ES9 | “I suspect that I have now become infected by the coronavirus.” | 0-10 |
| ES10 | “I have now become infected by the coronavirus.” | 0-10 |
| ES11 | “I am quarantined from meeting other people ordinarily so that the spreading of an infectious disease could be prevented.” | 0-10 |
| ES12 | “I must be inside a house without getting out.” | 0-10 |
| ES13 | “I must be without a human companion.” | 0-10 |
| ES14 | “I do not cope in everyday life independently without getting help from other persons.” | 0-10 |
| ES15 | “I do not cope at home independently without getting help from persons who originate outside of my home.” | 0-10 |
| ES16 | “I have an infectious disease.” | 0-10 |
| ES17 | “I have an infectious disease that has been verified by a doctor.” | 0-10 |
| ES18 | “I suspect that I have an infectious disease.” | 0-10 |
| ES19 | “I have a bad health condition.” | 0-10 |
| ES20 | “I have an ordinary health condition.” | 0-10 |
Background questions (BQ) presented to the person
|
|
|
|
|---|---|---|
| BQ1: an estimated health condition | “What kind of health condition you have currently according to your opinion?” [ | A 9-point Likert scale supplied with the following partial labeling: “9 Good”, “8 –“, “7 Rather good”, “6 –“, “5 Medium”, “4 –“, “3 Rather bad”, “2 –“, “1 Bad”. |
| BQ2: a health problem reduces ability | “Do you have a permanent or long-lasting disease or such deficit, ailment or disability that reduces your ability to work or to perform your daily living activities? Here the question refers to all long-lasting diseases identified by a doctor, and also to such ailments not identified by a doctor which have lasted at least three months but which affect your ability to perform your daily living activities.“ [ | No or yes. |
| BQ3: one or more diseases identified by a doctor | “Has there been a situation that a doctor has identified in you one or several of the following diseases?“ [ | The person answers by selecting one or more options from a list of diseases [ |
| BQ4: a continuous or repeated need for a doctor’s care | “Do you need continuously or repeatedly care given by a doctor for a long-lasting disease, deficit or disability that you have just mentioned?“ [ | No or yes. |
| BQ5: the quality of life | “How would you rate your quality of life? Give your estimate based on the latest two weeks.” [ | A 9-point Likert scale supplied with the following partial labeling: “9 Very good”, “8 –“, “7 Good”, “6 –“, “5 Neither good nor bad”, “4 –“, “3 Bad”, “2 –“, “1 Very bad”. |
| BQ6: the satisfaction about health | “How satisfied are you with your health? Give your estimate based on the latest two weeks.” [ | A 9-point Likert scale supplied with the following partial labeling: “9 Very satisfied”, “8 –“, “7 Satisfied”, “6 –“, “5 Neither satisfied nor dissatisfied”, “4 –“, “3 Dissatisfied”, “2 –“, “1 Very dissatisfied”. |
| BQ7: the satisfaction about ability | “How satisfied are you with your ability to perform your daily living activities? Give your estimate based on the latest two weeks.” [ | A 9-point Likert scale supplied with the following partial labeling: “9 Very satisfied”, “8 –“, “7 Satisfied”, “6 –“, “5 Neither satisfied nor dissatisfied”, “4 –“, “3 Dissatisfied”, “2 –“, “1 Very dissatisfied”. |
| BQ8: the sex | “Tell what is your sex. The answer alternatives are similar as in the earlier health surveys of Finnish Institute for Health and Welfare (THL) to maintain comparability with the earlier results.” [ | Man or woman. |
| BQ9: the age | “Tell what is your age.” [ | Age in years selected from the following range: 16 years, 17 years, …, 99 years, 100 years or more. |
Expression statements (ES) having statistically significant the “need for help” rating differences in the grouping based on the answer values of each background question (BQ), and evaluation about how well the convolutional neural network model can learn a labeling that matches the grouping (n = 673). M = mean, Mdn=median, SD=standard deviation
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
| ||
BQ1, two groups: x < 7 (n1=263), x>=7 (n2=410) | ES6: diffg1&g2=0,07 (95% CI [0,03; 0,11], p = 0.0011); ES8: diffg1&g2=-0,08 (95% CI [-0,14; -0,02], p = 0.0073); ES9: diffg1&g2=-0,08 (95% CI [-0,14; -0,02], p = 0.0068); ES10: diffg1&g2=-0,09 (95% CI [-0,15; -0,02], p = 0.0049); ES7: diffg1&g2=-0,05 (95% CI [-0,10; 0,00], p = 0.0384); ES16: diffg1&g2=-0,06 (95% CI [-0,12; 0,00], p = 0.0403); ES17: diffg1&g2=-0,08 (95% CI [-0,14; -0,02], p = 0.0143); ES18: diffg1&g2=-0,05 (95% CI [-0,10; 0,00], p = 0.0358); | M = 11.26 Mdn=11 SD=2.39 | M = 0.55 Mdn=0.55 SD=0.03 | M = 0.73 Mdn=0.72 SD=0.02 | M = 0.59 Mdn=0.59 SD=0.01 | M = 0.69 Mdn=0.69 SD=0.02 | 0.61 | 0.08 |
BQ1, three groups: x < 6 (n1= 218), 6<=x < 8 (n2=207), x>=8 (n3=248) | ES5: diffg1&g3=-0,01 (95% CI [-0,05; 0,03], p = 0.446), diffg1&g2=-0,05 (95% CI [-0,09; 0,00], p = 0.054), diffg2&g3=0,04 (95% CI [0,00; 0,08], p = 0.102), pg1&g2&g3=0.0449; ES8: diffg1&g3=-0,08 (95% CI [-0,15; -0,02], p = 0.067), diffg1&g2=-0,07 (95% CI [-0,14; 0,00], p = 0.067), diffg2&g3=-0,01 (95% CI [-0,07; 0,06], p = 0.929), pg1&g2&g3=0.0489; ES9: diffg1&g3=-0,09 (95% CI [-0,16; -0,02], p = 0.050), diffg1&g2=-0,08 (95% CI [-0,15; 0,00], p = 0.058), diffg2&g3=-0,01 (95% CI [-0,08; 0,06], p = 0.891), pg1&g2&g3=0.0355; ES11: diffg1&g3=0,08 (95% CI [0,02; 0,14], p = 0.015), diffg1&g2=0,01 (95% CI [-0,05; 0,07], p = 0.858), diffg2&g3=0,07 (95% CI [0,02; 0,13], p = 0.015), pg1&g2&g3=0.0108; | M = 4.85 Mdn=4 SD=1.8 | M = 1.02 Mdn=1.03 SD=0.03 | M = 0.48 Mdn=0.47 SD=0.03 | M = 1.06 Mdn=1.06 SD=0.01 | M = 0.40 Mdn=0.40 SD=0.02 | 0.37 | 0.03 |
BQ2, two groups: x < 2 (n1=219), x>=2 (n2=454) | ES11: diffg1g2=-0,08 (95% CI [-0,13; -0,03], p = 0.0014); ES6: diffg1g2=-0,06 (95% CI [-0,10; -0,02], p = 0.0039); ES3: diffg1g2=0,06 (95% CI [0,01; 0,12], p = 0.0476); ES14: diffg1g2=0,06 (95% CI [-0,01; 0,12], p = 0.04); ES15: diffg1g2=0,08 (95% CI [0,01; 0,14], p = 0.0189); | M = 5.55 Mdn=6 SD=2.91 | M = 0.57 Mdn=0.56 SD=0.04 | M = 0.69 Mdn=0.69 SD=0.02 | M = 0.63 Mdn=0.63 SD=0.01 | M = 0.66 Mdn=0.66 SD=0.02 | 0.67 | -0.01 |
BQ4, two groups: x < 2 (n1=364), x>=2 (n2=309) | ES6: diffg1g2=-0,06 (95% CI [-0,09; -0,02], p = 0.0064); ES11: diffg1g2=-0,06 (95% CI [-0,10; -0,01], p = 0.0165); | M = 3.44 Mdn=3 SD=1.28 | M = 0.67 Mdn=0.67 SD=0.01 | M = 0.59 Mdn=0.60 SD=0.02 | M = 0.68 Mdn=0.67 SD=0 | M = 0.57 Mdn=0.57 SD=0.02 | 0.54 | 0.03 |
BQ5, two groups: x < 7 (n1=274), x>=7 (n2=399) | ES6: diffg1g2=0,06 (95% CI [0,02; 0,10], p = 0.0024); ES9: diffg1g2=-0,08 (95% CI [-0,14; -0,02], p = 0.0043); ES10: diffg1g2=-0,08 (95% CI [-0,15; -0,02], p = 0.0036); ES11: diffg1g2=0,06 (95% CI [0,01; 0,11], p = 0.0168); ES16: diffg1g2=-0,06 (95% CI [-0,12; -0,01], p = 0.0271); ES17: diffg1g2=-0,07 (95% CI [-0,13; -0,01], p = 0.0303); | M = 3.27 Mdn=3 SD=1.65 | M = 0.64 Mdn=0.63 SD=0.02 | M = 0.65 Mdn=0.65 SD=0.03 | M = 0.66 Mdn=0.67 SD=0 | M = 0.60 Mdn=0.60 SD=0.02 | 0.59 | 0,01 |
BQ5, three groups: x < 6 (n1=190), 6<=x < 8 (n2=271), x>=8 (n3=212) | ES9: diffg1&g3=-0,15 (95% CI [-0,22; -0,07], p = 0.0005), diffg1&g2=-0,09 (95% CI [-0,17; -0,02], p = 0.0230), diffg2&g3=-0,05 (95% CI [-0,12; 0,02], p = 0.0965), pg1&g2&g3=0.0007; ES10: diffg1&g3=-0,15 (95% CI [-0,23; -0,07], p = 0.0004), diffg1&g2=-0,09 (95% CI [-0,17; -0,02], p = 0.0112), diffg2&g3=-0,06 (95% CI [-0,13; 0,02], p = 0.1699), pg1&g2&g3=0.0005; ES6: diffg1&g3=0,07 (95% CI [0,03; 0,12], p = 0.016), diffg1&g2=0,02 (95% CI [-0,02; 0,07], p = 0.393), diffg2&g3=0,05 (95% CI [0,01; 0,09], p = 0.023), pg1&g2&g3=0.0093; ES8: diffg1&g3=-0,11 (95% CI [-0,18; -0,04], p = 0.013), diffg1&g2=-0,09 (95% CI [-0,16; -0,02], p = 0.013), diffg2&g3=-0,02 (95% CI [-0,08; 0,05], p = 0.985), pg1&g2&g3=0.0117; ES16: diffg1&g3=-0,09 (95% CI [-0,16; -0,02], p = 0.04), diffg1&g2=-0,08 (95% CI [-0,15; -0,01], p = 0.04), diffg2&g3=-0,01 (95% CI [-0,08; 0,05], p = 0.76), pg1&g2&g3=0.0301; ES17: diffg1&g3=-0,10 (95% CI [-0,18; -0,03], p = 0.04), diffg1&g2=-0,08 (95% CI [-0,15; -0,01], p = 0.06), diffg2&g3=-0,02 (95% CI [-0,09; 0,04], p = 0.53), pg1&g2&g3=0.0329; ES20: diffg1&g3=0,04 (95% CI [-0,02; 0,09], p = 0.022), diffg1&g2=-0,01 (95% CI [-0,07; 0,04], p = 0.928), diffg2&g3=0,05 (95% CI [0,00; 0,10], p = 0.022), pg1&g2&g3=0.0139; | M = 3.63 Mdn=4 SD=1.33 | M = 1.05 Mdn=1.05 SD=0.02 | M = 0.44 Mdn=0.44 SD=0.03 | M = 1.07 Mdn=1.07 SD=0.01 | M = 0.42 Mdn=0.43 SD=0.03 | 0.40 | 0.02 |
BQ6, two groups: x < 7 (n1=318), x>=7 (n2=355) | ES11: diffg1&g2=0,08 (95% CI [0,04; 0,13], p = 0.0006); ES6: diffg1&g2=0,06 (95% CI [0,02; 0,09], p = 0.0056); | M = 5.35 Mdn=5 SD=1.61 | M = 0.62 Mdn=0.63 SD=0.02 | M = 0.63 Mdn=0.63 SD=0.03 | M = 0.65 Mdn=0.65 SD=0 | M = 0.60 Mdn=0.60 SD=0,02 | 0.53 | 0.07 |
BQ6, three groups: x < 6 (n1=240), 6<=x < 8 (n2=229), x>=8 (n3=204) | ES11: diffg1&g3=0,09 (95% CI [0,03; 0,15], p = 0.0077), diffg1&g2=0,03 (95% CI [-0,03; 0,08], p = 0.3516), diffg2&g3=0,06 (95% CI [0,00; 0,12], p = 0.0649), pg1&g2&g3=0.0098; ES6: diffg1&g3=0,07 (95% CI [0,02; 0,12], p = 0.019), diffg1&g2=0,04 (95% CI [0,00; 0,09], p = 0.141), diffg2&g3=0,03 (95% CI [-0,02; 0,07], p = 0.141), pg1&g2&g3=0.0199; | M = 3.89 Mdn=4 SD=1.8 | M = 1.05 Mdn=1.06 SD=0.03 | M = 0.41 Mdn=0.41 SD=0.03 | M = 1.08 Mdn=1.08 SD=0 | M = 0.39 Mdn=0.39 SD=0.03 | 0.36 | 0.03 |
BQ7, two groups: x < 7 (n1=201), x>=7 (n2=472) | ES6: diffg1&g2=0,08 (95% CI [0,04; 0,12], p = 0.0005); ES11: diffg1&g2=0,07 (95% CI [0,02; 0,12], p = 0.0078); ES19: diffg1&g2=0,07 (95% CI [0,02; 0,11], p = 0.0048); | M = 7.26 Mdn=7 SD=1.63 | M = 0.53 Mdn=0.54 SD=0.02 | M = 0.75 Mdn=0.74 SD=0.01 | M = 0.59 Mdn=0.59 SD=0 | M = 0.72 Mdn=0.72 SD=0.01 | 0.70 | 0.02 |
BQ7, three groups: x < 6 (n1=143), 6<=x < 8 (n2=214), x>=8 (n3=316) | ES6: diffg1&g3=0,09 (95% CI [0,04; 0,13], p = 0.0051), diffg1&g2=0,04 (95% CI [-0,01; 0,09], p = 0.1801), diffg2&g3=0,05 (95% CI [0,00; 0,09], p = 0.0619), pg1&g2&g3=0.0042; ES11: diffg1&g3=0,10 (95% CI [0,04; 0,16], p = 0.0086), diffg1&g2=0,03 (95% CI [-0,04; 0,09], p = 0.4526), diffg2&g3=0,07 (95% CI [0,02; 0,12], p = 0.0186), pg1&g2&g3=0.0035; | M = 1.31 Mdn=1 SD=0.61 | M = 1.05 Mdn=1.06 SD=0.02 | M = 0.45 Mdn=0.45 SD=0.02 | M = 1.07 Mdn=1.07 SD=0.01 | M = 0.47 Mdn=0.48 SD=0.02 | 0.47 | 0.00 |
BQ8, two groups: x < 2 (n1=123), x>=2 (n2=550) | ES4: diffg1&g2=-0,11 (95% CI [-0,17; -0,05], p = 0.0001); ES12: diffg1&g2=-0,13 (95% CI [-0,20; -0,06], p = 0.0002); ES14: diffg1&g2=-0,20 (95% CI [-0,27; -0,13], p = 0.0000); ES15: diffg1&g2=-0,20 (95% CI [-0,28; -0,12], p = 0.0000); ES3: diffg1&g2=-0,10 (95% CI [-0,16; -0,03], p = 0.0031); ES10: diffg1&g2=-0,12 (95% CI [-0,20; -0,04], p = 0.0050); ES11: diffg1&g2=-0,08 (95% CI [-0,14; -0,02], p = 0.0058); ES8: diffg1&g2=-0,09 (95% CI [-0,16; -0,02], p = 0.0225); ES9: diffg1&g2=-0,10 (95% CI [-0,18; -0,03], p = 0.0142); ES13: diffg1&g2=-0,07 (95% CI [-0,14; -0,01], p = 0.0223); ES16: diffg1&g2=-0,10 (95% CI [-0,17; -0,02], p = 0.0159); ES17: diffg1&g2=-0,09 (95% CI [-0,17; -0,02], p = 0.0319); ES18: diffg1&g2=-0,08 (95% CI [-0,14; -0,01], p = 0.0242); | M = 6.14 Mdn=6 SD=1.69 | M = 0.42 Mdn=0.42 SD=0.02 | M = 0.83 Mdn=0.83 SD=0.01 | M = 0.48 Mdn=0.48 SD=0.01 | M = 0.79 Mdn=0.78 SD=0.01 | 0.82 | -0.03 |
BQ9, two groups: x < 51 (n1=333), x>=51 (n2=340) | ES1: diffg1&g2=0,09 (95% CI [0,05; 0,12], p = 0.0000); ES2: diffg1&g2=0,10 (95% CI [0,06; 0,13], p = 0.0000); ES3: diffg1&g2=0,13 (95% CI [0,08; 0,18], p = 0.0000); ES4: diffg1&g2=0,10 (95% CI [0,05; 0,15], p = 0.0006); ES5: diffg1&g2=0,08 (95% CI [0,04; 0,11], p = 0.0000); ES14: diffg1&g2=0,14 (95% CI [0,08; 0,19], p = 0.0001); ES15: diffg1&g2=0,14 (95% CI [0,08; 0,20], p = 0.0001); ES7: diffg1&g2=0,06 (95% CI [0,01; 0,10], p = 0.0133); ES8: diffg1&g2=0,09 (95% CI [0,04; 0,15], p = 0.0466); ES11: diffg1&g2=-0,05 (95% CI [-0,09; 0,00], p = 0.0485); ES13: diffg1&g2=0,05 (95% CI [0,01; 0,10], p = 0.0297); ES19: diffg1&g2=0,04 (95% CI [0,00; 0,08], p = 0.0193); | M = 5.79 Mdn=6 SD=1.44 | M = 0.58 Mdn=0.58 SD=0.02 | M = 0.69 Mdn=0.69 SD=0.02 | M = 0.61 Mdn=0.61 SD=0.01 | M = 0.68 Mdn=0.68 SD=0.02 | 0.51 | 0.17 |
BQ9, three groups: x < 40 (n1=225), 40<=x < 60 (n2=231), x>=60 (n3=217) | ES1: diffg1&g3=0,10 (95% CI [0,06; 0,14], p = 0.0000), diffg1&g2=0,07 (95% CI [0,03; 0,11], p = 0.0002), diffg2&g3=0,03 (95% CI [-0,01; 0,06], p = 0.0716), pg1&g2&g3=0.0000; ES2: diffg1&g3=0,12 (95% CI [0,08; 0,16], p = 0.0000), diffg1&g2=0,07 (95% CI [0,03; 0,11], p = 0.0007), diffg2&g3=0,05 (95% CI [0,01; 0,09], p = 0.0162), pg1&g2&g3=0.0000; ES3: diffg1&g3=0,17 (95% CI [0,11; 0,23], p = 0.0000), diffg1&g2=0,06 (95% CI [0,01; 0,12], p = 0.110), diffg2&g3=0,10 (95% CI [0,04; 0,17], p = 0.003), pg1&g2&g3=0.0000; ES4: diffg1&g3=0,13 (95% CI [0,07; 0,19], p = 0.0011), diffg1&g2=0,02 (95% CI [-0,04; 0,07], p = 0.5450), diffg2&g3=0,11 (95% CI [0,05; 0,18], p = 0.0011), pg1&g2&g3=0.0004; ES5: diffg1&g3=0,09 (95% CI [0,04; 0,13], p = 0.0000), diffg1&g2=0,03 (95% CI [-0,01; 0,08], p = 0.042), diffg2&g3=0,05 (95% CI [0,01; 0,10], p = 0.012), pg1&g2&g3=0.0000; ES14: diffg1&g3=0,17 (95% CI [0,11; 0,24], p = 0.0002), diffg1&g2=0,05 (95% CI [-0,01; 0,12], p = 0.6097), diffg2&g3=0,12 (95% CI [0,05; 0,19], p = 0.0031), pg1&g2&g3=0.0002; ES15: diffg1&g3=0,18 (95% CI [0,11; 0,25], p = 0.0004), diffg1&g2=0,06 (95% CI [0,00; 0,13], p = 0.5430), diffg2&g3=0,12 (95% CI [0,04; 0,19], p = 0.0082), pg1&g2&g3=0.0006; ES7: diffg1&g3=0,09 (95% CI [0,03; 0,15], p = 0.0064), diffg1&g2=0,01 (95% CI [-0,04; 0,06], p = 0.7293), diffg2&g3=0,08 (95% CI [0,02; 0,14], p = 0.0120), pg1&g2&g3=0.0043; ES11: diffg1&g3=-0,08 (95% CI [-0,14; -0,03], p = 0.0069), diffg1&g2=-0,10 (95% CI [-0,15; -0,04], p = 0.0033), diffg2&g3=0,02 (95% CI [-0,04; 0,07], p = 0.5752), pg1&g2&g3=0.0017; ES8: diffg1&g3=0,13 (95% CI [0,06; 0,20], p = 0.016), diffg1&g2=0,03 (95% CI [-0,04; 0,09], p = 0.956), diffg2&g3=0,11 (95% CI [0,04; 0,18], p = 0.016), pg1&g2&g3=0.0116; ES10: diffg1&g3=0,14 (95% CI [0,07; 0,22], p = 0.034), diffg1&g2=0,02 (95% CI [-0,05; 0,09], p = 0.995), diffg2&g3=0,12 (95% CI [0,04; 0,20], p = 0.034), pg1&g2&g3=0.0245; ES19: diffg1&g3=0,06 (95% CI [0,01; 0,11], p = 0.034), diffg1&g2=0,03 (95% CI [-0,01; 0,08], p = 0.198), diffg2&g3=0,03 (95% CI [-0,03; 0,08], p = 0.198), pg1&g2&g3=0.0351; ES20: diffg1&g3=-0,07 (95% CI [-0,12; -0,02], p = 0.127), diffg1&g2=0,01 (95% CI [-0,04; 0,06], p = 0.268), diffg2&g3=-0,08 (95% CI [-0,13; -0,02], p = 0.026), pg1&g2&g3=0.0253; | M = 7.13 Mdn=7 SD=1.45 | M = 0.93 Mdn=0.93 SD=0.03 | M = 0.54 Mdn=0.54 SD=0.03 | M = 0.98 Mdn=0.98 SD=0.01 | M = 0.50 Mdn=0.50 SD=0.03 | 0.34 | 0.16 |
1 For groupings of two groups the difference of mean ratings (each mean rating in the range 0.0-1.0) is computed by the formula (M-M), and for groupings of three groups by the formula max({(M-M),(M-M),(M-M)}). Wilcoxon rank-sum test (for two groups) and Kruskal-Wallis test (for three groups) indicate the statistically significant rating differences (p < 0.05) between groups, each rating difference (diff, diff and diff) supplied with the corresponding 95% confidence interval (CI) and the p-value of the Wilcoxon pairwise comparison. The parameter p shows the p-value of the Kruskal-Wallis test for three groups
2 Training and validation metrics of the convolutional neural network model are averaged from 100 separate training and validation sequences to learn a labeling that matches the grouping (n = 673)
3 For groupings of two groups the probability of pure chance of classifying the rating profiles correctly is computed by the formula (max({n,n}))/(n+n), and for groupings of three groups by the formula (max({n,n,n}))/(n+n+n)
Fig. 2Gathering the “need for help” rating for an expression statement on an 11-point Likert scale with an online questionnaire
Layers of the convolutional neural network model used in the machine learning experiments
| Model: “sequential” | ||
|---|---|---|
|
|
|
|
| rescaling_1 (Rescaling) | (None, 20, 25, 3) | 0 |
| conv2d (Conv2D) | (None, 20, 25, 16) | 448 |
| max_pooling2d (MaxPooling2D) | (None, 10, 12, 16) | 0 |
| conv2d_1 (Conv2D) | (None, 10, 12, 32) | 4640 |
| max_pooling2d_1 (MaxPooling2D) | (None, 5, 6, 32) | 0 |
| conv2d_2 (Conv2D) | (None, 5, 6, 64) | 18,496 |
| max_pooling2d_2 (MaxPooling2D) | (None, 2, 3, 64) | 0 |
| flatten (Flatten) | (None, 384) | 0 |
| dense (Dense) | (None, 128) | 49,280 |
| dense_1 (Dense) | (None, 2) | 258 |
Frequencies of persons giving the answer values 1-9 for the background questions BQ1 and BQ5-BQ7 (n = 673). M = mean, Mdn=median, SD=standard deviation
|
|
| |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
| |
| BQ1: an estimated health condition | 11 (2%) | 3 (~0%) | 40 (6%) | 66 (10%) | 98 (15%) | 45 (7%) | 162 (24%) | 129 (19%) | 119 (18%) | 6.53 | 7 | 1.97 |
| BQ5: the quality of life | 7 (1%) | 6 (1%) | 29 (4%) | 47 (7%) | 101 (15%) | 84 (12%) | 187 (28%) | 123 (18%) | 89 (13%) | 6.53 | 7 | 1.77 |
| BQ6: the satisfaction about health | 17 (3%) | 10 (1%) | 68 (10%) | 61 (9%) | 84 (12%) | 78 (12%) | 151 (22%) | 138 (21%) | 66 (10%) | 6.13 | 7 | 2.04 |
| BQ7: the satisfaction about ability | 8 (1%) | 9 (1%) | 44 (7%) | 28 (4%) | 54 (8%) | 58 (9%) | 156 (23%) | 128 (19%) | 188 (28%) | 6.98 | 7 | 1.98 |
The distribution of answer values for the background questions BQ2-BQ4 and BQ8-BQ9. M = mean, Mdn=median, SD=standard deviation
|
|
|
|---|---|
| BQ2: a health problem reduces ability | No (coded as 1): 219 (33%); Yes (coded as 2): 454 (67%) (M = 1.67; Mdn=2; SD=0.47) |
| BQ3: one or more diseases identified by a doctor | Disease category (the number of unique persons who selected the category): Lung diseases: 126; Heart and circulatory diseases: 177; Joint and back diseases: 301; Injuries:103; Mental health problems: 188; Vision and hearing deficits: 191; Other diseases: 345 |
| BQ4: a continuous or repeated need for a doctor’s care | No (coded as 1): 364 (54%); Yes (coded as 2): 309 (46%) (M = 1.46; Mdn=1; SD=0.50) |
| BQ8: the sex | Man (coded as 1): 123 (18%); Woman (coded as 2): 550 (82%) (M = 1.82; Mdn=2; SD=0.39) |
| BQ9: the age | Belonging to an age range category (the lower bound is included in the range but not the upper bound): 16-20 years: 143 (21%); 20-30 years: 21 (3%); 30-40 years: 61 (9%); 40-50 years: 96 (14%); 50-60 years: 135 (20%); 60-70 years: 141 (21%); 70-80 years: 64 (10%); 80-90 years: 12 (2%); 90 years or more: 0 (0%) (M = 46.93; Mdn=51; SD=19.57) |
Fig. 3a The “need for help” rating mean values (transformed into the range 0.0-1.0) for expression statements ES4, ES9-ES10 and ES19-ES20 in respect to the person’s answer value to the background question BQ1 (an estimated health condition, 1-9), n = 673. b-c Increase of the “need for help” rating mean values from the baseline rating mean value that the person gives for the expression statement ES20 (“I have an ordinary health condition.”), n = 673
Fig. 4a-e The relative frequency of respondents for each alternative “need for help” rating value (transformed into the range 0.0-1.0) concerning expression statements ES4, ES9-ES10 and ES19-ES20 in respect to the person’s answer value to the background questions BQ1 (an estimated health condition) and BQ9 (the age), n = 673. f Rating value distributions for the expression statements ES4, ES9-ES10 and ES19-ES20 in respect to all respondents together, n = 673
Pairs of expression statements (ES) and background questions (BQ) having significant correlation (>=0.70 with the level p < 0.001; see [38]) based on a Kendall rank-correlation measure, all these were statistically significant with the level p < 0.001, and the highest cosine similarity values including the same pairs of expression statements and background questions
|
|
|
|
|---|---|---|
| ES16&ES17 | 0.91 | 0.97 |
| ES14&ES15 | 0.86 | 0.95 |
| ES9&ES10 | 0.79 | 0.92 |
| ES16&ES18 | 0.78 | 0.90 |
| ES17&ES18 | 0.77 | 0.89 |
| ES7&ES8 | 0.75 | 0.87 |
| ES1&ES2 | 0.73 | 0.80 |
| BQ1&BQ6 | 0.71 | 0.82 |
Fig. 5The “need for help” rating mean values of expression statements ES1-ES20 (transformed into the range 0.0-1.0) in respect to two groups based on the answer values of the background question BQ1 (an estimated health condition, 1-9). The “group 1” contains those respondents who gave an answer value that was lower than 7 (n1=263), and the “group 2” contains all the other respondents (n2=410)
Fig. 6Loss and accuracy for training and validation of the convolutional neural network model for one sequence to learn a labeling that matches the grouping of two groups based on the answer values of the background question BQ1 (an estimated health condition) (n = 673)