| Literature DB >> 36081868 |
Mareike Buhl1,2, Gülce Akin2,3, Samira Saak1,2, Ulrich Eysholdt1,2,4, Andreas Radeloff2,4, Birger Kollmeier1,2,5,6, Andrea Hildebrandt2,3.
Abstract
For supporting clinical decision-making in audiology, Common Audiological Functional Parameters (CAFPAs) were suggested as an interpretable intermediate representation of audiological information taken from various diagnostic sources within a clinical decision-support system (CDSS). Ten different CAFPAs were proposed to represent specific functional aspects of the human auditory system, namely hearing threshold, supra-threshold deficits, binaural hearing, neural processing, cognitive abilities, and a socio-economic component. CAFPAs were established as a viable basis for deriving audiological findings and treatment recommendations, and it has been demonstrated that model-predicted CAFPAs, with machine learning models trained on expert-labeled patient cases, are sufficiently accurate to be included in a CDSS, but it requires further validation by experts. The present study aimed to validate model-predicted CAFPAs based on previously unlabeled cases from the same data set. Here, we ask to which extent domain experts agree with the model-predicted CAFPAs and whether potential disagreement can be understood in terms of patient characteristics. To these aims, an expert survey was designed and applied to two highly-experienced audiology specialists. They were asked to evaluate model-predicted CAFPAs and estimate audiological findings of the given audiological information about the patients that they were presented with simultaneously. The results revealed strong relative agreement between the two experts and importantly between experts and the prediction for all CAFPAs, except for the neural processing and binaural hearing-related ones. It turned out, however, that experts tend to score CAFPAs in a larger value range, but, on average, across patients with smaller scores as compared with the machine learning models. For the hearing threshold-associated CAFPA in frequencies smaller than 0.75 kHz and the cognitive CAFPA, not only the relative agreement but also the absolute agreement between machine and experts was very high. For those CAFPAs with an average difference between the model- and expert-estimated values, patient characteristics were predictive of the disagreement. The findings are discussed in terms of how they can help toward further improvement of model-predicted CAFPAs to be incorporated in a CDSS for audiology.Entities:
Keywords: CAFPAs; CDSS; audiological diagnostics; expert knowledge; expert validation; machine learning; precision audiology
Year: 2022 PMID: 36081868 PMCID: PMC9446152 DOI: 10.3389/fneur.2022.960012
Source DB: PubMed Journal: Front Neurol ISSN: 1664-2295 Impact factor: 4.086
Figure 1(A) Definition of Common Audiological Functional Parameters (CAFPAs). From left to right, the functional aspects CA1-CA4 and CU1-CU2 are frequency-dependent, and from top to bottom, the functional aspects range from peripheral to central. The CAFPAs are defined on a continuum ranging on the interval [0 1], with 0 representing “normal” and 1 representing “maximally impaired”. Panel (A) of the figure was taken from Buhl (22). (B) Schematic representation of the clinical decision-support system (CDSS) by Buhl (22) (left part, based on labeled data) and relationships to the current study (right, based on unlabeled data). Labeled and unlabeled measurement data originate from the same database (light brown box). Light green arrows depict expert knowledge and blue arrows depict statistical predictions of CAFPAs. Numbered arrows represent contributions of previous studies: collection of expert knowledge in the opposite direction of audiological diagnostics (19); collection of expert knowledge based on individual patients from the currently used database (20); comparison of classification based on audiological measures vs. expert-estimated CAFPAs (21); comparison of classification based on expert-estimated CAFPAs vs. model-predicted CAFPAs. The prediction models were developed by Saak et al. (23) based on the expert-estimated CAFPAs from Buhl et al. (20). The prediction models were derived based on labeled patients (left) and applied to unlabeled patients (right). The experts' task in the current study was to validate the model-predicted CAFPAs (dashed light green arrow) and to estimate audiological findings for unlabeled patients.
Figure 2Patient data and CAFPA evaluation sheet as implemented in the electronic version of the expert survey. Patient cases were displayed one at a time. The survey sheet is shown in German as the survey was conducted in Germany. For the main terms, a translation is given in the following. Upper row: Patient ID, gender, and age. Measurements: Audiogram (right and left), LL: air conduction, KL: bone conduction, and hearing loss plotted over frequency. Goettingen sentence test (GOESA) in noise, intelligibility plotted over SRT. Loudness scaling (Adaptive CAtegorical LOudness Scaling (ACALOS); right and left), loudness plotted over level, and black line: normal-hearing reference. Native language. Tinnitus according to the home questionnaire (right and left). Hearing problems in quiet and in noise (scale from none to very much). Verbal intelligence test: z-score (negative scores: below average, positive scores: above average). Socio-economic status: lower class, middle class, and upper class. DemTect: suspicion of dementia, mild cognitive impairment, and age-specific normal cognitive abilities. CAFPAs: the meaning of the different parameters is given in Figure 1A.
Agreement between experts and stability of experts' ratings.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| CA1 | 0.90 [0.72; 0.97] | 0.00 | 0.11 | 0.99 [0.99; 1.00] | 0.00 | |
| CA2 | 0.96 [0.87; 0.98] | 0.00 | 0.97 [0.92; 0.99] | 0.00 | 0.99 [0.98; 1.00] | 0.00 |
| CA3 | 0.95 [0.86; 0.98] | 0.00 | 0.99 [0.97; 1.00] | 0.00 | 0.99 [0.98; 1.00] | 0.00 |
| CA4 | 0.92 [0.75; 0.97] | 0.00 | 0.84 [0.53; 0.95] | 0.00 | 0.98 [0.96; 0.99] | 0.00 |
|
| 0.09 | 0.89 [0.68; 0.96] | 0.00 | 0.96 [0.92; 0.98] | 0.00 | |
| CU2 | 0.94 [0.81; 0.98] | 0.00 | 0.90 [0.71; 0.97] | 0.00 | 0.98 [0.96; 0.99] | 0.00 |
|
| singular | 0.00 | 0.85 [0.54; 0.95] | 0.00 | 0.92 [0.84; 0.97] | 0.00 |
|
| 0.00 | 0.82 [0.47; 0.94] | 0.00 | 0.96 [0.91; 0.98] | 0.00 | |
|
| 0.01 | 0.96 [0.88; 0.99] | 0.00 | 0.94 [0.88; 0.98] | 0.00 | |
| CE | 0.86 [0.58; 0.95] | 0.00 | 0.97 [0.91; 0.99] | 0.00 | 0.96 [0.92; 0.98] | 0.00 |
CA1–CA4, hearing threshold-related CAFPAs; CU1–CU2, Suprathreshold-deficits related CAFPAs; CB, binaural hearing; CN, neural processing; CC, cognitive components of hearing; CE, socio-economic status; E1, Expert 1 who rated 15 patient cases two times; E2, Expert 2 who rated 15 patient cases 12 times; ICC, intra-class correlation; CI, confidence interval.
Bold numbers indicate estimated agreements with a lower than acceptable effect size.
Relative agreement between statistically predicted CAFPAs and experts' opinion.
|
|
| |||
|---|---|---|---|---|
|
|
|
|
|
|
| CA1 | 0.94 [0.85; 0.98] | 0.00 | 0.94 [0.92; 0.96] | 0.00 |
| CA2 | 0.98 [0.94; 0.99] | 0.00 | 0.96 [0.94; 0.97] | 0.00 |
| CA3 | 0.97 [0.93; 0.99] | 0.00 | 0.96 [0.95; 0.97] | 0.00 |
| CA4 | 0.94 [0.87; 0.98] | 0.00 | 0.94 [0.91; 0.95] | 0.00 |
| CU1 | 0.00 | 0.86 [0.80; 0.90] | 0.00 | |
| CU2 | 0.94 [0.86; 0.98] | 0.00 | 0.90 [0.86; 0.93] | 0.00 |
|
| 0.01 | 0.00 | ||
|
| 0.13 | 0.00 | ||
| CC | 0.88 [0.72; 0.96] | 0.00 | 0.00 | |
| CE | 0.91 [0.79; 0.97] | 0.00 | 0.82 [0.75; 0.87] | 0.00 |
CA1–CA4, hearing threshold-related CAFPAs; CU1–CU2, Suprathreshold-deficits related CAFPAs; CB, binaural hearing; CN, neural processing; CC, cognitive components of hearing; CE, socio-economic status; M, model = statistical model-predicted CAFPA, refer to Saak et al. (23); E1, Expert 1 who rated 15 patient cases two times and in total 150 different patients (used in second column M-E1); E2, Expert 2 who rated 15 patient cases 12 times; ICC, intra-class correlation; CI: confidence interval.
Bold numbers indicate estimated agreements with a lower than acceptable effect size.
Figure 3Scatterplots visualizing the relative agreement between statistically predicted CAFPAs and the expert's opinion (for Expert 1, N = 150 patients; corresponding to the second column of Table 2).
Main effect of evaluator in the linear mixed effects regression (LMER) models with evaluators (M and E1) nested within patients.
|
|
|
|
|
|---|---|---|---|
| CA1 | −2.09 (0.70) | −3.48; −0.71 | 0.00 |
| CA2 | −2.33 (0.64) | −3.60; −1.06 | 0.00 |
| CA3 | −3.20 (0.64) | −4.46; −1.94 | 0.00 |
| CA4 | −3.37 (0.85) | −5.05; −1.68 | 0.00 |
| CU1 | −5.14 (1.07) | −7.24; −3.04 | 0.00 |
| CU2 | −6.95 (0.83) | −8.60; −5.31 | 0.00 |
| CB | −17.79 (1.43) | −20.61; −14.98 | 0.00 |
| CN | −10.71 (1.24) | −13.61; −8.27 | 0.00 |
| CC | 0.27 (1.04) | −1.79; 2.33 | 0.79 |
| CE | 7.21 (1.05) | 5.15; 9.27 | 0.00 |
CA1–CA4, hearing threshold-related CAFPAs; CU1–CU2, Suprathreshold-deficits related CAFPAs; CB, binaural hearing; CN, neural processing; CC, cognitive components of hearing; CE, socio-economic status.
Evaluator was dummy coded with 0 = machine learning model, 1 = expert (1). Npatients = 150. β: regression weight (fixed effect) of CAFPAs depending on the within-patient factor (machine learning model vs. expert); it indicates the difference between experts' ratings across patients on average as compared with the statistical model; SE, standard error of the regression weight estimate; CI, confidence interval.
β-weights (of the cross-level interaction) indicating whether the difference between the expert and statistical model depends on the patients' audiological measures.
| Δ | Δ | Δ | Δ | Δ | Δ | Δ | Δ | Δ | Δ | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| Age | ||||||||||||||||||||
| Sex |
| 0.00 |
| 0.02 |
| 0.02 | ||||||||||||||
| PTA |
| 0.01 |
| 0.00 |
| 0.00 |
| 0.00 | ||||||||||||
| SES |
| 0.00 | ||||||||||||||||||
| GOESA |
| 0.00 |
| 0.00 |
| 0.00 |
| 0.00 | ||||||||||||
| WST | ||||||||||||||||||||
| DemTect |
| 0.03 |
| 0.00 | ||||||||||||||||
| Tinnitus |
| 0.00 | ||||||||||||||||||
| Tinnitus | ||||||||||||||||||||
| ACALOS |
| 0.00 |
| 0.02 | ||||||||||||||||
| ACALOS |
| 0.00 |
| 0.04 |
| 0.00 | ||||||||||||||
| ACALOS | ||||||||||||||||||||
Note that only significant results have been listed and an empty cell in the table indicates a null effect. Shaded rows or columns indicate that no significant results were obtained at all for the respective predictor or CAFPA.
CA1–CA4, hearing threshold-related CAFPAs; CU1–CU2, Suprathreshold-deficits related CAFPAs; CB, binaural hearing; CN, neural processing; CC, cognitive components of hearing; CE, socio-economic status.
Δ indicates the difference between the expert and the statistical models. p-values indicate the probability of observing the respective prediction of the difference, or more extreme ones, assuming the null hypothesis of no difference is true. The coefficient estimates originate from 10 different models, one model for each CAFPA. All predictors listed in the table were simultaneously included in the model, along with their interaction with the within-patient condition variable (model = 0; expert = 1). Thus, β-weights indicate cross-level interaction effects (within-patient condition variable and between-patient predictors as listed in the first column of the table).
Figure 4CAFPA patterns for the four most frequent audiological findings (columns) as indicated by Expert 1. Model-predicted (first row) and expert-validated CAFPAs (second row). The background color represents the median of the respective CAFPA for all patients associated to the respective audiological finding. The horizontal color bar includes the interquartile range in addition to the median.