| Literature DB >> 35061853 |
Julio Cesar Cavalcanti1,2, Anders Eriksson1, Plinio A Barbosa2.
Abstract
The purpose of this study was to assess the speaker-discriminatory potential of a set of speech timing parameters while probing their suitability for forensic speaker comparison applications. The recordings comprised of spontaneous dialogues between twin pairs through mobile phones while being directly recorded with professional headset microphones. Speaker comparisons were performed with twins speakers engaged in a dialogue (i.e., intra-twin pairs) and among all subjects (i.e., cross-twin pairs). The participants were 20 Brazilian Portuguese speakers, ten male identical twin pairs from the same dialectal area. A set of 11 speech timing parameters was extracted and analyzed, including speech rate, articulation rate, syllable duration (V-V unit), vowel duration, and pause duration. Three system performance estimates were considered for assessing the suitability of the parameters for speaker comparison purposes, namely global Cllr, EER, and AUC values. These were interpreted while also taking into consideration the analysis of effect sizes. Overall, speech rate and articulation rate were found the most reliable parameters, displaying the largest effect sizes for the factor "speaker" and the best system performance outcomes, namely lowest Cllr, EER, and highest AUC values. Conversely, smaller effect sizes were found for the other parameters, which is compatible with a lower explanatory potential of the speaker identity on the duration of such units and a possibly higher linguistic control regarding their temporal variation. In addition, there was a tendency for speech timing estimates based on larger temporal intervals to present larger effect sizes and better speaker-discriminatory performance. Finally, identical twin pairs were found remarkably similar in their speech temporal patterns at the macro and micro levels while engaging in a dialogue, resulting in poor system discriminatory performance. Possible underlying factors for such a striking convergence in identical twins' speech timing patterns are presented and discussed.Entities:
Mesh:
Year: 2022 PMID: 35061853 PMCID: PMC8782339 DOI: 10.1371/journal.pone.0262800
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Data segmentation and annotation.
Parameters’ categories followed by median, mean, standard deviation and range (of individual means) across subjects.
| Parameter | Category | Median | Mean | Standard deviation | Range (means) |
|---|---|---|---|---|---|
| SRATE | Macro | 4.6 vv/s | 4.6 vv/s | 1.3 vv/s | 3.5—5.7 vv/s |
| ARTRATE I | Macro | 5.5 vv/s | 5.4 vv/s | 1.1 vv/s | 4.7—6.2 vv/s |
| ARTRATE II | Macro | 6.0 vv/s | 5.9 vv/s | 1.0 vv/s | 5.2—6.6 vv/s |
| SGDUR | Macro | 1.0 s | 1.2 s | 702 ms | 853—1.687 ms |
| VVDUR I | Micro | 160 ms | 207 ms | 199 ms | 168—264 ms |
| VVDUR II | Micro | 150 ms | 163 ms | 98 ms | 137—205 ms |
| VOWEL DUR | Micro | 67 ms | 84 ms | 67 ms | 69—104 ms |
| SILENT PAUSES | Pause-related | 480 ms | 547 ms | 333 ms | 398—772 ms |
| FILLED PAUSES | Pause-related | 255 ms | 298 ms | 146 ms | 204—373 ms |
| ALL PAUSES | Pause-related | 365 ms | 449 ms | 301 ms | 345—649 ms |
| IPI | Pause-related | 2.0 s | 2.3 s | 1.3 s | 1.4—3.5 s |
Number of data points, p-value and χ2 for the Kruskal-wallis test (df = 19), number of significant differences among all speakers and intra-twin pairs (Dunn’s Multiple Comparison Test, df = 19, two-tailed test, p < 0.025 with Bonferroni adjustment), followed by effect sizes (η2).
| Parameter | N | Cross-pair differences | Intra-twin differences | Effect size (cross-pairs) | Magnitude (cross-pairs) | |
|---|---|---|---|---|---|---|
| SRATE | 851 | < 0.001/148.7 | 40 (21.0%) | – | 15.6% | Large |
| ARTRATE I | 851 | < 0.001/147.8 | 47 (27.7%) | – | 15.5% | Large |
| ARTRATE II | 851 | < 0.001/121.3 | 26 (13.6%) | – | 12.3% | Moderate |
| SGDUR | 2.107 | < 0.001/156.5 | 42 (22.1%) | – | 6.5% | Mod |
| VVDUR I | 12.609 | < 0.001/305.0 | 75 (39.4%) | G1-G2 | 2.2% | Small |
| VVDUR II | 10.495 | < 0.001/268.3 | 62 (32.6%) | G1-G2 | 2.3% | Small |
| VOWEL DUR | 9.447 | < 0.001/183.5 | 54 (28.4%) | J1-J2 | 1.7% | Small |
| SIL PAUSES | 864 | < 0.001/58.2 | 7 (3.6%) | C1-C2 | 4.6% | Small |
| FIL PAUSES | 560 | < 0.001/64.8 | 10 (5.2%) | – | 8.3% | Moderate |
| ALL PAUSES | 1.424 | < 0.001/66.9 | 7 (3.6%) | – | 3.3% | Small |
| IPI | 675 | < 0.001/92.8 | 21 (11.5%) | F1-F2 | 11.3% | Moderate |
Raw and calibrated likelihood-cost ratios (Cllr), equal error rates (EER), multi-class AUC values for cross-pair comparisons, and effect sizes (η2).
| Parameter | Cllrraw | Cllrcal | EER | AUC | Effect size |
|---|---|---|---|---|---|
| SRATE | 0.78 | 0.78 | 0.28 | 0.64 | Large |
| ARTRATE I | 0.76 | 0.75 | 0.27 | 0.64 | Large |
| ARTRATE II | 0.78 | 0.75 | 0.31 | 0.62 | Moderate |
| SGDUR | 0.96 | 0.89 | 0.35 | 0.59 | Moderate |
| VVDUR I | 0.82 | 0.81 | 0.33 | 0.55 | Small |
| VVDUR II | 0.92 | 0.84 | 0.30 | 0.55 | Small |
| VOWEL DUR | 0.95 | 0.90 | 0.40 | 0.54 | Small |
| SIL PAUSES | 6.06 | 1.00 | 0.55 | 0.58 | Small |
| FIL PAUSES | 2.81 | 1.00 | 0.50 | 0.61 | Moderate |
| ALL PAUSES | 9.97 | 1.00 | 0.50 | 0.56 | Small |
| IPI | 0.88 | 0.88 | 0.43 | 0.63 | Moderate |
Number of significant differences (Dunn’s Multiple Comparison Test, df = 19, two-tailed test, p < 0.025 with Bonferroni adjustment) in intra-twin pair comparisons for VVDUR I, VVDUR II, VOWEL DUR, and SIL PAUSES for downsized samples.
| Parameter | N | Random sampling Intra-twins 3 replications | Random sampling Intra-twins 10 replications | Random sampling Intra-twins 20 replications |
|---|---|---|---|---|
| VVDUR I | 8.580 | G1-G2 (1x) | G1-G2 (1x) | G1-G2 (4x)/J1-J2 (1x)/E1-E2 (1x) |
| VVDUR II | 7.020 | – | G1-G2 (1x) | G1-G2 (2x) |
| VOWEL DUR | 8.040 | – | J1-J2 (7x) | J1-J2 (8x) |
| SIL PAUSES | 440 | – | C1-C2 (1x)/ A1-A2 (1x) | C1-C2 (3x) |
Fig 2Diagram of density for speech rate (A), articulation rate I (B), articulation rate II (C), V-V unit duration I (D), V-V unit duration II (E), Vowel duration (F), silent pauses (G), filled pauses (H), IPI (I).
Fig 3ROC curves and AUC values for intra-twin pair comparisons.
Fig 4ROC curves and AUC values for cross-pair comparisons (I).
Fig 5ROC curves and AUC values for cross-pair comparisons (II).