| Literature DB >> 34209754 |
Catherine B Ashley1, Ryan D Snyder1, James E Shepherd1, Catalina Cervantes2, Nitish Mittal3, Sheila Fleming4, Jaxon Bailey2, Maisie D Nievera2, Sharmin Islam Souleimanova2, Bill Nyaoga2, Lauren Lichtenfeld2, Alicia R Chen2, W Todd Maddox5, Christine L Duvauchelle2,6.
Abstract
Ultrasonic vocalizations (USVs) are known to reflect emotional processing, brain neurochemistry, and brain function. Collecting and processing USV data is manual, time-intensive, and costly, creating a significant bottleneck by limiting researchers' ability to employ fully effective and nuanced experimental designs and serving as a barrier to entry for other researchers. In this report, we provide a snapshot of the current development and testing of Acoustilytix™, a web-based automated USV scoring tool. Acoustilytix implements machine learning methodology in the USV detection and classification process and is recording-environment-agnostic. We summarize the user features identified as desirable by USV researchers and how these were implemented. These include the ability to easily upload USV files, output a list of detected USVs with associated parameters in csv format, and the ability to manually verify or modify an automatically detected call. With no user intervention or tuning, Acoustilytix achieves 93% sensitivity (a measure of how accurately Acoustilytix detects true calls) and 73% precision (a measure of how accurately Acoustilytix avoids false positives) in call detection across four unique recording environments and was superior to the popular DeepSqueak algorithm (sensitivity = 88%; precision = 41%). Future work will include integration and implementation of machine-learning-based call type classification prediction that will recommend a call type to the user for each detected call. Call classification accuracy is currently in the 71-79% accuracy range, which will continue to improve as more USV files are scored by expert scorers, providing more training data for the classification model. We also describe a recently developed feature of Acoustilytix that offers a fast and effective way to train hand-scorers using automated learning principles without the need for an expert hand-scorer to be present and is built upon a foundation of learning science. The key is that trainees are given practice classifying hundreds of calls with immediate corrective feedback based on an expert's USV classification. We showed that this approach is highly effective with inter-rater reliability (i.e., kappa statistics) between trainees and the expert ranging from 0.30-0.75 (average = 0.55) after only 1000-2000 calls of training. We conclude with a brief discussion of future improvements to the Acoustilytix platform.Entities:
Keywords: addiction; automated scoring; dopamine; drug development; drug discovery; machine learning; mental health; ultrasonic vocalization
Year: 2021 PMID: 34209754 PMCID: PMC8301917 DOI: 10.3390/brainsci11070864
Source DB: PubMed Journal: Brain Sci ISSN: 2076-3425
Sensitivity and precision of Acoustilytix and DeepSqueak for four experimental conditions across four recording environments.
| Sensitivity | Precision | |||||||
|---|---|---|---|---|---|---|---|---|
| Recording Environment | Brief Description of Study Manipulation | # of Hand-Score | Acoustilytix | DeepSqueak | Acoustilytix | DeepSqueak | ||
| 1 | Cocaine | 8552 | 92.5 | 91.5 | 0.016 | 74.5 | 42.0 | <0.00001 |
| 2 | Ethanol | 1720 | 94.0 | 92.1 | 0.029 | 72.0 | 65.8 | 0.00008 |
| 3 | Morphine | 5771 | 90.6 | 75.8 | <0.00001 | 83.6 | 33.1 | <0.00001 |
| 4 | Sex | 3462 | 96.4 | 97.5 | 0.0078 | 59.5 | 46.1 | <0.00001 |
Call types in the Wright et al. [42] classification scheme and their mapping to the five-call composite with representative spectrograms from Acoustilytix.
| Wright et al. [ | Five-Call Composite |
|---|---|
| Flat | Fixed Frequency 50 |
| Short | |
| Upward Ramp | Frequency Modulated 50 |
| Downward Ramp | |
| Split | |
| Step Up | |
| Step Down | |
| Multi-step | |
| Inverted-U | |
| Complex | |
| Composite | |
| Trill | Frequency Modulated with Trills 50 |
| Flat-Trill Combo | |
| Trill with jumps | |
| 22-kHz call | Long 22-kHz |
| Short 22-kHz (>300 ms) |
Figure 1Representative spectrograms for each of the call types in Table 2.
Number of each of five call types in the five-call composite dataset classified by an expert scorer.
| Call Types | Count |
|---|---|
| Fixed Frequency 50 | 202 |
| Frequency Modulated 50 | 342 |
| Frequency Modulated with Trill 50 | 132 |
| Long 22-kHz | 184 |
| Short 22-kHz | 71 |
Evaluation of the five-call composite classification model performance on the test dataset.
| Five-Call Type | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Fixed Frequency 50 | 0.77 | 0.69 | 0.73 | 49 |
| Frequency Modulated 50 | 0.73 | 0.87 | 0.80 | 12 |
| Frequency Modulated with Trill 50 | 0.81 | 0.62 | 0.70 | 42 |
| Long 22-kHz | 0.92 | 0.87 | 0.89 | 53 |
| Short 22-kHz | 0.82 | 0.75 | 0.78 | 24 |
|
|
|
|
Evaluation of the five-call composite classification model performance on the training dataset.
| Five-Call Type | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Fixed Frequency 50 | 0.89 | 0.80 | 0.84 | 153 |
| Frequency Modulated 50 | 0.82 | 0.93 | 0.87 | 230 |
| Frequency Modulated with Trill 50 | 0.92 | 0.81 | 0.86 | 90 |
| Long 22-kHz | 0.92 | 0.92 | 0.92 | 131 |
| Short 22-kHz | 0.86 | 0.77 | 0.81 | 47 |
|
|
|
|
Kappa statistics for experienced and novice hand-scorers relative to an expert scorer.
| Hand-scorer | Test Following Unstructured Training | Test Following Acoustilytix Training | |
|---|---|---|---|
| Experienced Scorers | |||
| 1 | 0.42 | 0.29 | <0.00001 |
| 2 | 0.55 | 0.60 | 0.031 |
| 3 | 0.30 | 0.75 | <0.00001 |
| 4 | 0.36 | 0.69 | <0.00001 |
| 5 | 0.49 | 0.64 | <0.00001 |
| Novice Scorers | |||
| 6 | NA | 0.47 | NA |
| 7 | NA | 0.42 | NA |
Example of a confusion matrix comparing call types between two raters.
| Rater 1 | |||||||
|---|---|---|---|---|---|---|---|
| Fixed Frequency 50 | Frequency Modulated 50 | Frequency Modulated with Trill 50 | Long 22 | Short 22 | Row Totals | ||
| Rater 2 | Fixed Frequency 50 | 110 | 75 | 3 | 0 | 1 | 189 |
| Frequency Modulated 50 | 32 | 305 | 16 | 0 | 0 | 353 | |
| Frequency Modulated with Trill 50 | 6 | 126 | 60 | 0 | 0 | 192 | |
| Long 22 | 2 | 1 | 0 | 1 | 0 | 4 | |
| Short 22 | 0 | 0 | 0 | 0 | 10 | 10 | |
|
|
|
|
|
|
|
| |
Expected frequencies calculated from confusion matrix in Table A1.
| Expected Frequency | |
|---|---|
| Fixed Frequency 50 | 37.90 |
| Frequency Modulated 50 | 239.27 |
| Frequency Modulated with Trill 50 | 20.28 |
| Long 22 | 0.0053 |
| Short 22 | 0.1470 |