| Literature DB >> 32051504 |
Anna V Müller1, José M Amigo2,3,4, Nicoline R Wichmann5, Frederik B Witschas5, Fintan J McEvoy5.
Abstract
Audio fingerprinting involves extraction of quantitative frequency descriptors that can be used for indexing, search and retrieval of audio signals in sound recognition software. We propose a similar approach with medical ultrasonographic Doppler audio signals. Power Doppler periodograms were generated from 84 ultrasonographic Doppler signals from the common carotid arteries in 22 dogs. Frequency features were extracted from each periodogram and included in a principal component analysis (PCA). From this 10 audio samples were pairwise classified as being either similar or dissimilar. These pairings were compared to a similar classification based on standard quantitative parameters used in medical ultrasound and to classification performed by a panel of listeners. The ranking of sound files according to degree of similarity differed between the frequency and conventional classification methods. The panel of listeners had an 88% agreement with the classification based on quantitative frequency features. These findings were significantly different from the score expected by chance (p < 0.001). The results indicate that the proposed frequency based classification has a perceptual relevance for human listeners and that the method is feasible. Audio fingerprinting of medical Doppler signals is potentially useful for indexing and search for similar and dissimilar audio samples in a dataset.Entities:
Mesh:
Year: 2020 PMID: 32051504 PMCID: PMC7015996 DOI: 10.1038/s41598-020-59274-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flowchart showing two data-streams taken from the medical ultrasound machine. The ultrasound machine has two outputs, time domain data parameters and an analog signal which is recorded. The time domain data was available from the machine and was included directly in a principal component analysis that allowed ranking of the time domain data according to the degree of similarity. The analog signal is first digitized in an analog digital converter (AD converter). The resulting digital data is exported to a computer where the signal processing software performs a Fourier transform to extract the frequency domain data. These data are included in another principal component analysis that allowed the identification of similar and dissimilar pairs and also a ranking of the sound files according to the degree of similarity.
Time domain spectral Doppler features in 22 dogs.
| Median | Mean | SD | Min | Max | |
|---|---|---|---|---|---|
| PSVleft | 139.95 | 143.36 | 19.23 | 112.28 | 184.02 |
| PSVright | 136.28 | 141.07 | 25.23 | 104.52 | 192.17 |
| EDVleft | 35.98 | 37.20 | 13.27 | 22.08 | 71.15 |
| EDVright | 30.30 | 31.59 | 10.05 | 17.62 | 51.00 |
| PIleft | 1.89 | 2.07 | 0.73 | 1.04 | 4.13 |
| PIright | 2.28 | 2.39 | 0.84 | 1.46 | 4.44 |
| RIleft | 0.76 | 0.74 | 0.07 | 0.60 | 0.85 |
| RIright | 0.77 | 0.78 | 0.05 | 0.70 | 0.85 |
PSV = peak systolic velocity (cm/s), EDV = end diastolic velocity (cm/s), PI = pulsatility index, RI = resistive index. Values for these parameters were generated by software in the ultrasound machine during each scan. The table shows combined data for 22 dogs. SD = standard deviation.
Figure 2Example of an original Doppler audio signal in the time domain (top) and in the frequency domain, where the power spectral density estimate (PSDE) of the same signal is displayed after Fourier Transform (bottom). The PSDE has been processed so that high and low frequency signals (i.e., noise from instrumentation and adjacent tissues) are excluded. The ten most prominent peaks are automatically selected and numbered 1 to 10, according to their location on the x-axis (i.e. frequency). In further analysis, the location, prominence, peak and width at half prominence were used. In the lower plot frequency is displayed as bin number. The range of frequencies in the 200 bins displayed is approximately 800–16,000 Hz.
Figure 3Principal component analysis. (A) Scores plot. (B) Loadings plot. Each score represents an audio file and each loading represents a frequency feature extracted from all of the audio files. Proximity in the scores plot indicates feature similarity, with regard to the variance explained by the principal components. The loadings represent the weighted importance of each feature. For example, loadings to the left on PC1 had a greater influence for scores to the left on PC1. The colored circles indicate sound files identified as dissimilar, where the same color indicates a dissimilar pair. The sound files enclosed in grey ellipses indicate similar pairs. On the plot axes, numbers in parentheses indicate the variance captured. PC = principal component.