| Literature DB >> 29315232 |
Ivan Miguel Pires1,2,3, Rui Santos4,5, Nuno Pombo6,7,8, Nuno M Garcia9,10,11, Francisco Flórez-Revuelta12, Susanna Spinsante13, Rossitza Goleva14, Eftim Zdravevski15.
Abstract
An increase in the accuracy of identification of Activities of Daily Living (ADL) is very important for different goals of Enhanced Living Environments and for Ambient Assisted Living (AAL) tasks. This increase may be achieved through identification of the surrounding environment. Although this is usually used to identify the location, ADL recognition can be improved with the identification of the sound in that particular environment. This paper reviews audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices. A comprehensive literature search was conducted in order to identify relevant English language works aimed at the identification of the environment of ADLs using data acquired with mobile devices, published between 2002 and 2017. In total, 40 studies were analyzed and selected from 115 citations. The results highlight several audio fingerprinting techniques, including Modified discrete cosine transform (MDCT), Mel-frequency cepstrum coefficients (MFCC), Principal Component Analysis (PCA), Fast Fourier Transform (FFT), Gaussian mixture models (GMM), likelihood estimation, logarithmic moduled complex lapped transform (LMCLT), support vector machine (SVM), constant Q transform (CQT), symmetric pairwise boosting (SPB), Philips robust hash (PRH), linear discriminant analysis (LDA) and discrete cosine transform (DCT).Entities:
Keywords: Activities of Daily Living (ADL); acoustic sensors; artificial intelligence; data processing; fingerprint recognition; mobile computing; signal processing algorithms; systematic review
Mesh:
Year: 2018 PMID: 29315232 PMCID: PMC5795595 DOI: 10.3390/s18010160
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Study Analysis.
| Paper | Year of Publication | Population | Purpose of the Study | Devices | Raw Data Available | Source Code Available |
|---|---|---|---|---|---|---|
| ACM | ||||||
| Sui et al. [ | 2014 | 2500 pieces of 8 s advertisement audios, and randomly select 200 pieces of audio in the existing database and 50 pieces of other irrelevant audio as test audio | To search for audio in the database by the content rather than by name | Mobile Phone (Android) | No | No |
| Liu [ | 2012 | 100,000 MP3 fragments | To create an MP3 sniffer system that includes audio fingerprinting | Not mentioned | Yes | Only for feature extraction |
| Liu et al. [ | 2011 | 10,000 MP3 fragments | Proposes an MP3 fingerprint system for the recognition of several clips | Not mentioned | The same data as [ | The same source code as [ |
| IEEE | ||||||
| Tsai et al. [ | 2016 | Multi-channel audio recordings of 75 real research group meetings, approximately 72 h of meetings in total | Proposes an adaptive audio fingerprint based on spectrotemporal eigenfilters | Mobile phones, tablets or laptop computers | Yes | No |
| Casagranda et al. [ | 2015 | 1024 samples | Proposes an audio fingerprinting method that uses GPS and acoustic fingerprints | Smartphone | No | No |
| Nagano et al. [ | 2015 | Approximately 1,518,177 min (25,303 h) of songs | Proposes a method to accelerate audio fingerprinting techniques by skipping the search for irrelevant signal sections | Not mentioned | Yes | No |
| Ziaei et al. [ | 2015 | 1062 10 s clips | Proposes a method to analyze and classify daily activities in personal audio recordings | Not mentioned | Yes | No |
| George et al. [ | 2015 | 1500 audio files | Proposes an audio fingerprinting method based on landmarks in the audio spectrogram | Computer | No | No |
| Kim et al. [ | 2015 | 6000 television advertisements with a total time of 1110 h | Proposes a television advertisement search based on audio fingerprinting in real environments | Television | No | No |
| Seo [ | 2014 | 1000 songs with classic, jazz, pop, rock, and hip-hop | Proposes a binary audio fingerprint matching, using auxiliary information | Not mentioned | No | No |
| Rafii et al. [ | 2014 | Several songs with a duration between 6 and 9 s | Proposes an audio fingerprinting method for recognition of some clips | Computer and Smartphone | No | No |
| Naini et al. [ | 2014 | 1000 songs | Proposes an audio fingerprinting method based on maximization of the mutual information across the distortion channel | Not mentioned | No | No |
| Yang et al. [ | 2014 | 200,000 songs | Proposes a music identification system based on space-saving audio fingerprints | Not mentioned | No | No |
| Yin et al. [ | 2014 | 958 randomly chosen query excerpts | Proposes an audio fingerprinting algorithm that uses compressed-domain spectral entropy | Not mentioned | No | No |
| Wang et al. [ | 2014 | 100,000 songs | Proposes an audio fingerprinting method that uses GPUs | Not mentioned | No | No |
| Lee et al. [ | 2014 | 3000 TV advertisements | Proposes a high-performance audio fingerprint extraction method for identifying Television commercial advertisement | Television | No | No |
| Shibuya et al. [ | 2013 | 1374 television programs (792 h in total) | Proposes a method of identifying media content from an audio signal recorded in reverberant and noisy environments using a mobile device | Smartphone, tablet, notebook, desktop, or another mobile device | No | No |
| Bisio et al. [ | 2013 | 20 sounds | Proposes the Improved Real-Time TV-channel Recognition (IRTR) method | Smartphone | No | No |
| Lee et al. [ | 2013 | 1000 songs as positive samples and 999 songs as negatives | Proposes a method that speeds up the search process, reducing the number of database accesses | Not mentioned | No | No |
| Bisio et al. [ | 2012 | 100,000 songs | Proposes an audio fingerprint algorithm adapted to mobile devices | Smartphone | No | No |
| Anguera et al. [ | 2012 | Several datasets | Proposes an audio fingerprinting algorithm that encodes the local spectral energies around salient points selected among the main spectral peaks in a given signal | Not mentioned | No | No |
| Duong et al. [ | 2012 | 300 real-world recordings in a living room | Proposes an audio fingerprinting method that combines the Fingerprinting technique with Generalized cross correlation | iPad | No | No |
| Wang et al. [ | 2012 | 20 music clips with 5 s | Proposes an audio fingerprinting algorithm for recognition of some clips | Not mentioned | No | No |
| Xiong et al. [ | 2012 | 835 popular songs | Proposes an audio fingerprinting algorithm based on dynamic subband locating and normalized spectral subband centroid (SSC) | Not mentioned | No | No |
| Deng et al. [ | 2011 | 100 audio files | Proposes an audio fingerprinting algorithm based on harmonic enhancement and SSC of audio signal | Not mentioned | No | No |
| Pan et al. [ | 2011 | 62-h audio database of 1000 tracks | Proposes an audio feature in spectrum, local energy centroid, for audio fingerprinting | Not mentioned | No | No |
| Martinez et al. [ | 2011 | 3600 s of several real-time tests | Presents an audio fingerprinting method with a low-cost embedded reconfigurable platform | Computer | No | No |
| Cha [ | 2011 | 1000 songs | Proposes an indexing scheme and a search algorithm based on the index | Computer | No | Only pseudo-code for fingerprint matching |
| Schurmann et al. [ | 2011 | 7500 experiments | Proposes an audio fingerprinting method for the recognition of some clips | Computer | No | No |
| Son et al. [ | 2010 | 500 popular songs | Proposes an audio fingerprinting method using sub-fingerprint masking based on the predominant pitch extraction | Mobile devices | Yes | No |
| Chang et al. [ | 2010 | 17,208 audio clips | Presents a sub-Nyquist audio fingerprinting system for music recognition, which utilizes Compressive Sampling (CS) theory | Not mentioned | No | No |
| Umapathy et al. [ | 2007 | 213 audio signals | Proposes an audio feature extraction and a multi-group classification using the local discriminant bases (LDB) technique | Not mentioned | No | No |
| Kim et al. [ | 2007 | 100 Korean broadcast TV programs | Proposes an audio fingerprinting method for identification of bookmarked audio segments | Computer | No | No |
| Sert et al. [ | 2006 | approximately 45 min of pop, rock, and country songs | Proposes an audio fingerprinting method from the most representative section of an audio clip | Not mentioned | No | No |
| Ramalingam et al. [ | 2006 | 250 audio files | Proposes and audio fingerprinting method using several features | Not mentioned | No | No |
| Ghouti et al. [ | 2006 | Two audio contents perceptually similar | Proposes an audio fingerprinting algorithm that uses balanced multiwavelets (BMW) | Not mentioned | No | No |
| Cook et al. [ | 2006 | 7,106,069 fingerprints | Proposes an audio fingerprinting algorithm for the fast indexing and searching of a metadata database | PDA or computer | Yes | No |
| Seo et al. [ | 2005 | 8000 classic, jazz, pop, rock, and hip-hop songs | Proposes an audio fingerprinting method based on normalized SSC | Not mentioned | No | No |
| Haitsma et al. [ | 2003 | 256 sub-fingerprints | Proposes to solve larger speed changes by storing the fingerprint at multiple speeds in the database or extracting the fingerprint query at multiple speeds and then to perform multiple queries on the database | Not mentioned | No | No |
| Haitsma et al. [ | 2002 | 256 sub-fingerprints | Proposes an audio fingerprinting system for recognition of some clips | Not mentioned | No | No |
Study summaries.
| Paper | Outcomes |
|---|---|
| Sui et al. [ | The authors propose a two-level audio fingerprint retrieval algorithm to satisfy the demand of accurate and efficient search for advertisement audio. Based on clips with 8 s of advertisements, the authors build a database with 2500 audio fingerprints. The results show that the algorithm implemented with parallel processing yields a precision of 100%. |
| Liu [ | The authors create an MP3 sniffer system and test it with multi-resolution local descriptions. The system has a database of 100,000 MP3 tones and authors report that the system has high performance, because 100 queries for identifying unknown MP3 tones took less than 2 s to be processed |
| Liu et al. [ | The authors describe an MP3 fingerprinting system that compares the normalized distance between two MP3 fingerprints to detect a false identification. The authors identify the possible features of the song and build a large database. For the identification, the authors test the near neighbor searching schemes and compare with the indexing scheme, which utilizes the PCA technique, the QUery Context (QUC)-tree, and the MP3 signatures. The conclusions show that the system has a maximum average error equals to 4.26%. |
| Tsai et al. [ | The authors propose a method for aligning a set of overlapping meeting recordings, which uses an audio fingerprint representation based on spectrotemporal eigenfilters that are learned on-the-fly in an unsupervised manner. The proposed method is able to achieve more than 99% alignment accuracy at a reasonable error tolerance of 0.1 s. |
| Casagranda et al. [ | The authors propose an audio fingerprinting algorithm based on the spectral features of the audio samples. The authors reported that the algorithm is noise tolerant, which is a key feature for audio based group detection. |
| Nagano et al. [ | The authors propose an approach to accelerate fingerprinting techniques and apply it to the divide-and-locate (DAL) method. The reported results show that DAL3 can reduce the computational cost of DAL to approximately 25%. |
| Ziaei et al. [ | The authors propose a method to analyze and classify daily activities in personal audio recordings (PARs), which uses speech activity detection (SAD), speaker diarization, and a number of audio, speech and lexical features to characterize events in daily audio streams. The reported overall accuracy of the method is approximately 82%. |
| George et al. [ | The authors propose an audio fingerprinting method that is tolerant to time-stretching and is scalable. The proposed method uses three peaks in the time slice, unlike Shazam, which uses only one. The additive noise deteriorates the lowest frequency bin, decreasing the performance of the algorithm at higher additive noise, compared to other algorithms. |
| Kim et al. [ | The authors propose a Television advertisement search based on audio peak-pair hashing method. The reported results show that the proposed method has respectable results compared to other methods. |
| Seo [ | The authors propose an asymmetric fingerprint matching method which utilizes an auxiliary information obtained while extracting fingerprints from the input unknown audio. The experiments carried out with one thousand songs against various distortions compare the performance of the asymmetric matching with the conventional Hamming distance. Reported results suggest that the proposed method has better performance than the conventional Hamming distance. |
| Rafii et al. [ | The authors propose an audio fingerprinting system with two stages: fingerprinting and matching. The system uses CQT and a threshold method for fingerprinting stage, and the Hamming similarity and the Hough Transform for the matching stage, reporting an accuracy between 61% and 81%. |
| Naini et al. [ | The authors present a method for designing fingerprints that maximizes a mutual information metric, using a greedy optimization method that relies on the information bottleneck (IB) method. The results report a maximum accuracy around 65% in the recognition. |
| Yang et al. [ | The authors propose an efficient music identification system that utilizes a kind of space-saving audio fingerprints. The experiments were conducted on a database of 200,000 songs and a query set of 20,000 clips compressed in MP3 format with different bit rates. The author’s report that compared to other methods, this method reduces the memory consumption and keeps the recall rate at approximately 98%. |
| Yin et al. [ | The authors propose a compressed-domain audio fingerprinting algorithm for MP3 music identification in the Internet of Things. The algorithm achieves promising results on robustness and retrieval precision rates under various time-frequency audio signal distortions including the challenging pitch shifting and time-scale modification. |
| Wang et al. [ | The authors propose parallelized schemes for audio fingerprinting over GPU. In the experiments, the speedup factors of the landmark lookup and landmark analysis are verified and the reported overall response time has been reduced. |
| Lee et al. [ | The authors propose a salient audio peak pair fingerprint extraction based on CQT. The reported results show that the proposed method has better results compared to other methods, and is suitable for many practical portable consumer devices. |
| Shibuya et al. [ | The authors develop a method that uses the quadratically interpolated FFT (QIFFT) for the audio fingerprint generation in order to identify media content from an audio signal recorded in a reverberant or noisy environment with an accuracy around 96%. |
| Bisio et al. [ | The authors present an improvement of the parameter configuration used by the Philips audio fingerprint computation algorithm in order to reduce the computational load and consequent energy consumption in the smartphone client. The results show a significant reduction of computational time and power consumption of more than 90% with a limited decrease in recognition performance. |
| Lee et al. [ | The authors propose an audio fingerprint search algorithm for music retrieval from large audio databases. The results of the proposed method achieve 80–99% search accuracy for input audio samples of 2–3 s with signal-to-noise ratio (SNR) of 10 dB or above. |
| Bisio et al. [ | The authors present an optimization of the Philips Robust Hash audio fingerprint computation algorithm, in order to adapt it to run on a smartphone device. In the experiments, the authors report that the proposed algorithm has an accuracy of 95%. |
| Anguera et al. [ | The authors present a novel local audio fingerprint called Masked Audio Spectral Keypoints (MASK) that is able to encode, with few bits, the audio information of any kind in an audio document. MASK fingerprints encode the local energy distribution around salient spectral points by using a compact binary vector. The authors report an accuracy around 58%. |
| Duong et al. [ | The authors presented a new approach based on audio fingerprinting techniques. The results of this study indicate that a high level of synchronization accuracy can be achieved for a recording period as short as one second. |
| Wang et al. [ | The authors present an audio fingerprinting algorithm, where the audio fingerprints are produced based on 2-Dimagel, reporting an accuracy between 88% and 99%. |
| Xiong et al. [ | The authors propose an improved audio fingerprinting algorithm based on dynamic subband locating and normalized Spectral Subband Centroid (SSC). The authors claim that the algorithm can recognize unknown audio clips correctly, even in the presence of severe noise and distortion. |
| Deng et al. [ | The authors propose an audio fingerprinting algorithm based on harmonic enhancement and Spectral Subband Centroid (SSC). The authors build a database with 100 audio files, and also implement several techniques to reduce the noise and other degradations, proving the reliability of the method when severe channel distortion is present. The results report an accuracy between 86% and 93%. |
| Pan et al. [ | The authors propose a method for fingerprinting generation using the local energy centroid (LEC) as a feature. They report that the method is robust to different noise conditions and, when the linear speed is not changed, the audio fingerprint method based on LEC obtains an accuracy of 100%, reporting better results than Shazam’s fingerprinting. |
| Martinez et al. [ | The authors present a music information retrieval algorithm based on audio fingerprinting techniques. The size of frame windows influences the performance of the algorithm, e.g., the best size of the frame window for shorts audio tracks is between 32 ms to 64 ms, and the best size of the frame window for audio tracks is 128 ms. |
| Cha [ | The author proposes an indexing scheme for large audio fingerprint databases. The method shows a higher performance than the Haitsma-Kalker method with respect to accuracy and speed. |
| Schurmann et al. [ | The authors propose a fuzzy-cryptography scheme that is adaptable in its noise tolerance through the parameters of the error correcting code used and the audio sample length. In a laboratory environment, the authors utilized sets of recordings for five situations at three loudness levels and four relative positions of microphones and audio source. The authors derive the expected Hamming distance among audio fingerprints through 7500 experiments. The fraction of identical bits is above 0.75 for fingerprints from the same audio context, and below 0.55 otherwise. |
| Son et al. [ | The authors present an audio fingerprinting algorithm to recognize songs in real noisy environments, which outperforms the original Philips algorithm in recognizing polyphonic music in real similar environments. |
| Chang et al. [ | The authors introduce the Compressive Sampling (CS) theory to the audio fingerprinting system for music recognition, by proposing a CS-based sub-Nyquist audio fingerprinting system. Authors claim that this system achieves an accuracy of 93.43% in reducing the sampling rate and in the extraction of musical features. |
| Umapathy et al. [ | The authors present a novel local discriminant bases (LDB)-based audio classification scheme covering a wide range of audio signals. After the experiments, the obtained results suggest significant potential for LDB-based audio classification in auditory scene analysis or environment detection. |
| Kim et al. [ | The authors develop a system that retrieves desired bookmarked video segments using audio fingerprint techniques based on the logarithmic modified Discrete Cosine Transform (DCT) modulation coefficients (LMDCT-MC) feature and two-stage bit vector searching method. The author’s state that the search accuracy obtained is 99.67%. |
| Sert et al. [ | The authors propose an audio fingerprinting model based on the Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) features, reporting and accuracy of 93% and 91%, respectively. |
| Ramalingam et al. [ | The authors propose a method to create audio fingerprints by Gaussian Mixtures using features extracted from the short-time Fourier transform (STFT) of the signal. The experiments were performed on a database of 250 audio files, obtaining the highest identification rate of 99.2% with spectral centroid. |
| Ghouti et al. [ | The authors propose a framework for robust identification of audio content by using short robust hashing codes, which applies the forward balanced multiwavelet (BMW) to transform each audio frame using 5 decomposition levels, and after the distribution of the subbands’ coefficients into 32 different blocks, the estimation quantization (EQ) scheme and the hashes are computed. |
| Cook et al. [ | The authors propose a system that allows audio content identification and association of metadata in very restricted embedded environments. The authors report that the system has better performance than the method based on a more traditional n-dimensional hashing scheme, but it achieves results with 2% less accuracy. |
| Seo et al. [ | The authors propose an audio fingerprinting method based on the normalized Spectral Subband Centroid (SSC), where the match is performed using the square of the Euclidean distance. The normalized SSC obtains better results than the widely-used features, such as tonality and Mel Frequency Cepstral Coefficients (MFCC). |
| Haitsma et al. [ | The authors present an approach to audio fingerprinting, but it has negligible effects on other aspects, such as robustness and reliability. They proved that the developed method is robust in case of linear speed changes. |
| Haitsma et al. [ | The authors present an approach to audio fingerprinting, in which the fingerprint extraction is based on the extraction of a 32-bit sub-fingerprint every 11.8 millis. They also develop a fingerprint database and implement a two-phase search algorithm, achieving an excellent performance, and allowing the analytical modeling of false acceptance rates. |
Figure 1Flow diagram of identification and inclusion of papers.
Distribution of the features extracted in the studies.
| Features | Average Accuracy of Features | Number of Studies |
|---|---|---|
| Fast Fourier Transform (FFT) | 93.85% | 16 |
| Thresholding | 90.49% | 6 |
| Normalized spectral subband centroid (SSC) | 93.44% | 5 |
| Mel-frequency cepstrum coefficients (MFCC) | 97.30% | 4 |
| Maximum | 87.57% | 3 |
| Local peaks and landmarks | 82.32% | 3 |
| Shannon entropy | 99.10% | 2 |
| Rényi entropy | 99.10% | 2 |
| MPEG-7 descriptors | 97.50% | 2 |
| Spectral bandwidth | 97.10% | 2 |
| Spectral flatness measure | 97.10% | 2 |
| Modified discrete cosine transform (MDCT) | 93.00% | 2 |
| Constant Q transform (CQT) | 85.40% | 2 |
| Short-time Fourier transform (STFT) | 84.50% | 2 |
| Average | 83.00% | 2 |
| Minimum | 83.00% | 2 |
| Sum of the spectrum modulus of every frame | 100.00% | 1 |
| Sum of global spectrum modulus in two stages | 100.00% | 1 |
| Local energy centroid (LEC) | 100.00% | 1 |
| Time-frequency power spectral | 100.00% | 1 |
| Long-term logarithmic modified discrete cosine transform (DCT) modulation coefficients (LMDCT-MC) | 99.67% | 1 |
| Bit packing | 99.20% | 1 |
| Spectral band energy | 99.20% | 1 |
| Spectral crest factor | 99.20% | 1 |
| Spectral similarity | 99.00% | 1 |
| Timbral texture | 99.00% | 1 |
| Band periodicity | 99.00% | 1 |
| Linear prediction coefficient derived cepstral coefficients (lpccs) | 99.00% | 1 |
| Zero crossing rate | 99.00% | 1 |
| Octaves | 99.00% | 1 |
| Single local description | 96.00% | 1 |
| Multiple local description | 96.00% | 1 |
| Chroma vectors | 96.00% | 1 |
| MP3 signatures | 96.00% | 1 |
| Time averaging | 96.00% | 1 |
| Quadratic interpolation | 96.00% | 1 |
| Sinusoidal quantification | 96.00% | 1 |
| Frequency-axial discretization | 96.00% | 1 |
| Time-axial warping | 96.00% | 1 |
| Logarithmic moduled complex lapped transform spectral peaks | 86.50% | 1 |
| Correlation coefficient | 70.00% | 1 |
| Matching score | 70.00% | 1 |
Distribution of the methods implemented in the studies.
| Methods | Average Accuracy of Methods | Number of Studies |
|---|---|---|
| Other methods | 90.78% | 15 |
| Two level search algorithm | 99.84% | 2 |
| Likelihood estimation | 97.10% | 3 |
| Principal Component Analysis (PCA) | 90.13% | 2 |
| Hamming distances between each fingerprint | 83.50% | 2 |
| Streaming audio fingerprinting (SAF) | 100.00% | 1 |
| Human auditory system (HAS) | 100.00% | 1 |
| Divide and locate (DAL) | 100.00% | 1 |
| Gaussian mixture models (GMM) modelling | 99.20% | 1 |
| Local discriminant bases (LDBS)-based automated multigroup audio classification system | 99.00% | 1 |
| Linear discriminant analysis (LDA) | 99.00% | 1 |
| Local maximum chroma energy (LMCE) | 99.00% | 1 |
| Expanded hash table lookup method | 97.40% | 1 |
| Query Context (QUC)-tree | 96.00% | 1 |
| Improved Real-Time TV-channel Recognition (IRTR) | 95.00% | 1 |
| Sub-Nyquist fudio fingerprinting system | 93.43% | 1 |
| Logarithmic moduled complex lapped transform (LMCLT) peak pair | 86.50% | 1 |
| TO-Combo-SAD (Threshold Optimized Combo SAD) algorithm | 84.25% | 1 |
| Support vector machine (SVM) | 84.25% | 1 |
| Hough Transform between each fingerprint | 81.00% | 1 |
| Masked audio spectral keypoints (MASK) | 58.00% | 1 |
Potential accuracies for the top most accurate methods vs. top most mean accurate features (mean accuracies equal or higher than 99%, according to its authors).
| SAF | HAS | DAL | TLS | GMM | LDBS | LDA | LMCE | ||
|---|---|---|---|---|---|---|---|---|---|
| Local energy centroid (LEC) | 100.00% | 100.00% | 100.00% | 99.92% | 99.60% | 99.50% | 99.50% | 99.50% | |
| Sum of global spectrum modulus in two stages | 100.00% | 100.00% | 100.00% | 99.92% | 99.60% | 99.50% | 99.50% | 99.50% | |
| Sum of the spectrum modulus of every frame | 100.00% | 100.00% | 100.00% | 99.92% | 99.60% | 99.50% | 99.50% | 99.50% | |
| Time-frequency power spectral | 100.00% | 100.00% | 100.00% | 99.92% | 99.60% | 99.50% | 99.50% | 99.50% | |
| Long-term logarithmic modified discrete cosine transform (DCT) modulation coefficients (LMDCT-MC) | 99.84% | 99.84% | 99.84% | 99.76% | 99.44% | 99.34% | 99.34% | 99.34% | |
| Bit packing | 99.60% | 99.60% | 99.60% | 99.52% | 99.20% | 99.10% | 99.10% | 99.10% | |
| Spectral band energy | 99.60% | 99.60% | 99.60% | 99.52% | 99.20% | 99.10% | 99.10% | 99.10% | |
| Spectral crest factor | 99.60% | 99.60% | 99.60% | 99.52% | 99.20% | 99.10% | 99.10% | 99.10% | |
| Rényi entropy | 99.55% | 99.55% | 99.55% | 99.47% | 99.15% | 99.05% | 99.05% | 99.05% | |
| Shannon entropy | 99.55% | 99.55% | 99.55% | 99.47% | 99.15% | 99.05% | 99.05% | 99.05% | |
| Band periodicity | 99.50% | 99.50% | 99.50% | 99.42% | 99.10% | 99.00% | 99.00% | 99.00% | |
| Linear prediction coefficient derived cepstral coefficients (lpccs) | 99.50% | 99.50% | 99.50% | 99.42% | 99.10% | 99.00% | 99.00% | 99.00% | |
| Octaves | 99.50% | 99.50% | 99.50% | 99.42% | 99.10% | 99.00% | 99.00% | 99.00% | |
| Spectral similarity | 99.50% | 99.50% | 99.50% | 99.42% | 99.10% | 99.00% | 99.00% | 99.00% | |
| Timbral texture | 99.50% | 99.50% | 99.50% | 99.42% | 99.10% | 99.00% | 99.00% | 99.00% | |
| Zero crossing rate | 99.50% | 99.50% | 99.50% | 99.42% | 99.10% | 99.00% | 99.00% | 99.00% |