Literature DB >> 32165022

Performance Evaluation of Subharmonic-to-Harmonic Ratio (SHR) Computation.

Christian T Herbst1.   

Abstract

Subharmonics are an important class of voice signals, relevant for speech, pathological voice, singing, and animal bioacoustics. They arise from special cases of amplitude (AM) or frequency modulation (FM) of the time-domain signal. Surprisingly, to date there is only one open source subharmonics detector available to the scientific community: Sun's subharmonic-to-harmonic ratio (SHR). Here, this algorithm was subjected to a formal evaluation with two data sets of synthesized and empirical speech samples. Both data sets consisted of electroglottographic (EGG) signals, ie, a physiological correlate of vocal fold oscillation that bypasses vocal tract acoustics. Data Set I contained 2560 synthesized EGG signals with varying degrees of AM and FM, fundamental frequency (fo), periodicity, and signal-to-noise ratio (SNR). Data Set II was made up of 25 EGG samples extracted from the CMU Arctic speech data base. For a "ground truth" of subharmonicity, these samples were manually annotated by a group of five external experts. Analysis of the synthesized data suggested that the SHR metric is relatively robust as long as the subharmonic modulation extent is below 0.35 and 0.7 for the FM and AM scenarios, respectively. In the CMU Arctic speech data samples, the SHR analysis reached a maximum sensitivity of about 87% at a specificity of over 90%, but only for adaptive algorithm parameter settings. In contrast, the algorithm's default parameter settings could only successfully classify about 9% of all subharmonic instances. The SHR is a useful metric for assessing the degree of subharmonics contained in voice signals, but only at adaptive parameter settings. In particular, the frequency ceiling should be set to five times the highest fo, and the frame length to at least five times the largest fundamental period of the analyzed signal. For subharmonic classification a threshold of SHR  ≥  0.01 is recommended.
Copyright © 2019 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

Keywords:  EGG; Electroglottography; Period doubling; SHR; Subharmonic-to-harmonic ratio; Subharmonics; Voice

Year:  2020        PMID: 32165022     DOI: 10.1016/j.jvoice.2019.11.005

Source DB:  PubMed          Journal:  J Voice        ISSN: 0892-1997            Impact factor:   2.009


  1 in total

1.  Speaker discrimination performance for "easy" versus "hard" voices in style-matched and -mismatched speech.

Authors:  Amber Afshan; Jody Kreiman; Abeer Alwan
Journal:  J Acoust Soc Am       Date:  2022-02       Impact factor: 1.840

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.