Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Speaker-dependent multipitch tracking using deep neural networks.

Literature DB >> 28253703

Speaker-dependent multipitch tracking using deep neural networks.

Abstract

Multipitch tracking is important for speech and signal processing. However, it is challenging to design an algorithm that achieves accurate pitch estimation and correct speaker assignment at the same time. In this paper, deep neural networks (DNNs) are used to model the probabilistic pitch states of two simultaneous speakers. To capture speaker-dependent information, two types of DNN with different training strategies are proposed. The first is trained for each speaker enrolled in the system (speaker-dependent DNN), and the second is trained for each speaker pair (speaker-pair-dependent DNN). Several extensions, including gender-pair-dependent DNNs, speaker adaptation of gender-pair-dependent DNNs and training with multiple energy ratios, are introduced later to relax constraints. A factorial hidden Markov model (FHMM) then integrates pitch probabilities and generates the most likely pitch tracks with a junction tree algorithm. Experiments show that the proposed methods substantially outperform other speaker-independent and speaker-dependent multipitch trackers on two-speaker mixtures. With multi-ratio training, the proposed methods achieve consistent performance at various energies ratios of the two speakers in a mixture.

Mesh：

Year: 2017 PMID： 28253703 PMCID： PMC6909980 DOI： 10.1121/1.4973687

Source DB: PubMed Journal: J Acoust Soc Am ISSN： 0001-4966 Impact factor: 1.840

5 in total

1 in total

1. Application of deep neural network and deep reinforcement learning in wireless communication.

Authors: Ming Li; Hui Li
Journal: PLoS One Date: 2020-07-02 Impact factor: 3.240

1 in total

Speaker-dependent multipitch tracking using deep neural networks.

1. YIN, a fundamental frequency estimator for speech and music.

2. A fast learning algorithm for deep belief nets.

3. An audio-visual corpus for speech perception and automatic speech recognition.

4. Cepstrum pitch determination.

5. Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises.

1. Application of deep neural network and deep reinforcement learning in wireless communication.