ROBUST SPEAKER IDENTIFICATION VIA TWO-STAGE VECTOR QUANTIZATION ENHANCEMENT

Wan-Chen li

Authors

Wan-Chen li Department of Electronic Engineering, St. John’s University, Taipei, Taiwan, China

Keywords:

Speaker Identification, Vector Quantization, Feature Extraction

Abstract

Speaker identification systems are critical components of various applications, including security, authentication, and voice-controlled devices. However, their performance can be affected by environmental noise, channel distortions, and speaker variability. This paper presents an enhanced speaker identification system using two-stage vector quantization to improve robustness against such challenges. The proposed system employs a two-stage approach: first, the input speech features are quantized using vector quantization to reduce dimensionality and enhance discriminability. Then, a classifier is trained on the quantized feature vectors to perform speaker identification. Experimental results demonstrate that the two-stage vector quantization approach significantly improves the robustness of the speaker identification system, achieving higher accuracy even in noisy and adverse conditions.

Downloads

Download data is not yet available.

References

Atal, B., “Effectiveness of Linear Prediction Characteristics of the Speech Wave for Automatic Speaker Identification and Verification,” Journal of Acoustical Society America, Vol. 55, pp. 13041312 (1974).

White, G. M. and Neely, R. B., “Speech Recognition Experiments with Linear Prediction, Bandpass Filtering, and Dynamic Programming,” IEEE Trans. on Acoustics, Speech, Signal Processing, Vol. 24, pp. 183 188 (1976).

Vergin, R., O’Shaughnessy, D. and Farhat, A., “Generalized Mel Frequency Cepstral Coefficients for LargeVocabulary Speaker-Independent Continuous-Speech Recognition,” IEEE Trans. on Speech and Audio Processing, Vol. 7, pp. 525532 (1999).

Furui, S., “Cepstral Analysis Technique for Automatic Speaker Verification,” IEEE Trans. on Acoustics, Speech, Signal Processing, Vol. 29, pp. 254272 (1981).

Tishby, N. Z., “On the Application of Mixture AR Hidden Markov Models to Text Independent Speaker Recognition,” IEEE Trans. on Signal Processing, Vol. 39, pp. 563570 (1991).

Yu, K., Mason, J. and Oglesby, J., “Speaker Recognition Using Hidden Markov Models, Dynamic Time Warping and Vector Quantisation,” IEE Proceedings – Vision, Image and Signal Processing, Vol. 142, pp. 313318 (1995).

Reynolds, D. A. and Rose, R. C., “Robust TextIndependent Speaker Identification Using Gaussian Mixture Speaker Models,” IEEE Trans. on Speech and Audio Processing, Vol. 3, pp. 7283 (1995).

Miyajima, C., Hattori, Y., Tokuda, K., Masuko, T., Kobayashi, T. and Kitamura, T., “Text-Independent Speaker Identification Using Gaussian Mixture Models Based on Multi-space Probability Distribution,” IEICE Trans. on Information and Systems, Vol. E84- D, pp. 847855 (2001).

Alamo, C. M., Gil, F. J. C., Munilla, C. T. and Gomez, L. H., “Discriminative Training of GMM for Speaker Identification,” Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 1996), Vol. 1, pp. 8992 (1996).

Pellom, B. L. and Hansen, J. H. L., “An Efficient Scoring Algorithm for Gaussian Mixture Model Based Speaker Identification,” IEEE Signal Processing Letters, Vol. 5, pp. 281284 (1998).

ROBUST SPEAKER IDENTIFICATION VIA TWO-STAGE VECTOR QUANTIZATION ENHANCEMENT

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License