Machine Learning for Speaker Recognition [Kietas viršelis]

(The Hong Kong Polytechnic University), (National Chiao Tung University, Taiwan)
  • Formatas: Hardback, 336 pages, Worked examples or Exercises; 4 Tables, black and white; 4 Halftones, black and white; 129 Line drawings, black and white
  • Išleidimo metai: 30-Jun-2020
  • Leidėjas: Cambridge University Press
  • ISBN-10: 1108428126
  • ISBN-13: 9781108428125
  • Formatas: Hardback, 336 pages, Worked examples or Exercises; 4 Tables, black and white; 4 Halftones, black and white; 129 Line drawings, black and white
  • Išleidimo metai: 30-Jun-2020
  • Leidėjas: Cambridge University Press
  • ISBN-10: 1108428126
  • ISBN-13: 9781108428125
"In the last ten years, many methods have been developed and deployed for real-world biometric applications and multimedia information systems. Machine learning has been playing a crucial role in these applications where the model parameters could be learned and the system performance could be optimized. As for speaker recognition, researchers and engineers have been attempting to tackle the most di cult challenges: noise robustness and domain mismatch. These e orts have now been fruitful, leading to commercial products starting to emerge, e.g., voice authentication for e-banking and speaker identication in smart speakers. Research in speaker recognition has traditionally been focused on signal processing (for extracting the most relevant and robust features) and machine learning (for classifying the features). Recently, we have witnessed the shift in the focus from signal processing to machine learning. In particular, many studies have shown that model adaptation can address both robustness and domain mismatch. As for robust feature extraction, recent studies also demonstrate that deep learning and feature learning can be a great alternative to traditional signal processing algorithms. This book has two perspectives: Machine Learning and Speaker Recognition. The machine learning perspective gives readers insights on what make stateof-the-art systems perform so well. The speaker recognition perspective enables readers to apply machine learning techniques to address practical issues (e.g., robustness under adverse acoustic environments and domain mismatch) when deploying speaker recognition systems. The theories and practices of speaker recognition are tightly connected in the book"--

Daugiau informacijos

Learn fundamental and advanced machine learning techniques for robust speaker recognition and domain adaptation with this useful toolkit.
Part I. Fundamental Theories:
1. Introduction;
2. Learning algorithms;
3. Machine learning models; Part II. Advanced Studies:
4. Deep learning models;
5. Robust speaker verification;
6. Domain adaptation;
7. Dimension reduction and data augmentation;
8. Future direction; Index.
Man-Wai Mak is Associate Professor of Department of Electronic and Information Engineering at The Hong Kong Polytechnic University. Jen-Tzung Chien is a Chair Professor at the Department of Electrical and Computer Engineering, National Chiao Tung University, Taiwan. He has published extensively, including the book Bayesian Speech and Language Processing (Cambridge 2015). He is currently serving as an elected member of the IEEE Machine Learning for Signal Processing (MLSP) Technical Committee.