Atnaujinkite slapukų nuostatas

Discriminative Learning for Speech Recognition: Theory and Practice [Minkštas viršelis]

In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-function form. This common form enables the use of the growth transformation (or extended BaumWelch) optimization framework in discriminative learning of model parameters. In addition to all the necessary introduction of the background and tutorial material on the subject, we also included technical details on the derivation of the parameter optimization formulas for exponential-family distributions, discrete hidden Markov models (HMMs), and continuous-density HMMs in discriminative learning. Selected experimental results obtained by the authors in firsthand are presented to show that discriminative learning can lead to superior speech recognition performance over conventional parameter learning. Details on major algorithmic implementation issues with practical significance are provided to enable the practitioners to directly reproduce the theory in the earlier part of the book into engineering practice.
Introduction and Background
1(24)
What is Discriminative Learning?
1(1)
What is Speech Recognition?
2(2)
Roles of Discriminative Learning in Speech Recognition
4(1)
Background: Basic Probality Distributions
5(12)
Multinomial Distributions
6(1)
Gaussian and Mixture-of-Gaussian Distributions
7(1)
Exponential-Family Distribution
7(10)
Background: Basic Optimization Concepts and Techniques
17(6)
Basic Definitions
18(1)
Necessary and Sufficient Conditions for an Optimum
18(1)
Lagrange Multiplier Method for Constrained Optimization
19(1)
Gradient Descent Method
20(1)
Growth Transformation Method: Introduction
21(2)
Organization of the Book
23(2)
Statistical Speech Recognition: A Tutorial
25(6)
Introduction
25(1)
Language Modeling
26(1)
Acoustic Modeling and HMMs
27(4)
Discriminative Learning: A Unified Objective Function
31(16)
Introduction
31(1)
A Unified Discriminative Training Criterion
32(1)
Notations
32(1)
The Central Result
32(1)
MMI and its Unified Form
33(2)
Introduction to MMI Criterion
33(1)
Reformulation of the MMI Criterion into Its Unified Form
34(1)
MCE and its Unified Form
35(4)
Introduction to the MCE Criterion
35(3)
Reformulation of the MCE Criterion Into its Unified Form
38(1)
Minimum Phone/Word Error and its Unified Form
39(2)
Introduction to the MPE/MWE Criterion
39(1)
Reformulation of the MPE/MWE Criterion Into Its Unified Form
40(1)
Discussions and Comparisons
41(6)
Discussion and Elaboration on the Unified Form
41(2)
Comparisons With Another Unifying Framework
43(4)
Discriminative Learning Algorithm for Exponential-Family Distributions
47(12)
Exponential-Family Models for Classification
47(1)
Construction of Auxiliary Functions
48(1)
GT Learning for Exponential-Family Distributions
49(5)
Estimation Formulas for Two Exponential-Family Distributions
54(5)
Multinomial Distribution
54(1)
Multivariate Gaussian Distribution
55(4)
Discriminative Learning Algorithm for Hidden Markov Model
59(16)
Estimation Formulas for Discrete HMM
59(8)
Constructing Auxiliary Function F (ΛΛ')
60(1)
Constructing Auxiliary Function V(ΛΛ')
60(1)
Simplifying Auxiliary Function V(ΛΛ')
61(4)
GT by Optimizing Auxiliary Function U(ΛΛ')
65(2)
Estimation Formulas for CDHMM
67(3)
Relationship with Gradient-Based Methods
70(1)
Setting Constant D for GT-Based Optimization
71(4)
Existence Proof of Finite D in GT Updates for CDHMM
72(3)
Practical Implementation of Discriminative Learning
75(16)
Computing ΔΓ (i, r, t) in Growth-Transform Formulas
75(4)
Product Form of C(s) (for MMI)
76(2)
Summation Form of C(s) (MCE and MPE/MWE)
78(1)
Computing ΔΓ(i, r, t) Using Lattices
79(9)
Computing ΔΓ (i, r, t) for MMI Involving Lattices
80(3)
Computing ΔΓ (i, r, t) for MPE/MWE Involving Lattices
83(4)
Computing ΔΓ (i, r, t) for MCE Involving Lattices
87(1)
Arbitrary Exponent Scaling in MCE Implementation
88(1)
Arbitrary Slope in Defining MCE Cost Function
89(2)
Selected Experimental Results
91(6)
Experimental Results on Small ASR Tasks TIDIGITS
91(1)
Telephony LV-ASR Applications
92(5)
Epilogue
97(6)
Summary of Book Contents
97(1)
Summary of Contributions
98(1)
Remaining Theoretical Issue and Future Direction
99(4)
Major Symbols Used in the Book and Their Descriptions 103(2)
Mathematical Notation 105(2)
Bibliography 107(4)
Author Biography 111