Preface |
|
xi | |
Abbreviations |
|
xiii | |
1 Introduction |
|
1 | (14) |
|
1.1 Importance of Music Emotion Recognition |
|
|
1 | (3) |
|
1.2 Recognizing the Perceived Emotion of Music |
|
|
4 | (2) |
|
1.3 Issues of Music Emotion Recognition |
|
|
6 | (6) |
|
1.3.1 Ambiguity and Granularity of Emotion Description |
|
|
6 | (1) |
|
1.3.2 Heavy Cognitive Load of Emotion Annotation |
|
|
7 | (1) |
|
1.3.3 Subjectivity of Emotional Perception |
|
|
8 | (1) |
|
1.3.4 Semantic Gap between Low-Level Audio Signal and High-Level Human Perception |
|
|
9 | (3) |
|
|
12 | (3) |
2 Overview of Emotion Description and Recognition |
|
15 | (20) |
|
|
15 | (6) |
|
2.1.1 Categorical Approach |
|
|
16 | (2) |
|
2.1.2 Dimensional Approach |
|
|
18 | (2) |
|
2.1.3 Music Emotion Variation Detection |
|
|
20 | (1) |
|
|
21 | (11) |
|
2.2.1 Categorical Approach |
|
|
22 | (7) |
|
|
23 | (2) |
|
2.2.1.2 Data Preprocessing |
|
|
25 | (1) |
|
|
26 | (2) |
|
2.2.1.4 Feature Extraction |
|
|
28 | (1) |
|
|
28 | (1) |
|
2.2.2 Dimensional Approach |
|
|
29 | (2) |
|
2.2.3 Music Emotion Variation Detection |
|
|
31 | (1) |
|
|
32 | (3) |
3 Music Features |
|
35 | (20) |
|
|
36 | (1) |
|
|
37 | (5) |
|
|
42 | (2) |
|
|
44 | (7) |
|
|
51 | (3) |
|
|
54 | (1) |
4 Dimensional MER by Regression |
|
55 | (26) |
|
4.1 Adopting the Dimensional Conceptualization of Emotion |
|
|
55 | (2) |
|
|
57 | (2) |
|
4.2.1 Weighted Sum of Component Functions |
|
|
57 | (1) |
|
|
58 | (1) |
|
4.2.3 System Identification Approach (System ID) |
|
|
58 | (1) |
|
4.3 The Regression Approach |
|
|
59 | (3) |
|
|
59 | (1) |
|
4.3.2 Problem Formulation |
|
|
60 | (1) |
|
4.3.3 Regression Algorithms |
|
|
60 | (3) |
|
4.3.3.1 Multiple Linear Regression |
|
|
60 | (1) |
|
4.3.3.2 E-Support Vector Regression |
|
|
61 | (1) |
|
4.3.3.3 AdaBoost Regression Tree (AdaBoost.RT) |
|
|
62 | (1) |
|
|
62 | (1) |
|
|
63 | (5) |
|
|
63 | (2) |
|
|
65 | (2) |
|
|
67 | (1) |
|
|
67 | (1) |
|
4.6 Performance Evaluation |
|
|
68 | (11) |
|
4.6.1 Consistency Evaluation of the Ground Truth |
|
|
68 | (2) |
|
4.6.2 Data Transformation |
|
|
70 | (1) |
|
|
71 | (3) |
|
4.6.4 Accuracy of Emotion Recognition |
|
|
74 | (3) |
|
4.6.5 Performance Evaluation for Music Emotion Variation Detection |
|
|
77 | (1) |
|
4.6.6 Performance Evaluation for Emotion Classification |
|
|
78 | (1) |
|
|
79 | (2) |
5 Ranking-Based Emotion Annotation and Model Training |
|
81 | (26) |
|
|
81 | (1) |
|
5.2 Ranking-Based Emotion Annotation |
|
|
82 | (2) |
|
5.3 Computational Model for Ranking Music by Emotion |
|
|
84 | (6) |
|
|
85 | (1) |
|
|
85 | (5) |
|
|
85 | (1) |
|
|
85 | (2) |
|
|
87 | (3) |
|
|
90 | (1) |
|
|
90 | (6) |
|
|
92 | (3) |
|
|
95 | (1) |
|
5.6 Performance Evaluation |
|
|
96 | (8) |
|
5.6.1 Cognitive Load of Annotation |
|
|
97 | (1) |
|
5.6.2 Accuracy of Emotion Recognition |
|
|
98 | (6) |
|
5.6.2.1 Comparison of Different Feature Representations |
|
|
99 | (1) |
|
5.6.2.2 Comparison of Different Learning Algorithms |
|
|
100 | (2) |
|
|
102 | (2) |
|
5.6.3 Subjective Evaluation of the Prediction Result |
|
|
104 | (1) |
|
|
104 | (1) |
|
|
105 | (2) |
6 Fuzzy Classification of Music Emotion |
|
107 | (12) |
|
|
107 | (1) |
|
|
108 | (4) |
|
6.2.1 Fuzzy k-NN Classifier |
|
|
108 | (1) |
|
6.2.2 Fuzzy Nearest-Mean Classifier |
|
|
109 | (3) |
|
|
112 | (1) |
|
|
113 | (1) |
|
|
113 | (1) |
|
6.4.2 Feature Extraction and Feature Selection |
|
|
113 | (1) |
|
6.5 Performance Evaluation |
|
|
114 | (3) |
|
6.5.1 Accuracy of Emotion Classification |
|
|
114 | (1) |
|
6.5.2 Music Emotion Variation Detection |
|
|
114 | (3) |
|
|
117 | (2) |
7 Personalized MER and Groupwise MER |
|
119 | (16) |
|
|
119 | (2) |
|
|
121 | (1) |
|
|
122 | (2) |
|
|
124 | (4) |
|
|
124 | (2) |
|
7.4.2 Personal Information Collection |
|
|
126 | (1) |
|
|
127 | (1) |
|
7.5 Performance Evaluation |
|
|
128 | (6) |
|
7.5.1 Performance of the General Method |
|
|
128 | (2) |
|
7.5.2 Performance of GWMER |
|
|
130 | (1) |
|
7.5.3 Performance of PMER |
|
|
130 | (4) |
|
|
134 | (1) |
8 Two-Layer Personalization |
|
135 | (10) |
|
|
135 | (1) |
|
|
136 | (1) |
|
8.3 Residual Modeling and Two-Layer Personalization Scheme |
|
|
137 | (2) |
|
8.4 Performance Evaluation |
|
|
139 | (4) |
|
|
143 | (2) |
9 Probability Music Emotion Distribution Prediction |
|
145 | (28) |
|
|
145 | (1) |
|
|
146 | (2) |
|
9.3 The KDE-Based Approach to Music Emotion Distribution Prediction |
|
|
148 | (9) |
|
9.3.1 Ground Truth Collection |
|
|
148 | (2) |
|
|
150 | (3) |
|
9.3.2.1 v-Support Vector Regression |
|
|
151 | (1) |
|
9.3.2.2 Gaussian Process Regression |
|
|
151 | (2) |
|
|
153 | (3) |
|
9.3.3.1 Weighted by Performance |
|
|
153 | (1) |
|
|
154 | (2) |
|
9.3.4 Output of Emotion Distribution |
|
|
156 | (1) |
|
|
157 | (4) |
|
|
157 | (1) |
|
|
157 | (4) |
|
9.5 Performance Evaluation |
|
|
161 | (6) |
|
9.5.1 Comparison of Different Regression Algorithms |
|
|
161 | (1) |
|
9.5.2 Comparison of Different Distribution Modeling Methods |
|
|
162 | (3) |
|
9.5.3 Comparison of Different Feature Representations |
|
|
165 | (1) |
|
9.5.4 Evaluation of Regressor Fusion |
|
|
166 | (1) |
|
|
167 | (5) |
|
|
172 | (1) |
10 Lyrics Analysis and Its Application to MER |
|
173 | (14) |
|
|
173 | (1) |
|
10.2 Lyrics Feature Extraction |
|
|
174 | (5) |
|
|
175 | (1) |
|
10.2.2 Probabilistic Latent Semantic Analysis (PLSA) |
|
|
176 | (1) |
|
|
177 | (2) |
|
10.3 Multimodal MER System |
|
|
179 | (2) |
|
10.4 Performance Evaluation |
|
|
181 | (3) |
|
10.4.1 Comparison of Multimodal Fusion Methods |
|
|
181 | (2) |
|
10.4.2 Performance of PLSA Model |
|
|
183 | (1) |
|
10.4.3 Performance of Bi-Gram Model |
|
|
184 | (1) |
|
|
184 | (3) |
11 Chord Recognition and Its Application to MER |
|
187 | (10) |
|
|
187 | (4) |
|
11.1.1 Beat Tracking and PCP Extraction |
|
|
188 | (1) |
|
11.1.2 Hidden Markov Model and N-Gram Model |
|
|
188 | (2) |
|
|
190 | (1) |
|
|
191 | (2) |
|
11.2.1 Longest Common Chord Subsequence |
|
|
192 | (1) |
|
|
192 | (1) |
|
|
193 | (1) |
|
11.4 Performance Evaluation |
|
|
193 | (3) |
|
11.4.1 Evaluation of Chord Recognition System |
|
|
193 | (1) |
|
11.4.2 Accuracy of Emotion Classification |
|
|
194 | (2) |
|
|
196 | (1) |
12 Genre Classification and Its Application to MER |
|
197 | (10) |
|
|
197 | (1) |
|
12.2 Two-Layer Music Emotion Classification |
|
|
198 | (1) |
|
12.3 Performance Evaluation |
|
|
199 | (6) |
|
|
199 | (1) |
|
12.3.2 Analysis of the Correlation between Genre and Emotion |
|
|
200 | (3) |
|
12.3.3 Evaluation of the Two-Layer Emotion Classification Scheme |
|
|
203 | (5) |
|
12.3.3.1 Computational Model |
|
|
203 | (1) |
|
12.3.3.2 Evaluation Measures |
|
|
203 | (1) |
|
|
204 | (1) |
|
|
205 | (2) |
13 Music Retrieval in the Emotion Plane |
|
207 | (6) |
|
13.1 Emotion-Based Music Retrieval |
|
|
207 | (1) |
|
13.2 2D Visualization of Music |
|
|
208 | (1) |
|
|
208 | (2) |
|
13.3.1 Query by Emotion Point (QBEP) |
|
|
209 | (1) |
|
13.3.2 Query by Emotion Trajectory (QBET) |
|
|
209 | (1) |
|
13.3.3 Query by Artist and Emotion (QBAE) |
|
|
209 | (1) |
|
13.3.4 Query by Lyrics and Emotion (QBLE) |
|
|
209 | (1) |
|
|
210 | (2) |
|
|
212 | (1) |
14 Future Research Directions |
|
213 | (6) |
|
14.1 Exploiting Vocal Timbre for MER |
|
|
213 | (1) |
|
14.2 Emotion Distribution Prediction Based on Rankings |
|
|
214 | (1) |
|
14.3 Personalized Emotion-Based Music Retrieval |
|
|
215 | (1) |
|
14.4 Situational Factors of Emotion Perception |
|
|
215 | (1) |
|
14.5 Connections between Dimensional and Categorical MER |
|
|
216 | (1) |
|
14.6 Music Retrieval and Organization in 3D Emotion Space |
|
|
216 | (3) |
References |
|
219 | (18) |
Index |
|
237 | |