Preface |
|
xiii | |
1 Data Analysis |
|
1 | (8) |
|
1.1 Perspectives of Data Analysis |
|
|
1 | (2) |
|
1.2 Strategies and Stages of Data Analysis |
|
|
3 | (1) |
|
|
4 | (3) |
|
1.3.1 Heterogeneity in Data Sources |
|
|
5 | (1) |
|
1.3.1.1 Heterogeneity in Study Subject Populations |
|
|
5 | (1) |
|
1.3.1.2 Heterogeneity in Data due to Timing of Generations |
|
|
5 | (1) |
|
|
6 | (1) |
|
1.3.3 Spurious Correlation |
|
|
6 | (1) |
|
|
6 | (1) |
|
1.4 Data Sets Analyzed in This Book |
|
|
7 | (2) |
|
|
7 | (1) |
|
1.4.2 Riboflavin Production with Bacillus Subtilis |
|
|
7 | (1) |
|
|
7 | (1) |
|
1.4.4 The Boston Housing Data Set |
|
|
8 | (1) |
2 Examining Data Distribution |
|
9 | (20) |
|
|
9 | (3) |
|
2.1.1 Histogram, Stem-and-Leaf, Density Plot |
|
|
9 | (1) |
|
|
10 | (1) |
|
2.1.3 Quantile-Quantile (Q-Q) Plot, Normal Plot, Probability-Probability (P-P) Plot |
|
|
11 | (1) |
|
|
12 | (7) |
|
|
12 | (1) |
|
2.2.2 Ellipse - Visualization of Covariance and Correlation |
|
|
13 | (4) |
|
2.2.3 Multivariate Normality Test |
|
|
17 | (2) |
|
2.3 More Than Two Dimension |
|
|
19 | (6) |
|
2.3.1 Scatter Plot Matrix |
|
|
19 | (1) |
|
|
20 | (3) |
|
|
23 | (2) |
|
2.4 Visualization of Categorical Data |
|
|
25 | (4) |
|
|
26 | (1) |
|
|
27 | (2) |
3 Regressions |
|
29 | (30) |
|
|
29 | (1) |
|
|
30 | (4) |
|
3.2.1 Example: Lasso on Continuous Data |
|
|
31 | (1) |
|
3.2.2 Example: Lasso on Binary Data |
|
|
32 | (1) |
|
3.2.3 Example: Lasso on Survival Data |
|
|
33 | (1) |
|
|
34 | (3) |
|
3.3.1 Example: Group Lasso on Gene Signatures |
|
|
35 | (2) |
|
|
37 | (8) |
|
3.4.1 Example: Lasso, Group Lasso, Sparse Group Lasso on Simulated Continuous Data |
|
|
38 | (3) |
|
3.4.2 Example: Lasso, Group Lasso, Sparse Group Lasso on Gene Signatures Continuous Data |
|
|
41 | (4) |
|
|
45 | (4) |
|
3.5.1 Example: Adaptive Lasso on Continuous Data |
|
|
46 | (1) |
|
3.5.2 Example: Adaptive Lasso on Binary Data |
|
|
47 | (2) |
|
|
49 | (4) |
|
3.6.1 Example: Elastic Net on Continuous Data |
|
|
51 | (1) |
|
3.6.2 Example: Elastic Net on Binary Data |
|
|
52 | (1) |
|
3.7 The Sure Screening Method |
|
|
53 | (4) |
|
3.7.1 The Sure Screening Method |
|
|
54 | (1) |
|
3.7.2 Sure Independence Screening on Model Selection |
|
|
55 | (1) |
|
3.7.3 Example: SIS on Continuous Data |
|
|
56 | (1) |
|
3.7.4 Example: SIS on Survival Data |
|
|
56 | (1) |
|
3.8 Identify Minimal Class of Models |
|
|
57 | (2) |
|
3.8.1 Analysis Using Minimal Models |
|
|
58 | (1) |
4 Recursive Partitioning Modeling |
|
59 | (42) |
|
4.1 Recursive Partitioning Modeling via Trees |
|
|
59 | (11) |
|
4.1.1 Elements of Growing a Tree |
|
|
59 | (1) |
|
|
60 | (1) |
|
4.1.2 The Impurity Function |
|
|
60 | (1) |
|
4.1.2.1 Definition of Impurity Function |
|
|
61 | (1) |
|
4.1.2.2 Measure of Node Impurity - the Gini Index |
|
|
61 | (1) |
|
4.1.3 Misclassification Cost |
|
|
61 | (1) |
|
|
62 | (1) |
|
4.1.5 Example of Recursive Partitioning |
|
|
63 | (7) |
|
4.1.5.1 Recursive Partitioning with Binary Outcomes |
|
|
63 | (2) |
|
4.1.5.2 Recursive Partitioning with Continuous Outcomes |
|
|
65 | (2) |
|
4.1.5.3 Recursive Partitioning for Survival Outcomes |
|
|
67 | (3) |
|
|
70 | (7) |
|
4.2.1 Mechanism of Action of Random Forests |
|
|
72 | (1) |
|
4.2.2 Variable Importance |
|
|
72 | (1) |
|
4.2.3 Random Forests for Regression |
|
|
73 | (1) |
|
4.2.4 Example of Random Forest Data Analysis |
|
|
73 | (4) |
|
4.2.4.1 randomForest for Binary Data |
|
|
73 | (3) |
|
4.2.4.2 randomForest for Continuous Data |
|
|
76 | (1) |
|
4.3 Random Survival Forest |
|
|
77 | (4) |
|
4.3.1 Algorithm to Construct RSF |
|
|
78 | (1) |
|
4.3.2 Individual and Ensemble Estimate at Terminal Nodes |
|
|
79 | (1) |
|
|
79 | (1) |
|
|
79 | (2) |
|
4.4 XGBoost: A Tree Boosting System |
|
|
81 | (7) |
|
4.4.1 Example Using xgboost for Data Analysis |
|
|
83 | (4) |
|
4.4.1.1 xgboost for Binary Data |
|
|
83 | (1) |
|
4.4.1.2 xgboost for Continuous Data |
|
|
84 | (3) |
|
4.4.2 Example - xgboost for Cox Regression |
|
|
87 | (1) |
|
4.5 Model-based Recursive Partitioning |
|
|
88 | (3) |
|
4.5.1 The Recursive Partitioning Algorithm |
|
|
89 | (1) |
|
|
89 | (2) |
|
4.6 Recursive Partition for Longitudinal Data |
|
|
91 | (4) |
|
|
91 | (1) |
|
4.6.2 Recursive Partition for Longitudinal Data Based on Baseline Covariates |
|
|
92 | (1) |
|
|
92 | (1) |
|
|
93 | (1) |
|
4.6.4 Example of Recursive Partitioning of Longitudinal Data |
|
|
93 | (2) |
|
4.7 Analysis of Ordinal Data |
|
|
95 | (1) |
|
4.8 Examples - Analysis of Ordinal Data |
|
|
96 | (3) |
|
4.8.1 Analysis of Cleveland Clinic Heart Data (Ordinal) |
|
|
96 | (1) |
|
4.8.2 Analysis of Cleveland Clinic Heart Data (Twoing) |
|
|
97 | (2) |
|
4.9 Advantages and Disadvantages of Trees |
|
|
99 | (2) |
5 Support Vector Machine |
|
101 | (28) |
|
5.1 General Theory of Classification and Regression in Hyperplane |
|
|
101 | (3) |
|
|
102 | (1) |
|
|
102 | (2) |
|
5.1.2.1 Method of Stochastic Approximation |
|
|
103 | (1) |
|
5.1.2.2 Method of Sigmoid Approximations |
|
|
103 | (1) |
|
5.1.2.3 Method of Radial Basis Functions |
|
|
104 | (1) |
|
5.2 SVM for Indicator Functions |
|
|
104 | (8) |
|
5.2.1 Optimal Hyperplane for Separable Data Sets |
|
|
104 | (2) |
|
5.2.1.1 Constructing the Optimal Hyperplane |
|
|
105 | (1) |
|
5.2.2 Optimal Hyperplane for Non-Separable Sets |
|
|
106 | (2) |
|
5.2.2.1 Generalization of the Optimal Hyperplane |
|
|
106 | (2) |
|
5.2.3 Support Vector Machine |
|
|
108 | (1) |
|
|
109 | (1) |
|
5.2.4.1 Polynomial Kernel Functions |
|
|
110 | (1) |
|
5.2.4.2 Radial Basis Kernel Functions |
|
|
110 | (1) |
|
5.2.5 Example: Analysis of Binary Classification Using SVM |
|
|
110 | (2) |
|
5.2.6 Example: Effect of Kernel Selection |
|
|
112 | (1) |
|
5.3 SVM for Continuous Data |
|
|
112 | (5) |
|
5.3.1 Minimizing the Risk with f-insensitive Loss Functions |
|
|
113 | (2) |
|
5.3.2 Example: Regression Analysis Using SVM |
|
|
115 | (2) |
|
5.4 SVM for Survival Data Analysis |
|
|
117 | (2) |
|
5.4.1 Example: Analysis of Survival Data Using SVM |
|
|
118 | (1) |
|
5.5 Feature Elimination for SVM |
|
|
119 | (3) |
|
5.5.1 Example: Gene Selection via SVM with Feature Elimination |
|
|
120 | (2) |
|
5.6 Spare Bayesian Learning with Relevance Vector Machine (RVM) |
|
|
122 | (5) |
|
5.6.1 Example: Regression Analysis Using RVM |
|
|
125 | (1) |
|
5.6.2 Example: Curve Fitting for SVM and RVM |
|
|
125 | (2) |
|
5.7 SV Machines for Function Estimation |
|
|
127 | (2) |
6 Cluster Analysis |
|
129 | (26) |
|
6.1 Measure of Distance/Dissimilarity |
|
|
129 | (2) |
|
6.1.1 Continuous Variables |
|
|
130 | (1) |
|
6.1.2 Binary and Categorical Variables |
|
|
130 | (1) |
|
|
130 | (1) |
|
6.1.4 Other Measure of Dissimilarity |
|
|
131 | (1) |
|
6.2 Hierarchical Clustering |
|
|
131 | (4) |
|
|
132 | (1) |
|
6.2.2 Example of Hierarchical Clustering |
|
|
133 | (2) |
|
|
135 | (4) |
|
6.3.1 General Description of K-means Clustering |
|
|
135 | (2) |
|
6.3.2 Estimating the Number of Clusters |
|
|
137 | (2) |
|
6.4 The PAM Clustering Algorithm |
|
|
139 | (2) |
|
6.4.1 Example of K-means with PAM Clustering Algorithm |
|
|
141 | (1) |
|
|
141 | (3) |
|
6.5.1 Example of Bagged Clustering |
|
|
142 | (2) |
|
6.6 RandomForest for Clustering |
|
|
144 | (1) |
|
6.6.1 Example: Random Forest for Clustering |
|
|
144 | (1) |
|
6.7 Mixture Models/Model-based Cluster Analysis |
|
|
145 | (2) |
|
6.8 Stability of Clusters |
|
|
147 | (1) |
|
|
147 | (4) |
|
6.9.1 Determination of Clusters |
|
|
148 | (1) |
|
6.9.2 Example of Consensus Clustering on RNA Sequence Data |
|
|
149 | (2) |
|
6.10 The Integrative Clustering Framework |
|
|
151 | (4) |
|
6.10.1 Example: Integrative Clustering |
|
|
152 | (3) |
7 Neural Network |
|
155 | (18) |
|
7.1 General Theory of Neural Network |
|
|
155 | (1) |
|
7.2 Elemental Aspects and Structure of Artificial Neural Networks |
|
|
156 | (1) |
|
7.3 Multilayer Perceptrons |
|
|
157 | (1) |
|
7.3.1 The Simple (Single Unit) Perceptron |
|
|
157 | (1) |
|
7.3.2 Training Perceptron Learning |
|
|
157 | (1) |
|
7.4 Multilayer Perceptrons (MLP) |
|
|
158 | (1) |
|
7.4.1 Architectures of MLP |
|
|
158 | (1) |
|
|
159 | (1) |
|
|
159 | (2) |
|
7.5.1 Model Parameterization |
|
|
160 | (1) |
|
7.6 Few Pros and Cons of Neural Networks |
|
|
161 | (1) |
|
|
162 | (11) |
8 Causal Inference and Matching |
|
173 | (24) |
|
|
173 | (1) |
|
8.2 Three Layer Causal Hierarchy |
|
|
173 | (1) |
|
8.3 Seven Tools of Causal Inference |
|
|
174 | (2) |
|
8.4 Statistical Framework of Causal Inferences |
|
|
176 | (1) |
|
|
177 | (1) |
|
8.6 Methodologies of Matching |
|
|
178 | (6) |
|
8.6.1 Nearest Neighbor (or greedy) Matching |
|
|
178 | (2) |
|
8.6.1.1 Example Using Nearest Neighbor Matching |
|
|
178 | (2) |
|
|
180 | (1) |
|
|
180 | (1) |
|
8.6.3 Mahalanobis Distance Matching |
|
|
181 | (1) |
|
|
181 | (1) |
|
|
182 | (4) |
|
|
183 | (1) |
|
|
184 | (2) |
|
|
185 | (1) |
|
|
186 | (5) |
|
|
187 | (1) |
|
8.8.1 Analysis of Data After Matching |
|
|
188 | (3) |
|
|
189 | (2) |
|
|
191 | (6) |
|
|
192 | (5) |
9 Business |
|
197 | (24) |
|
9.1 Case Study One: Marketing Campaigns of a Portuguese Banking Institution |
|
|
197 | (6) |
|
9.1.1 Description of Data |
|
|
197 | (1) |
|
|
198 | (6) |
|
9.1.2.1 Analysis via Lasso |
|
|
198 | (1) |
|
9.1.2.2 Analysis via Elastic Net |
|
|
198 | (1) |
|
|
199 | (1) |
|
9.1.2.4 Analysis via rpart |
|
|
200 | (1) |
|
9.1.2.5 Analysis via randomForest |
|
|
200 | (2) |
|
9.1.2.6 Analysis via xgboost |
|
|
202 | (1) |
|
|
203 | (1) |
|
9.3 Case Study Two: Polish Companies Bankruptcy Data |
|
|
204 | (14) |
|
9.3.1 Description of Data |
|
|
204 | (2) |
|
|
206 | (19) |
|
9.3.2.1 Analysis of Year-1 Data (univariate analysis) |
|
|
207 | (2) |
|
9.3.2.2 Analysis of Year-3 Data (univariate analysis) |
|
|
209 | (1) |
|
9.3.2.3 Analysis of Year-5 Data (univariate analysis) |
|
|
210 | (2) |
|
9.3.2.4 Analysis of Year-1 Data (composite analysis) |
|
|
212 | (2) |
|
9.3.2.5 Analysis of Year-3 Data (composite analysis) |
|
|
214 | (2) |
|
9.3.2.6 Analysis of Year-5 Data (composite analysis) |
|
|
216 | (2) |
|
|
218 | (3) |
10 Analysis of Response Profiles |
|
221 | (14) |
|
|
221 | (1) |
|
|
221 | (3) |
|
10.3 Transition of Response States |
|
|
224 | (1) |
|
10.4 Classification of Response Profiles |
|
|
225 | (5) |
|
10.4.1 Dissimilarities Between Response Profiles |
|
|
225 | (1) |
|
10.4.2 Visualizing Clusters via Multidimensional Scaling |
|
|
226 | (1) |
|
10.4.3 Response Profile Differences among Clusters |
|
|
227 | (1) |
|
10.4.4 Significant Clinical Variables for Each Cluster |
|
|
228 | (2) |
|
10.5 Modeling of Response Profiles via GEE |
|
|
230 | (3) |
|
|
230 | (1) |
|
10.5.2 Estimation of Marginal Regression Parameters |
|
|
231 | (1) |
|
|
231 | (1) |
|
10.5.4 Results of Modeling |
|
|
231 | (2) |
|
|
233 | (2) |
Bibliography |
|
235 | |
Index |
|
24 | |