|
|
xv | |
|
|
xix | |
About the Authors |
|
xxi | |
Preface |
|
xxiii | |
|
|
1 | (16) |
|
|
1 | (3) |
|
|
4 | (4) |
|
1.2.1 Mathematical notation |
|
|
4 | (1) |
|
1.2.2 Relationships and `effects' |
|
|
5 | (1) |
|
|
6 | (1) |
|
1.2.4 There is no `true' model |
|
|
7 | (1) |
|
1.3 Statistical Inference |
|
|
8 | (7) |
|
1.3.1 The sampling distribution |
|
|
9 | (1) |
|
1.3.2 Bias, efficiency and consistency |
|
|
10 | (2) |
|
1.3.3 Hypothesis tests and confidence intervals |
|
|
12 | (2) |
|
1.3.4 Substantive importance and practical significance |
|
|
14 | (1) |
|
|
15 | (2) |
|
Part A General Principles of Effective Presentation |
|
|
17 | (80) |
|
2 Best Practices for Graphs and Tables |
|
|
19 | (18) |
|
|
19 | (1) |
|
2.2 When to Use Tables and Graphs |
|
|
20 | (2) |
|
2.3 Constructing Effective Tables |
|
|
22 | (5) |
|
2.3.1 Significant digits and rounding |
|
|
23 | (1) |
|
2.3.2 Placement of numerical values and text |
|
|
23 | (3) |
|
2.3.3 Delineating information |
|
|
26 | (1) |
|
2.4 Constructing Clear and Informative Graphs |
|
|
27 | (8) |
|
2.4.1 Graphical perception |
|
|
28 | (1) |
|
|
29 | (2) |
|
|
31 | (4) |
|
2.4.4 Grouping and ordering |
|
|
35 | (1) |
|
|
35 | (2) |
|
3 Methods for Visualizing Distributions |
|
|
37 | (26) |
|
3.1 Displaying Distributions of Categorical Variables |
|
|
38 | (4) |
|
3.2 Displaying Distributions of Quantitative Variables |
|
|
42 | (11) |
|
3.2.1 Histograms and density estimation |
|
|
42 | (6) |
|
3.2.2 Quantile comparison plots |
|
|
48 | (4) |
|
3.2.3 Boxplots and violin plots |
|
|
52 | (1) |
|
|
53 | (8) |
|
|
61 | (2) |
|
4 Exploring and Describing Relationships |
|
|
63 | (34) |
|
4.1 Two Categorical Variables |
|
|
63 | (8) |
|
|
63 | (4) |
|
|
67 | (2) |
|
|
69 | (2) |
|
4.2 Categorical Explanatory Variable and Quantitative Dependent Variable |
|
|
71 | (7) |
|
4.2.1 Side-by-side boxplots and violin plots |
|
|
71 | (1) |
|
4.2.2 Superposition of density estimates |
|
|
72 | (2) |
|
4.2.3 Tests of differences across groups |
|
|
74 | (2) |
|
|
76 | (2) |
|
4.3 Two Quantitative Variables |
|
|
78 | (10) |
|
4.3.1 Adding regression lines to scatterplots |
|
|
80 | (1) |
|
4.3.2 Jittering a scatterplot |
|
|
81 | (1) |
|
4.3.3 Encoding a control variable in a scatterplot |
|
|
82 | (1) |
|
4.3.4 Correlations, scatterplot matrices and linear scatterplot arrays |
|
|
83 | (4) |
|
|
87 | (1) |
|
4.4 Multivariate Displays |
|
|
88 | (7) |
|
4.4.1 Bivariate density estimation |
|
|
88 | (3) |
|
4.4.2 Three-dimensional density estimation |
|
|
91 | (2) |
|
|
93 | (2) |
|
|
95 | (2) |
|
|
97 | (174) |
|
5 The Linear Regression Model |
|
|
99 | (36) |
|
|
99 | (1) |
|
5.2 Ordinary Least Squares Regression |
|
|
100 | (7) |
|
5.2.1 Basics of the linear model |
|
|
100 | (1) |
|
5.2.2 Alternatives to minimizing the least squared residuals |
|
|
101 | (1) |
|
5.2.3 Ordinary least squares estimator |
|
|
102 | (2) |
|
5.2.4 Inference and assumptions of the linear model |
|
|
104 | (3) |
|
5.3 Hypothesis Tests and Confidence Intervals |
|
|
107 | (2) |
|
5.3.1 Individual coefficients |
|
|
107 | (1) |
|
5.3.2 Difference between two coefficients in the same model |
|
|
108 | (1) |
|
5.3.3 Difference between two coefficients from different models |
|
|
109 | (1) |
|
5.4 Assessing and Comparing Model Fit |
|
|
109 | (9) |
|
5.4.1 Residual standard error and R2 for assessing overall model fit |
|
|
110 | (1) |
|
5.4.2 F-tests for nested models |
|
|
111 | (2) |
|
5.4.3 Analysis of deviance for nested models |
|
|
113 | (1) |
|
5.4.4 Tests for non-nested models |
|
|
114 | (1) |
|
5.4.5 Using information criteria to compare models |
|
|
115 | (2) |
|
|
117 | (1) |
|
5.4.7 Some general advice |
|
|
117 | (1) |
|
5.5 Relative Importance of Predictors |
|
|
118 | (4) |
|
5.5.1 Scaling quantitative variables |
|
|
119 | (1) |
|
5.5.2 Standardized coefficients and related methods |
|
|
120 | (2) |
|
5.6 Interpreting and Presenting OLS Models: Some Empirical Examples |
|
|
122 | (8) |
|
5.6.1 Some general considerations |
|
|
122 | (1) |
|
5.6.2 General interpretation and presentation of regression models |
|
|
123 | (3) |
|
5.6.3 Assessing competing models |
|
|
126 | (2) |
|
5.6.4 Assessing relative importance |
|
|
128 | (2) |
|
5.7 Linear Probability Model |
|
|
130 | (3) |
|
|
130 | (1) |
|
5.7.2 Problems with the error distribution |
|
|
130 | (1) |
|
5.7.3 Problems with prediction |
|
|
131 | (2) |
|
|
133 | (2) |
|
6 Assessing the Impact and Importance of Multi-category Explanatory Variables |
|
|
135 | (32) |
|
|
135 | (1) |
|
6.2 Coding Multi-category Explanatory Variables |
|
|
136 | (10) |
|
|
136 | (2) |
|
6.2.2 Deviation or effect coding |
|
|
138 | (1) |
|
6.2.3 Comparing estimates from dummy and deviation coding |
|
|
139 | (2) |
|
6.2.4 Ordered explanatory variables |
|
|
141 | (4) |
|
6.2.5 The `reference category problem' |
|
|
145 | (1) |
|
6.3 Revisiting Statistical Significance: Multi-category Predictors |
|
|
146 | (8) |
|
6.3.1 Omnibus F-test for the null hypothesis that all coefficients equal zero |
|
|
146 | (1) |
|
6.3.2 Testing the difference between two coefficients |
|
|
147 | (1) |
|
6.3.3 Quasi-variances: Testing differences across all groups |
|
|
148 | (3) |
|
6.3.4 Comparing confidence intervals |
|
|
151 | (3) |
|
6.4 Relative Importance of Sets of Regressors |
|
|
154 | (6) |
|
6.4.1 Scaling and standardization revisited |
|
|
156 | (1) |
|
6.4.2 Comparing the relative importance of sets of regressors |
|
|
157 | (3) |
|
6.5 Graphical Presentation of Additive Effects |
|
|
160 | (6) |
|
6.5.1 Dot plots for comparing coefficients |
|
|
161 | (1) |
|
6.5.2 Fitted values and effect displays |
|
|
162 | (1) |
|
6.5.3 Compact letter displays for categorical variables |
|
|
163 | (2) |
|
6.5.4 Factorplots for the effects of categorical variables |
|
|
165 | (1) |
|
6.5.6 Choosing the `right' display |
|
|
166 | (1) |
|
|
166 | (1) |
|
7 Identifying and Handling Problems in Linear Models |
|
|
167 | (36) |
|
|
167 | (5) |
|
7.1.1 The importance of regression diagnostics |
|
|
168 | (1) |
|
7.1.2 Motivating example: Inequality and democratic history |
|
|
169 | (3) |
|
|
172 | (5) |
|
7.2.1 Component-plus-residual plots (or partial-residual plots) |
|
|
173 | (2) |
|
7.2.2 Testing for nonlinearity |
|
|
175 | (2) |
|
7.3 Influential Observations |
|
|
177 | (16) |
|
7.3.1 Breakdown point, influence function and OLS |
|
|
177 | (1) |
|
7.3.2 Identifying outliers: Studentized residuals |
|
|
178 | (1) |
|
7.3.3 Measuring leverage: Hat values |
|
|
179 | (1) |
|
7.3.4 Cook's D: Overall influence of individual observations |
|
|
180 | (1) |
|
7.3.5 Residual-residual plots: Comparing OLS and robust regression residuals |
|
|
181 | (3) |
|
7.3.6 DFBETAS: Influence of individual observations on specific coefficients |
|
|
184 | (1) |
|
7.3.7 Partial regression plots: Joint influence |
|
|
185 | (3) |
|
7.3.8 How to handle outliers |
|
|
188 | (5) |
|
|
193 | (1) |
|
|
193 | (2) |
|
7.6 Other issues of concern |
|
|
195 | (7) |
|
7.6.1 Dependent observations |
|
|
195 | (2) |
|
7.6.2 Control variables, omitted-variable bias and endogeneity |
|
|
197 | (3) |
|
|
200 | (2) |
|
|
202 | (1) |
|
8 Modelling and Presentation of Curvilinear Effects |
|
|
203 | (36) |
|
|
203 | (1) |
|
8.2 Curvilinearity in the linear model framework |
|
|
204 | (1) |
|
8.3 Nonlinear Transformations |
|
|
205 | (10) |
|
8.3.1 Fitted values and effect displays |
|
|
209 | (1) |
|
|
210 | (3) |
|
8.3.3 First differences: An alternative to marginal effects |
|
|
213 | (2) |
|
8.4 Polynomial Regression |
|
|
215 | (4) |
|
8.4.1 Centred and orthogonal polynomial terms |
|
|
216 | (1) |
|
8.4.2 Exploring the effects of polynomials: An empirical example |
|
|
217 | (2) |
|
|
219 | (8) |
|
8.5.1 Piecewise linear regression splines |
|
|
220 | (3) |
|
8.5.2 Cubic regression splines |
|
|
223 | (1) |
|
8.5.3 Comparing methods: An empirical example |
|
|
224 | (3) |
|
8.6 Nonparametric Regression |
|
|
227 | (5) |
|
8.6.1 Local polynomial regression |
|
|
228 | (2) |
|
|
230 | (1) |
|
8.6.3 Comparing the LOESS and smoothing spline fits |
|
|
231 | (1) |
|
8.7 Generalized Additive Models |
|
|
232 | (5) |
|
8.7.1 The backfitting estimation procedure |
|
|
233 | (2) |
|
8.7.2 Plotting the fitted curve |
|
|
235 | (1) |
|
8.7.3 Using GAMs to test functional form |
|
|
236 | (1) |
|
|
237 | (2) |
|
9 Interaction Effects in Linear Models |
|
|
239 | (32) |
|
|
239 | (1) |
|
9.2 Understanding Interaction Effects |
|
|
240 | (5) |
|
9.2.1 Specifying interaction effects |
|
|
240 | (3) |
|
9.2.2 Necessity of statistical significance |
|
|
243 | (1) |
|
9.2.3 Both `sides' of the interaction and other considerations |
|
|
244 | (1) |
|
9.3 Interactions between Two Categorical Variables |
|
|
245 | (4) |
|
9.3.1 Type II tests for an overall interaction effect |
|
|
245 | (2) |
|
9.3.2 Effect displays based on quasi-variances using OVT confidence intervals |
|
|
247 | (2) |
|
9.4 Interactions between One Categorical Variable and One Quantitative Variable |
|
|
249 | (8) |
|
9.4.1 Calculating simple slopes |
|
|
249 | (2) |
|
9.4.2 Pairwise comparisons |
|
|
251 | (2) |
|
9.4.3 Plotting the conditional effects |
|
|
253 | (1) |
|
9.4.4 Assessing group differences: Subset models versus interaction effects |
|
|
253 | (4) |
|
9.5 Interactions between Two Continuous Variables |
|
|
257 | (7) |
|
9.5.1 Testing the effects of one variable at different levels of the other |
|
|
259 | (1) |
|
9.5.2 Marginal effect graphs |
|
|
260 | (2) |
|
9.5.3 Plotting fitted values |
|
|
262 | (2) |
|
9.6 Interaction Effects: Some Cautions and Recommendations |
|
|
264 | (5) |
|
9.6.1 Data density issues |
|
|
265 | (1) |
|
9.6.2 Assessing linearity |
|
|
266 | (2) |
|
9.6.3 Three-way interactions |
|
|
268 | (1) |
|
|
269 | (2) |
|
Part C The Generalized Linear Model and Extensions |
|
|
271 | (104) |
|
10 Generalized Linear Models |
|
|
273 | (44) |
|
10.1 Basics of the Generalized Linear Model |
|
|
274 | (3) |
|
|
274 | (1) |
|
|
274 | (2) |
|
10.1.3 Random component and exponential family |
|
|
276 | (1) |
|
10.2 Maximum Likelihood Estimation |
|
|
277 | (4) |
|
10.2.1 Likelihood functions for some common GLMs |
|
|
278 | (2) |
|
10.2.2 Assumptions of the model |
|
|
280 | (1) |
|
10.3 Hypothesis Tests and Confidence Intervals |
|
|
281 | (2) |
|
10.3.1 Single-parameter tests |
|
|
281 | (1) |
|
10.3.2 Multiple-parameter tests |
|
|
282 | (1) |
|
10.3.3 Confidence intervals |
|
|
282 | (1) |
|
|
283 | (3) |
|
10.4.1 Pseudo-R2 measures of fit |
|
|
283 | (1) |
|
|
284 | (2) |
|
10.5 Empirical Example: Using Poisson Regression to Predict Counts |
|
|
286 | (2) |
|
10.6 Understanding Effects of Variables |
|
|
288 | (11) |
|
|
288 | (2) |
|
|
290 | (1) |
|
10.6.3 `Average' case versus `observed' case approaches |
|
|
291 | (5) |
|
10.6.4 Hypothesis tests and confidence intervals for first differences |
|
|
296 | (1) |
|
10.6.5 Effect displays for GLMs |
|
|
297 | (2) |
|
10.7 Measuring Variable Importance |
|
|
299 | (1) |
|
|
300 | (15) |
|
|
301 | (1) |
|
10.8.2 Assessing the functional form: Component-plus-residual plots |
|
|
302 | (7) |
|
10.8.3 Assessing the appropriateness of the variance model: Residual plots |
|
|
309 | (4) |
|
10.8.4 Influential observations |
|
|
313 | (2) |
|
|
315 | (2) |
|
11 Categorical Dependent Variables |
|
|
317 | (42) |
|
|
317 | (1) |
|
11.2 Regression Models for Binary Outcomes |
|
|
318 | (3) |
|
11.2.1 Revisiting the linear probability model |
|
|
318 | (2) |
|
11.2.2 Logit and probit models |
|
|
320 | (1) |
|
11.3 Interpreting Effects in Logit and Probit Models |
|
|
321 | (13) |
|
11.3.1 Odds and odds ratios for the logit model |
|
|
321 | (2) |
|
11.3.2 Predicted and fitted probabilities |
|
|
323 | (1) |
|
|
324 | (1) |
|
|
325 | (3) |
|
11.3.5 Effect displays for binary regression models |
|
|
328 | (1) |
|
11.3.6 Compression and interaction effects |
|
|
328 | (1) |
|
11.3.7 Difference in first differences |
|
|
329 | (5) |
|
11.4 Model Fit for Binary Regression Models |
|
|
334 | (3) |
|
11.4.1 Classification tables |
|
|
334 | (1) |
|
11.4.2 Proportional reduction in error |
|
|
334 | (2) |
|
11.4.3 Expected proportion correctly predicted |
|
|
336 | (1) |
|
11.5 Diagnostics Specific to Binary Regression Models |
|
|
337 | (4) |
|
|
337 | (1) |
|
11.5.2 Empirical versus predicted probabilities |
|
|
338 | (1) |
|
11.5.3 Separation anxiety |
|
|
338 | (3) |
|
11.6 Extending the Binary Regression Model: Ordered and Multinomial Models |
|
|
341 | (15) |
|
11.6.1 Ordinal regression model |
|
|
341 | (7) |
|
11.6.2 Comparing ordered logit and linear regression models |
|
|
348 | (3) |
|
11.6.3 The proportional odds assumption |
|
|
351 | (1) |
|
11.6.4 Multinomial regression models |
|
|
352 | (1) |
|
11.6.5 Revisiting the reference category problem |
|
|
353 | (3) |
|
|
356 | (3) |
|
12 Conclusions and Recommendations |
|
|
359 | (16) |
|
|
359 | (1) |
|
12.2 Choosing the Right Estimator |
|
|
360 | (3) |
|
12.2.1 Exploratory distributions |
|
|
360 | (2) |
|
12.2.2 Preliminary analysis of relationships |
|
|
362 | (1) |
|
12.2.3 Choosing the initial regression model |
|
|
362 | (1) |
|
12.3 Research Design and Measurement Issues |
|
|
363 | (4) |
|
|
363 | (1) |
|
12.3.2 Causation and endogeneity |
|
|
364 | (1) |
|
|
365 | (1) |
|
|
365 | (1) |
|
|
366 | (1) |
|
|
366 | (1) |
|
12.4 Evaluating the Model |
|
|
367 | (4) |
|
12.4.1 Detecting and handling nonlinearity |
|
|
368 | (1) |
|
12.4.2 Evaluating and handling undue influence |
|
|
369 | (1) |
|
12.4.3 Problems with the error distribution |
|
|
370 | (1) |
|
12.5 Effective Presentation of Results |
|
|
371 | (3) |
|
12.5.1 Simple additive effects |
|
|
371 | (1) |
|
|
372 | (1) |
|
12.5.3 Interaction effects |
|
|
373 | (1) |
|
|
374 | (1) |
|
Appendix: Data and Computing |
|
|
375 | (12) |
|
|
375 | (1) |
|
|
376 | (11) |
References |
|
387 | (20) |
Index |
|
407 | |