Preface |
|
xiii | |
Part I: Fundamentals of Bayesian Inference |
|
1 | (138) |
|
1 Probability and inference |
|
|
3 | (26) |
|
1.1 The three steps of Bayesian data analysis |
|
|
3 | (1) |
|
1.2 General notation for statistical inference |
|
|
4 | (2) |
|
|
6 | (2) |
|
1.4 Discrete probability examples: genetics and spell checking |
|
|
8 | (3) |
|
1.5 Probability as a measure of uncertainty |
|
|
11 | (2) |
|
1.6 Example of probability assignment: football point spreads |
|
|
13 | (3) |
|
1.7 Example: estimating the accuracy of record linkage |
|
|
16 | (3) |
|
1.8 Some useful results from probability theory |
|
|
19 | (3) |
|
1.9 Computation and software |
|
|
22 | (2) |
|
1.10 Bayesian inference in applied statistics |
|
|
24 | (1) |
|
|
25 | (2) |
|
|
27 | (2) |
|
2 Single-parameter models |
|
|
29 | (34) |
|
2.1 Estimating a probability from binomial data |
|
|
29 | (3) |
|
2.2 Posterior as compromise between data and prior information |
|
|
32 | (1) |
|
2.3 Summarizing posterior inference |
|
|
32 | (2) |
|
2.4 Informative prior distributions |
|
|
34 | (5) |
|
2.5 Estimating a normal mean with known variance |
|
|
39 | (3) |
|
2.6 Other standard single-parameter models |
|
|
42 | (5) |
|
2.7 Example: informative prior distribution for cancer rates |
|
|
47 | (4) |
|
2.8 Noninformative prior distributions |
|
|
51 | (4) |
|
2.9 Weakly informative prior distributions |
|
|
55 | (1) |
|
|
56 | (1) |
|
|
57 | (6) |
|
3 Introduction to multiparameter models |
|
|
63 | (20) |
|
3.1 Averaging over 'nuisance parameters' |
|
|
63 | (1) |
|
3.2 Normal data with a noninformative prior distribution |
|
|
64 | (3) |
|
3.3 Normal data with a conjugate prior distribution |
|
|
67 | (2) |
|
3.4 Multinomial model for categorical data |
|
|
69 | (1) |
|
3.5 Multivariate normal model with known variance |
|
|
70 | (2) |
|
3.6 Multivariate normal with unknown mean and variance |
|
|
72 | (2) |
|
3.7 Example: analysis of a bioassay experiment |
|
|
74 | (4) |
|
3.8 Summary of elementary modeling and computation |
|
|
78 | (1) |
|
|
78 | (1) |
|
|
79 | (4) |
|
4 Asymptotics and connections to non-Bayesian approaches |
|
|
83 | (18) |
|
4.1 Normal approximations to the posterior distribution |
|
|
83 | (4) |
|
|
87 | (2) |
|
4.3 Counterexamples to the theorems |
|
|
89 | (2) |
|
4.4 Frequency evaluations of Bayesian inferences |
|
|
91 | (1) |
|
4.5 Bayesian interpretations of other statistical methods |
|
|
92 | (5) |
|
|
97 | (1) |
|
|
98 | (3) |
|
|
101 | (38) |
|
5.1 Constructing a parameterized prior distribution |
|
|
102 | (2) |
|
5.2 Exchangeability and setting up hierarchical models |
|
|
104 | (4) |
|
5.3 Fully Bayesian analysis of conjugate hierarchical models |
|
|
108 | (5) |
|
5.4 Estimating exchangeable parameters from a normal model |
|
|
113 | (6) |
|
5.5 Example: parallel experiments in eight schools |
|
|
119 | (5) |
|
5.6 Hierarchical modeling applied to a meta-analysis |
|
|
124 | (4) |
|
5.7 Weakly informative priors for hierarchical variance parameters |
|
|
128 | (4) |
|
|
132 | (2) |
|
|
134 | (5) |
Part II: Fundamentals of Bayesian Data Analysis |
|
139 | (120) |
|
|
141 | (24) |
|
6.1 The place of model checking in applied Bayesian statistics |
|
|
141 | (1) |
|
6.2 Do the inferences from the model make sense? |
|
|
142 | (1) |
|
6.3 Posterior predictive checking |
|
|
143 | (10) |
|
6.4 Graphical posterior predictive checks |
|
|
153 | (6) |
|
6.5 Model checking for the educational testing example |
|
|
159 | (2) |
|
|
161 | (2) |
|
|
163 | (2) |
|
7 Evaluating, comparing, and expanding models |
|
|
165 | (32) |
|
7.1 Measures of predictive accuracy |
|
|
166 | (3) |
|
7.2 Information criteria and cross-validation |
|
|
169 | (9) |
|
7.3 Model comparison based on predictive performance |
|
|
178 | (4) |
|
7.4 Model comparison using Bayes factors |
|
|
182 | (2) |
|
7.5 Continuous model expansion |
|
|
184 | (3) |
|
7.6 Implicit assumptions and model expansion: an example |
|
|
187 | (5) |
|
|
192 | (1) |
|
|
193 | (4) |
|
8 Modeling accounting for data collection |
|
|
197 | (40) |
|
8.1 Bayesian inference requires a model for data collection |
|
|
197 | (2) |
|
8.2 Data-collection models and ignorability |
|
|
199 | (6) |
|
|
205 | (9) |
|
|
214 | (4) |
|
8.5 Sensitivity and the role of randomization |
|
|
218 | (2) |
|
8.6 Observational studies |
|
|
220 | (4) |
|
8.7 Censoring and truncation |
|
|
224 | (5) |
|
|
229 | (1) |
|
|
229 | (1) |
|
|
230 | (7) |
|
|
237 | (22) |
|
9.1 Bayesian decision theory in different contexts |
|
|
237 | (2) |
|
9.2 Using regression predictions: incentives for telephone surveys |
|
|
239 | (6) |
|
9.3 Multistage decision making: medical screening |
|
|
245 | (1) |
|
9.4 Hierarchical decision analysis for radon measurement |
|
|
246 | (10) |
|
9.5 Personal vs. institutional decision analysis |
|
|
256 | (1) |
|
|
257 | (1) |
|
|
257 | (2) |
Part III: Advanced Computation |
|
259 | (92) |
|
10 Introduction to Bayesian computation |
|
|
261 | (14) |
|
10.1 Numerical integration |
|
|
261 | (1) |
|
10.2 Distributional approximations |
|
|
262 | (1) |
|
10.3 Direct simulation and rejection sampling |
|
|
263 | (2) |
|
|
265 | (2) |
|
10.5 How many simulation draws are needed? |
|
|
267 | (1) |
|
10.6 Computing environments |
|
|
268 | (2) |
|
10.7 Debugging Bayesian computing |
|
|
270 | (1) |
|
|
271 | (1) |
|
|
272 | (3) |
|
11 Basics of Markov chain simulation |
|
|
275 | (18) |
|
|
276 | (2) |
|
11.2 Metropolis and Metropolis-Hastings algorithms |
|
|
278 | (2) |
|
11.3 Using Gibbs and Metropolis as building blocks |
|
|
280 | (1) |
|
11.4 Inference and assessing convergence |
|
|
281 | (5) |
|
11.5 Effective number of simulation draws |
|
|
286 | (2) |
|
11.6 Example: hierarchical normal model |
|
|
288 | (3) |
|
|
291 | (1) |
|
|
291 | (2) |
|
12 Computationally efficient Markov chain simulation |
|
|
293 | (18) |
|
12.1 Efficient Gibbs samplers |
|
|
293 | (2) |
|
12.2 Efficient Metropolis jumping rules |
|
|
295 | (2) |
|
12.3 Further extensions to Gibbs and Metropolis |
|
|
297 | (3) |
|
12.4 Hamiltonian Monte Carlo |
|
|
300 | (5) |
|
12.5 Hamiltonian dynamics for a simple hierarchical model |
|
|
305 | (2) |
|
12.6 Stan: developing a computing environment |
|
|
307 | (1) |
|
|
308 | (1) |
|
|
309 | (2) |
|
13 Modal and distributional approximations |
|
|
311 | (40) |
|
13.1 Finding posterior modes |
|
|
311 | (2) |
|
13.2 Boundary-avoiding priors for modal summaries |
|
|
313 | (5) |
|
13.3 Normal and related mixture approximations |
|
|
318 | (2) |
|
13.4 Finding marginal posterior modes using EM |
|
|
320 | (5) |
|
13.5 Approximating conditional and marginal posterior densities |
|
|
325 | (1) |
|
13.6 Example: hierarchical normal model (continued) |
|
|
326 | (5) |
|
13.7 Variational inference |
|
|
331 | (7) |
|
13.8 Expectation propagation |
|
|
338 | (5) |
|
13.9 Other approximations |
|
|
343 | (2) |
|
13.10 Unknown normalizing factors |
|
|
345 | (3) |
|
|
348 | (1) |
|
|
349 | (2) |
Part IV: Regression Models |
|
351 | (118) |
|
14 Introduction to regression models |
|
|
353 | (28) |
|
14.1 Conditional modeling |
|
|
353 | (1) |
|
14.2 Bayesian analysis of the classical regression model |
|
|
354 | (4) |
|
14.3 Regression for causal inference: incumbency in congressional elections |
|
|
358 | (6) |
|
14.4 Goals of regression analysis |
|
|
364 | (1) |
|
14.5 Assembling the matrix of explanatory variables |
|
|
365 | (2) |
|
14.6 Regularization and dimension reduction for multiple predictors |
|
|
367 | (2) |
|
14.7 Unequal variances and correlations |
|
|
369 | (7) |
|
14.8 Including numerical prior information |
|
|
376 | (2) |
|
|
378 | (1) |
|
|
378 | (3) |
|
15 Hierarchical linear models |
|
|
381 | (24) |
|
15.1 Regression coefficients exchangeable in batches |
|
|
382 | (1) |
|
15.2 Example: forecasting U.S. presidential elections |
|
|
383 | (5) |
|
15.3 Interpreting a normal prior distribution as additional data |
|
|
388 | (2) |
|
15.4 Varying intercepts and slopes |
|
|
390 | (2) |
|
15.5 Computation: batching and transformation |
|
|
392 | (3) |
|
15.6 Analysis of variance and the batching of coefficients |
|
|
395 | (3) |
|
15.7 Hierarchical models for batches of variance components |
|
|
398 | (2) |
|
|
400 | (2) |
|
|
402 | (3) |
|
16 Generalized linear models |
|
|
405 | (30) |
|
16.1 Standard generalized linear model likelihoods |
|
|
406 | (1) |
|
16.2 Working with generalized linear models |
|
|
407 | (5) |
|
16.3 Weakly informative priors for logistic regression |
|
|
412 | (8) |
|
16.4 Example: hierarchical Poisson regression for police stops |
|
|
420 | (2) |
|
16.5 Example: hierarchical logistic regression for political opinions |
|
|
422 | (1) |
|
16.6 Models for multivariate and multinomial responses |
|
|
423 | (5) |
|
16.7 Loglinear models for multivariate discrete data |
|
|
428 | (3) |
|
|
431 | (1) |
|
|
432 | (3) |
|
17 Models for robust inference |
|
|
435 | (14) |
|
17.1 Aspects of robustness |
|
|
435 | (2) |
|
17.2 Overdispersed versions of standard probability models |
|
|
437 | (2) |
|
17.3 Posterior inference and computation |
|
|
439 | (2) |
|
17.4 Robust inference and sensitivity analysis for the eight schools |
|
|
441 | (3) |
|
17.5 Robust regression using t-distributed errors |
|
|
444 | (1) |
|
|
445 | (1) |
|
|
446 | (3) |
|
18 Models for missing data |
|
|
449 | (20) |
|
|
449 | (2) |
|
|
451 | (3) |
|
18.3 Missing data in the multivariate normal and t models |
|
|
454 | (2) |
|
18.4 Example: multiple imputation for a series of polls |
|
|
456 | (6) |
|
18.5 Missing values with counted data |
|
|
462 | (1) |
|
18.6 Example: an opinion poll in Slovenia |
|
|
463 | (3) |
|
|
466 | (1) |
|
|
467 | (2) |
Part V: Nonlinear and Nonparametric Models |
|
469 | (106) |
|
19 Parametric nonlinear models |
|
|
471 | (16) |
|
19.1 Example: serial dilution assay |
|
|
471 | (6) |
|
19.2 Example: population toxicokinetics |
|
|
477 | (8) |
|
|
485 | (1) |
|
|
486 | (1) |
|
|
487 | (14) |
|
20.1 Splines and weighted sums of basis functions |
|
|
487 | (3) |
|
20.2 Basis selection and shrinkage of coefficients |
|
|
490 | (4) |
|
20.3 Non-normal models and multivariate regression surfaces |
|
|
494 | (4) |
|
|
498 | (1) |
|
|
498 | (3) |
|
21 Gaussian process models |
|
|
501 | (18) |
|
21.1 Gaussian process regression |
|
|
501 | (4) |
|
21.2 Example: birthdays and birthdates |
|
|
505 | (5) |
|
21.3 Latent Gaussian process models |
|
|
510 | (2) |
|
21.4 Functional data analysis |
|
|
512 | (1) |
|
21.5 Density estimation and regression |
|
|
513 | (3) |
|
|
516 | (1) |
|
|
516 | (3) |
|
|
519 | (26) |
|
22.1 Setting up and interpreting mixture models |
|
|
519 | (5) |
|
22.2 Example: reaction times and schizophrenia |
|
|
524 | (9) |
|
22.3 Label switching and posterior computation |
|
|
533 | (3) |
|
22.4 Unspecified number of mixture components |
|
|
536 | (3) |
|
22.5 Mixture models for classification and regression |
|
|
539 | (3) |
|
|
542 | (1) |
|
|
543 | (2) |
|
23 Dirichlet process models |
|
|
545 | (30) |
|
|
545 | (1) |
|
23.2 Dirichlet process prior distributions |
|
|
546 | (3) |
|
23.3 Dirichlet process mixtures |
|
|
549 | (8) |
|
23.4 Beyond density estimation |
|
|
557 | (3) |
|
23.5 Hierarchical dependence |
|
|
560 | (8) |
|
|
568 | (3) |
|
|
571 | (2) |
|
|
573 | (2) |
A Standard probability distributions |
|
575 | (10) |
|
A.1 Continuous distributions |
|
|
575 | (8) |
|
A.2 Discrete distributions |
|
|
583 | (1) |
|
|
584 | (1) |
B Outline of proofs of limit theorems |
|
585 | (4) |
|
|
588 | (1) |
C Computation in R and Stan |
|
589 | (18) |
|
C.1 Getting started with R and Stan |
|
|
589 | (1) |
|
C.2 Fitting a hierarchical model in Stan |
|
|
589 | (5) |
|
C.3 Direct simulation, Gibbs, and Metropolis in R |
|
|
594 | (7) |
|
C.4 Programming Hamiltonian Monte Carlo in R |
|
|
601 | (4) |
|
C.5 Further comments on computation |
|
|
605 | (1) |
|
|
606 | (1) |
References |
|
607 | (34) |
Author Index |
|
641 | (8) |
Subject Index |
|
649 | |