Foreword |
|
xiii | |
Preface |
|
xv | |
Terminology |
|
xvii | |
Software |
|
xxi | |
|
1 Why should I care about statistical prediction models? |
|
|
1 | (7) |
|
1.1 The many uses of prediction models in medicine |
|
|
3 | (1) |
|
1.2 The unique messages of this book |
|
|
4 | (2) |
|
1.3 Prognostic factor modeling philosophy |
|
|
6 | (1) |
|
1.4 The rest of this book |
|
|
7 | (1) |
|
2 I am going to make a prediction model. What do I need to know? |
|
|
8 | (39) |
|
2.1 Prediction model framework |
|
|
8 | (4) |
|
|
8 | (1) |
|
|
8 | (1) |
|
2.1.3 The event of interest |
|
|
8 | (1) |
|
2.1.4 The prediction time horizon and follow-up |
|
|
9 | (1) |
|
|
10 | (1) |
|
2.1.6 Risks and risk predictions |
|
|
10 | (1) |
|
2.1.7 Classification of risk |
|
|
11 | (1) |
|
2.1.8 Predictor variables |
|
|
11 | (1) |
|
|
12 | (1) |
|
2.2 Prediction performance |
|
|
12 | (5) |
|
2.2.1 Proper scoring rules |
|
|
13 | (1) |
|
|
14 | (1) |
|
|
14 | (1) |
|
2.2.4 Explained variation |
|
|
15 | (1) |
|
2.2.5 Variability and uncertainty |
|
|
15 | (1) |
|
2.2.6 The interpretation is relative |
|
|
16 | (1) |
|
|
16 | (1) |
|
2.2.8 Average versus subgroups |
|
|
17 | (1) |
|
|
17 | (3) |
|
2.3.1 Study design and sources of information |
|
|
17 | (1) |
|
|
18 | (1) |
|
|
18 | (1) |
|
2.3.4 Randomized clinical trial |
|
|
18 | (1) |
|
|
19 | (1) |
|
2.3.6 Given treatment and treatment options |
|
|
19 | (1) |
|
2.3.7 Sample size calculation |
|
|
19 | (1) |
|
|
20 | (5) |
|
|
20 | (1) |
|
|
21 | (1) |
|
|
21 | (1) |
|
|
21 | (1) |
|
|
22 | (1) |
|
|
22 | (3) |
|
|
25 | (8) |
|
2.5.1 Risk prediction model |
|
|
25 | (2) |
|
|
27 | (1) |
|
2.5.3 How is prediction modeling different from statistical inference? |
|
|
27 | (2) |
|
|
29 | (1) |
|
|
29 | (1) |
|
2.5.6 Expert selects the candidate predictors |
|
|
29 | (1) |
|
2.5.7 How to select variables for inclusion in the final model |
|
|
30 | (1) |
|
2.5.8 All possible interactions |
|
|
31 | (1) |
|
|
32 | (1) |
|
|
32 | (1) |
|
|
33 | (6) |
|
2.6.1 The conventional model |
|
|
33 | (1) |
|
2.6.2 Internal and external validation |
|
|
33 | (1) |
|
2.6.3 Conditional versus expected performance |
|
|
34 | (1) |
|
|
34 | (1) |
|
|
35 | (1) |
|
|
35 | (2) |
|
2.6.7 Model checking and goodness of fit |
|
|
37 | (1) |
|
|
38 | (1) |
|
|
39 | (8) |
|
|
39 | (1) |
|
2.7.2 Odds ratios and hazard ratios are not predictions of risks |
|
|
40 | (1) |
|
2.7.3 Do not blame the metric |
|
|
40 | (2) |
|
2.7.4 Censored data versus competing risks |
|
|
42 | (2) |
|
2.7.5 Disease-specific survival |
|
|
44 | (1) |
|
|
44 | (1) |
|
2.7.7 Data-dependent decisions |
|
|
44 | (1) |
|
|
45 | (1) |
|
2.7.9 Independent predictor |
|
|
45 | (1) |
|
2.7.10 Automated variable selection |
|
|
45 | (2) |
|
3 How should I prepare for modeling? |
|
|
47 | (15) |
|
3.1 Definition of subjects |
|
|
47 | (2) |
|
|
49 | (1) |
|
3.3 Pre-selection of predictor variables |
|
|
50 | (2) |
|
3.4 Preparation of predictor variables |
|
|
52 | (6) |
|
3.4.1 Categorical variables |
|
|
53 | (1) |
|
3.4.2 Continuous variables |
|
|
53 | (1) |
|
3.4.3 Derived predictor variables |
|
|
54 | (1) |
|
3.4.4 Repeated measurements |
|
|
55 | (1) |
|
|
56 | (1) |
|
|
57 | (1) |
|
3.5 Preparation of event time outcome |
|
|
58 | (4) |
|
3.5.1 Illustration without competing risks |
|
|
58 | (1) |
|
3.5.2 Illustration with competing risks |
|
|
59 | (1) |
|
3.5.3 Artificial censoring at the prediction time horizon |
|
|
60 | (2) |
|
4 I am ready to build a prediction model |
|
|
62 | (41) |
|
4.1 Specifying the model type |
|
|
63 | (3) |
|
4.1.1 Uncensored binary outcome |
|
|
63 | (1) |
|
4.1.2 Right-censored time-to-event outcome (no competing risks) |
|
|
64 | (1) |
|
4.1.3 Right-censored time-to-event outcome with competing risks |
|
|
65 | (1) |
|
|
66 | (4) |
|
4.2.1 Uncensored binary outcome |
|
|
66 | (1) |
|
4.2.2 Right-censored time-to-event outcome (without competing risks) |
|
|
67 | (1) |
|
4.2.3 Right-censored time-to-event with competing risks |
|
|
68 | (2) |
|
4.3 Including predictor variables |
|
|
70 | (14) |
|
4.3.1 Categorical predictor variables |
|
|
71 | (5) |
|
4.3.2 Continuous predictor variables |
|
|
76 | (5) |
|
4.3.3 Interaction effects |
|
|
81 | (3) |
|
|
84 | (3) |
|
|
84 | (1) |
|
4.4.2 Conventional model strategy |
|
|
85 | (1) |
|
4.4.3 Whether to use a standard regression model or something else |
|
|
86 | (1) |
|
|
87 | (2) |
|
4.5.1 How to prevent overfitting the data |
|
|
87 | (1) |
|
4.5.2 How to deal with missing values |
|
|
88 | (1) |
|
4.5.3 How to deal with non-converging models |
|
|
89 | (1) |
|
4.6 What you should put in your manuscript |
|
|
89 | (11) |
|
|
89 | (1) |
|
|
90 | (1) |
|
|
91 | (3) |
|
|
94 | (2) |
|
|
96 | (4) |
|
|
100 | (3) |
|
|
100 | (1) |
|
4.7.2 Internet calculator |
|
|
100 | (1) |
|
4.7.3 Cost-benefit analysis (waiting lists) |
|
|
100 | (3) |
|
5 Does my model predict accurately? |
|
|
103 | (47) |
|
5.1 Model assessment roadmap |
|
|
104 | (3) |
|
5.1.1 Visualization of the predictions |
|
|
104 | (1) |
|
5.1.2 Calculation of model performance |
|
|
105 | (1) |
|
5.1.3 Visualization of model performance |
|
|
106 | (1) |
|
5.2 Uncensored binary outcome |
|
|
107 | (18) |
|
5.2.1 Distribution of the predicted risks |
|
|
107 | (6) |
|
|
113 | (3) |
|
|
116 | (2) |
|
|
118 | (7) |
|
5.3 Right-censored time-to-event outcome (without competing risks) |
|
|
125 | (11) |
|
5.3.1 Distribution of the predicted risks |
|
|
126 | (2) |
|
5.3.2 Brier score with censored data |
|
|
128 | (3) |
|
5.3.3 Time-dependent AUC for censored data |
|
|
131 | (3) |
|
5.3.4 Calibration curve for censored data |
|
|
134 | (2) |
|
|
136 | (11) |
|
5.4.1 Distribution of the predicted risks |
|
|
136 | (2) |
|
5.4.2 Brier score with competing risks |
|
|
138 | (4) |
|
5.4.3 Time-dependent AUC for competing risks |
|
|
142 | (1) |
|
5.4.4 Calibration curve for competing risks |
|
|
143 | (4) |
|
5.5 The Index of Prediction Accuracy (IPA) |
|
|
147 | (1) |
|
5.6 Choice of prediction time horizon |
|
|
148 | (1) |
|
5.7 Time-dependent prediction performance |
|
|
149 | (1) |
|
6 How do I decide between rival models? |
|
|
150 | (30) |
|
6.1 Model comparison roadmap |
|
|
151 | (1) |
|
6.2 Analysis of rival prediction models |
|
|
151 | (18) |
|
6.2.1 Uncensored binary outcome |
|
|
152 | (4) |
|
6.2.2 Right-censored time-to-event outcome (without competing risks) |
|
|
156 | (9) |
|
|
165 | (4) |
|
6.3 Clinically relevant change of prediction |
|
|
169 | (6) |
|
6.4 Does a new marker improve prediction? |
|
|
175 | (5) |
|
6.4.1 Many new predictors |
|
|
179 | (1) |
|
6.4.2 Updating a subject's prediction |
|
|
179 | (1) |
|
7 What would make me an expert? |
|
|
180 | (44) |
|
7.1 Multiple cohorts / Multi-center studies |
|
|
180 | (2) |
|
7.2 The role of treatment for making a prediction model |
|
|
182 | (4) |
|
|
183 | (2) |
|
7.2.2 Comparative effectiveness tables |
|
|
185 | (1) |
|
7.3 Learning curve paradigm |
|
|
186 | (1) |
|
7.4 Internal validation (data splitting) |
|
|
187 | (18) |
|
|
187 | (6) |
|
|
193 | (1) |
|
7.4.3 Multiple splits (cross-validation) |
|
|
194 | (7) |
|
7.4.4 Dilemma of internal validation |
|
|
201 | (1) |
|
7.4.5 The apparent and the 632+ estimator |
|
|
202 | (1) |
|
|
202 | (3) |
|
|
205 | (14) |
|
7.5.1 Missing values in the learning data |
|
|
207 | (8) |
|
7.5.2 Missing values in the validation data |
|
|
215 | (4) |
|
7.6 Time-varying coefficient models |
|
|
219 | (1) |
|
7.7 Time-varying predictor variables |
|
|
220 | (4) |
|
8 Can't the computer just take care of all of this? |
|
|
224 | (33) |
|
8.1 Zero layers of cross-validation |
|
|
225 | (7) |
|
8.1.1 What may happen if you do not look at the data |
|
|
225 | (2) |
|
8.1.2 Unsupervised modeling steps |
|
|
227 | (5) |
|
|
232 | (1) |
|
8.2 One layer of cross-validation |
|
|
232 | (8) |
|
8.2.1 Penalized regression |
|
|
233 | (7) |
|
8.2.2 Supervised spline selection |
|
|
240 | (1) |
|
8.3 Machine learning (two levels of cross-validation) |
|
|
240 | (10) |
|
|
243 | (5) |
|
8.3.2 Deep learning and artificial neural networks |
|
|
248 | (2) |
|
|
250 | (7) |
|
9 Things you might have expected in our book |
|
|
257 | (11) |
|
9.1 Threshold selection for decision making |
|
|
257 | (1) |
|
9.2 Number of events per variable |
|
|
258 | (1) |
|
9.3 Confidence intervals for predicted probabilities |
|
|
258 | (1) |
|
9.4 Models developed from case-control data |
|
|
259 | (1) |
|
|
259 | (1) |
|
9.6 Backward elimination and stepwise selection |
|
|
260 | (1) |
|
9.7 Rank correlation (c-index) for survival outcome |
|
|
260 | (1) |
|
9.8 Integrated Brier score |
|
|
261 | (1) |
|
9.9 Net reclassification index and the integrated discrimination improvement |
|
|
261 | (1) |
|
9.10 Re-classification tables |
|
|
262 | (4) |
|
9.11 Boxplots of rival models conditional on the outcome |
|
|
266 | (2) |
Bibliography |
|
268 | (16) |
Index |
|
284 | |