|
1 Introduction to Modern Survey Analytics |
|
|
1 | (34) |
|
1.1 Information and Survey Data |
|
|
3 | (1) |
|
|
4 | (12) |
|
|
5 | (2) |
|
1.2.2 Target Audience and Sample Size |
|
|
7 | (2) |
|
1.2.2.1 Key Parameters to Estimate |
|
|
9 | (1) |
|
1.2.2.2 Sample Design to Use |
|
|
9 | (1) |
|
|
10 | (1) |
|
|
10 | (1) |
|
|
10 | (1) |
|
1.2.2.6 Additional Information |
|
|
10 | (2) |
|
1.2.3 Screener and Questionnaire Design |
|
|
12 | (2) |
|
|
14 | (1) |
|
|
14 | (2) |
|
1.2.6 Report Writing and Presentation |
|
|
16 | (1) |
|
1.3 Sample Representativeness |
|
|
16 | (6) |
|
1.3.1 Digression on Indicator Variables |
|
|
20 | (1) |
|
1.3.2 Calculating the Population Parameters |
|
|
21 | (1) |
|
1.4 Estimating Population Parameters |
|
|
22 | (3) |
|
|
25 | (5) |
|
1.5.1 Consumer Study: Yogurt Consumption |
|
|
25 | (2) |
|
1.5.2 Public Sector Study: VA Benefits Survey |
|
|
27 | (1) |
|
1.5.3 Public Opinion Study: Toronto Casino Opinion Survey |
|
|
28 | (2) |
|
1.5.4 Public Opinion Study: San Francisco Airport Customer Satisfaction Survey |
|
|
30 | (1) |
|
1.6 Why Use Python for Survey Data Analysis? |
|
|
30 | (2) |
|
1.7 Why Use Jupyter for Survey Data Analysis? |
|
|
32 | (3) |
|
2 First Step: Working with Survey Data |
|
|
35 | (48) |
|
2.1 Best Practices: First Steps to Analysis |
|
|
36 | (7) |
|
2.1.1 Installing and Importing Python Packages |
|
|
36 | (3) |
|
2.1.2 Organizing Routinely Used Packages, Functions, and Formats |
|
|
39 | (2) |
|
2.1.3 Defining Data Paths and File Names |
|
|
41 | (1) |
|
2.1.4 Defining Your Functions and Formatting Statements |
|
|
42 | (1) |
|
2.1.5 Documenting Your Data with a Dictionary |
|
|
42 | (1) |
|
2.2 Importing Your Data with Pandas |
|
|
43 | (5) |
|
2.3 Handling Missing Values |
|
|
48 | (4) |
|
2.3.1 Identifying Missing Values |
|
|
49 | (1) |
|
2.3.2 Reporting Missing Values |
|
|
49 | (1) |
|
2.3.3 Reasons for Missing Values |
|
|
50 | (1) |
|
2.3.4 Dealing with Missing Values |
|
|
51 | (1) |
|
2.3.4.1 Use the fillna() Method |
|
|
51 | (1) |
|
2.3.4.2 Use the Interpolation) Method |
|
|
51 | (1) |
|
2.3.4.3 An Even More Sophisticated Method |
|
|
52 | (1) |
|
2.4 Handling Special Types of Survey Data |
|
|
52 | (4) |
|
|
52 | (1) |
|
2.4.1.1 Multiple Responses |
|
|
53 | (1) |
|
2.4.1.2 Multiple Responses by ID |
|
|
53 | (1) |
|
2.4.1.3 Multiple Responses Delimited |
|
|
54 | (1) |
|
2.4.1.4 Indicator Variable |
|
|
54 | (1) |
|
|
54 | (1) |
|
2.4.2 Categorical Questions |
|
|
54 | (2) |
|
2.5 Creating New Variables, Binning, and Rescaling |
|
|
56 | (11) |
|
2.5.1 Creating Summary Variables |
|
|
58 | (4) |
|
|
62 | (2) |
|
2.5.3 Other Forms of Preprocessing |
|
|
64 | (3) |
|
2.6 Knowing the Structure of the Data Using Simple Statistics |
|
|
67 | (3) |
|
2.6.1 Descriptive Statistics and DataFrame Checks |
|
|
68 | (1) |
|
2.6.2 Obtaining Value Counts |
|
|
69 | (1) |
|
2.6.3 Styling Your DataFrame Display |
|
|
69 | (1) |
|
|
70 | (10) |
|
2.7.1 Complex Weight Calculation: Raking |
|
|
73 | (2) |
|
|
75 | (5) |
|
|
80 | (3) |
|
3 Shallow Survey Analysis |
|
|
83 | (30) |
|
|
84 | (2) |
|
3.1.1 Ordinal-Based Summaries |
|
|
85 | (1) |
|
3.1.2 Nominal-Based Summaries |
|
|
86 | (1) |
|
3.2 Basic Descriptive Statistics |
|
|
86 | (3) |
|
|
89 | (5) |
|
|
94 | (17) |
|
3.4.1 Visuals Best Practice |
|
|
95 | (1) |
|
3.4.2 Data Visualization Background |
|
|
95 | (3) |
|
|
98 | (1) |
|
|
99 | (2) |
|
3.4.5 Other Charts and Graphs |
|
|
101 | (4) |
|
3.4.5.1 Histograms and Boxplots for Distributions |
|
|
105 | (1) |
|
|
105 | (4) |
|
|
109 | (2) |
|
3.5 Weighted Summaries: Crosstabs and Descriptive Statistics |
|
|
111 | (2) |
|
4 Beginning Deep Survey Analysis |
|
|
113 | (64) |
|
|
114 | (8) |
|
4.1.1 Hypothesis Testing Background |
|
|
115 | (3) |
|
4.1.2 Examples of Hypotheses |
|
|
118 | (1) |
|
4.1.3 A Formal Framework for Statistical Tests |
|
|
118 | (1) |
|
4.1.4 A Less Formal Framework for Statistical Tests |
|
|
119 | (1) |
|
4.1.5 Types of Tests to Use |
|
|
120 | (2) |
|
4.2 Quantitative Data: Tests of Means |
|
|
122 | (20) |
|
|
122 | (4) |
|
4.2.2 Test of Two Means for Two Populations |
|
|
126 | (1) |
|
4.2.2.1 Standard Errors: Independent Populations |
|
|
126 | (3) |
|
4.2.2.2 Standard Errors: Dependent Populations |
|
|
129 | (2) |
|
4.2.3 Test of More Than Two Means |
|
|
131 | (11) |
|
4.3 Categorical Data: Tests of Proportions |
|
|
142 | (11) |
|
|
143 | (1) |
|
4.3.2 Comparing Proportions: Two Independent Populations |
|
|
144 | (2) |
|
4.3.3 Comparing Proportions: Paired Populations |
|
|
146 | (1) |
|
4.3.4 Comparing Multiple Proportions |
|
|
147 | (6) |
|
|
153 | (5) |
|
4.5 Advanced Visualization |
|
|
158 | (19) |
|
4.5.1 Extended Visualizations |
|
|
159 | (3) |
|
|
162 | (3) |
|
|
165 | (1) |
|
|
166 | (11) |
|
5 Advanced Deep Survey Analysis: The Regression Family |
|
|
177 | (32) |
|
5.1 The Regression Family and Link Functions |
|
|
178 | (1) |
|
5.2 The Identity Link: Introduction to OLS Regression |
|
|
179 | (8) |
|
5.2.1 OLS Regression Background |
|
|
180 | (1) |
|
5.2.2 The Classical Assumptions |
|
|
180 | (1) |
|
5.2.3 Example of Application |
|
|
181 | (1) |
|
5.2.4 Steps for Estimating an OLS Regression |
|
|
182 | (4) |
|
5.2.5 Predicting with the OLS Model |
|
|
186 | (1) |
|
5.3 The Logit Link: Introduction to Logistic Regression |
|
|
187 | (13) |
|
5.3.1 Logistic Regression Background |
|
|
189 | (3) |
|
5.3.2 Example of Application |
|
|
192 | (2) |
|
5.3.3 Steps for Estimating a Logistic Regression |
|
|
194 | (6) |
|
5.3.4 Predicting with the Logistic Regression Model |
|
|
200 | (1) |
|
5.4 The Poisson Link: Introduction to Poisson Regression |
|
|
200 | (9) |
|
5.4.1 Poisson Regression Background |
|
|
200 | (1) |
|
5.4.2 Example of Application |
|
|
201 | (1) |
|
5.4.3 Steps for Estimating a Poisson Regression |
|
|
201 | (1) |
|
5.4.4 Predicting with the Poisson Regression Model |
|
|
202 | (1) |
|
|
203 | (6) |
|
6 Sample of Specialized Survey Analyses |
|
|
209 | (28) |
|
|
210 | (7) |
|
|
210 | (1) |
|
|
210 | (1) |
|
6.1.3 Creating the Design Matrix |
|
|
211 | (1) |
|
6.1.4 Fielding the Conjoint Study |
|
|
212 | (2) |
|
6.1.5 Estimating a Conjoint Model |
|
|
214 | (1) |
|
6.1.6 Attribute Importance Analysis |
|
|
215 | (2) |
|
|
217 | (7) |
|
6.3 Correspondence Analysis |
|
|
224 | (4) |
|
|
228 | (9) |
|
|
237 | (14) |
|
7.1 Complex Sample Survey Estimation Effects |
|
|
239 | (1) |
|
7.2 Sample Size Calculation |
|
|
240 | (1) |
|
|
241 | (3) |
|
|
244 | (2) |
|
|
245 | (1) |
|
|
245 | (1) |
|
|
246 | (5) |
|
7.5.1 One-Sample Test: Hypothesized Mean |
|
|
247 | (1) |
|
7.5.2 Two-Sample Test: Independence Case |
|
|
248 | (1) |
|
7.5.3 Two-Sample Test: Paired Case |
|
|
248 | (3) |
|
8 Bayesian Survey Analysis: Introduction |
|
|
251 | (52) |
|
8.1 Frequentist vs Bayesian Statistical Approaches |
|
|
253 | (6) |
|
8.2 Digression on Bayes' Rule |
|
|
259 | (6) |
|
8.2.1 Bayes' Rule Derivation |
|
|
259 | (2) |
|
8.2.2 Bayes' Rule Reexpressions |
|
|
261 | (1) |
|
8.2.3 The Prior Distribution |
|
|
262 | (1) |
|
8.2.4 The Likelihood Function |
|
|
263 | (1) |
|
8.2.5 The Marginal Probability Function |
|
|
263 | (1) |
|
8.2.6 The Posterior Distribution |
|
|
264 | (1) |
|
8.2.7 Hyperparameters of the Distributions |
|
|
264 | (1) |
|
8.3 Computational Method: MCMC |
|
|
265 | (4) |
|
8.3.1 Digression on Markov Chain Monte Carlo Simulation |
|
|
265 | (4) |
|
8.3.2 Sampling from a Markov Chain Monte Carlo Simulation |
|
|
269 | (1) |
|
8.4 Python Package pyMC3: Overview |
|
|
269 | (1) |
|
|
270 | (3) |
|
8.5.1 Basic Data Analysis |
|
|
272 | (1) |
|
8.6 Benchmark OLS Regression Estimation |
|
|
273 | (1) |
|
|
274 | (15) |
|
8.7.1 pyMC3 Bayesian Regression Setup |
|
|
274 | (6) |
|
8.7.2 Bayesian Estimation Results |
|
|
280 | (1) |
|
|
280 | (2) |
|
8.7.2.2 The Visualization Output |
|
|
282 | (7) |
|
8.8 Extensions to Other Analyses |
|
|
289 | (11) |
|
8.8.1 Sample Mean Analysis |
|
|
290 | (1) |
|
8.8.2 Sample Proportion Analysis |
|
|
290 | (1) |
|
8.8.3 Contingency Table Analysis |
|
|
291 | (4) |
|
8.8.4 Logit Model for Contingency Table |
|
|
295 | (2) |
|
8.8.5 Poisson Model for Count Data |
|
|
297 | (3) |
|
|
300 | (3) |
|
|
300 | (1) |
|
8.9.2 Half-Normal Distribution |
|
|
300 | (1) |
|
8.9.3 Bernoulli Distribution |
|
|
301 | (2) |
|
9 Bayesian Survey Analysis: Multilevel Extension |
|
|
303 | (34) |
|
9.1 Multilevel Modeling: An introduction |
|
|
304 | (4) |
|
9.1.1 Omitted Variable Bias |
|
|
305 | (2) |
|
9.1.2 Simple Handling of Data Structure |
|
|
307 | (1) |
|
9.1.3 Nested Market Structures |
|
|
307 | (1) |
|
9.2 Multilevel Modeling: Some Observations |
|
|
308 | (4) |
|
9.2.1 Aggregation and Disaggregation Issues |
|
|
309 | (1) |
|
|
310 | (1) |
|
|
311 | (1) |
|
9.2.4 Ubiquity of Hierarchical Structures |
|
|
311 | (1) |
|
9.3 Data Visualization of Multilevel Data |
|
|
312 | (6) |
|
9.3.1 Basic Data Visualization and Regression Analysis |
|
|
313 | (5) |
|
|
318 | (5) |
|
9.4.1 Pooled Regression Model |
|
|
318 | (1) |
|
9.4.2 Unpooled (Dummy Variable) Regression Model |
|
|
319 | (2) |
|
9.4.3 Multilevel Regression Model |
|
|
321 | (2) |
|
9.5 Multilevel Modeling Using pyMC3: Introduction |
|
|
323 | (5) |
|
9.5.1 Multilevel Model Notation |
|
|
324 | (1) |
|
9.5.2 Multilevel Model Formulation |
|
|
324 | (1) |
|
9.5.3 Example Multilevel Estimation Set-up |
|
|
325 | (3) |
|
9.5.4 Example Multilevel Estimation Analyses |
|
|
328 | (1) |
|
9.6 Multilevel Modeling with Level Explanatory Variables |
|
|
328 | (1) |
|
9.7 Extensions of Multilevel Models |
|
|
328 | (9) |
|
9.7.1 Logistic Regression Model |
|
|
330 | (2) |
|
|
332 | (1) |
|
|
332 | (1) |
|
|
333 | (4) |
References |
|
337 | (6) |
Index |
|
343 | |