Atnaujinkite slapukų nuostatas

El. knyga: Product Analytics: Applied Data Science Techniques for Actionable Consumer Insights

Kitos knygos pagal šią temą:
Kitos knygos pagal šią temą:

DRM apribojimai

  • Kopijuoti:

    neleidžiama

  • Spausdinti:

    neleidžiama

  • El. knygos naudojimas:

    Skaitmeninių teisių valdymas (DRM)
    Leidykla pateikė šią knygą šifruota forma, o tai reiškia, kad norint ją atrakinti ir perskaityti reikia įdiegti nemokamą programinę įrangą. Norint skaityti šią el. knygą, turite susikurti Adobe ID . Daugiau informacijos  čia. El. knygą galima atsisiųsti į 6 įrenginius (vienas vartotojas su tuo pačiu Adobe ID).

    Reikalinga programinė įranga
    Norint skaityti šią el. knygą mobiliajame įrenginyje (telefone ar planšetiniame kompiuteryje), turite įdiegti šią nemokamą programėlę: PocketBook Reader (iOS / Android)

    Norint skaityti šią el. knygą asmeniniame arba „Mac“ kompiuteryje, Jums reikalinga  Adobe Digital Editions “ (tai nemokama programa, specialiai sukurta el. knygoms. Tai nėra tas pats, kas „Adobe Reader“, kurią tikriausiai jau turite savo kompiuteryje.)

    Negalite skaityti šios el. knygos naudodami „Amazon Kindle“.

This guide shows how to combine data science with social science to gain unprecedented insight into customer behavior, so you can change it. Joanne Rodrigues-Craig bridges the gap between predictive data science and statistical techniques that reveal why important things happen -- why customers buy more, or why they immediately leave your site -- so you can get more behaviors you want and less you don’t. 

Drawing on extensive enterprise experience and deep knowledge of demographics and sociology, Rodrigues-Craig shows how to create better theories and metrics, so you can accelerate the process of gaining insight, altering behavior, and earning business value. You’ll learn how to:
  • Develop complex, testable theories for understanding individual and social behavior in web products 
  • Think like a social scientist and contextualize individual behavior in today’s social environments 
  • Build more effective metrics and KPIs for any web product or system
  • Conduct more informative and actionable A/B tests 
  • Explore causal effects, reflecting a deeper understanding of the differences between correlation and causation
  • Alter user behavior in a complex web product 
  • Understand how relevant human behaviors develop, and the prerequisites for changing them
  • Choose the right statistical techniques for common tasks such as multistate and uplift modeling 
  • Use advanced statistical techniques to model multidimensional systems 
  • Do all of this in R (with sample code available in a separate code manual)
Preface xvii
Acknowledgments xxiii
About the Author xxv
I Qualitative Methodology
1(66)
1 Data in Action: A Model of a Dinner Party
3(22)
1.1 The User Data Disruption
4(3)
1.1.1 Don't Leave the Users out of the Model
4(1)
1.1.2 The Junior Analyst
5(1)
1.1.3 The Opposite of the Misguided Analyst: The Data Guru
6(1)
1.2 A Model of a Dinner Party
7(6)
1.2.1 Why Are Social Processes Difficult to Analyze?
9(1)
1.2.2 A Party Is a Process
9(1)
1.2.3 A Party Is an Open System
10(1)
1.2.4 A "Great" Party Is Hard to Define
10(1)
1.2.5 Party Guests' Motives and Opinions Are Often Unknown
11(1)
1.2.6 A Party Presents a Variable Search Problem
12(1)
1.2.7 The Real Secret to a Great Party Is Elusive
12(1)
1.3 What's Unique about User Data?
13(10)
1.3.1 Human Behavior Is a Process, Not a Problem
13(2)
1.3.2 No Clear and Defined Outcomes
15(3)
1.3.3 Social Systems Have Rampant Problems of Incomplete Information
18(1)
1.3.4 Social Systems Consist of Millions of Potential Behaviors
18(1)
1.3.5 Social Systems Are Often Open* Systems
19(1)
1.3.6 Inferring Causation Is Almost Impossible
20(3)
1.4 Why Does Causation Matter?
23(1)
1.5 Actionable Insights
24(1)
2 Building a Theory of the Social Universe
25(22)
2.1 Building a Theory
25(11)
2.1.1 Won't Fancy Algorithms Solve All Our Problems?
26(1)
2.1.2 The Pervasive (and Generally Useless) One-Off Fact
26(1)
2.1.3 The Art of the Typology
27(1)
2.1.4 The Project Design Process: Theory Building
28(1)
2.1.5 Steps to a Good Theory
29(1)
2.1.6 Description: Questions and Goals
30(1)
2.1.7 Analytical: Theory and Concepts
31(1)
2.1.8 Qualities of a "Good" Theory
32(4)
2.2 Conceptualization and Measurement
36(4)
2.2.1 Conceptualization
36(2)
2.2.2 Measurement
38(1)
2.2.3 Hypothesis Generation
39(1)
2.3 Theories from a Web Product
40(6)
2.3.1 User Type Purchasing Model
40(1)
2.3.2 Feed Algorithm Model
41(1)
2.3.3 Middle School Dance Model
42(4)
2.4 Actionable Insights
46(1)
3 The Coveted Goalpost: How to Change Human Behavior
47(20)
3.1 Understanding Actionable Insight
47(3)
3.2 It's All about Changing "Your" Behavior
50(5)
3.2.1 Is It True Behavior Change?
51(1)
3.2.2 Quitting Smoking: The Herculean Task of Behavioral Change
52(1)
3.2.3 Measuring Behavior Change
53(2)
3.3 A Theory about Human Behavioral Change
55(4)
3.3.1 Learning
55(1)
3.3.2 Cognition
56(1)
3.3.3 Randomized Variable Investment Schedule
56(1)
3.3.4 Outsized Positive Rewards and Mitigated Losses
57(1)
3.3.5 Fogg Model of Change
57(1)
3.3.6 ABA Model of Change
58(1)
3.4 Change in a Web Product
59(2)
3.5 What Are Realistic Expectations for Behavioral Change?
61(5)
3.5.1 What Percentage of Users Will See a Real Change in Our Product?
61(1)
3.5.2 Are Certain Behaviors Easier to Change?
62(2)
3.5.3 Behavioral Change Worksheet
64(2)
3.6 Actionable Insights
66(1)
II Basic Statistical Methods
67(70)
4 Distributions in User Analytics
69(16)
4.1 Why Are Metrics Important?
69(13)
4.1.1 Statistical Tools for Metric Development
70(1)
4.1.2 Distributions
70(1)
4.1.3 Exploring a Distribution
71(1)
4.1.4 Mean, Median, and Mode
72(3)
4.1.5 Variance
75(2)
4.1.6 Sampling
77(1)
4.1.7 Other Measures
78(1)
4.1.8 The Exponential Distribution
79(1)
4.1.9 Bivariate Distribution
80(2)
4.2 Actionable Insights
82(3)
5 Retained? Metric Creation and Interpretation
85(22)
5.1 Period, Age, and Cohort
85(6)
5.1.1 Period
86(1)
5.1.2 Age
86(1)
5.1.3 Cohort
86(1)
5.1.4 Lexis Diagram
86(1)
5.1.5 Period versus Cohort?
87(1)
5.1.6 Cohort Retention
88(1)
5.1.7 Period Retention
89(1)
5.1.8 Standardization
90(1)
5.2 Metric Development
91(15)
5.2.1 Conceptualization
92(1)
5.2.2 Acquisition
93(1)
5.2.3 Ratio-Based Metrics
93(4)
5.2.4 Retention
97(3)
5.2.5 Engagement
100(2)
5.2.6 Revenue
102(1)
5.2.7 Progression Ratios
103(3)
5.3 Actionable Insights
106(1)
6 Why Are My Users Leaving? The Ins and Outs of A/B Testing
107(30)
6.1 An A/B Test
107(2)
6.2 The Curious Case of Free Weekly Events
109(4)
6.2.1 Spurious Correlation
110(2)
6.2.2 Selection Bias
112(1)
6.3 But It's Correlated
113(4)
6.3.1 Proportional Comparisons
113(1)
6.3.2 Linear Correlations
114(2)
6.3.3 Nonlinear Relationships
116(1)
6.4 Why Randomness?
117(2)
6.5 The Nuts and Bolts of an A/B Test
119(13)
6.5.1 A/B Test Process
119(1)
6.5.2 The Setup
120(2)
6.5.3 Randomization
122(1)
6.5.4 Hypothesis Testing
123(7)
6.5.5 Power Analysis
130(2)
6.6 Pitfalls in A/B testing
132(3)
6.6.1 General Pitfalls
132(2)
6.6.2 No Randomness
134(1)
6.6.3 Differing Patterns Between Groups
134(1)
6.6.4 Differing Patterns in Long- and Short-Run Effects
135(1)
6.7 Actionable Insights
135(2)
III Predictive Methods
137(68)
7 Modeling the User Space: Jr-Means and PCA
139(12)
7.1 What Is a Model?
139(1)
7.2 Clustering Techniques
140(10)
7.2.1 Segmenting Users, Novice Users, and Unsupervised Learning
141(9)
7.3 Actionable Insights
150(1)
8 Predicting User Behavior: Regression, Decision Trees, and Support Vector Machines
151(22)
8.1 Predictive Inference
151(1)
8.2 Much Ado about Prediction?
152(2)
8.2.1 Applications of Predictive Algorithms
153(1)
8.2.2 Prediction in Behavioral Contexts Is Rarely Just Prediction
154(1)
8.3 Predictive Modeling
154(15)
8.3.1 Simple Explanation: Methods
155(14)
8.4 Validation of Supervised Learning Models
169(3)
8.4.1 k-Fold Cross-Validation
169(1)
8.4.2 Leave-One-Out Cross-Validation
170(1)
8.4.3 Precision, Recall, and the Fl-Score
170(2)
8.5 Actionable Insights
172(1)
Appendix
172(1)
9 Forecasting Population Changes in Product: Demographic Projections
173(32)
9.1 Why Should We Spend Time on the Product Life Cycle?
174(1)
9.2 Birth, Death, and the Full Life Cycle
174(3)
9.3 Different Models of Retention
177(6)
9.3.1 The Transition Matrix
178(1)
9.3.2 Snowmobile Transition Example
179(4)
9.4 The Art of Population Prediction
183(20)
9.4.1 Population Projection Example
184(1)
9.4.2 User Death by a Thousand Cuts
185(8)
9.4.3 Exponential Growth Example
193(10)
9.5 Actionable Insights
203(2)
IV Causal Inference Methods
205(72)
10 In Pursuit of the Experiment: Natural Experiments and Difference-ln-Difference Modeling
207(18)
10.1 Why Causal Inference?
208(1)
10.2 Causal Inference versus Prediction
208(3)
10.3 When A/B Testing Doesn't Work
211(2)
10.3.1 Broader Social Phenomena
211(1)
10.3.2 Historical Action
212(1)
10.3.3 No Infrastructure
212(1)
10.4 Nuts and Bolts of Causal Inference from Real-World Data
213(9)
10.4.1 Causal Inference Terminology
213(1)
10.4.2 Natural Experiments
214(4)
10.4.3 Operationalizing Geographic Space: Difference-in-Difference Modeling
218(4)
10.5 Actionable Insights
222(3)
11 In Pursuit of the Experiment, Continued
225(18)
11.1 Regression Discontinuity
226(3)
11.1.1 Nuts and Bolts of RD
226(1)
11.1.2 Potential RD Designs
226(1)
11.1.3 The Enemy of the Good: Nonrandom Selection at the Cut Point
227(1)
11.1.4 RD Complexities
228(1)
11.1.5 Graphing the Data
229(1)
11.2 Estimating the Causal Effect of Gaining a Badge
229(5)
11.2.1 Comparing Models
230(2)
11.2.2 Checking for Selection in Confounding Variables
232(2)
11.3 Interrupted Time Series
234(4)
11.3.1 Simple Regression Analysis
235(1)
11.3.2 Time-Series Modeling
236(2)
11.4 Seasonality Decomposition
238(3)
11.5 Actionable Insights
241(2)
12 Developing Heuristics in Practice
243(16)
12.1 Determining Causation from Real-World Data
243(1)
12.2 Statistical Matching
244(7)
12.2.1 Basics of Matching
244(1)
12.2.2 What Features "Cause" a User to Buy?
245(1)
12.2.3 Matching Theory
245(6)
12.3 Problems with Propensity Score Matching
251(2)
12.3.1 Omitted Variable Bias and Better Matching Methods
251(2)
12.3.2 No Coverage
253(1)
12.4 Matching as a Heuristic
253(1)
12.5 The Best Guess
254(3)
12.6 Final Thoughts
257(1)
12.7 Actionable Insights
258(1)
13 Uplift Modeling
259(18)
13.1 What Is Uplift?
259(1)
13.2 Why Uplift?
260(1)
13.3 Understanding Uplift
261(1)
13.4 Prediction and Uplift
261(1)
13.5 Difficulties with Uplift
262(13)
13.5.1 Lalonde Data Set
263(1)
13.5.2 Uplift Modeling
263(1)
13.5.3 Two-Model Approach
264(1)
13.5.4 Interaction Model
265(1)
13.5.5 Tree-Based Methods
266(1)
13.5.6 Random Forests
267(8)
13.6 Actionable Insights
275(2)
V Basic, Predictive, and Causal Inference Methods In R
277(110)
14 Metrics In R
279(30)
14.1 Why R?
279(1)
14.2 R Fundamentals: A Very Basic Introduction to R and Its Setup
280(5)
14.2.1 The R Language
280(1)
14.2.2 RStudio
281(1)
14.2.3 Installing Packages
282(1)
14.2.4 RMarkdown
282(2)
14.2.5 Reading Data into R
284(1)
14.2.6 NAs in R
284(1)
14.3 Sampling from Distributions in R
285(5)
14.4 Summary Statistics
290(1)
14.5 Q-QPIot
291(2)
14.6 Calculating Variance and Higher ! Moments
293(1)
14.7 Histograms and Binning
294(7)
14.8 Bivariate Distribution and Correlation
301(1)
14.8.1 Calculating Metrics
302(3)
14.9 Parity Progression Ratios
305(2)
14.10 Summary
307(2)
15 A/B Testing, Predictive Modeling, and Population Projection in R
309(34)
15.1 A/B Testing in R
309(11)
15.1.1 Statistical Testing
310(8)
15.1.2 Power Analysis
318(2)
15.2 Clustering
320(4)
15.2.1 k-Means 320
1(320)
15.2.2 Principal Components Analysis
321(3)
15.3 Predictive Modeling
324(9)
15.3.1 Linear Regression
324(2)
15.3.2 Logistic Regression
326(1)
15.3.3 Decision Trees
327(3)
15.3.4 Support Vector Machines
330(1)
15.3.5 Cross-Validation
331(2)
15.4 Population Projection
333(9)
15.4.1 Example 1: User Death by a Thousand Cuts *
336(3)
15.4.2 Example 2: The Exponential Growth Example
339(3)
15.5 Actionable Insights
342(1)
16 Regression Discontinuity, Matching, and Uplift in R
343(44)
16.1 Difference-in-Difference Modeling
343(3)
16.2 Regression Discontinuity and Time-Series Modeling
346(11)
16.2.1 Regression Discontinuity
347(5)
16.2.2 Interrupted Time Series
352(4)
16.2.3 Seasonality Decomposition
356(1)
16.3 Statistical, Matching
357(13)
16.3.1 Plotting Balance
366(1)
16.3.2 Caliper Matching
367(1)
16.3.3 GenMatch ()
368(2)
16.4 Uplift Modeling
370(13)
16.4.1 Two-Model Solution
371(1)
16.4.2 Interaction Model
372(3)
16.4.3 Causal Conditional Inference Forest and Uplift Forest Models
375(8)
16.5 Actionable Insights
383(1)
Appendix
383(4)
Conclusion 387(4)
Bibliography 391(6)
Index 397
Joanne Rodrigues is an experienced data scientist with masters degrees in mathematics, political science, and demography. She has six years of experience in statistical computing and R programming, as well as experience with Python for data science applications. Her management experience at enterprise companies leverages her ability to understand human behavior by using economic and sociological theory in the context of complex mathematical models.