Klientų aptarnavimas: +370 652 87781

Pagalba | Naujas vartotojas | Prisijungti

El. knyga: Handbook of Regression Modeling in People Analytics: With Examples in R and Python [Taylor & Francis e-book]

Keith McNulty

Formatas: 272 pages, 48 Line drawings, color; 48 Illustrations, color
Išleidimo metai: 30-Jul-2021
Leidėjas: Chapman & Hall/CRC
ISBN-13: 9781003194156

Kitos knygos pagal šią temą:

Taylor & Francis e-book
Kaina: 106,17 €*
* this price gives unlimited concurrent access for unlimited time
Standartinė kaina: 151,67 €
Sutaupote 30%

Formatas: 272 pages, 48 Line drawings, color; 48 Illustrations, color
Išleidimo metai: 30-Jul-2021
Leidėjas: Chapman & Hall/CRC
ISBN-13: 9781003194156

Kitos knygos pagal šią temą:

More info about Taylor & Francis e-books

Partnerių interneto svetainė: https://www.taylorfrancis.com/books/9781003194156

"This book is a learning resource on inferential statistics and regression analysis. It teaches how to do a wide range of statistical analyses in both R and in Python, ranging from simple hypothesis testing to advanced multivariate modelling. Although itis primarily focused on examples related to the analysis of people and talent, the methods easily transfer to any discipline. The book hits a 'sweet spot' where there is just enough mathematical theory to support a strong understanding of the methods, but with a step-by-step guide and easily reproducible examples and code, so that the methods can be put into practice immediately. This makes the book accessible to a wide readership, from public and private sector analysts and practitioners to students andresearchers"--

Despite the recent rapid growth in machine learning and predictive analytics, many of the statistical questions that are faced by researchers and practitioners still involve explaining why something is happening. Regression analysis is the best ‘swiss army knife’ we have for answering these kinds of questions.

This book is a learning resource on inferential statistics and regression analysis. It teaches how to do a wide range of statistical analyses in both R and in Python, ranging from simple hypothesis testing to advanced multivariate modelling. Although it is primarily focused on examples related to the analysis of people and talent, the methods easily transfer to any discipline. The book hits a ‘sweet spot’ where there is just enough mathematical theory to support a strong understanding of the methods, but with a step-by-step guide and easily reproducible examples and code, so that the methods can be put into practice immediately. This makes the book accessible to a wide readership, from public and private sector analysts and practitioners to students and researchers.

Key Features:

• 16 accompanying datasets across a wide range of contexts (e.g. academic, corporate, sports, marketing)
• Clear step-by-step instructions on executing the analyses.
• Clear guidance on how to interpret results.
• Primary instruction in R but added sections for Python coders.
• Discussion exercises and data exercises for each of the main chapters.
• Final chapter of practice material and datasets ideal for class homework or project work.

Foreword

xiii

Alexis Fink

Introduction

1 The Importance of Regression in People Analytics

(8)

1.1 Why is regression modeling so important in people analytics?

(1)

1.2 What do we mean by `modeling'?

(3)

1.2.1 The theory of inferential modeling

(2)

1.2.2 The process of inferential modeling

(1)

1.3 The structure, system and organization of this book

(3)

2 The Basics of the R Programming Language

(30)

2.1 What is R?

(1)

2.2 How to start using R

(1)

2.3 Data in R

(7)

2.3.1 Data types

(1)

2.3.2 Homogeneous data structures

(2)

2.3.3 Heterogeneous data structures

(2)

2.4 Working with dataframes

(6)

2.4.1 Loading and tidying data in dataframes

(4)

2.4.2 Manipulating dataframes

(2)

2.5 Functions, packages and libraries

(5)

2.5.1 Using functions

(1)

2.5.2 Help with functions

(1)

2.5.3 Writing your own functions

(1)

2.5.4 Installing packages

(1)

2.5.5 Using packages

(1)

2.5.6 The pipe operator

(1)

2.6 Errors, warnings and messages

(2)

2.7 Plotting and graphing

(3)

2.7.1 Plotting in base R

(2)

2.7.2 Specialist plotting and graphing packages

(1)

2.8 Documenting your work using R Markdown

(3)

2.9 Learning exercises

(2)

2.9.1 Discussion questions

(1)

2.9.2 Data exercises

(1)

3 Statistics Foundations

(26)

3.1 Elementary descriptive statistics of populations and samples

(6)

3.1.1 Mean, variance and standard deviation

(3)

3.1.2 Covariance and correlation

(3)

3.2 Distribution of random variables

(3)

3.2.1 Sampling of random variables

(1)

3.2.2 Standard errors, the t-distribution and confidence intervals

(2)

3.3 Hypothesis testing

(9)

3.3.1 Testing for a difference in means (Welch's i-test)

(3)

3.3.2 Testing for a non-zero correlation between two variables (i-test for correlation)

(2)

3.3.3 Testing for a difference in frequency distribution between different categories in a data set (Chi-square test)

(2)

3.4 Foundational statistics in Python

(4)

3.5 Learning exercises

(3)

3.5.1 Discussion tiuestions

(1)

3.5.2 Data exercises

(2)

4 Linear Regression for Continuous Outcomes

(36)

4.1 When to use it

(4)

4.1.1 Origins and intuition of linear regression

(1)

4.1.2 Use cases for linear regression

(1)

4.1.3 Walkthrough example

(2)

4.2 Simple linear regression

(7)

4.2.1 Linear relationship between a single input and an outcome

(1)

4.2.2 Minimising the error

(3)

4.2.3 Determining the best fit

(1)

4.2.4 Measuring the fit of the model

(2)

4.3 Multiple linear regression

(6)

4.3.1 Running a multiple linear regression model and interpreting its coefficients

(1)

4.3.2 Coefficient confidence

(1)

4.3.3 Model `goodness-of-fit'

(3)

4.3.4 Making predictions from your model

(1)

4.4 Managing inputs in linear regression

(4)

4.4.1 Relevance of input variables

(1)

4.4.2 Sparseness (`missingness') of data

(1)

4.4.3 Transforming categorical inputs to dummy variables

(2)

4.5 Testing your model assumptions

(7)

4.5.1 Assumption of linearity and additivity

(2)

4.5.2 Assumption of constant error variance

(1)

4.5.3 Assumption of normally distributed errors

(1)

4.5.4 Avoiding high collinearity and multicollinearity between input variables

(3)

4.6 Extending multiple linear regression

(4)

4.6.1 Interactions between input variables

(3)

4.6.2 Quadratic and higher-order polynomial terms

(1)

4.7 Learning exercises

(4)

4.7.1 Discussion questions

(1)

4.7.2 Data exercises

(4)

5 Binomial Logistic Regression for Binary Outcomes

101

(26)

5.1 When to use it

102

(4)

5.1.1 Origins and intuition of binomial logistic regression

102

(1)

5.1.2 Use cases for binomial logistic regression

103

(1)

5.1.3 Walkthrough example

104

(2)

5.2 Modeling probabilistic outcomes using a logistic function

106

(6)

5.2.1 Deriving the concept of log odds

107

(2)

5.2.2 Modeling the log odds and interpreting the coefficients

109

(1)

5.2.3 Odds versus probability

110

(2)

5.3 Running a multivariate binomial logistic regression model

112

(10)

5.3.1 Running and interpreting a multivariate binomial logistic regression model

113

(3)

5.3.2 Understanding the fit and goodness-of-fit of a binomial logistic regression model

116

(4)

5.3.3 Model parsimony

120

(2)

5.4 Other considerations in binomial logistic regression

122

(2)

5.5 Learning exercises

124

(3)

5.5.1 Discussion questions

124

(1)

5.5.2 Data exercises

124

(3)

6 Multinomial Logistic Regression for Nominal Category Outcomes

127

(16)

6.1 When to use it

127

(4)

6.1.1 Intuition for multinomial logistic regression

127

(1)

6.1.2 Use cases for multinomial logistic regression

128

(1)

6.1.3 Walkthrough example

128

(3)

6.2 Running stratified binomial models

131

(2)

6.2.1 Modeling the choice of Product A versus other products

131

(2)

6.2.2 Modeling other choices

133

(1)

6.3 Running a multinomial regression model

133

(5)

6.3.1 Defining a reference level and running the model

134

(2)

6.3.2 Interpreting the model

136

(1)

6.3.3 Changing the reference

137

(1)

6.4 Model simplification, fit and goodness-of-fit for multinomial logistic regression models

138

(2)

6.4.1 Gradual safe elimination of variables

138

(1)

6.4.2 Model fit and goodness-of-fit

139

(1)

6.5 Learning exercises

140

(3)

6.5.1 Discussion questions

140

(1)

6.5.2 Data exercises

141

(2)

7 Proportional Odds Logistic Regression for Ordered Category Outcomes

143

(20)

7.1 When to use it

143

(5)

7.1.1 Intuition for proportional odds logistic regression

143

(2)

7.1.2 Use cases for proportional odds logistic regression

145

(1)

7.1.3 Walkthrough example

145

(3)

7.2 Modeling ordinal outcomes under the assumption of proportional odds

148

(7)

7.2.1 Using a latent continuous outcome variable to derive a proportional odds model

148

(2)

7.2.2 Running a proportional odds logistic regression model

150

(3)

7.2.3 Calculating the likelihood of an observation being in a specific ordinal category

153

(1)

7.2.4 Model diagnostics

154

(1)

7.3 Testing the proportional odds assumption

155

(4)

7.3.1 Sighting the coefficients of stratified binomial models

156

(1)

7.3.2 The Brant-Wald test

157

(1)

7.3.3 Alternatives to proportional odds models

158

(1)

7.4 Learning exercises

159

(4)

7.4.1 Discussion questions

159

(1)

7.4.2 Data exercises

160

(3)

8 Modeling Explicit and Latent Hierarchy in Data

163

(24)

8.1 Mixed models for explicit hierarchy in data

164

(6)

8.1.1 Fixed and random effects

164

(1)

8.1.2 Running a mixed model

165

(5)

8.2 Structural equation models for latent hierarchy in data

170

(15)

8.2.1 Running and assessing the measurement model

173

(7)

8.2.2 Running and interpreting the structural model

180

(5)

8.3 Learning exercises

185

(2)

8.3.1 Discussion questions

185

(1)

8.3.2 Data exercises

185

(2)

9 Survival Analysis for Modeling Singular Events Over Time

187

(16)

9.1 Tracking and illustrating survival rates over the study period

189

(4)

9.2 Cox proportional hazard regression models

193

(4)

9.2.1 Running a Cox proportional hazard regression model

194

(2)

9.2.2 Checking the proportional hazard assumption

196

(1)

9.3 Frailty models

197

(3)

9.4 Learning exercises

200

(3)

9.4.1 Discussion questions

200

(1)

9.4.2 Data exercises

201

(2)

10 Alternative Technical Approaches in R and Python

203

(18)

10.1 `Tidier' modeling approaches in R

204

(5)

10.1.1 The broom package

204

(4)

10.1.2 The parsnip package

208

(1)

10.2 Inferential statistical modeling in Python

209

(12)

10.2.1 Ordinary Least Squares (OLS) linear regression

209

(2)

10.2.2 Binomial logistic regression

211

(1)

10.2.3 Multinomial logistic regression

212

(1)

10.2.4 Structural equation models

213

(2)

10.2.5 Survival analysis

215

(3)

10.2.6 Other model variants

218

(3)

11 Power Analysis to Estimate Required Sample Sizes for Modeling

221

(14)

11.1 Errors, effect sizes and statistical power

222

(2)

11.2 Power analysis for simple hypothesis tests

224

(4)

11.3 Power analysis for linear regression models

228

(1)

11.4 Power analysis for log-likelihood regression models

229

(2)

11.5 Power analysis for hierarchicarregression models

231

(1)

11.6 Power analysis using Python

232

(3)

12 Further Exercises for Practice

235

(12)

12.1 Analyzing graduate salaries

235

(2)

12.1.1 The graduates data set

236

(1)

12.1.2 Discussion questions

236

(1)

12.1.3 Data exercises

236

(1)

12.2 Analyzing a recruiting process

237

(2)

12.2.1 The recruiting data set

238

(1)

12.2.2 Discussion questions

238

(1)

12.2.3 Data exercises

239

(1)

12.3 Analyzing the drivers of performance ratings

239

(2)

12.3.1 The employee performance data set

240

(1)

12.3.2 Discussion questions

240

(1)

12.3.3 Data exercises

241

(1)

12.4 Analyzing promotion differences between groups

241

(2)

12.4.1 The promotion data set

242

(1)

12.4.2 Discussion questions

242

(1)

12.4.3 Data exercises

242

(1)

12.5 Analyzing feedback on learning programs

243

(4)

12.5.1 The learning data set

243

(1)

12.5.2 Discussion questions

244

(1)

12.5.3 Data exercises

244

(3)

References

247

(2)

Glossary

249

(4)

Index

253

Keith McNulty, PhD is a leading practitioner of applied statistics, psychometrics and people analytics. He is currently Global Director of Talent Science and Analytics at McKinsey and Company.

Pastovi nuoroda: https://www.kriso.lt/db/9781003194156_pe.html

Raktažodžiai:

El. knyga: Handbook of Regression Modeling in People Analytics: With Examples in R and Python [Taylor & Francis e-book]

Paskyra ir nustatymai

Paieška

Ieškoti duomenų bazėje

Patikslinti paiešką

Temos Publishers Subjects

Pasirinkti pirkinių krepšelį