Klientų aptarnavimas: +370 652 87781

Pagalba | Naujas vartotojas | Prisijungti

El. knyga: Probability and Statistics for Data Science: Math + R + Data

4.33/5 (28 ratings by Goodreads)

Norman Matloff

Formatas: 444 pages
Serija: Chapman & Hall/CRC Data Science Series
Išleidimo metai: 21-Jun-2019
Leidėjas: CRC Press
ISBN-13: 9780429687129

Kitos knygos pagal šią temą:

Probability & statistics

Formatas - PDF+DRM
Kaina: 76,69 €*
* ši kaina yra galutinė, t.y. papildomos nuolaidos nebus taikomos
Įdėti į krepšelį
Įtraukti į pageidavimų sąrašą
Ši e-knyga skirta tik asmeniniam naudojimui. El. knygos nėra grąžinamos.

Formatas: 444 pages
Serija: Chapman & Hall/CRC Data Science Series
Išleidimo metai: 21-Jun-2019
Leidėjas: CRC Press
ISBN-13: 9780429687129

Kitos knygos pagal šią temą:

Probability & statistics

DRM apribojimai

Kopijuoti:

neleidžiama
Spausdinti:

neleidžiama
El. knygos naudojimas:

Skaitmeninių teisių valdymas (DRM)
Leidykla pateikė šią knygą šifruota forma, o tai reiškia, kad norint ją atrakinti ir perskaityti reikia įdiegti nemokamą programinę įrangą. Norint skaityti šią el. knygą, turite susikurti Adobe ID . Daugiau informacijos čia. El. knygą galima atsisiųsti į 6 įrenginius (vienas vartotojas su tuo pačiu Adobe ID).

Reikalinga programinė įranga
Norint skaityti šią el. knygą mobiliajame įrenginyje (telefone ar planšetiniame kompiuteryje), turite įdiegti šią nemokamą programėlę: PocketBook Reader (iOS / Android)

Norint skaityti šią el. knygą asmeniniame arba „Mac“ kompiuteryje, Jums reikalinga Adobe Digital Editions “ (tai nemokama programa, specialiai sukurta el. knygoms. Tai nėra tas pats, kas „Adobe Reader“, kurią tikriausiai jau turite savo kompiuteryje.)

Negalite skaityti šios el. knygos naudodami „Amazon Kindle“.

Probability and Statistics for Data Science: Math + R + Data covers math stat—distributions, expected value, estimation etc.—but takes the phrase Data Science in the title quite seriously:* Real datasets are used extensively. * All data analysis is supported by R coding. * Includes many Data Science applications, such as PCA, mixture distributions, random graph models, Hidden Markov models, linear and logistic regression, and neural networks.* Leads the student to think critically about the how and why of statistics, and to see the big picture.* Not theorem/proof-oriented, but concepts and models are stated in a mathematically precise manner.Prerequisites are calculus, some matrix algebra, and some experience in programming.Norman Matloff is a professor of computer science at the University of California, Davis, and was formerly a statistics professor there. He is on the editorial boards of the Journal of Statistical Software and The R Journal. His book Statistical Regression and Classification: From Linear Models to Machine Learning was the recipient of the Ziegel Award for the best book reviewed in Technometrics in 2017. He is a recipient of his universitys Distinguished Teaching Award.

Recenzijos

"I quite like this book. I believe that the book describes itself quite well when it says: Mathematically correct yet highly intuitiveThis book would be great for a class that one takes before one takes my statistical learning class. I often run into beginning graduate Data Science students whose background is not math (e.g., CS or Business) and they are not readyThe book fills an important niche, in that it provides a self-contained introduction to material that is useful for a higher-level statistical learning course. I think that it compares well with competing books, particularly in that it takes a more "Data Science" and "example driven" approach than more classical books." ~Randy Paffenroth, Worchester Polytechnic Institute

"This text by Matloff (Univ. of California, Davis) affords an excellent introduction to statistics for the data science studentIts examples are often drawn from data science applications such as hidden Markov models and remote sensing, to name a few All the models and concepts are explained well in precise mathematical terms (not presented as formal proofs), to help students gain an intuitive understanding." ~CHOICE

About the Author

xxiii

To the Instructor

xxv

To the Reader

xxxi

I Fundamentals of Probability

(146)

1 Basic Probability Models

(32)

1.1 Example: Bus Ridership

(1)

1.2 A "Notebook" View: the Notion of a Repeatable Experiment

(3)

1.2.1 Theoretical Approaches

(1)

1.2.2 A More Intuitive Approach

(2)

1.3 Our Definitions

(4)

1.4 "Mailing Tubes"

(1)

1.5 Example: Bus Ridership Model (cont'd.)

(3)

1.6 Example: ALOHA Network

(5)

1.6.1 ALOHA Network Model Summary

(1)

1.6.2 ALOHA Network Computations

(3)

1.7 ALOHA in the Notebook Context

(1)

1.8 Example: A Simple Board Game

(3)

1.9 Bayes' Rule

(1)

1.9.1 General Principle

(1)

1.9.2 Example: Document Classification

(1)

1.10 Random Graph Models

(2)

1.10.1 Example: Preferential Attachment Model

(1)

1.11 Combinatorics-Based Computation

(5)

1.11.1 Which Is More Likely in Five Cards, One King or Two Hearts?

(1)

1.11.2 Example: Random Groups of Students

(1)

1.11.3 Example: Lottery Tickets

(1)

1.11.4 Example: Gaps between Numbers

(1)

1.11.5 Multinomial Coefficients

(1)

1.11.6 Example: Probability of Getting Four Aces in a Bridge Hand

(1)

1.12 Exercises

(4)

2 Monte Carlo Simulation

(10)

2.1 Example: Rolling Dice

(4)

2.1.1 First Improvement

(1)

2.1.2 Second Improvement

(1)

2.1.3 Third Improvement

(1)

2.2 Example: Dice Problem

(1)

2.3 Use of runif() for Simulating Events

(1)

2.4 Example: Bus Ridership (cont'd.)

(1)

2.5 Example: Board Game (cont'd.)

(1)

2.6 Example: Broken Rod

(1)

2.7 How Long Should We Run the Simulation?

(1)

2.8 Computational Complements

(1)

2.8.1 More on the replicate() Function

(1)

2.9 Exercises

(2)

3 Discrete Random Variables: Expected Value

(20)

3.1 Random Variables

(1)

3.2 Discrete Random Variables

(1)

3.3 Independent Random Variables

(1)

3.4 Example: The Monty Hall Problem

(3)

3.5 Expected Value

(1)

3.5.1 Generality - Not Just for Discrete Random Variables

(1)

3.5.2 Misnomer

(1)

3.5.3 Definition and Notebook View

(1)

3.6 Properties of Expected Value

(7)

3.6.1 Computational Formula

(3)

3.6.2 Further Properties of Expected Value

(4)

3.7 Example: Bus Ridership

(1)

3.8 Example: Predicting Product Demand

(1)

3.9 Expected Values via Simulation

(1)

3.10 Casinos, Insurance Companies and "Sum Users," Compared to Others

(1)

3.11 Mathematical Complements

(1)

3.11.1 Proof of Property E

(1)

3.12 Exercises

(3)

4 Discrete Random Variables: Variance

(18)

4.1 Variance

(6)

4.1.1 Definition

(4)

4.1.2 Central Importance of the Concept of Variance

(1)

4.1.3 Intuition Regarding the Size of Var(X)

(5)

4.1.3.1 Chebychev's Inequality

(1)

4.1.3.2 The Coefficient of Variation

(1)

4.2 A Useful Fact

(1)

4.3 Covariance

(2)

4.4 Indicator Random Variables, and Their Means and Variances

(5)

4.4.1 Example: Return Time for Library Books, Version I

(1)

4.4.2 Example: Return Time for Library Books, Version II

(1)

4.4.3 Example: Indicator Variables in a Committee Problem

(2)

4.5 Skewness

(1)

4.6 Mathematical Complements

(2)

4.6.1 Proof of Chebychev's Inequality

(2)

4.7 Exercises

(2)

5 Discrete Parametric Distribution Families

(30)

5.1 Distributions

(3)

5.1.1 Example: Toss Coin Until First Head

(1)

5.1.2 Example: Sum of Two Dice

(1)

5.1.3 Example: Watts-Strogatz Random Graph Model

(3)

5.1.3.1 The Model

(1)

5.2 Parametric Families of Distributions

(1)

5.3 The Case of Importance to Us: Parameteric Families of pmfs

(2)

5.4 Distributions Based on Bernoulli Trials

(10)

5.4.1 The Geometric Family of Distributions

(6)

5.4.1.1 R Functions

(1)

5.4.1.2 Example: A Parking Space Problem

(2)

5.4.2 The Binomial Family of Distributions

(2)

5.4.2.1 R Functions

(1)

5.4.2.2 Example: Parking Space Model

(1)

5.4.3 The Negative Binomial Family of Distributions

(2)

5.4.3.1 R Functions

(1)

5.4.3.2 Example: Backup Batteries

(1)

5.5 Two Major Non-Bernoulli Models

(8)

5.5.1 The Poisson Family of Distributions

(1)

5.5.1.1 R Functions

(1)

5.5.1.2 Example: Broken Rod

100

(1)

5.5.2 The Power Law Family of Distributions

100

(2)

5.5.2.1 The Model

100

(2)

5.5.3 Fitting the Poisson and Power Law Models to Data

102

(4)

5.5.3.1 Poisson Model

102

(1)

5.5.3.2 Straight-Line Graphical Test for the Power Law

103

(1)

5.5.3.3 Example: DNC E-mail Data

103

(3)

5.6 Further Examples

106

(2)

5.6.1 Example: The Bus Ridership Problem

106

(1)

5.6.2 Example: Analysis of Social Networks

107

(1)

5.7 Computational Complements

108

(1)

5.7.1 Graphics and Visualization in R

108

(1)

5.8 Exercises

109

(4)

6 Continuous Probability Models

113

(34)

6.1 A Random Dart

113

(1)

6.2 Individual Values Now Have Probability Zero

114

(1)

6.3 But Now We Have a Problem

115

(1)

6.4 Our Way Out of the Problem: Cumulative Distribution Functions

115

(4)

6.4.1 CDFs

115

(4)

6.4.2 Non-Discrete, Non-Continuous Distributions

119

(1)

6.5 Density Functions

119

(4)

6.5.1 Properties of Densities

120

(2)

6.5.2 Intuitive Meaning of Densities

122

(1)

6.5.3 Expected Values

122

(1)

6.6 A First Example

123

(1)

6.7 Famous Parametric Families of Continuous Distributions

124

(14)

6.7.1 The Uniform Distributions

125

(2)

6.7.1.1 Density and Properties

125

(1)

6.7.1.2 R Functions

125

(1)

6.7.1.3 Example: Modeling of Disk Performance

126

(1)

6.7.1.4 Example: Modeling of Denial-of-Service Attack

126

(1)

6.7.2 The Normal (Gaussian) Family of Continuous Distributions

127

(1)

6.7.2.1 Density and Properties

127

(1)

6.7.2.2 R Functions

127

(1)

6.7.2.3 Importance in Modeling

128

(1)

6.7.3 The Exponential Family of Distributions

128

(3)

6.7.3.1 Density and Properties

128

(1)

6.7.3.2 R Functions

128

(1)

6.7.3.3 Example: Garage Parking Fees

129

(1)

6.7.3.4 Memoryless Property of Exponential Distributions

130

(1)

6.7.3.5 Importance in Modeling

131

(1)

6.7.4 The Gamma Family of Distributions

131

(3)

6.7.4.1 Density and Properties

132

(1)

6.7.4.2 Example: Network Buffer

133

(1)

6.7.4.3 Importance in Modeling

133

(1)

6.7.5 The Beta Family of Distributions

134

(4)

6.7.5.1 Density Etc.

134

(4)

6.7.5.2 Importance in Modeling

138

(1)

6.8 Mathematical Complements

138

(3)

6.8.1 Hazard Functions

138

(1)

6.8.2 Duality of the Exponential Family with the Poisson Family

139

(2)

6.9 Computational Complements

141

(2)

6.9.1 R's integrate() Function

141

(1)

6.9.2 Inverse Method for Sampling from a Density

141

(1)

6.9.3 Sampling from a Poisson Distribution

142

(1)

6.10 Exercises

143

(4)

II Fundamentals of Statistics

147

(96)

7 Statistics: Prologue

149

(22)

7.1 Importance of This
Chapter

150

(1)

7.2 Sampling Distributions

150

(2)

7.2.1 Random Samples

150

(2)

7.3 The Sample Mean - a Random Variable

152

(4)

7.3.1 Toy Population Example

152

(1)

7.3.2 Expected Value and Variance of X

153

(1)

7.3.3 Toy Population Example Again

154

(1)

7.3.4 Interpretation

155

(1)

7.3.5 Notebook View

155

(1)

7.4 Simple Random Sample Case

156

(1)

7.5 The Sample Variance

157

(2)

7.5.1 Intuitive Estimation of σ2

157

(1)

7.5.2 Easier Computation

158

(1)

7.5.3 Special Case: X Is an Indicator Variable

158

(1)

7.6 To Divide by n or n-1?

159

(2)

7.6.1 Statistical Bias

159

(2)

7.7 The Concept of a "Standard Error"

161

(1)

7.8 Example: Pima Diabetes Study

162

(2)

7.9 Don't Forget: Sample not equal to Population!

164

(1)

7.10 Simulation Issues

164

(1)

7.10.1 Sample Estimates

164

(1)

7.10.2 Infinite Populations?

164

(1)

7.11 Observational Studies

165

(1)

7.12 Computational Complements

165

(5)

7.12.1 The *apply() Functions

165

(3)

7.12.1.1 R's apply() Function

166

(1)

7.12.1.2 The lapply() and sapply() Function

166

(1)

7.12.1.3 The split() and tapply() Functions

167

(1)

7.12.2 Outliers/Errors in the Data

168

(2)

7.13 Exercises

170

(1)

8 Fitting Continuous Models

171

(26)

8.1 Why Fit a Parametric Model?

171

(1)

8.2 Model-Free Estimation of a Density from Sample Data

172

(8)

8.2.1 A Closer Look

172

(1)

8.2.2 Example: BMI Data

173

(1)

8.2.3 The Number of Bins

174

(7)

8.2.3.1 The Bias-Variance Tradeoff

175

(1)

8.2.3.2 The Bias-Variance Tradeoff in the Histogram Case

176

(2)

8.2.3.3 A General Issue: Choosing the Degree of Smoothing

178

(2)

8.3 Advanced Methods for Model-Free Density Estimation

180

(1)

8.4 Parameter Estimation

181

(6)

8.4.1 Method of Moments

181

(1)

8.4.2 Example: BMI Data

182

(1)

8.4.3 The Method of Maximum Likelihood

183

(2)

8.4.4 Example: Humidity Data

185

(2)

8.5 MM vs. MLE

187

(1)

8.6 Assessment of Goodness of Fit

187

(2)

8.7 The Bayesian Philosophy

189

(2)

8.7.1 How Does It Work?

190

(1)

8.7.2 Arguments For and Against

190

(1)

8.8 Mathematical Complements

191

(1)

8.8.1 Details of Kernel Density Estimators

191

(1)

8.9 Computational Complements

192

(2)

8.9.1 Generic Functions

192

(1)

8.9.2 The gmm Package

193

(4)

8.9.2.1 The gmm() Function

193

(1)

8.9.2.2 Example: Bodyfat Data

193

(1)

8.10 Exercises

194

(3)

9 The Family of Normal Distributions

197

(20)

9.1 Density and Properties

197

(3)

9.1.1 Closure under Affine Transformation

198

(1)

9.1.2 Closure under Independent Summation

199

(1)

9.1.3 A Mystery

200

(1)

9.2 R Functions

200

(1)

9.3 The Standard Normal Distribution

200

(1)

9.4 Evaluating Normal cdfs

201

(1)

9.5 Example: Network Intrusion

202

(1)

9.6 Example: Class Enrollment Size

203

(1)

9.7 The Central Limit Theorem

204

(3)

9.7.1 Example: Cumulative Roundoff Error

205

(1)

9.7.2 Example: Coin Tosses

205

(1)

9.7.3 Example: Museum Demonstration

206

(1)

9.7.4 A Bit of Insight into the Mystery

207

(1)

9.8 X Is Approximately Normal

207

(2)

9.8.1 Approximate Distribution of X

207

(1)

9.8.2 Improved Assessment of Accuracy of X

208

(1)

9.9 Importance in Modeling

209

(1)

9.10 The Chi-Squared Family of Distributions

210

(2)

9.10.1 Density and Properties

210

(1)

9.10.2 Example: Error in Pin Placement

211

(1)

9.10.3 Importance in Modeling

211

(1)

9.10.4 Relation to Gamma Family

212

(1)

9.11 Mathematical Complements

212

(1)

9.11.1 Convergence in Distribution, and the Precisely-Stated CLT

212

(1)

9.12 Computational Complements

213

(1)

9.12.1 Example: Generating Normal Random Numbers

213

(1)

9.13 Exercises

214

(3)

10 Introduction to Statistical Inference

217

(26)

10.1 The Role of Normal Distributions

217

(1)

10.2 Confidence Intervals for Means

218

(2)

10.2.1 Basic Formulation

218

(2)

10.3 Example: Pima Diabetes Study

220

(1)

10.4 Example: Humidity Data

221

(1)

10.5 Meaning of Confidence Intervals

221

(2)

10.5.1 A Weight Survey in Davis

221

(2)

10.6 Confidence Intervals for Proportions

223

(3)

10.6.1 Example: Machine Classification of Forest Covers

224

(2)

10.7 The Student-t Distribution

226

(1)

10.8 Introduction to Significance Tests

227

(1)

10.9 The Proverbial Fair Coin

228

(1)

10.10 The Basics

229

(2)

10.11 General Normal Testing

231

(1)

10.12 The Notion of "p-Values"

231

(1)

10.13 What's Random and What Is Not

232

(1)

10.14 Example: The Forest Cover Data

232

(2)

10.15 Problems with Significance Testing

234

(3)

10.15.1 History of Significance Testing

234

(1)

10.15.2 The Basic Issues

235

(1)

10.15.3 Alternative Approach

236

(1)

10.16 The Problem of "P-hacking"

237

(2)

10.16.1 A Thought Experiment

238

(1)

10.16.2 Multiple Inference Methods

238

(1)

10.17 Philosophy of Statistics

239

(2)

10.17.1 More about Interpretation of Cis

239

(6)

10.17.1.1 The Bayesian View of Confidence Intervals

241

(1)

10.18 Exercises

241

(2)

III Multivariate Analysis

243

(122)

11 Multivariate Distributions

245

(20)

11.1 Multivariate Distributions: Discrete

245

(1)

11.1.1 Example: Marbles in a Bag

245

(1)

11.2 Multivariate Distributions: Continuous

246

(2)

11.2.1 Motivation and Definition

246

(1)

11.2.2 Use of Multivariate Densities in Finding Probabilities and Expected Values

247

(1)

11.2.3 Example: Train Rendezvous

247

(1)

11.3 Measuring Co-variation

248

(3)

11.3.1 Covariance

248

(2)

11.3.2 Example: The Committee Example Again

250

(1)

11.4 Correlation

251

(1)

11.4.1 Sample Estimates

252

(1)

11.5 Sets of Independent Random Variables

252

(2)

11.5.1 Mailing Tubes

252

(2)

11.5.1.1 Expected Values Factor

253

(1)

11.5.1.2 Covariance Is 0

253

(1)

11.5.1.3 Variances Add

253

(1)

11.6 Matrix Formulations

254

(2)

11.6.1 Mailing Tubes: Mean Vectors

254

(1)

11.6.2 Covariance Matrices

254

(1)

11.6.3 Mailing Tubes: Covariance Matrices

255

(1)

11.7 Sample Estimate of Covariance Matrix

256

(1)

11.7.1 Example: Pima Data

257

(1)

11.8 Mathematical Complements

257

(5)

11.8.1 Convolution

257

(2)

11.8.1.1 Example: Backup Battery

258

(1)

11.8.2 Transform Methods

259

(17)

11.8.2.1 Generating Functions

259

(2)

11.8.2.2 Sums of Independent Poisson Random Variables Are Poisson Distributed

261

(1)

11.9 Exercises

262

(3)

12 The Multivariate Normal Family of Distributions

265

(10)

12.1 Densities

265

(1)

12.2 Geometric Interpretation

266

(3)

12.3 R Functions

269

(1)

12.4 Special Case: New Variable Is a Single Linear Combination of a Random Vector

270

(1)

12.5 Properties of Multivariate Normal Distributions

270

(2)

12.6 The Multivariate Central Limit Theorem

272

(1)

12.7 Exercises

273

(2)

13 Mixture Distributions

275

(12)

13.1 Iterated Expectations

276

(5)

13.1.1 Conditional Distributions

277

(1)

13.1.2 The Theorem

277

(2)

13.1.3 Example: Flipping Coins with Bonuses

279

(1)

13.1.4 Conditional Expectation as a Random Variable

280

(1)

13.1.5 What about Variance?

280

(1)

13.2 A Closer Look at Mixture Distributions

281

(3)

13.2.1 Derivation of Mean and Variance

281

(2)

13.2.2 Estimation of Parameters

283

(5)

13.2.2.1 Example: Old Faithful Estimation

283

(1)

13.3 Clustering

284

(1)

13.4 Exercises

285

(2)

14 Multivariate Description and Dimension Reduction

287

(22)

14.1 What Is Overfitting Anyway?

288

(5)

14.1.1 "Desperate for Data"

288

(1)

14.1.2 Known Distribution

289

(1)

14.1.3 Estimated Mean

289

(1)

14.1.4 The Bias/Variance Tradeoff: Concrete Illustration

290

(2)

14.1.5 Implications

292

(1)

14.2 Principal Components Analysis

293

(4)

14.2.1 Intuition

293

(2)

14.2.2 Properties of PCA

295

(1)

14.2.3 Example: Turkish Teaching Evaluations

296

(1)

14.3 The Log-Linear Model

297

(3)

14.3.1 Example: Hair Color, Eye Color and Gender

297

(2)

14.3.2 Dimension of Our Data

299

(1)

14.3.3 Estimating the Parameters

299

(1)

14.4 Mathematical Complements

300

(2)

14.4.1 Statistical Derivation of PCA

300

(2)

14.5 Computational Complements

302

(4)

14.5.1 R Tables

302

(1)

14.5.2 Some Details on Log-Linear Models

302

(8)

14.5.2.1 Parameter Estimation

303

(1)

14.5.2.2 The loglin() Function

304

(1)

14.5.2.3 Informal Assessment of Fit

305

(1)

14.6 Exercises

306

(3)

15 Predictive Modeling

309

(34)

15.1 Example: Heritage Health Prize

309

(1)

15.2 The Goals: Prediction and Description

310

(1)

15.2.1 Terminology

310

(1)

15.3 What Does "Relationship" Mean?

311

(3)

15.3.1 Precise Definition

311

(2)

15.3.2 Parametric Models for the Regression Function m()

313

(1)

15.4 Estimation in Linear Parametric Regression Models

314

(1)

15.5 Example: Baseball Data

315

(4)

15.5.1 R. Code

316

(3)

15.6 Multiple Regression

319

(1)

15.7 Example: Baseball Data (cont'd.)

320

(1)

15.8 Interaction Terms

321

(1)

15.9 Parametric Estimation

322

(6)

15.9.1 Meaning of "Linear"

322

(1)

15.9.2 Random-X and Fixed-X Regression

322

(1)

15.9.3 Point Estimates and Matrix Formulation

323

(3)

15.9.4 Approximate Confidence Intervals

326

(2)

15.10 Example: Baseball Data (cont'd )

328

(1)

15.11 Dummy Variables

329

(1)

15.12 Classification

330

(6)

15.12.1 Classification = Regression

331

(1)

15.12.2 Logistic Regression

332

(2)

15.12.2.1 The Logistic Model: Motivations

332

(2)

15.12.2.2 Estimation and Inference for Logit

334

(1)

15.12.3 Example: Forest Cover Data

334

(1)

15.12.4 R Code

334

(1)

15.12.5 Analysis of the Results

335

(1)

15.12.5.1 Multiclass Case

336

(1)

15.13 Machine Learning: Neural Networks

336

(4)

15.13.1 Example: Predicting Vertebral Abnormalities

336

(3)

15.13.2 But What Is Really Going On?

339

(1)

15.13.3 R Packages

339

(1)

15.14 Computational Complements

340

(2)

15.14.1 Computational Details in Section 15.5.1

340

(1)

15.14.2 More Regarding glm()

341

(1)

15.15 Exercises

342

(1)

16 Model Parsimony and Overfitting

343

(6)

16.1 What Is Overfitting?

343

(2)

16.1.1 Example: Histograms

343

(1)

16.1.2 Example: Polynomial Regression

344

(1)

16.2 Can Anything Be Done about It?

345

(1)

16.2.1 Cross-Validation

345

(1)

16.3 Predictor Subset Selection

346

(1)

16.4 Exercises

347

(2)

17 Introduction to Discrete Time Markov Chains

349

(16)

17.1 Matrix Formulation

350

(1)

17.2 Example: Die Game

351

(1)

17.3 Long-Run State Probabilities

352

(4)

17.3.1 Stationary Distribution

353

(1)

17.3.2 Calculation of π

354

(1)

17.3.3 Simulation Calculation of π

355

(1)

17.4 Example: 3-Heads-in-a-Row Game

356

(2)

17.5 Example: Bus Ridership Problem

358

(1)

17.6 Hidden Markov Models

359

(2)

17.6.1 Example: Bus Ridership

360

(1)

17.6.2 Computation

361

(1)

17.7 Google PageRank

361

(1)

17.8 Computational Complements

361

(1)

17.8.1 Initializing a Matrix to All 0s

361

(1)

17.9 Exercises

362

(3)

IV Appendices

365

(26)

A R Quick Start

367

(16)

A.1 Starting R

367

(1)

A.2 Correspondences

368

(1)

A.3 First Sample Programming Session

369

(3)

A.4 Vectorization

372

(1)

A.5 Second Sample Programming Session

372

(2)

A.6 Recycling

374

(1)

A.7 More on Vectorization

374

(1)

A.8 Default Argument Values

375

(1)

A.9 The R List Type

376

(2)

A.9.1 The Basics

376

(1)

A.9.2 S3 Classes

377

(1)

A.10 Data Frames

378

(2)

A.11 Online Help

380

(1)

A.12 Debugging in R

380

(3)

B Matrix Algebra

383

(8)

B.1 Terminology and Notation

383

(2)

B.1.1 Matrix Addition and Multiplication

383

(2)

B.2 Matrix Transpose

385

(1)

B.3 Matrix Inverse

385

(1)

B.4 Eigenvalues and Eigenvectors

385

(1)

B.5 Mathematical Complements

386

(5)

B.5.1 Matrix Derivatives

386

(5)

Bibliography

391

(4)

Index

395

Norman Matloff is a professor of computer science at the University of California, Davis, and was formerly a statistics professor there. He is on the editorial boards of the Journal of Statistical Software and The R Journal. His book Statistical Regression and Classification: From Linear Models to Machine Learning was the recipient of the Ziegel Award for the best book reviewed in Technometrics in 2017. He is a recipient of his university's Distinguished Teaching Award.

Daugiau apie elektronines knygas

Pastovi nuoroda: https://www.kriso.lt/db/97804296871292e.html

Raktažodžiai:

El. knyga: Probability and Statistics for Data Science: Math + R + Data

DRM apribojimai

Kopijuoti:

Spausdinti:

El. knygos naudojimas:

Recenzijos

Paskyra ir nustatymai

Paieška

Ieškoti duomenų bazėje

Patikslinti paiešką

Temos E-knygų temos

Pasirinkti pirkinių krepšelį