Klientų aptarnavimas: +370 652 87781

Pagalba | Naujas vartotojas | Prisijungti

Foundations of Data Science [Kietas viršelis]

4.24/5 (29 ratings by Goodreads)

Avrim Blum, John Hopcroft (Cornell University, New York), Ravindran Kannan

Formatas: Hardback, 432 pages, aukštis x plotis x storis: 259x182x27 mm, weight: 930 g, Worked examples or Exercises
Išleidimo metai: 23-Jan-2020
Leidėjas: Cambridge University Press
ISBN-10: 1108485065
ISBN-13: 9781108485067

Kitos knygos pagal šią temą:

Pattern recognition

Kietas viršelis
Kaina: 66,25 €
Knygas pristatysime per 3-4 savaites.
Kiekis:
- - 1
  - 2
  - 3
  - 4
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
Įdėti į krepšelį
Pristatymas per 4-6 savaites
Įtraukti į pageidavimų sąrašą

Formatas: Hardback, 432 pages, aukštis x plotis x storis: 259x182x27 mm, weight: 930 g, Worked examples or Exercises
Išleidimo metai: 23-Jan-2020
Leidėjas: Cambridge University Press
ISBN-10: 1108485065
ISBN-13: 9781108485067

Kitos knygos pagal šią temą:

Pattern recognition

Pastovi nuoroda: https://www.kriso.lt/db/9781108485067.html

Raktažodžiai:

"This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data"--

Recenzijos

'This beautifully written text is a scholarly journey through the mathematical and algorithmic foundations of data science. Rigorous but accessible, and with many exercises, it will be a valuable resource for advanced undergraduate and graduate classes.' Peter Bartlett, University of California, Berkeley 'The rise of the Internet, digital media, and social networks has brought us to the world of data, with vast sources from every corner of society. Data Science - aiming to understand and discover the essences that underlie the complex, multifaceted, and high-dimensional data - has truly become a 'universal discipline', with its multidisciplinary roots, interdisciplinary presence, and societal relevance. This timely and comprehensive book presents - by bringing together from diverse fields of computing - a full spectrum of mathematical, statistical, and algorithmic materials fundamental to data analysis, machine learning, and network modeling. Foundations of Data Science offers an effective roadmap to approach this fascinating discipline and engages more advanced readers with rigorous mathematical/algorithmic theory.' Shang-Hua Teng, University of Southern California 'A lucid account of mathematical ideas that underlie today's data analysis and machine learning methods. I learnt a lot from it, and I am sure it will become an invaluable reference for many students, researchers and faculty around the world.' Sanjeev Arora, Princeton University, New Jersey 'It provides a very broad overview of the foundations of data science that should be accessible to well-prepared students with backgrounds in computer science, linear algebra, and probability theory These are all important topics in the theory of machine learning and it is refreshing to see them introduced together in a textbook at this level.' Brian Borchers, MAA Reviews 'One plausible measure of [ Foundations of Data Science's] impact is the book's own citation metrics. Semantic Scholar (https://www.semanticscholar.org) reports 81 citations with 42 citations related to background or methods; [ Foundations of Data Science] appears to be on course to becoming influential.' M. Mounts, Choice

Daugiau informacijos

Covers mathematical and algorithmic foundations of data science: machine learning, high-dimensional geometry, and analysis of large networks.

1 Introduction

(3)

2 High-Dimensional Space

(25)

2.1 Introduction

(1)

2.2 The Law of Large Numbers

(4)

2.3 The Geometry of High Dimensions

(1)

2.4 Properties of the Unit Ball

(5)

2.5 Generating Points Uniformly at Random from a Ball

(2)

2.6 Gaussians in High Dimension

(1)

2.7 Random Projection and Johnson-Lindenstrauss Lemma

(2)

2.8 Separating Gaussians

(2)

2.9 Fitting a Spherical Gaussian to Data

(1)

2.10 Bibliographic Notes

(1)

2.11 Exercises

(7)

3 Best-Fit Subspaces and Singular Value Decomposition (SVD)

(33)

3.1 Introduction

(2)

3.2 Preliminaries

(1)

3.3 Singular Vectors

(3)

3.4 Singular Value Decomposition (SVD)

(2)

3.5 Best Rank-A: Approximations

(1)

3.6 Left Singular Vectors

(2)

3.7 Power Method for Singular Value Decomposition

(3)

3.8 Singular Vectors and Eigenvectors

(1)

3.9 Applications of Singular Value Decomposition

(11)

3.10 Bibliographic Notes

(1)

3.11 Exercises

(8)

4 Random Walks and Markov Chains

(47)

4.1 Stationary Distribution

(2)

4.2 Markov Chain Monte Carlo

(4)

4.3 Areas and Volumes

(2)

4.4 Convergence of Random Walks on Undirected Graphs

(8)

4.5 Electrical Networks and Random Walks

(4)

4.6 Random Walks on Undirected Graphs with Unit Edge Weights

(7)

4.7 Random Walks in Euclidean Space

(3)

4.8 The Web as a Markov Chain

(3)

4.9 Bibliographic Notes

(1)

4.10 Exercises

(10)

5 Machine Learning

109

(50)

5.1 Introduction

109

(1)

5.2 The Perceptron Algorithm

110

(1)

5.3 Kernel Functions and Nonlinearly Separable Data

111

(2)

5.4 Generalizing to New Data

113

(5)

5.5 VC-Dimension

118

(8)

5.6 VC-Dimension and Machine Learning

126

(1)

5.7 Other Measures of Complexity

127

(1)

5.8 Deep Learning

128

(6)

5.9 Gradient Descent

134

(4)

5.10 Online Learning

138

(7)

5.11 Boosting

145

(3)

5.12 Further Current Directions

148

(4)

5.13 Bibliographic Notes

152

(1)

5.14 Exercises

152

(7)

6 Algorithms for Massive Data Problems: Streaming, Sketching, and Sampling

159

(23)

6.1 Introduction

159

(1)

6.2 Frequency Moments of Data Streams

160

(9)

6.3 Matrix Algorithms Using Sampling

169

(8)

6.4 Sketches of Documents

177

(1)

6.5 Bibliographic Notes

178

(1)

6.6 Exercises

179

(3)

7 Clustering

182

(33)

7.1 Introduction

182

(3)

7.2 k-Means Clustering

185

(4)

7.3 k-Center Clustering

189

(1)

7.4 Finding Low-Error Clusterings

189

(1)

7.5 Spectral Clustering

190

(7)

7.6 Approximation Stability

197

(2)

7.7 High-Density Clusters

199

(2)

7.8 Kernel Methods

201

(1)

7.9 Recursive Clustering Based on Sparse Cuts

202

(1)

7.10 Dense Submatrices and Communities

202

(3)

7.11 Community Finding and Graph Partitioning

205

(3)

7.12 Spectral Clustering Applied to Social Networks

208

(2)

7.13 Bibliographic Notes

210

(1)

7.14 Exercises

210

(5)

8 Random Graphs

215

(59)

8.1 The G(n,p) Model

215

(7)

8.2 Phase Transitions

222

(10)

8.3 Giant Component

232

(3)

8.4 Cycles and Full Connectivity

235

(4)

8.5 Phase Transitions for Increasing Properties

239

(2)

8.6 Branching Processes

241

(5)

8.7 CNF-SAT

246

(6)

8.8 Nonuniform Models of Random Graphs

252

(2)

8.9 Growth Models

254

(7)

8.10 Small-World Graphs

261

(5)

8.11 Bibliographic Notes

266

(1)

8.12 Exercises

266

(8)

9 Topic Models, Nonnegative Matrix Factorization, Hidden Markov Models, and Graphical Models

274

(44)

9.1 Topic Models

274

(3)

9.2 An Idealized Model

277

(2)

9.3 Nonnegative Matrix Factorization

279

(2)

9.4 NMF with Anchor Terms

281

(1)

9.5 Hard and Soft Clustering

282

(1)

9.6 The Latent Dirichlet Allocation Model for Topic Modeling

283

(2)

9.7 The Dominant Admixture Model

285

(2)

9.8 Formal Assumptions

287

(3)

9.9 Finding the Term-Topic Matrix

290

(5)

9.10 Hidden Markov Models

295

(3)

9.11 Graphical Models and Belief Propagation

298

(1)

9.12 Bayesian or Belief Networks

299

(1)

9.13 Markov Random Fields

300

(1)

9.14 Factor Graphs

301

(1)

9.15 Tree Algorithms

301

(2)

9.16 Message Passing in General Graphs

303

(7)

9.17 Warning Propagation

310

(1)

9.18 Correlation between Variables

311

(4)

9.19 Bibliographic Notes

315

(1)

9.20 Exercises

315

(3)

10 Other Topics

318

(23)

10.1 Ranking and Social Choice

318

(4)

10.2 Compressed Sensing and Sparse Vectors

322

(3)

10.3 Applications

325

(2)

10.4 An Uncertainty Principle

327

(3)

10.5 Gradient

330

(2)

10.6 Linear Programming

332

(2)

10.7 Integer Optimization

334

(1)

10.8 Semi-Definite Programming

334

(2)

10.9 Bibliographic Notes

336

(1)

10.10 Exercises

337

(4)

11 Wavelets

341

(19)

11.1 Dilation

341

(1)

11.2 The Haar Wavelet

342

(3)

11.3 Wavelet Systems

345

(1)

11.4 Solving the Dilation Equation

346

(1)

11.5 Conditions on the Dilation Equation

347

(3)

11.6 Derivation of the Wavelets from the Scaling Function

350

(3)

11.7 Sufficient Conditions for the Wavelets to Be Orthogonal

353

(2)

11.8 Expressing a Function in Terms of Wavelets

355

(1)

11.9 Designing a Wavelet System

356

(1)

11.10 Applications

357

(1)

11.11 Bibliographic Notes

357

(1)

11.12 Exercises

357

(3)

12 Background Material

360

(51)

12.1 Definitions and Notation

360

(1)

12.2 Useful Relations

361

(4)

12.3 Useful Inequalities

365

(7)

12.4 Probability

372

(8)

12.5 Bounds on Tail Probability

380

(6)

12.6 Applications of the Tail Bound

386

(1)

12.7 Eigenvalues and Eigenvectors

387

(13)

12.8 Generating Functions

400

(4)

12.9 Miscellaneous

404

(3)

12.10 Exercises

407

(4)

References

411

(10)

Index

421

Avrim Blum is Chief Academic Officer at Toyota Technical Institute at Chicago and formerly Professor at Carnegie Mellon University, Pennsylvania. He has over 25,000 citations for his work in algorithms and machine learning. He has received the AI Journal Classic Paper Award, ICML/COLT 10-Year Best Paper Award, Sloan Fellowship, NSF NYI award, and Herb Simon Teaching Award, and is a Fellow of the Association for Computing Machinery (ACM). John Hopcroft is a member of the National Academy of Sciences and National Academy of Engineering, and a foreign member of the Chinese Academy of Sciences. He received the Turing Award in 1986, was appointed to the National Science Board in 1992 by President George H. W. Bush, and was presented with the Friendship Award by Premier Li Keqiang for his work in China. Ravi Kannan is Principal Researcher for Microsoft Research, India. He was the recipient of the Fulkerson Prize in Discrete Mathematics (1991) and the Knuth Prize (ACM) in 2011. He is a distinguished alumnus of the Indian Institute of Technology, Bombay, and his past faculty appointments include Massachusetts Institute of Technology, Carnegie Mellon University, Pennsylvania, Yale University, Connecticut, and the Indian Institute of Science.

Foundations of Data Science [Kietas viršelis]

Recenzijos

Daugiau informacijos

Paskyra ir nustatymai

Paieška

Ieškoti duomenų bazėje

Patikslinti paiešką

Temos Temos anglų kalba

Pasirinkti pirkinių krepšelį