Using data from one season of NBA games, Basketball Data Science: With Applications in R is the perfect book for anyone interested in learning and applying data analytics in basketball. Whether assessing the spatial performance of an NBA player’s shots or doing an analysis of the impact of high pressure game situations on the probability of scoring, this book discusses a variety of case studies and hands-on examples using a custom R package. The codes are supplied so readers can reproduce the analyses themselves or create their own. Assuming a basic statistical knowledge, Basketball Data Science with R is suitable for students, technicians, coaches, data analysts and applied researchers.
Features:
· One of the first books to provide statistical and data mining methods for the growing field of analytics in basketball.
· Presents tools for modelling graphs and figures to visualize the data.
· Includes real world case studies and examples, such as estimations of scoring probability using the Golden State Warriors as a test case.
· Provides the source code and data so readers can do their own analyses on NBA teams and players.
Foreword |
|
xi | |
Preface |
|
xv | |
Authors |
|
xxiii | |
|
PART I Getting Started Analyzing Basketball Data |
|
|
|
|
3 | (14) |
|
1.1 What Is Data Science? |
|
|
4 | (6) |
|
1.1.1 Knowledge representation |
|
|
5 | (1) |
|
1.1.2 A tool for decisions and not a substitute for human intelligence |
|
|
6 | (4) |
|
1.2 Data Science In Basketball |
|
|
10 | (4) |
|
1.3 How The Book Is Structured |
|
|
14 | (3) |
|
Chapter 2 Data and Basic Statistical Analyses |
|
|
17 | (40) |
|
|
18 | (6) |
|
2.2 Basic Statistical Analyses |
|
|
24 | (33) |
|
2.2.1 Pace, Ratings, Four Factors |
|
|
24 | (3) |
|
|
27 | (3) |
|
|
30 | (4) |
|
|
34 | (3) |
|
|
37 | (3) |
|
2.2.6 Variability analysis |
|
|
40 | (4) |
|
2.2.7 Inequality analysis |
|
|
44 | (6) |
|
|
50 | (7) |
|
|
|
Chapter 3 Discovering Patterns in Data |
|
|
57 | (54) |
|
3.1 Quantifying Associations Between Variables |
|
|
58 | (10) |
|
3.1.1 Statistical dependence |
|
|
59 | (3) |
|
|
62 | (2) |
|
|
64 | (4) |
|
3.2 Analyzing Pairwise Linear Correlation Among Variables |
|
|
68 | (5) |
|
3.3 Visualizing Similarities Among Individuals |
|
|
73 | (3) |
|
3.4 Analyzing Network Relationships |
|
|
76 | (14) |
|
3.5 Estimating Event Densities |
|
|
90 | (8) |
|
3.5.1 Density with respect to a concurrent variable |
|
|
90 | (6) |
|
|
96 | (2) |
|
3.5.3 Joint density of two variables |
|
|
98 | (1) |
|
3.6 Focus: Shooting Under High-Pressure Conditions |
|
|
98 | (13) |
|
Chapter 4 Finding Groups in Data |
|
|
111 | (40) |
|
|
113 | (3) |
|
|
116 | (20) |
|
4.2.1 k-means clustering of NBA teams |
|
|
117 | (9) |
|
4.2.2 k-means clustering of Golden State Warriors' shots |
|
|
126 | (10) |
|
4.3 Agglomerative Hierarchical Clustering |
|
|
136 | (9) |
|
4.3.1 Hierarchical clustering of NBA players |
|
|
138 | (7) |
|
4.4 Focus: New Roles In Basketball |
|
|
145 | (6) |
|
Chapter 5 Modeling Relationships in Data |
|
|
151 | (34) |
|
|
155 | (4) |
|
5.1.1 Simple linear regression model |
|
|
156 | (3) |
|
5.2 Nonparametric Regression |
|
|
159 | (14) |
|
5.2.1 Polynomial local regression |
|
|
160 | (3) |
|
5.2.2 Gaussian kernel smoothing |
|
|
163 | (3) |
|
5.2.2.1 Estimation of scoring probability |
|
|
166 | (2) |
|
5.2.2.2 Estimation of expected points |
|
|
168 | (5) |
|
5.3 Focus: Surface Area Dynamics And Their Effects On The Team Performance |
|
|
173 | (12) |
|
PART III Computational Insights |
|
|
|
Chapter 6 The R Package BasketballAnalyzeR |
|
|
185 | (12) |
|
|
185 | (2) |
|
|
187 | (2) |
|
|
189 | (4) |
|
6.4 Building Interactive Graphics |
|
|
193 | (2) |
|
|
195 | (2) |
Bibliography |
|
197 | (18) |
Index |
|
215 | |
Paola Zuccolotto and Marica Manisera are, respectively, Full and Associate Professor of Statistics at the University of Brescia. Paola Zuccolotto is the scientific director of the Big & Open Data Innovation Laboratory (BODaI-Lab), where she coordinates, together with Marica Manisera, the international project Big Data Analytics in Sports (BDsports).
They carry out scientific research activity in the field of Statistical Science, both with a methodological and applied approach. They authored/co-authored several scientific articles in international journals and books, participated to many national and international conferences, also as organizers of specialized sessions, often on the topic of Sports Analytics. They regularly act as scientific reviewers for the worlds most prestigious journals in the field of Statistics.
Paola Zuccolotto is a member of the Editorial Advisory Board of the Journal of Sports Sciences, while Marica Manisera is Associate Editor of the Journal of Sports Analytics; both of them are guest co-editors of special issues of international journals on Statistics in Sports. The International Statistical Institute (ISI) delegated them the task of revitalizing its Special Interest Group (SIG) on Sports Statistics. Marica Manisera is the Chair of the renewed ISI SIG on Sport.
Both of them teach undergraduate and graduate courses in the field of Statistics and are responsible for the scientific area dedicated to Sport Analytics at the PhD Analytics for Economics and Management of the University of Brescia. They also teach courses and seminars on Sports Analytics in University Masters on Sports Engineering and specialized training projects devoted to people operating in the sports world. They supervise students internships, final reports and masters theses on the subject of Statistics, often with applications to sport data. They also work in collaboration with high-school teachers, creating experimental educational projects to bring students closer to quantitative subjects through Sport Analytics.