Atnaujinkite slapukų nuostatas

Big Data: Concepts, Technology, and Architecture [Kietas viršelis]

(Galgotias University, Greater Noida, India), (University of Technology Sydney, Australia), (VIT University, Vellore, India), (Noroff University College, Kristiansand, Norwa)
  • Formatas: Hardback, 368 pages, aukštis x plotis x storis: 10x10x10 mm, weight: 454 g
  • Išleidimo metai: 15-Jun-2021
  • Leidėjas: John Wiley & Sons Inc
  • ISBN-10: 1119701821
  • ISBN-13: 9781119701828
Kitos knygos pagal šią temą:
  • Formatas: Hardback, 368 pages, aukštis x plotis x storis: 10x10x10 mm, weight: 454 g
  • Išleidimo metai: 15-Jun-2021
  • Leidėjas: John Wiley & Sons Inc
  • ISBN-10: 1119701821
  • ISBN-13: 9781119701828
Kitos knygos pagal šią temą:
"This book offers comprehensive coverage of Big Data tools, terminologies and technologies for researchers, business professionals and graduates. This book begins with an overview of what Big Data is and emphasizes all the key concepts of big data end toend. Big Data concepts, technologies, terminologies and storing, processing and analysis techniques and much more -- are all logically organized and reinforced by diagrams and case studies. This book refines readers' understanding of Big Data with in-depth analysis of key concepts. The case studies provided in this book give insight on key concepts. The initial chapters of the book shed light on various characteristics of Big Data that distinguish it from traditional Database Management systems. Big DataAnalytics are covered in detail in a separate chapter. Hadoop, the heart of Big Data is handled in the Big Data processing chapter and a deep understanding of its concepts is provided"--

Learn Big Data from the ground up with this complete and up-to-date resource from leaders in the field 

Big Data: Concepts, Technology, and Architecture delivers a comprehensive treatment of Big Data tools, terminology, and technology perfectly suited to a wide range of business professionals, academic researchers, and students. Beginning with a fulsome overview of what we mean when we say, “Big Data,” the book moves on to discuss every stage of the lifecycle of Big Data. 

You’ll learn about the creation of structured, unstructured, and semi-structured data, data storage solutions, traditional database solutions like SQL, data processing, data analytics, machine learning, and data mining. You’ll also discover how specific technologies like Apache Hadoop, SQOOP, and Flume work. 

Big Data also covers the central topic of big data visualization with Tableau, and you’ll learn how to create scatter plots, histograms, bar, line, and pie charts with that software. 

Accessibly organized, Big Data includes illuminating case studies throughout the material, showing you how the included concepts have been applied in real-world settings. Some of those concepts include: 

  • The common challenges facing big data technology and technologists, like data heterogeneity and incompleteness, data volume and velocity, storage limitations, and privacy concerns 
  • Relational and non-relational databases, like RDBMS, NoSQL, and NewSQL databases 
  • Virtualizing Big Data through encapsulation, partitioning, and isolating, as well as big data server virtualization 
  • Apache software, including Hadoop, Cassandra, Avro, Pig, Mahout, Oozie, and Hive 
  • The Big Data analytics lifecycle, including business case evaluation, data preparation, extraction, transformation, analysis, and visualization 

Perfect for data scientists, data engineers, and database managers, Big Data also belongs on the bookshelves of business intelligence analysts who are required to make decisions based on large volumes of information. Executives and managers who lead teams responsible for keeping or understanding large datasets will also benefit from this book. 

 

Acknowledgments xi
About the Author xii
1 Introduction to the World of Big Data
1(30)
1.1 Understanding Big Data
1(1)
1.2 Evolution of Big Data
2(1)
1.3 Failure of Traditional Database in Handling Big Data
3(1)
1.4 3Vs of Big Data
4(3)
1.5 Sources of Big Data
7(1)
1.6 Different Types of Data
8(3)
1.7 Big Data Infrastructure
11(1)
1.8 Big Data Life Cycle
12(6)
1.9 Big Data Technology
18(3)
1.10 Big Data Applications
21(1)
1.11 Big Data Use Cases
21(10)
Chapter 1 Refresher
24(7)
2 Big Data Storage Concepts
31(22)
2.1 Cluster Computing
32(5)
2.2 Distribution Models
37(6)
2.3 Distributed File System
43(1)
2.4 Relational and Non-Relational Databases
43(4)
2.5 Scaling Up and Scaling Out Storage
47(6)
Chapter 2 Refresher
48(5)
3 NoSQL Database
53(30)
3.1 Introduction to NoSQL
53(1)
3.2 Why NoSQL
54(1)
3.3 CAP Theorem
54(2)
3.4 ACID
56(1)
3.5 BASE
56(1)
3.6 Schemaless Databases
57(1)
3.7 NoSQL (Not Only SQL)
57(19)
3.8 Migrating from RDBMS to NoSQL
76(7)
Chapter 3 Refresher
77(6)
4 Processing, Management Concepts, and Cloud Computing
83(1)
Part I Big Data Processing and Management Concepts
83(10)
4.1 Data Processing
83(2)
4.2 Shared Everything Architecture
85(1)
4.3 Shared-Nothing Architecture
86(2)
4.4 Batch Processing
88(1)
4.5 Real-Time Data Processing
88(1)
4.6 Parallel Computing
89(1)
4.7 Distributed Computing
90(1)
4.8 Big Data Virtualization
90(3)
Part II Managing and Processing Big Data in Cloud Computing
93(254)
4.9 Introduction
93(1)
4.10 Cloud Computing Types
94(1)
4.11 Cloud Services
95(1)
4.12 Cloud Storage
96(5)
4.13 Cloud Architecture
101(10)
Chapter 4 Refresher
103(8)
5 Driving Big Data with Hadoop Tools and Technologies
111(50)
5.1 Apache Hadoop
111(3)
5.2 Hadoop Storage
114(5)
5.3 Hadoop Computation
119(10)
5.4 Hadoop 2.0
129(9)
5.5 HBASE
138(3)
5.6 Apache Cassandra
141(1)
5.7 SQOOP
141(2)
5.8 Flume
143(1)
5.9 Apache Avro
144(1)
5.10 Apache Pig
145(1)
5.11 Apache Mahout
146(1)
5.12 Apache Oozie
146(3)
5.13 Apache Hive
149(2)
5.14 Hive Architecture
151(1)
5.15 Hadoop Distributions
152(9)
Chapter 5 Refresher
153(8)
6 Big Data Analytics
161(26)
6.1 Terminology of Big Data Analytics
161(1)
6.2 Big Data Analytics
162(4)
6.3 Data Analytics Life Cycle
166(4)
6.4 Big Data Analytics Techniques
170(5)
6.5 Semantic Analysis
175(3)
6.6 Visual analysis
178(1)
6.7 Big Data Business Intelligence
178(2)
6.8 Big Data Real-Time Analytics Processing
180(1)
6.9 Enterprise Data Warehouse
181(6)
Chapter 6 Refresher
182(5)
7 Big Data Analytics with Machine Learning
187(14)
7.1 Introduction to Machine Learning
187(1)
7.2 Machine Learning Use Cases
188(1)
7.3 Types of Machine Learning
189(12)
Chapter 7 Refresher
196(5)
8 Mining Data Streams and Frequent Itemset
201(58)
8.1 Itemset Mining
201(5)
8.2 Association Rules
206(4)
8.3 Frequent Itemset Generation
210(1)
8.4 Itemset Mining Algorithms
211(18)
8.5 Maximal and Closed Frequent Itemset
229(4)
8.6 Mining Maximal Frequent Itemsets: the GenMax Algorithm
233(3)
8.7 Mining Closed Frequent Itemsets: the Charm Algorithm
236(1)
8.8 CHARM Algorithm Implementation
236(3)
8.9 Data Mining Methods
239(1)
8.10 Prediction
240(1)
8.11 Important Terms Used in Bayesian Network
241(8)
8.12 Density Based Clustering Algorithm
249(1)
8.13 DBSCAN
249(1)
8.14 Kernel Density Estimation
250(4)
8.15 Mining Data Streams
254(1)
8.16 Time Series Forecasting
255(4)
9 Cluster Analysis
259(34)
9.1 Clustering
259(2)
9.2 Distance Measurement Techniques
261(2)
9.3 Hierarchical Clustering
263(3)
9.4 Analysis of Protein Patterns in the Human Cancer-Associated Liver
266(1)
9.5 Recognition Using Biometrics of Hands
267(7)
9.6 Expectation Maximization Clustering Algorithm
274(3)
9.7 Representative-Based Clustering
277(1)
9.8 Methods of Determining the Number of Clusters
277(7)
9.9 Optimization Algorithm
284(4)
9.10 Choosing the Number of Clusters
288(2)
9.11 Bayesian Analysis of Mixtures
290(1)
9.12 Fuzzy Clustering
290(1)
9.13 Fuzzy C-Means Clustering
291(2)
10 Big Data Visualization
293(54)
10.1 Big Data Visualization
293(1)
10.2 Conventional Data Visualization Techniques
294(3)
10.3 Tableau
297(12)
10.4 Bar Chart in Tableau
309(1)
10.5 Line Chart
310(1)
10.6 Pie Chart
311(1)
10.7 Bubble Chart
312(1)
10.8 Box Plot
313(1)
10.9 Tableau Use Cases
313(5)
10.10 Installing R and Getting Ready
318(3)
10.11 Data Structures in R
321(14)
10.12 Importing Data from a File
335(1)
10.13 Importing Data from a Delimited Text File
336(1)
10.14 Control Structures in R
337(4)
10.15 Basic Graphs in R
341(6)
Index 347
BALAMURUGAN BALUSAMY, PHD, is a Professor with the School of Computing Science and Engineering at Galgotias University, Greater Noida, India

NANDHINI ABIRAMI. R is an IT Consultant and Research Scholar at VIT University in Vellore.

SEIFEDINE KADRY, PhD, is a Professor of Data Science at the Faculty of Applied Computing and Technology at Noroff University College, Kristiansand, Norway.

AMIR H. GANDOMI, PHD, is a Professor of Data Science at the Faculty of Engineering & Information Technology, University of Technology Sydney, Australia.