Book description
Explore big data concepts, platforms, analytics, and their applications using the power of Hadoop 3
About This Book- Learn Hadoop 3 to build effective big data analytics solutions on-premise and on cloud
- Integrate Hadoop with other big data tools such as R, Python, Apache Spark, and Apache Flink
- Exploit big data using Hadoop 3 with real-world examples
Big Data Analytics with Hadoop 3 is for you if you are looking to build high-performance analytics solutions for your enterprise or business using Hadoop 3's powerful features, or you're new to big data analytics. A basic understanding of the Java programming language is required.
What You Will Learn- Explore the new features of Hadoop 3 along with HDFS, YARN, and MapReduce
- Get well-versed with the analytical capabilities of Hadoop ecosystem using practical examples
- Integrate Hadoop with R and Python for more efficient big data processing
- Learn to use Hadoop with Apache Spark and Apache Flink for real-time data analytics
- Set up a Hadoop cluster on AWS cloud
- Perform big data analytics on AWS using Elastic Map Reduce
Apache Hadoop is the most popular platform for big data processing, and can be combined with a host of other big data tools to build powerful analytics solutions. Big Data Analytics with Hadoop 3 shows you how to do just that, by providing insights into the software as well as its benefits with the help of practical examples.
Once you have taken a tour of Hadoop 3's latest features, you will get an overview of HDFS, MapReduce, and YARN, and how they enable faster, more efficient big data processing. You will then move on to learning how to integrate Hadoop with the open source tools, such as Python and R, to analyze and visualize data and perform statistical computing on big data. As you get acquainted with all this, you will explore how to use Hadoop 3 with Apache Spark and Apache Flink for real-time data analytics and stream processing. In addition to this, you will understand how to use Hadoop to build analytics solutions on the cloud and an end-to-end pipeline to perform big data analysis using practical use cases.
By the end of this book, you will be well-versed with the analytical capabilities of the Hadoop ecosystem. You will be able to build powerful solutions to perform big data analytics and get insight effortlessly.
Style and approachFilled with practical examples and use cases, this book will not only help you get up and running with Hadoop, but will also take you farther down the road to deal with Big Data Analytics
Table of contents
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributors
- Preface
-
Introduction to Hadoop
- Hadoop Distributed File System
- MapReduce framework
- YARN
- Other changes
- Installing Hadoop 3 
- Summary
-
Overview of Big Data Analytics
- Introduction to data analytics
- Introduction to big data
- Distributed computing using Apache Hadoop
- The MapReduce framework
- Hive
- Apache Spark
- Visualization using Tableau
- Summary
- Big Data Processing with MapReduce
- Scientific Computing and Big Data Analysis with Python and Hadoop
- Statistical Big Data Computing with R and Hadoop
- Batch Analytics with Apache Spark
- Real-Time Analytics with Apache Spark
- Batch Analytics with Apache Flink
-
Stream Processing with Apache Flink
- Introduction to streaming execution model
- Data processing using the DataStream API
- Summary
- Visualizing Big Data
- Introduction to Cloud Computing
-
Using Amazon Web Services
- Amazon Elastic Compute Cloud
- Launching multiple instances of an AMI
- What is AWS Lambda?
-
Introduction to Amazon S3
- Getting started with Amazon S3
- Comprehensive security and compliance capabilities
- Query in place
- Flexible management
- Most supported platform with the largest ecosystem
- Easy and flexible data transfer
- Backup and recovery
- Data archiving
- Data lakes and big data analytics
- Hybrid Cloud storage
- Cloud-native application data
- Disaster recovery
- Amazon DynamoDB
- Amazon Kinesis Data Streams
- AWS Glue
- Amazon EMR
- Summary
Product information
- Title: Big Data Analytics with Hadoop 3
- Author(s):
- Release date: May 2018
- Publisher(s): Packt Publishing
- ISBN: 9781788628846
You might also like
book
Data Analytics with Hadoop
Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you …
book
Modern Big Data Processing with Hadoop
A comprehensive guide to design, build and execute effective Big Data strategies using Hadoop About This …
video
Mastering Big Data Analytics with PySpark
PySpark helps you perform data analysis at-scale; it enables you to build more scalable analyses and …
book
Scala and Spark for Big Data Analytics
Harness the power of Scala to program Spark and analyze tonnes of data in the blink …