Chapter 2. Getting Hadoop Up and Running

Now that we have explored the opportunities and challenges presented by large-scale data processing and why Hadoop is a compelling choice, it's time to get things set up and running.

In this chapter, we will do the following:

  • Learn how to install and run Hadoop on a local Ubuntu host
  • Run some example Hadoop programs and get familiar with the system
  • Set up the accounts required to use Amazon Web Services products such as EMR
  • Create an on-demand Hadoop cluster on Elastic MapReduce
  • Explore the key differences between a local and hosted Hadoop cluster

Hadoop on a local Ubuntu host

For our exploration of Hadoop outside the cloud, we shall give examples using one or more Ubuntu hosts. A single machine (be it a physical ...

Get Hadoop: Data Processing and Modelling now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.