Reviews

On Nov 10 Marco Rapella wrote: Getting Started with Impala – O’Reilly Media
This book introduces Impala SQL for Hadoop in a simple and crystal clear way. Impala is a modern, and yet very interesting new entry in the Hadoop ecosystem. Cloudera, by implementing Impala, provided a very useful layer to Hadoop. Impala makes SQL knowledge valuable again without having other low lever skills on Hadoop. SQL is already known to many data professionals coming to Hadoop from various platforms and DBMS and this layer helps those professionals to be ready and focused on real value of Big Data: producing Insights. To me the book reaches two important goals: first it's highly readable and concise, second it shows something under active development in a "stable" and "actionable" way. The author, to clearly point his view, provided in the book two main message: the first is why Impala implementation is a good news in Hadoop. Second: how to be ready with daily tasks and avoid pitfalls. Full Review  >

Rating: StarStarStarStarStar5.0

On Oct 25 Surachart Opun wrote: Good impala sql tutorial
It focuses in Impala (SQL) and guides with excellent examples. Easy to read and understand. It was not written to explain whole about Impala (SQL). Each guidance (tutorial) gives much more idea and able to adapt with real-world. Full Review  >

Rating: StarStarStarStarStar4.0

On Oct 24 Arthur Zubarev wrote: Easy to understand good starter to doing things with Impala
Impala is a recent, but very valuable addition to the Hadoop ecosystem. I must say (after reading the book) Cloudera made a big step forward in the right direction. The rational behind bringing Impala to life is the proliferation of SQL. SQL as a language has many flavors, but in one form or another is already known to data practitioners coming to Hadoop from various platforms and DBMS. Impala implements a subset of ANSI-92 SQL specification, regardless, even the subset is powerful enough to make a developer productive. In my opinion, since SQL it is based on algebra and sets, and because HDFS (Hadoop) is just able to expose datasets Impala is the right choice for MDL and DDL even for the Big Data projects. At 110 pages the book is not terribly long, but bear in mind Impala as a product is still under active development, as a bonus, the author has a close relationship with the product working at Cloudera, this is a big plus resulting in top professional content. John structured the book so it is basically divided into two parts: 1st and the largest is on Impala implementation and its role in data analysis and processing, the 2nd part covers most commonly used tasks, pitfalls or simply advice and techniques. What I did not find is more on how to use it with Hive, Scoop, HBase and Pig, I will take a star out of my rating for this. Let me reiterate, the book covers the Cloudera’s Hadoop Impala distribution, if you are using a different distribution, Impala is not part of it. Like I said, I am giving this book a 4 out of 5 stars. Good work John! Disclaimer: the book was provided to me for free as part of O’Reilly’s blogger reviewer program. Full Review  >

Rating: StarStarStarStarStar4.0

Top Reviewers

Michal Konrad Owsiak, 95 Reviews

Santosh Shanbhag, 64 Reviews

Surachart Opun, 61 Reviews

Doron Katz, 57 Reviews

Shawn Day, 55 Reviews

See More Reviewers >

Featured Review

Programming Python

Matt Keranen wrote:
Long and comprehensive
A tutorial that will be useful as a later reference. Full Review >

Rating: StarStarStarStarStar4.0