Errata for Programming Hive
Submit your own errata for this product.
The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Color Key: Serious Technical Mistake Minor Technical Mistake Language or formatting error Typo Question Note Update
| Version |
Location |
Description |
Submitted By |
Date Submitted |
Date Corrected |
| Printed, PDF |
Page ?
? |
The book text makes use of an employees data set.
I find no reference to a set of data that can be used with the book to run through the book examples as working exercises.
Do you have a data set to download for use with the book?
Did I miss it in the text?
I have the stock data, I'm referring to the Employee data you work with but don't ever tell us where to get it so we can work along with you on the examples.
Before creating my own data set, which is tedious, thought I'd ask you if you have one already constructed and available.
Note from the Author or Editor: I'll prepare a zip file of "extras" like this, which is a reasonable request.
|
Anonymous |
Oct 30, 2012 |
|
| Printed, PDF, ePub |
Page 5
United States |
In figure 1-1, there are several references to the word 'ie', but given the input, the word/token should be 'is'.
Note from the Author or Editor: Yes, this is correct. The figure is incorrect.
|
Bill Bates |
Oct 11, 2012 |
|
| Printed, PDF, ePub |
Page 5
United States |
In figure 1.1, in the reducers, the token/key 'there' should have a value of '[1,1]' and the token/key 'uses' should have the value '[1]'. It appears these two values have been mixed up.
Note from the Author or Editor: Yes, this is correct. The figure is incorrect.
|
Bill Bates |
Oct 11, 2012 |
|
| PDF |
Page 12
Hive sample |
'\s' SHOULD be '\\s'
Note from the Author or Editor: Correct.
|
Tatsuo Kawasaki |
Apr 20, 2013 |
|
| Printed, PDF, ePub |
Page 42
Table of types and 4th paragraph after the table |
Our description of the new TIMESTAMP type followed the specification for the feature that was planned, but the implemented feature doesn't support all 3 formats listed. Instead, it only supports the UTC string format: "YYYY-MM-DD HH:MM:SS.FFFFFFFFF".
So, the "Literal Syntax Example" cell in the table should contain just this text: "'2012-02-03 12:34:56.123456789' (JDBC- compliant java.sql.Timestamp format)".
The 4th paragraph after the table, should be modified to say this, although we could omit the parenthetical "Note":
Values of the new TIMESTAMP type must be strings that follow the JDBC date string format convention, YYYY-MM-DD hh:mm:ss.fffffffff. (Note: when support for TIMESTAMP was under development, support for integer and float literals was planned, where they would be interpreted as seconds and seconds plus nanoseconds, respectively, since the Unix epoch time. These formats were not implemented.)
|
 Dean Wampler
|
Oct 21, 2012 |
|
| PDF |
Page 53
CREATE TABLE statement |
It seems LOCATION should be put before TBLPROPERTIES.
READ:
:
TBLPROPERTIES ('creator'='me', 'created_at'='2012-01-02 10:00:00', ...)
LOCATION '/user/hive/warehouse/mydb.db/employees';
SHOULD READ:
:
LOCATION '/user/hive/warehouse/mydb.db/employees'
TBLPROPERTIES ('creator'='me', 'created_at'='2012-01-02 10:00:00', ...);
Note from the Author or Editor: Yes, this is correct.
|
Tatsuo Kawasaki |
Apr 21, 2013 |
|
| PDF |
Page 68, 122, 123, 127
|
Data Type 'LONG' is not supported in Hive. It should be ’BIGINT'.
Note from the Author or Editor: OOPS! Yes, these LONGs should be BIGINTs in the Hive queries.
|
Tatsuo Kawasaki |
Apr 21, 2013 |
|
| PDF |
Page 71
1 |
The book text makes use of an employees data set.
I find no reference to a set of data that can be used with the book to run through the book examples as working exercises.
Do you have a data set to download for use with the book?
Did I miss it in the text?
I have the stock data, I'm referring to the Employee data you work with but don't ever tell us where to get it so we can work along with you on the examples.
Before creating my own data set, which is tedious, thought I'd ask you if you have one already constructed and available.
Note from the Author or Editor: I have posted a small example file here: http://polyglotprogramming.com/employees.txt
It uses the default delimiters for Hive, so just create the table as described in the book and drop this file in the appropriate HDFS directory.
|
Anonymous |
Feb 21, 2013 |
|
| PDF |
Page 119
Section "Rebuilding an Index" (starts on previous page) |
The statement:
ALTER INDEX employees_index ON TABLE employees PARTITION (country = 'US') REBUILD;
should not have the word TABLE. It should be
ALTER INDEX employees_index ON employees PARTITION (country = 'US') REBUILD;
Also, later on the page, the same error occurs in a different statement:
DROP INDEX IF EXISTS employees_index ON TABLE employees;
should be
DROP INDEX IF EXISTS employees_index ON employees;
|
 Dean Wampler
|
Feb 23, 2013 |
|
| PDF |
Page 135
In "Optimized Join" section |
/* streamtable(table_name) */ should be
/*+ streamtable(table_name) */
|
 Dean Wampler
|
Feb 23, 2013 |
|
| Printed |
Page 179
3rd paragraph |
Hi,
This is a very minor typo:
The third paragraph starts as " The benefit of this type of UDFT...." where as it should be
" The benefit of this type of UDTF...." where UDTF stands for User-Defined Table Generating Function.
Regards,
Ramki.
Note from the Author or Editor: OOPS! Correct. Should be UDTF
|
Ramki Palle |
Feb 11, 2013 |
|
| Printed |
Page 185
2nd section |
Which version of Hive implements "CREATE TEMPORARY MACRO"? I looked for more information online and all I could find was a proposed patch on a still-open ticket:
https://issues.apache.org/jira/browse/HIVE-2655
Note from the Author or Editor: This is a bit embarrassing; it appears that the feature isn't actually in any Hive release, even the latest 0.10.0.
The text should be amended to say this is a planned feature that may appear in a release soon.
|
Terran Melconian |
Nov 12, 2012 |
|
| PDF |
Page 209
7th line |
"Avro is a serialization systemit’s main feature" there appears to be at least a comma and space missing between "system" and "it's" (and there shouldn't be an apostrophe).
Note from the Author or Editor: Yes, should be "... system. It's ..."
|
peter marron |
Mar 20, 2013 |
|
| Printed |
Page 217
java code to check "bad" table names |
Hi,
I understand that the intention of the code is to identify the external tables whose data reside inside the warehouse directory, which is /user/hive/warehouse.
The if clause code is there as:
if (t.getTableType().equals("MANAGED_TABLE") &&
! u.getPath()contains("/user/hive/warehouse") ) {
System.out.println (t.getTableName()
+ " is a non external table mounted inside /user/hive/warehouse" );
bad.add (t.getTableName());
}
There are two issues here:
1. The check inside the if clause.
2. The message in the println statement.
The code should be
if (! t.getTableType().equals("MANAGED_TABLE") &&
u.getPath()contains("/user/hive/warehouse") ) {
System.out.println (t.getTableName()
+ " is an external table mounted inside /user/hive/warehouse" );
bad.add (t.getTableName());
}
or
if (t.getTableType().equals("EXTERNAL") &&
u.getPath()contains("/user/hive/warehouse") ) {
System.out.println (t.getTableName()
+ " is an external table mounted inside /user/hive/warehouse" );
bad.add (t.getTableName());
}
Regards,
Ramki.
Note from the Author or Editor: UPDATE: Missing "." in one expression:
I think the second proposed alternative is better:
if (t.getTableType().equals("EXTERNAL") &&
u.getPath().contains("/user/hive/warehouse") ) {
System.out.println (t.getTableName()
+ " is an external table mounted inside /user/hive/warehouse" );
bad.add (t.getTableName());
}
|
Ramki Palle |
Feb 11, 2013 |
|
| PDF |
Page 221
3rd Paragraph |
The 3rd paragraph appears to be missing at least one line. It trails off with "however it could output". It could output what?
Note from the Author or Editor: OOPS! I'm not sure what we intended for the missing part of the sentence, but let's just write "..., however it could output to text files."
|
Peter Marron |
Mar 20, 2013 |
|
| PDF |
Page 300
Middle of page |
READ: a number of segments separated by /r/n (carriage..
SHOULD READ: a number of segments separated by \r\n (carriage..
Note from the Author or Editor: Correct. This is a typo.
|
Tatsuo Kawasaki |
Apr 14, 2013 |
|
| ePub |
Page 18013
location 18013 in Kindle version (section on HBase) |
The "CREATE TABLE" statements that are listed for HBase are incorrect. Both the internal and external DDL that is listed contain only a single non-key column mapping in the hbase.columns.mapping SERDEPROPERTIES, while they use two non-key columns in the actual table creation statement. Both of these create table statements are invalid.
The DDL statements can be corrected by either adding an additional column mapping, or reducing the number of columns defined in table definition.
Note from the Author or Editor: Correct. Should be:
CREATE TABLE hbase_stocks(key INT, name STRING, price FLOAT) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,stock:val,price:val") TBLPROPERTIES ("hbase.table.name" = "stocks");
(I added ,"price:val" to the SERDEPROPERTIES clause.
|
Gabriel Reid |
Feb 10, 2013 |
|
|