Chapter 16. Hive Thrift Service
Hive has an optional component known as HiveServer or HiveThrift that allows access to Hive over a single port. Thrift is a software framework for scalable cross-language services development. See http://thrift.apache.org/ for more details. Thrift allows clients using languages including Java, C++, Ruby, and many others, to programmatically access Hive remotely.
The CLI is the most common way to access Hive. However, the design of the CLI can make it difficult to use programmatically. The CLI is a fat client; it requires a local copy of all the Hive components and configuration as well as a copy of a Hadoop client and its configuration. Additionally, it works as an HDFS client, a MapReduce client, and a JDBC client (to access the metastore). Even with the proper client installation, having all of the correct network access can be difficult, especially across subnets or datacenters.
Starting the Thrift Server
To get started with the HiveServer, start it in the
background using the service
knob for
hive
:
$
cd
$HIVE_HOME
$
bin/hive --service hiveserver&
Starting Hive Thrift Server
A quick way to ensure the HiveServer is running is to use the
netstat
command to determine if port 10,000 is open and
listening for connections:
$
netstat -nl|
grep 10000 tcp0
0
:::10000 :::* LISTEN
(Some whitespace removed.) As mentioned, the HiveService uses Thrift. Thrift provides an interface language. With the interface, the Thrift compiler generates code that creates network ...
Get Programming Hive now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.