Book description
Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases.
Once you set up your Oozie server, you’ll dive into techniques for writing and coordinating workflows, and learn how to write complex data pipelines. Advanced topics show you how to handle shared libraries in Oozie, as well as how to implement and manage Oozie’s security capabilities.
- Install and configure an Oozie server, and get an overview of basic concepts
- Journey through the world of writing and configuring workflows
- Learn how the Oozie coordinator schedules and executes workflows based on triggers
- Understand how Oozie manages data dependencies
- Use Oozie bundles to package several coordinator apps into a data pipeline
- Learn about security features and shared library management
- Implement custom extensions and write your own EL functions and actions
- Debug workflows and manage Oozie’s operational details
Publisher resources
Table of contents
- Foreword
- Preface
- 1. Introduction to Oozie
- 2. Oozie Concepts
- 3. Setting Up Oozie
- 4. Oozie Workflow Actions
- 5. Workflow Applications
- 6. Oozie Coordinator
- 7. Data Trigger Coordinator
- 8. Oozie Bundles
- 9. Advanced Topics
- 10. Developer Topics
- 11. Oozie Operations
- Index
Product information
- Title: Apache Oozie
- Author(s):
- Release date: May 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781449369750
You might also like
book
Apache Oozie Essentials
Unleash the power of Apache Oozie to create and manage your big data and machine learning …
book
Apache Sqoop Cookbook
Integrating data from multiple sources is essential in the age of big data, but it can …
video
Introduction to Apache HBase Operations
HBase master Jonathan Hsieh provides a complete overview of Apache HBase operations in this course designed …
book
Architecting HBase Applications
HBase is a remarkable tool for indexing mass volumes of data, but getting started with this …