Book description
Working with unbounded and fast-moving data streams has historically been difficult. But with Kafka Streams and ksqlDB, building stream processing applications is easy and fun. This practical guide shows data engineers how to use these tools to build highly scalable stream processing applications for moving, enriching, and transforming large amounts of data in real time.
Mitch Seymour, data services engineer at Mailchimp, explains important stream processing concepts against a backdrop of several interesting business problems. You'll learn the strengths of both Kafka Streams and ksqlDB to help you choose the best tool for each unique stream processing project. Non-Java developers will find the ksqlDB path to be an especially gentle introduction to stream processing.
- Learn the basics of Kafka and the pub/sub communication pattern
- Build stateless and stateful stream processing applications using Kafka Streams and ksqlDB
- Perform advanced stateful operations, including windowed joins and aggregations
- Understand how stateful processing works under the hood
- Learn about ksqlDB's data integration features, powered by Kafka Connect
- Work with different types of collections in ksqlDB and perform push and pull queries
- Deploy your Kafka Streams and ksqlDB applications to production
Publisher resources
Table of contents
- Foreword
- Preface
- I. Kafka
- 1. A Rapid Introduction to Kafka
- II. Kafka Streams
- 2. Getting Started with Kafka Streams
-
3. Stateless Processing
- Stateless Versus Stateful Processing
- Introducing Our Tutorial: Processing a Twitter Stream
- Project Setup
- Adding a KStream Source Processor
- Serialization/Deserialization
- Filtering Data
- Branching Data
- Translating Tweets
- Merging Streams
- Enriching Tweets
- Serializing Avro Data
- Adding a Sink Processor
- Running the Code
- Empirical Verification
- Summary
- 4. Stateful Processing
- 5. Windows and Time
- 6. Advanced State Management
-
7. Processor API
- When to Use the Processor API
- Introducing Our Tutorial: IoT Digital Twin Service
- Project Setup
- Data Models
- Adding Source Processors
- Adding Stateless Stream Processors
- Creating Stateless Processors
- Creating Stateful Processors
- Periodic Functions with Punctuate
- Accessing Record Metadata
- Adding Sink Processors
- Interactive Queries
- Putting It All Together
- Combining the Processor API with the DSL
- Processors and Transformers
- Putting It All Together: Refactor
- Summary
- III. ksqlDB
- 8. Getting Started with ksqlDB
- 9. Data Integration with ksqlDB
- 10. Stream Processing Basics with ksqlDB
- 11. Intermediate and Advanced Stream Processing with ksqlDB
- IV. The Road to Production
- 12. Testing, Monitoring, and Deployment
- A. Kafka Streams Configuration
- B. ksqlDB Configuration
- Index
Product information
- Title: Mastering Kafka Streams and ksqlDB
- Author(s):
- Release date: February 2021
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492062493
You might also like
book
Kafka: The Definitive Guide
Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something …
book
Kafka: The Definitive Guide, 2nd Edition
Every enterprise application creates data, whether it consists of log messages, metrics, user activity, or outgoing …
book
Terraform: Up and Running, 3rd Edition
Terraform has become a key player in the DevOps world for defining, launching, and managing infrastructure …
video
Apache Kafka Series - Learn Apache Kafka for Beginners v3
The high throughput and low latency of Apache Kafka have made it one of the leading …