Book description
Every enterprise application creates data, whether it consists of log messages, metrics, user activity, or outgoing messages. Moving all this data is just as important as the data itself. With this updated edition, application architects, developers, and production engineers new to the Kafka streaming platform will learn how to handle data in motion. Additional chapters cover Kafka's AdminClient API, transactions, new security features, and tooling changes.
Engineers from Confluent and LinkedIn responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream processing applications with this platform. Through detailed examples, you'll learn Kafka's design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.
You'll examine:
- Best practices for deploying and configuring Kafka
- Kafka producers and consumers for writing and reading messages
- Patterns and use-case requirements to ensure reliable data delivery
- Best practices for building data pipelines and applications with Kafka
- How to perform monitoring, tuning, and maintenance tasks with Kafka in production
- The most critical metrics among Kafka's operational measurements
- Kafka's delivery capabilities for stream processing systems
Publisher resources
Table of contents
- Foreword to the Second Edition
- Foreword to the First Edition
- Preface
- 1. Meet Kafka
- 2. Installing Kafka
- 3. Kafka Producers: Writing Messages to Kafka
-
4. Kafka Consumers: Reading Data from Kafka
- Kafka Consumer Concepts
- Creating a Kafka Consumer
- Subscribing to Topics
- The Poll Loop
-
Configuring Consumers
- fetch.min.bytes
- fetch.max.wait.ms
- fetch.max.bytes
- max.poll.records
- max.partition.fetch.bytes
- session.timeout.ms and heartbeat.interval.ms
- max.poll.interval.ms
- default.api.timeout.ms
- request.timeout.ms
- auto.offset.reset
- enable.auto.commit
- partition.assignment.strategy
- client.id
- client.rack
- group.instance.id
- receive.buffer.bytes and send.buffer.bytes
- offsets.retention.minutes
- Commits and Offsets
- Rebalance Listeners
- Consuming Records with Specific Offsets
- But How Do We Exit?
- Deserializers
- Standalone Consumer: Why and How to Use a Consumer Without a Group
- Summary
- 5. Managing Apache Kafka Programmatically
- 6. Kafka Internals
- 7. Reliable Data Delivery
- 8. Exactly-Once Semantics
- 9. Building Data Pipelines
- 10. Cross-Cluster Data Mirroring
- 11. Securing Kafka
- 12. Administering Kafka
- 13. Monitoring Kafka
- 14. Stream Processing
- A. Installing Kafka on Other Operating Systems
- B. Additional Kafka Tools
- Index
Product information
- Title: Kafka: The Definitive Guide, 2nd Edition
- Author(s):
- Release date: November 2021
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492043089
You might also like
book
Kafka: The Definitive Guide
Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something …
book
Terraform: Up and Running, 3rd Edition
Terraform has become a key player in the DevOps world for defining, launching, and managing infrastructure …
book
Spark: The Definitive Guide
Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the …
book
Kubernetes: Up and Running, 3rd Edition
In just five years, Kubernetes has radically changed the way developers and ops personnel build, deploy, …