Chapter 2. Getting Started with Kafka Streams

Kafka Streams is a lightweight, yet powerful Java library for enriching, transforming, and processing real-time streams of data. In this chapter, you will be introduced to Kafka Streams at a high level. Think of it as a first date, where you will learn a little about Kafka Streams’ background and get an initial glance at its features.

By the end of this date, er…I mean chapter, you will understand the following:

  • Where Kafka Streams fits in the Kafka ecosystem

  • Why Kafka Streams was built in the first place

  • What kinds of features and operational characteristics are present in this library

  • Who Kafka Streams is appropriate for

  • How Kafka Streams compares to other stream processing solutions

  • How to create and run a basic Kafka Streams application

So without further ado, let’s get our metaphorical date started with a simple question for Kafka Streams: where do you live (…in the Kafka ecosystem)?

The Kafka Ecosystem

Kafka Streams lives among a group of technologies that are collectively referred to as the Kafka ecosystem. In Chapter 1, we learned that at the heart of Apache Kafka is a distributed, append-only log that we can produce messages to and read messages from. Furthermore, the core Kafka code base includes some important APIs for interacting with this log (which is separated into categories of messages called topics). Three APIs in the Kafka ecosystem, which are summarized in Table 2-1, are concerned with the movement ...

Get Mastering Kafka Streams and ksqlDB now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.