Building Spark Streaming Applications with Kafka

We have gone through all the components of Apache Kafka and different APIs that can be used to develop an application which can use Kafka. In the previous chapter, we learned about Kafka producer, brokers, and Kafka consumers, and different concepts related to best practices for using Kafka as a messaging system.

In this chapter, we will cover Apache Spark, which is distributed in memory processing engines and then we will walk through Spark Streaming concepts and how we can integrate Apache Kafka with Spark.

In short, we will cover the following topics:

  • Introduction to Spark
  • Internals of Spark such as RDD
  • Spark Streaming
  • Receiver-based approach (Spark-Kafka integration)
  • Direct approach (Spark-Kafka ...

Get Building Data Streaming Applications with Apache Kafka now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.