Kafka Fundamentals
Published by O'Reilly Media, Inc.
A hands-on course in mastering Kafka at scale
Apache Kafka is an increasingly popular foundation for large-scale software systems. In this course, you’ll learn how to use Kafka to publish and subscribe to data streams, and how Kafka can be used to solve various use cases. You’ll also learn how to install and configure a Kafka cluster, and how to use the Kafka API’s to produce and consume data. We’ll also discuss how to connect Kafka to technologies for stream processing, log aggregation, and other related big-data technologies.
What you’ll learn and how you can apply it
- Why Kafka is scalable
- How to interact with Kafka
- Kafka’s role in enterprise architectures
- How to design Kafka topics and partitions
Participants will be able to:
- Install and configure Kafka
- Publish data to Kafka
- Subscribe to data from Kafka
- Design Kafka topics and partitions
This live event is for you because...
- You are a software architect with experience building enterprise systems, and you need to ensure that your systems are scalable and fault tolerant
- You are a software developer with Java experience, and you need to build software on top of Kafka
Prerequisites
- Basic knowledge of Java
A GitHub link to a description for installations will be provided.
Recommended Preparation:
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
DAY 1
Introduction (Lecture ~ 20 min)
- Who are we
- What is Kafka
- Explain first lab
Verify that everything is installed and working (Lab ~ 20 min)
- Install Kafka through Docker
- Run a simple example of Kafka
Introduction to Kafka (Lecture ~ 30 min)
- Kafka under the hood
- What is a topic
- What is a partition
- What is a producer
- What is a consumer
Creating a topic and pass a message (Lab ~30 min)
- Create a topic
- Run a simple consumer
- Run a simple producer
Dissecting the first example (Lecture/Discussion ~ 30 min)
- Walkthrough of the first lab
- Question and answers
Design of Kafka topics and partitions (Lecture ~ 30 min)
- Case study
- How to select topics
- How to select partitions
Exercise: Designing topics and partitions (Group Project ~ 20 min)
- Design topics and partitions
DAY 2
Evaluation of the designs and suggested solutions (Discussion ~20 min)
- Discussion of the suggested solution(s)
- Recommended design of case study
Implement Topics and Partitions for case study (Lab ~30 min)
- Define a topic and partition in Kafka
- Create a consumer and producer
- Run a test script
Scaling Kafka (Lecture ~30 min)
- Kafka Brokers
- Kafka Clusters
- Cluster mirroring
- Consumer groups
Streaming APIs for Kafka (Lecture ~20 min)
- What is streaming
- Why use streams
- Programming to streams
- Example streams using Spark
Streaming and IoT Case Study (Lab ~30 min)
- Consume a stream from Kafka
- Build a Spark application over the Kafka stream
Kafka Administration and Integration (Lecture ~30 Min)
- Integration with Big Data tools (Storm, Spark, Hadoop)
- Kafka Connect
- Certified Kafka connectors
- Kafka administration
- Kafka monitoring
- Security
Your Instructor
Petter Graff
Petter Graff is a partner at Northscaler, helping Fortune-500 companies reach their potential through training, consulting and custom development. Petter has extensive experience building large scale software systems for many of the Fortune-500 companies. Petter was the main architect behind the open source project Yaktor (yaktor.io) which relies on Kafka to deliver messages across large clusters of computation nodes. Yaktor and Kafka has been used by various companies to build systems processing millions of messages per second. Petter is also a frequent speaker at various conferences and an O’Reilly author (check out his Video Series on Design Patterns in Java).