Kafka Connect architecture

The following image shows Kafka Connect's architecture:

The data flow can be explained as follows:

  • Various sources are connected to Kafka Connect Cluster. Kafka Connect Cluster pulls data from the sources.
  • Kafka Connect Cluster consists of a set of worker processes that are containers that execute connectors, and tasks automatically coordinate with each other to distribute work and provide scalability and fault tolerance.
  • Kafka Connect Cluster pushes data to Kafka Cluster.
  • Kafka Cluster persists the data on to the broker local disk or on Hadoop.
  • Streams applications such as Storm, Spark Streaming, and Flink pull ...

Get Modern Big Data Processing with Hadoop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.