Native streaming versus micro-batch

Let's examine how the stateful stream processing (as found in Apex and Flink) compares to the micro-batch based approach in Apache Spark Streaming.

Let's look at the following diagram:

On top, we see an example of processing in Spark Streaming and below we see an example in Apex in the preceding diagram. Based on its underlying "stateless" batch architecture, Spark Streaming processes a stream by dividing it into small batches (micro-batches) that typically last from 500 ms to a few seconds. A new task is scheduled for every micro-batch. Once scheduled, the new task needs to be initialized. Such initialization ...

Get Learning Apache Apex now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.