Chapter 3. What Flink Does
Apache Flink brings a fresh approach to the role of stream processor, completing the streaming architecture described in Chapter 2. One of the strengths of a technology like this is the way it lets you build applications that are a good fit for real life. In order to understand what Flink does and how you might want to use it, consider here some of the key aspects of what makes it versatile, and in particular what makes it able to address “correctness” in several important ways.
Different Types of Correctness
In Chapter 1, we saw the consequences of not doing streaming well. Here, we look at how Flink helps do streaming correctly and what this means. In the simplest sense, people think of correctness as accuracy—if you are counting, for example, have you counted correctly? That’s a good point, but there are really a number of issues that affect “correct,” especially if you think of it in the slightly larger terms of how well your computation fits the world you are trying to model and analyze. Another way to put this is: for your data processing, you want “what you want, what you expect, when you want it.”
Natural Fit for Sessions
One way in which streaming in general and Flink in particular offers correctness is through a more natural fit between the way computational windows are defined and how data naturally occurs. Think of the situation of tracking the activity of three users (A, B, and C in Figure 3-1) on a website monitored through clickstream ...
Get Introduction to Apache Flink now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.