Chapter 8. Architectures for Streaming

In this chapter, you will learn why the industry trend is inexorably away from batch and into streaming. We will discuss different streaming architectures and how to choose between them. We will also do a deeper dive into two of these architectures—micro-batching and streaming pipelines—and discuss how to support real-time, ad hoc querying in both these architectures. Finally, sometimes the reason to do streaming is to autonomously take some action when certain events happen, and we will discuss how to architect such automated systems.

The Value of Streaming

Businesses along the entire technology maturity spectrum, from digital natives to more traditional companies, across many industries are recognizing the increasing value of making faster decisions. For example, consider business A, which takes three days to approve a vehicle loan. Business B, on the other hand, will approve or deny a loan in minutes. That increased convenience will lead business B to have a competitive advantage.

Even better than faster decisions is being able to make decisions in context. Being able to make decisions while the event is proceeding (see Figure 8-1) is significantly more valuable than making the decision even a few minutes later. For example, if you can detect a fraudulent credit card when it is presented for payment and reject the transaction, you can avoid a costly process of getting reimbursed.

Figure 8-1. The value of a decision typically drops with ...

Get Architecting Data and Machine Learning Platforms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.