Chapter 6. Implementing Streaming Applications

So far in the book, we have been using Ray to implement serverless batch applications. In this case, data is collected, or provided from the user, and then used for calculations. Another important group of use cases are the situations requiring you to process data in real time. We use the overloaded term real time to mean processing the data as it arrives within some latency constraints. This type of data processing is called streaming.

In this book, we define streaming as taking action on a series of data close to the time that the data is created.

Some common streaming use cases include the following:

Log analysis

A way of gaining insights into the state of your hardware and software. It is typically implemented as a distributed processing of streams of logs as they are being produced.

Fraud detection

The monitoring of financial transactions and watching for anomalies that signal fraud in real time and stopping fraudulent transactions.

Cybersecurity

The monitoring of interactions with the system to detect anomalies, allowing the identification of security issues in real time to isolate threats.

Streaming logistics ...

Get Scaling Python with Ray now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.