Chapter 2: Implementing, Testing, and Deploying Basic Pipelines

Now that we are familiar with the basic concept of streaming data processing, in this chapter, we will take a deep dive into how to build something practical with Apache Beam.

The purpose of this chapter is to give you some hands-on experience of solving practical problems from start to finish. The chapter will be divided into subsections, with each following the same structure:

  1. Defining a practical problem
  2. Discussing the problem decomposition (and how to solve the problem using Beam's PTransform)
  3. Implementing a pipeline to solve the defined problem
  4. Testing and validating that we have implemented our pipeline correctly
  5. Deploying the pipeline, both locally and to a running cluster ...

Get Building Big Data Pipelines with Apache Beam now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.