Book description
NoneTable of contents
- Foreword
- Preface
- I. Fundamentals of Stream Processing with Apache Spark
- 1. Introducing Stream Processing
- 2. Stream-Processing Model
- 3. Streaming Architectures
- 4. Apache Spark as a Stream-Processing Engine
-
5. Sparkâs Distributed Processing Model
- Running Apache Spark with a Cluster Manager
- Sparkâs Own Cluster Manager
- Understanding Resilience and Fault Tolerance in a Distributed System
- Data Delivery Semantics
- Microbatching and One-Element-at-a-Time
- Bringing Microbatch and One-Record-at-a-Time Closer Together
- Dynamic Batch Interval
- Structured Streaming Processing Model
- 6. Sparkâs Resilience Model
- A. References for Part I
- II. Structured Streaming
- 7. Introducing Structured Streaming
- 8. The Structured Streaming Programming Model
- 9. Structured Streaming in Action
- 10. Structured Streaming Sources
- 11. Structured Streaming Sinks
- 12. Event TimeâBased Stream Processing
- 13. Advanced Stateful Operations
- 14. Monitoring Structured Streaming Applications
- 15. Experimental Areas: Continuous Processing and Machine Learning
- B. References for Part II
- III. Spark Streaming
- 16. Introducing Spark Streaming
- 17. The Spark Streaming Programming Model
- 18. The Spark Streaming Execution Model
- 19. Spark Streaming Sources
- 20. Spark Streaming Sinks
- 21. Time-Based Stream Processing
- 22. Arbitrary Stateful Streaming Computation
- 23. Working with Spark SQL
- 24. Checkpointing
- 25. Monitoring Spark Streaming
- 26. Performance Tuning
- C. References for Part III
- IV. Advanced Spark Streaming Techniques
-
27. Streaming Approximation and Sampling Algorithms
- Exactness, Real Time, and Big Data
- The Exactness, Real-Time, and Big Data triangle
- Approximation Algorithms
- Hashing and Sketching: An Introduction
- Counting Distinct Elements: HyperLogLog
- Counting Element Frequency: Count Min Sketches
- Ranks and Quantiles: T-Digest
- Reducing the Number of Elements: Sampling
- 28. Real-Time Machine Learning
- D. References for Part IV
- V. Beyond Apache Spark
- 29. Other Distributed Real-Time Stream Processing Systems
- 30. Looking Ahead
- E. References for Part V
- Index
Product information
- Title: Stream Processing with Apache Spark
- Author(s):
- Release date:
- Publisher(s): O'Reilly Media, Inc.
- ISBN: None
You might also like
book
Stream Processing with Apache Flink
Get started with Apache Flink, the open source framework that powers some of the world’s largest …
video
Real-Time Stream Processing Using Apache Spark 3 for Scala Developers
Since its inception, Apache Spark has seen rapid adoption by enterprises across a wide range of …
video
Real-Time Stream Processing Using Apache Spark 3 for Python Developers
Take your first steps towards discovering, learning, and using Apache Spark 3.0. We will be taking …
book
Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications
Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how …