Scaling Python with Ray

Book description

Serverless computing enables developers to concentrate solely on their applications rather than worry about where they've been deployed. With the Ray general-purpose serverless implementation in Python, programmers and data scientists can hide servers, implement stateful applications, support direct communication between tasks, and access hardware accelerators.

In this book, experienced software architecture practitioners Holden Karau and Boris Lublinsky show you how to scale existing Python applications and pipelines, allowing you to stay in the Python ecosystem while reducing single points of failure and manual scheduling. Scaling Python with Ray is ideal for software architects and developers eager to explore successful case studies and learn more about decision and measurement effectiveness.

If your data processing or server application has grown beyond what a single computer can handle, this book is for you. You'll explore distributed processing (the pure Python implementation of serverless) and learn how to:

  • Implement stateful applications with Ray actors
  • Build workflow management in Ray
  • Use Ray as a unified system for batch and stream processing
  • Apply advanced data processing with Ray
  • Build microservices with Ray
  • Implement reliable Ray applications

Publisher resources

View/Submit Errata

Table of contents

  1. Foreword
  2. Preface
    1. What You Will Learn
    2. A Note on Responsibility
    3. Conventions Used in This Book
    4. License
    5. Using Code Examples
    6. O’Reilly Online Learning
    7. How to Contact Us
    8. Acknowledgments
      1. From Holden
      2. From Boris
  3. 1. What Is Ray, and Where Does It Fit?
    1. Why Do You Need Ray?
    2. Where Can You Run Ray?
    3. Running Your Code with Ray
    4. Where Does It Fit in the Ecosystem?
      1. Big Data / Scalable DataFrames
      2. Machine Learning
      3. Workflow Scheduling
      4. Streaming
      5. Interactive
    5. What Ray Is Not
    6. Conclusion
  4. 2. Getting Started with Ray (Locally)
    1. Installation
      1. Installing for x86 and M1 ARM
      2. Installing (from Source) for ARM
    2. Hello Worlds
      1. Ray Remote (Task/Futures) Hello World
      2. Data Hello World
      3. Actor Hello World
    3. Conclusion
  5. 3. Remote Functions
    1. Essentials of Ray Remote Functions
    2. Composition of Remote Ray Functions
    3. Ray Remote Best Practices
    4. Bringing It Together with an Example
    5. Conclusion
  6. 4. Remote Actors
    1. Understanding the Actor Model
    2. Creating a Basic Ray Remote Actor
    3. Implementing the Actor’s Persistence
    4. Scaling Ray Remote Actors
    5. Ray Remote Actors Best Practices
    6. Conclusion
  7. 5. Ray Design Details
    1. Fault Tolerance
    2. Ray Objects
    3. Serialization/Pickling
      1. cloudpickle
      2. Apache Arrow
    4. Resources / Vertical Scaling
    5. Autoscaler
    6. Placement Groups: Organizing Your Tasks and Actors
    7. Namespaces
    8. Managing Dependencies with Runtime Environments
    9. Deploying Ray Applications with the Ray Job API
    10. Conclusion
  8. 6. Implementing Streaming Applications
    1. Apache Kafka
      1. Basic Kafka Concepts
      2. Kafka APIs
    2. Using Kafka with Ray
    3. Scaling Our Implementation
    4. Building Stream-Processing Applications with Ray
      1. Key-Based Approach
      2. Key-Independent Approach
    5. Going Beyond Kafka
    6. Conclusion
  9. 7. Implementing Microservices
    1. Understanding Microservice Architecture in Ray
      1. Deployment
      2. Additional Deployment Capabilities
      3. Deployment Composition
    2. Using Ray Serve for Model Serving
      1. Simple Model Service Example
      2. Considerations for Model-Serving Implementations
      3. Speculative Model Serving Using the Ray Microservice Framework
    3. Conclusion
  10. 8. Ray Workflows
    1. What Is Ray Workflows?
    2. How Is It Different from Other Solutions?
    3. Ray Workflows Features
      1. What Are the Main Features?
      2. Workflow Primitives
    4. Working with Basic Workflow Concepts
      1. Workflows, Steps, and Objects
      2. Dynamic Workflows
      3. Virtual Actors
    5. Workflows in Real Life
      1. Building Workflows
      2. Managing Workflows
      3. Building a Dynamic Workflow
      4. Building Workflows with Conditional Steps
      5. Handling Exceptions
      6. Handling Durability Guarantees
      7. Extending Dynamic Workflows with Virtual Actors
      8. Integrating Workflows with Other Ray Primitives
      9. Triggering Workflows (Connecting to Events)
      10. Working with Workflow Metadata
    6. Conclusion
  11. 9. Advanced Data with Ray
    1. Creating and Saving Ray Datasets
    2. Using Ray Datasets with Different Tools
    3. Using Tools on Ray Datasets
      1. pandas-like DataFrames with Dask
      2. Indexing
      3. Shuffles
      4. Embarrassingly Parallel Operations
      5. Working with Multiple DataFrames
      6. What Does Not Work
      7. What’s Slower
      8. Handling Recursive Algorithms
      9. What Other Functions Are Different
      10. pandas-like DataFrames with Modin
      11. Big Data with Spark
      12. Working with Local Tools
    4. Using Built-in Ray Dataset Operations
    5. Implementing Ray Datasets
    6. Conclusion
  12. 10. How Ray Powers Machine Learning
    1. Using scikit-learn with Ray
    2. Using Boosting Algorithms with Ray
      1. Using XGBoost
      2. Using LightGBM
    3. Using PyTorch with Ray
    4. Reinforcement Learning with Ray
    5. Hyperparameter Tuning with Ray
    6. Conclusion
  13. 11. Using GPUs and Accelerators with Ray
    1. What Are GPUs Good At?
    2. The Building Blocks
    3. Higher-Level Libraries
    4. Acquiring and Releasing GPU and Accelerator Resources
    5. Ray’s ML Libraries
    6. Autoscaler with GPUs and Accelerators
    7. CPU Fallback as a Design Pattern
    8. Other (Non-GPU) Accelerators
    9. Conclusion
  14. 12. Ray in the Enterprise
    1. Ray Dependency Security Issues
    2. Interacting with the Existing Tools
    3. Using Ray with CI/CD Tools
    4. Authentication with Ray
    5. Multitenancy on Ray
    6. Credentials for Data Sources
    7. Permanent Versus Ephemeral Clusters
      1. Ephemeral Clusters
      2. Permanent Clusters
    8. Monitoring
    9. Instrumenting Your Code with Ray Metrics
    10. Wrapping Custom Programs with Ray
    11. Conclusion
  15. A. Space Beaver Case Study: Actors, Kubernetes, and More
    1. High-Level Design
    2. Implementation
      1. Outbound Mail Client
      2. Shared Actor Patterns and Utilities
      3. Mail Server Actor
      4. Satellite Actor
      5. User Actor
      6. SMS Actor and Serve Implementation
    3. Testing
    4. Deployment
    5. Conclusion
  16. B. Installing and Deploying Ray
    1. Installing Ray Locally
    2. Using Ray Docker Images
    3. Using Ray Clusters
      1. Installing Ray on AWS
      2. Installing Ray on IBM Cloud
      3. Installing Ray on Kubernetes
      4. Installing Ray on a kind Cluster
      5. Using ray up
      6. Using the Ray Kubernetes Operator
      7. Installing Ray on OpenShift
    4. Conclusion
  17. C. Debugging with Ray
    1. General Debugging Tips with Ray
    2. Serialization Errors
    3. Local Debugging with Ray Local
    4. Remote Debugging
      1. Ray’s Integrated Debugger (via Pdb)
      2. Other Tools
    5. Ray and Container Exit Codes
    6. Ray Logs
    7. Container Errors
    8. Native Errors
    9. Conclusion
  18. Index
  19. About the Authors

Product information

  • Title: Scaling Python with Ray
  • Author(s): Holden Karau, Boris Lublinsky
  • Release date: November 2022
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098118808