Chapter 7. Implementing Microservices
Initially, Ray was created as a framework for implementing reinforcement learning but gradually morphed into a full-fledged serverless platform. Similarly, initially introduced as a better way to serve ML models, Ray Serve has recently evolved into a full-fledged microservices framework. In this chapter, you will learn how to use Ray Serve for implementing a general-purpose microservice framework and how to use this framework for model serving.
Complete code of all examples used in this chapter can be found in the folder /ray_examples/serving in the book’s GitHub repo.
Understanding Microservice Architecture in Ray
Ray microservice architecture (Ray Serve) is implemented on top of Ray by leveraging Ray actors. Three kinds of actors are created to make up a Serve instance:
- Controller
-
A global actor unique to each Serve instance that manages the control plane. It is responsible for creating, updating, and destroying other actors. All of the Serve API calls (e.g., creating or getting a deployment) use the controller for their execution.
- Router
-
There is one router per node. Each router is a Uvicorn HTTP server that accepts incoming requests, forwards them to replicas, and responds after they are completed.
- Worker replica
-
Worker replicas execute the user-defined code in response to a request. Each replica processes individual requests from the routers.
User-defined code is implemented using a Ray deployment, an extension of a Ray actor ...
Get Scaling Python with Ray now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.