Skip to main content

- Sign In
- Try Now

Content

Machine Learning

Efficient Machine Learning Inference

By Alejandro Lince and Steven Ross

The benefits of multi-model serving where latency matters

About O’Reilly

Teach/write/train
Careers
O’Reilly news
Media coverage
Community partners
Affiliate program
Submit an RFP
Diversity
O’Reilly for marketers

Support

Contact us
Newsletters
Privacy policy

International

Australia & New Zealand
Hong Kong & Taiwan
India
Indonesia
Japan

Download the O’Reilly App

Take O’Reilly with you and learn anywhere, anytime on your phone and tablet.

Watch on your big screen

View all O’Reilly videos, Superstream events, and Meet the Expert sessions on your home TV.

Do not sell my personal information

© 2024, O’Reilly Media, Inc. All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.

Terms of service • Privacy policy • Editorial independence