Efficient Machine Learning Inference By Alejandro Lince and Steven Ross The benefits of multi-model serving where latency matters