Chapter 9. Gaining Practical Expertise with Scaling Across All Dimensions

Chapter 8 discussed the theoretical concepts and foundational knowledge you need to scale model training beyond data parallelism, exploring techniques for model, pipeline, tensor, and hybrid parallelism. This chapter continues that discussion and provides practical experience with using these distributed training paradigms. You will also review some tools and libraries that are useful in vertical scaling and further explore DeepSpeed (introduced in Hands-On Exercise #5 in Chapter 7) through a vertical scaling lens. At the end of the chapter, you’ll find a practical exercise to achieve more automated multidimensional hybrid training using DeepSpeed.

Hands-On Exercises: Model, Tensor, Pipeline, and Hybrid Parallelism

In this series of Hands-On Exercises, you will build a recommendation engine for movies. You will be leveraging the DeepFM model to explore simplistic implementations of vertical scaling. Please note that in order to make the implementations simpler and easier to follow, the use of monitoring and profiling tools has largely been omitted from these exercises. However, the tools and software discussed in Chapters 4 and 7 are equally applicable and useful for profiling and benchmarking model, pipeline, and hybrid parallel programs too.

The Dataset

The movie recommender will be based on the MovieLens dataset, open sourced by GroupLens Research. This dataset has two parts: the ratings of the movies ...

Get Deep Learning at Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.