26

–––––––––––––––––––––––

Scalable Runtime Environments for Large-Scale Parallel Applications

Camille Coti and Franck Cappello

26.1   INTRODUCTION

Parallel programming techniques define the sequence of instructions that each process must execute on some data and how these data are provided to each process. Parallel programming environments take this parallelism into account and support a parallel application by putting its components in touch with one another and managing the distributed execution.

Parallel programming environments are made of three distinct components [1]:

  • a job scheduler, deciding which machines will be used by a given parallel application
  • a runtime environment, in charge of making interprocess communications possible, I/O and signal forwarding, monitoring the processes, and ensuring the finalization of the application
  • a communication library, in charge of data movements between the processes.

This chapter presents the support provided to the application by the runtime environment in the context of large-scale systems. First, we define what a runtime environment is and how it interacts with the other components of the parallel

programming and execution environment in Section 26.2. Then we present how it can deploy the application in a scalable way in Section 26.4 and how the communication infrastructure that supports the application is important in Section 26.3. We present its role in fault tolerance in Section 26.5 and we finish by two case studies in ...

Get Scalable Computing and Communications: Theory and Practice now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.