Chapter 8. Performance Tuning

Performance tuning involves optimizing the performance of your Presto cluster by making small adjustments to improve its speed, efficiency, and overall performance. This process starts by analyzing the existing system to determine where and how to improve. Once we have identified the areas for improvement, we can implement changes to maximize performance.

The chapter is organized into five parts. In the first part, we’ll introduce some basic concepts related to performance tuning, including motivation and the performance tuning life cycle. In the second part, we’ll see the Presto query execution model, which helps you understand where to act when there are bottlenecks. Next, we’ll analyze some popular approaches for performance tuning in Presto, including resource allocation, storage, and query optimization. In the fourth part, we’ll focus on Aria Scan, a project to improve Presto’s performance by increasing table scan efficiency. Finally, we’ll implement a practical use case to show how to tune some configuration parameters in Presto.

Introducing Performance Tuning

As you have learned in the previous chapters, Presto is a distributed query engine that enables you to query large datasets stored in multiple data sources. As the size and complexity of datasets grow, it becomes increasingly important to optimize the query execution process to minimize query response time and ensure the timely availability of data to users. Performance tuning identifies ...

Get Learning and Operating Presto now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.