Chapter 7. Monitoring and Managing Greenplum

Effectively managing an MPP database shares many of the challenges of PostgreSQL database management, with some major differences due to the nature of the multi-instance architecture. Greenplum offers a set of tools to facilitate the monitoring and management of a multinode cluster. Understanding factors that have immediate effect (e.g., What is the current state of Greenplum? Why is this query running so much slower today than yesterday?) can help quickly troubleshoot day-to-day issues. Understanding longer-term issues (e.g., Which tables need to be analyzed? When should I expect my disk to fill up? What can I do to optimize frequent long queries to ease load?) ensures that issues can be proactively addressed, and rarely catch the user off guard.

Both long- and short-term issues can be addressed by Greenplum Command Center, an application bundled with Pivotal Greenplum (the commercial version of the Greenplum Database).

Greenplum Command Center

Perhaps one of the most opaque aspect of Greenplum operations is the current status of a running query. DBAs are frequently asked, “Why is my query taking so long?” The question seems simple, but diagnosing the factors that contribute to a slow query or a group of slow queries requires diving into multiple parts of the database as well as an understanding of how queries are executed in an MPP environment. Common culprits include locking and blocking of tables by competing queries or ...

Get Data Warehousing with Greenplum, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.