Advanced Relational Database Table Optimization

Pete Forde with Rowan Hick (see their bios in Contributors)

Problem

I have the problem everyone wants: popularity! Even with a proper caching strategy and sensible indexes in place, my application database is quickly becoming a bottleneck. Like most web applications, my resources tend to be frequently read, with relatively infrequent updates. I have adhered to common design best practices; my schema is properly normalized. What more can I do to scale my database throughput capacity?

Solution

The solutions listed next won’t work for everyone, but there’s a pretty good chance that they’ll solve some of your problems.

Denormalization

Are you consistently joining the same tables together? You might consider adding redundant columns to your high-traffic tables so that you can reduce or eliminate joins. For example, given a Users.province_id fk> Province.id relationship, it might make sense to create a Users.province column and store the literal text value directly on the Users table.

Of course, they call these databases “relational” for a reason! By giving that up, you accept the burden of updating the same data in multiple places. You might have to change how your application retrieves these new redundant values. Profile to find bottlenecks, and weigh the tradeoff cost of each optimization.

In most cases, it’s not recommended that you drop or abandon your normalized join tables completely; you should add redundant columns only for data that ...

Get Facebook Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.