Eliminating Duplicates from a Table

Problem

You want to remove duplicate rows from a table so that it contains only unique rows.

Solution

Select the unique rows from the table into a second table that you use to replace the original one. Or add a unique index to the table using ALTER TABLE, which will remove duplicates as it builds the index. Or use DELETE ... LIMIT n to remove all but one instance of a specific set of duplicate rows.

Discussion

Preventing Duplicates from Occurring in a Table discusses how to prevent duplicates from being added to a table by creating it with a unique index. However, if you forget to include a unique index when you create a table, you may discover later that it contains duplicates and that it’s necessary to apply some sort of duplicate-removal technique. The catalog_list table used earlier is an example of this, because it contains several instances in which the same person is listed multiple times:

mysql>SELECT * FROM catalog_list ORDER BY last_name, first_name; +-----------+-------------+--------------------------+ | last_name | first_name | street | +-----------+-------------+--------------------------+ | Baxter | Wallace | 57 3rd Ave. | | BAXTER | WALLACE | 57 3rd Ave. | | Baxter | Wallace | 57 3rd Ave., Apt 102 | | Brown | Bartholomew | 432 River Run | | Isaacson | Jim | 515 Fordam St., Apt. 917 | | McTavish | Taylor | 432 River Run | | Pinter | Marlene | 9 Sunset Trail | | Pinter | Marlene | 9 Sunset Trail | +-----------+-------------+--------------------------+ ...

Get MySQL Cookbook, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.