Saturday, August 27, 2005

Remove duplicate rows / entries from a mySQL database table

Here is a fairly quick and painless way to remove duplicate rows from a mySQL database table.

To remove the duplicates in mySQL 3.23 and above we will move the non duplicates into a temporary table, then delete the old table, and rename the temporary table to the old table to finish the circle.

First Create the temporary table and populate it with the values from the old table.

NOTE: As of mySQL 3.23, you can create one table from another by adding a SELECT statement at the end of the CREATE TABLE statement :: CREATE TABLE new_tbl SELECT * FROM orig_tbl;


CREATE TABLE newtable
SELECT * FROM oldtable WHERE 1 GROUP BY [column to remove duplicates from];



Now we have a newtable with all of the duplicates removed. So next we drop the old table (the one with all the duplicates still remaining)

DROP TABLE oldtable;


Now we just rename the newtable to the name of the oldtable so we don't break our whole code base!

RENAME TABLE newtable TO oldtable;


Thats it, duplicates removed. With newer versions of mySQL and with other database engines such as Oracle 9 you can remove duplicates in a single statement and step but this method will still work (using any changes in the SQL commands necessary for your engine).

0 Comments:

Post a Comment

<< Home