1/17/2006
Delete Duplicate Rows in MySQL
1.5 million rows in a products_categories table. That’s 1,500,000 products to categories relations. I needed a quick way to remove products from multiple categories to test the performance hit of the HUGE products_categories table. A quick and dirty is to remove duplicate productids from the table. Here’s a quick howto.
Create a temporary table with only one row for each productid (the GROUP BY clause). Change this GROUP BY to be the field with duplicates to remove. You can also add some logic here to select which records to keep.
Empty the original table (you created a backup already right?).
Copy the ‘clean’ records back to the original table.
This can be used to normalize a table to add a primary key, data mining, cleaning a database or just for testing.
Popularity: 30%



September 30th, 2008 at 10:08 am
Its a good solution…for removing duplicates