Debugging Duplicates In A Dataset
Sometimes you are working on a query and you want to debug whether there are duplicates in your table.
select id count(*) group by id having count(*) > 1
The query will show all the rows (grouped by id) with counts greater than 1.
Unintentional duplicates can occur after a join. Instead of fixing this by using
select distinct (which is a code smell), try to figure out why the duplication is happening.
Once you find the rows with duplicates, you can then
select * for those records to see if there is any relationship between them that explains the duplication.