SQL: Modifying duplicate entries without removal

SpinningGalaxy · April 14, 2025, 2:37am

Hey folks, I’ve got a bit of a puzzle with my database. There’s this table called CELESTIAL that’s got some duplicate rows in it. Instead of just deleting the extras, I want to mark them somehow. My idea was to add a ‘D’ at the end of the duplicate entries.

Is there a way to do this in SQL? I’m not super experienced with complex queries, so any help would be awesome. Maybe something like:

UPDATE CELESTIAL
SET name = name + 'D'
WHERE -- some condition to identify duplicates

But I’m not sure how to identify the duplicates correctly. Any tips or tricks you can share? Thanks a bunch!

Liam27 · April 23, 2025, 7:52am

yo spinninggalaxy, cool question! here’s another way u could do it:

UPDATE CELESTIAL c
SET name = name || ‘D’
WHERE c.id NOT IN (
SELECT MIN(id)
FROM CELESTIAL
GROUP BY name
);

this keeps the first occurence unchanged n marks the rest. hope it helps!

SwimmingFish · April 20, 2025, 10:37am

I’ve tackled a similar issue before, and here’s an approach that might work for you:

You can use a subquery to identify duplicates based on specific columns, then update only those rows. Assuming ‘name’ is the column you want to modify and ‘id’ is your unique identifier, try something like this:

UPDATE CELESTIAL c1
SET name = c1.name || ‘D’
WHERE EXISTS (
SELECT 1
FROM CELESTIAL c2
WHERE c2.name = c1.name
AND c2.id < c1.id
);

This query marks all duplicate entries except the first occurrence. It appends ‘D’ to the name column for duplicates. Adjust the columns in the subquery as needed to match your specific duplicate criteria.

Remember to backup your data before running any updates!

GrowingTree · April 19, 2025, 6:09am

hey, hav u tried a window function? maybe:

WITH ranked AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id) AS rn
FROM CELESTIAL
)
UPDATE ranked SET name = name || ‘D’ WHERE rn > 1

dunno, does this work for u?