Monthly Archives: May 2006

How to delete from a table records which do not exist in another one ?

So, you want to delete records from a table when they are not in another one. To cleanup orphans for example.

The query is easy to write, but depending on the implementation, it may be really slow.

Let’s say you have two tables, A and B, with an “ident” prmary key column in each of them. You want to delete from A records whose “ident” is not in B.

The easiest way is to write :
[code lang=”sql”]
SELECT *
FROM A
WHERE NOT EXISTS ( SELECT * FROM B WHERE B.Ident = A.Ident)
[/code]
Another way :
[code lang=”sql”]
DELETE
FROM A
WHERE A.Ident NOT IN ( SELECT B.Ident FROM B )
[/code]
However, if both queries are corrects, they may be really long execute. That’s because for each record in A, the query must go through all the records in B to ensure there’s no match.

The following query is much faster, since we don’t use a “NOT IN” query, but a “IN” query. The subquery can be constructed once for all, and not for each records, and the comparison stops as soon as the record is found:
[code lang=”sql”]
DELETE
FROM A
WHERE A.Ident IN (SELECT A2.Ident from A A2 LEFT JOIN B on A2.Ident = B.Ident WHERE B.Ident IS NULL)
[/code]