SQL - How to select every row from a table where a column value is not unique

SQL FAQ: How can I select every row from a database table where a column value is not unique?

I’m working on an problem today where a Drupal article can have many revisions, and the way Drupal works is that a table named node_revisions has a nid field for “node id,” and a vid field for “revision id.” If you have five revisions of an article (i.e., a blog post), there will be five records in this table, and each record will have the same nid value and a unique vid. If an article has no revisions, this table will have one entry with a unique nid and unique vid.

To find all of the articles that have more than one revision, I just need a SQL query to find all of the table entries where the nid is not distinct. To find all of those articles I use this query:

SELECT * FROM node_revisions WHERE nid IN
(SELECT nid FROM node_revisions GROUP BY nid HAVING COUNT(1) > 1)
ORDER BY nid ASC

This query comes back as desired, showing all the rows where the nid is not unique.

Be warned that this SQL query can take a long time to run. My database table has less than 10,000 rows, and this query takes about 25 seconds to return, and the computer I’m running it on has an SSD drive and it’s not slow. I don’t use SQL subqueries too often, so at the moment I don’t know how to optimize this query.

If you just want to test the query on your system to make sure it works, add a LIMIT clause, like this:

SELECT nid, vid FROM node_revisions WHERE nid IN
(SELECT nid FROM node_revisions GROUP BY nid HAVING COUNT(1) > 1)
ORDER BY nid ASC
LIMIT 20

On my computer this query returns almost immediately.