Richard C.
—SQL tables can have rows with duplicate values. These might be:
What SQL query can we run using SELECT to find duplicates like these?
Let’s use the following table of people with email addresses as an example. The code will run on SQL Server, MySQL, and PostgreSQL.
CREATE TABLE Person ( Id INT PRIMARY KEY, Name VARCHAR(255), Email VARCHAR(255) ); INSERT INTO Person(Id, Name, Email) VALUES (1, 'Amir', 'coolguy@example.com'), (2, 'Sofia', 's.martinez@example.com'), (3, 'Aya', 'aya@example.com'), (4, 'Mateo', 'coolguy@example.com'), (5, 'Leila', 'leila@example.com'), (6, 'Yara', 'yara@example.com'), (7, 'Ndidi', 'theman@example.com'), (8, 'Santiago', 's.martinez@example.com');
The query below returns all the duplicate emails but not the user’s name. This is because you cannot return fields in the SELECT query that aren’t part of the GROUP BY.
SELECT Email FROM Person GROUP BY Email HAVING COUNT(*) > 1;
To retrieve Name
and Id
too, we can use the above SELECT result as a subquery in the query below.
SELECT p.Id, p.Name, p.Email FROM Person AS p JOIN ( SELECT Email FROM Person GROUP BY Email HAVING COUNT(*) > 1 ) AS duplicates ON p.Email = duplicates.Email;
The results of this query are:
Id | Name | |
---|---|---|
4 | Mateo | coolguy@example.com |
1 | Amir | coolguy@example.com |
8 | Santiago | s.martinez@example.com |
2 | Sofia | s.martinez@example.com |
Our solution was to join the list of duplicate emails on the original Person
table to include the names and IDs of the people with those emails.
Since this subquery is only run once, not once per row, the query will be fast.
If duplicate values are something you don’t want in your table, you can prevent them from being entered by adding a UNIQUE constraint to the column.
CREATE TABLE Person ( Id INT PRIMARY KEY, Name VARCHAR(255), Email VARCHAR(255) UNIQUE );
Tasty treats for web developers brought to you by Sentry. Get tips and tricks from Wes Bos and Scott Tolinski.
SEE EPISODESConsidered “not bad” by 4 million developers and more than 100,000 organizations worldwide, Sentry provides code-level observability to many of the world’s best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.
Here’s a quick look at how Sentry handles your personal information (PII).
×We collect PII about people browsing our website, users of the Sentry service, prospective customers, and people who otherwise interact with us.
What if my PII is included in data sent to Sentry by a Sentry customer (e.g., someone using Sentry to monitor their app)? In this case you have to contact the Sentry customer (e.g., the maker of the app). We do not control the data that is sent to us through the Sentry service for the purposes of application monitoring.
Am I included?We may disclose your PII to the following type of recipients:
You may have the following rights related to your PII:
If you have any questions or concerns about your privacy at Sentry, please email us at compliance@sentry.io.
If you are a California resident, see our Supplemental notice.