Sentry Answers>SQL>

How do I find duplicate values in a SQL table?

How do I find duplicate values in a SQL table?

Richard C.

The Problem

SQL tables can have rows with duplicate values. These might be:

  • An email address signifying duplicate contacts.
  • A product code signifying duplicate items in an online store.
  • A medical ICD code signifying that you double-billed a patient.

What SQL query can we run using SELECT to find duplicates like these?

Let’s use the following table of people with email addresses as an example. The code will run on SQL Server, MySQL, and PostgreSQL.

Click to Copy
CREATE TABLE Person ( Id INT PRIMARY KEY, Name VARCHAR(255), Email VARCHAR(255) ); INSERT INTO Person(Id, Name, Email) VALUES (1, 'Amir', 'coolguy@example.com'), (2, 'Sofia', 's.martinez@example.com'), (3, 'Aya', 'aya@example.com'), (4, 'Mateo', 'coolguy@example.com'), (5, 'Leila', 'leila@example.com'), (6, 'Yara', 'yara@example.com'), (7, 'Ndidi', 'theman@example.com'), (8, 'Santiago', 's.martinez@example.com');

The Solution

The query below returns all the duplicate emails but not the user’s name. This is because you cannot return fields in the SELECT query that aren’t part of the GROUP BY.

Click to Copy
SELECT Email FROM Person GROUP BY Email HAVING COUNT(*) > 1;

To retrieve Name and Id too, we can use the above SELECT result as a subquery in the query below.

Click to Copy
SELECT p.Id, p.Name, p.Email FROM Person AS p JOIN ( SELECT Email FROM Person GROUP BY Email HAVING COUNT(*) > 1 ) AS duplicates ON p.Email = duplicates.Email;

The results of this query are:

IdNameEmail
4Mateocoolguy@example.com
1Amircoolguy@example.com
8Santiagos.martinez@example.com
2Sofias.martinez@example.com

Our solution was to join the list of duplicate emails on the original Person table to include the names and IDs of the people with those emails.

Since this subquery is only run once, not once per row, the query will be fast.

Using Constraints to Prevent the Problem

If duplicate values are something you don’t want in your table, you can prevent them from being entered by adding a UNIQUE constraint to the column.

Click to Copy
CREATE TABLE Person ( Id INT PRIMARY KEY, Name VARCHAR(255), Email VARCHAR(255) UNIQUE );
  • Syntax.fmListen to the Syntax Podcast
  • Community SeriesIdentify, Trace, and Fix Endpoint Regression Issues
  • Syntax.fm logo
    Listen to the Syntax Podcast

    Tasty treats for web developers brought to you by Sentry. Get tips and tricks from Wes Bos and Scott Tolinski.

    SEE EPISODES

Considered “not bad” by 4 million developers and more than 100,000 organizations worldwide, Sentry provides code-level observability to many of the world’s best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.

© 2024 • Sentry is a registered Trademark of Functional Software, Inc.