By | May 7, 2025

How To Avoid Duplicate In SQL

To avoid duplicates in SQL, you typically want to ensure that your data is clean, unique, and consistent. There are several approaches to preventing, identifying, or removing duplicates in SQL queries, depending on your needs.

Here’s how you can avoid or manage duplicates in SQL:

🔍 How to Avoid Duplicates in SQL

✅ 1. Use the DISTINCT Keyword

  • The DISTINCT keyword helps you select unique records from a query.
  • It’s useful when you want to avoid duplicates in the results of a SELECT query.

Example:

sqlCopyEditSELECT DISTINCT column_name
FROM table_name;
  • This will return only unique values for column_name.

Example (multiple columns):

sqlCopyEditSELECT DISTINCT column1, column2
FROM table_name;
  • This returns unique combinations of column1 and column2.

✅ 2. Use the GROUP BY Clause

  • The GROUP BY clause groups rows that have the same values in specified columns and is useful for aggregate functions like COUNT(), SUM(), AVG(), etc.
  • This helps avoid duplicate rows in results and allows you to work with aggregates.

Example:

sqlCopyEditSELECT column1, COUNT(*)
FROM table_name
GROUP BY column1;
  • This will count the occurrences of each unique value in column1, eliminating duplicates in the output.

✅ 3. Use ROW_NUMBER() for Deleting Duplicates

  • When you need to remove duplicates from a table, you can use window functions like ROW_NUMBER() to assign a unique number to each row within a partition of data. This helps you identify duplicate rows for deletion.

Example:

sqlCopyEditWITH CTE AS (
  SELECT column1, column2, ROW_NUMBER() OVER (PARTITION BY column1, column2 ORDER BY column1) AS row_num
  FROM table_name
)
DELETE FROM CTE
WHERE row_num > 1;
  • This query assigns a row number to each duplicate based on column1 and column2. It then deletes all rows with a row_num greater than 1, effectively removing duplicates.

✅ 4. Use the UNION Operator

  • The UNION operator combines the results of two queries but automatically removes duplicate rows.
  • This is useful if you are combining results from two separate queries and want only unique rows in the result.

Example:

sqlCopyEditSELECT column1 FROM table1
UNION
SELECT column1 FROM table2;
  • This query will combine data from both tables and remove any duplicates in the final result.

✅ 5. Add Unique Constraints or Indexes to the Table

  • To prevent duplicates from being inserted into a table, you can add a unique constraint or unique index to one or more columns.
  • This ensures that duplicate data cannot be inserted into the table.

Example (Creating a Unique Constraint):

sqlCopyEditALTER TABLE table_name
ADD CONSTRAINT constraint_name UNIQUE (column1, column2);
  • This will ensure that combinations of column1 and column2 are unique in the table.

Example (Creating a Unique Index):

sqlCopyEditCREATE UNIQUE INDEX index_name
ON table_name (column1, column2);
  • This will enforce uniqueness for column1 and column2 in table_name.

✅ 6. Use INSERT IGNORE to Prevent Duplicate Inserts (MySQL)

  • In MySQL, you can use the INSERT IGNORE statement to prevent inserting duplicates that would violate a unique constraint.

Example:

sqlCopyEditINSERT IGNORE INTO table_name (column1, column2)
VALUES (value1, value2);
  • This will ignore the insert if it causes a duplicate based on the table’s unique constraints.

✅ 7. Use ON CONFLICT DO NOTHING (PostgreSQL)

  • In PostgreSQL, you can use ON CONFLICT DO NOTHING to prevent inserting a row if it conflicts with an existing row that violates a unique constraint.

Example:

sqlCopyEditINSERT INTO table_name (column1, column2)
VALUES (value1, value2)
ON CONFLICT (column1) DO NOTHING;
  • This prevents duplicate rows for column1 from being inserted.

✅ 8. Check for Duplicates Before Insertion

  • Before inserting new data, you can check if the data already exists in the table to prevent duplicates.

Example:

sqlCopyEditIF NOT EXISTS (SELECT 1 FROM table_name WHERE column1 = value1)
BEGIN
    INSERT INTO table_name (column1) VALUES (value1);
END;
  • This checks if a value already exists before performing the insert operation.

⚠️ When to Avoid Duplicates:

  • Data entry: Always enforce constraints such as primary keys or unique indexes to ensure clean data.
  • Reporting: Use DISTINCT or GROUP BY to remove duplicates when querying for reports or analytics.
  • Data cleaning: Use window functions or CTEs (ROW_NUMBER(), RANK()) to identify and remove duplicate rows in your database.