SQL is a programming language now used by software developers across the world.
The language is used to create relational databases, which can help to store structured data and information.
However, this does not mean that SQL is free from negative criticism, as there may come a situation when you have multiple records with duplicate data.
Because of this, many developers use Distinct to remove duplicate rows from a table or query, which helps to recover only unique records.
In the article below, we are going to teach you how to use Distinct in SQL. So that you can use the keyword to recover specific data.
Just follow our helpful guide and you will find everything you need to get started…
What Is SQL?
SQL stands for Structured Query Language. This programming language allows you to create databases and query them.
If you want to become a data scientist or database developer, learning SQL is essential.
The language is a powerful tool for querying structured data stored in relational databases.
The syntax of SQL is similar to English, allowing programmers to write queries using simple commands.
SQL has been around since the 1970s. It was originally developed by IBM to allow their customers to access their mainframe computers.
Since then, it has evolved into a standard language used by almost every major software company.
What Is Distinct?
Distinct is a keyword used to remove duplicate rows from a table or query result. The problem is that it only removes duplicates where the values match exactly.
This means that if two records contain the same value for a particular column, they won’t get removed.
Because of this, Distinct is a very useful feature of SQL. It helps us to reduce the number of rows returned by a query without having to write additional code.
For example, suppose we want to select all the names of employees who have worked for at least five years. We can use the distinct keyword to ensure that no employee appears twice.
How To Use Distinct In SQL
Distinct can be used to remove duplicate rows from a table. For example, if we had a table with three columns: name, age, and gender, then we could write a query like this: SELECT DISTINCT name FROM people.
The keyword is used in conjunction with the Select keyword and is helpful when avoiding duplicate values in columns or tables.
The unique values are recovered when we use the Distinct keyword:
- SELECT DISTINCT recovers only distinct (or different) values.
- DISTINCT removes duplicate values from the table.
- DISTINCT can be used in conjunction with aggregates such as: COUNT, AVG and MAX.
- DISTINCT operates with a single column.
- DISTINCT does not support multiple columns.
Syntax
SELECT DISTINCT expressions FROM tables [WHERE conditions]
Parameters
Expressions: This refers to the columns or calculations that you want to recover.
Tables: This is the table where we want to recover the data. There is only one table in the FROM clause.
WHERE Conditions: This refers to the conditions for the selected records and is optional.
Things To Remember
- When one expression is operated in the DISTINCT clause the query will return the unique values of the expressions.
- The query will recover the unique records for the listed expressions if more than one expression is provided in the DISTINCT clause.
- In SQL, the DISTINCT clause will not ignore the NULL values. So when using the DISTINCT clause in SQL, the result will include NULL as a distinct value.
Example
ID | Name | Age | Gender | Salary |
001 | John | 25 | M | 18000 |
002 | Tina | 30 | F | 22000 |
003 | Ryan | 26 | M | 20000 |
004 | Raden | 22 | F | 18000 |
Now, let’s use the SELECT query down below to return the duplicate salary records.
SQL> SELECT salary FROM employees ORDER BY salary;
When the above query has been executed, it will recover all the records including duplicate data. In the table above, the salary of John and Raden is the same 18000.
Salary |
18000 |
22000 |
20000 |
18000 |
Now, let’s use the DISTINCT keyword with the query above.
SQL> SELECT DISTINCT salary FROM employees ORDER BY salary;
When the above query has been executed, it will remove the duplicate records from the table and show the following result:
Salary |
18000 |
22000 |
20000 |
How To Find Unique Values (Example)
Let’s take a look at the DISTINCT clause and how it can be used to find unique values within one column in the table.
In this example, we have compiled a list of suppliers using the following data to organize them:
ID | Supplier | Product | State |
100 | Sunfresh Goods | Fruit | Florida |
200 | Rainbow Supplies | Craft | Washington |
300 | Julie’s Botanics | Plants | Florida |
400 | Seafresh Treats | Fish | Maine |
500 | Inkspot Press | Books | California |
600 | Good Time Spirits | Alcohol | Georgia |
By using the following SELECT DISTINCT query, we are going to find the unique states from the table above.
SELECT DISTINCT state FROM suppliers ORDER BY state;
When the query has been executed, it will recover the unique states from the table and show the following result:
State |
Washington |
Maine |
California |
Georgia |
How To Find Unique Values In Multiple Columns (Example)
The SQL DISTINCT clause can be used to remove duplicate records from many fields in the SELECT statement.
In this example, we have compiled a list of companies using the following data to organize them.
ID | Company | City | State |
470 | Microsoft | Redmond | Washington |
471 | HCL | Redwood City | California |
472 | Zomato Foods | Springdale | Arkansas |
473 | Wipro | Redwood City | California |
474 | Flowers Company | Thomasville | Georgia |
475 | Adani | Westlake Village | California |
By using the following SELECT DISTINCT query, we are going to recover each unique city and state combination.
SELECT DISTINCT city, state FROM company ORDER BY city, state;
When the query has been executed, it will recover the unique city and state combinations and show the following result:
City | State |
Redmond | Washington |
Redwood City | California |
Springdale | Arkansas |
Thomasville | Georgia |
Westlake Village | California |
How Distinct Clauses Handle Null Values (Example)
In SQL, the DISTINCT clause considers NULL to the unique value. In this example, we have compiled a list of foods using the following data to organize them:
ID | Food | Category |
1 | Steak | 55 |
2 | Donuts | 55 |
3 | Mango | NULL |
4 | Burgers | 70 |
5 | Candy | 35 |
By using the following SELECT DISTINCT query, we are going to recover the unique values from the table, which should include the null value.
SELECT DISTINCT id FROM food ORDER BY category;
When the query has been executed, it will recover the unique values with the null value also being included in the results:
Category |
NULL |
35 |
55 |
70 |
Final Thoughts
Using Distinct in SQL may sound complicated, but it is a great way to recover unique values and remove duplicate data.
Fortunately, the programming language uses a similar syntax to English and can be used to make simple commands.
- What Is Percentile Rank? - February 15, 2022
- How To Use Distinct In SQL - February 14, 2022
- How To Drop A Column In SQL - February 14, 2022