Postgresql Distinct Clause

PostgreSQL: DISTINCT Clause

In this PostgreSQL post explains how to use the PostgreSQL DISTINCT clause with syntax and examples.

Description

The PostgreSQL DISTINCT clause is used to remove duplicates from the result set. The DISTINCT clause can only be used with SELECT statements.

Syntax

The syntax for the DISTINCT clause in PostgreSQL is:

SELECT DISTINCT | DISTINCT ON (distinct_expressions)
expressions
FROM tables
[WHERE conditions];

Parameters or Arguments

distinct_expressions

The expressions to use to remove the duplicates.

expressions

The columns or calculations that you wish to retrieve.

tables

The tables that you wish to retrieve records from. There must be at least one table listed in the FROM clause.

WHERE conditions

Optional. The conditions that must be met for the records to be selected.

Note

  • When only one expression is provided in the DISTINCT clause, the query will return the unique values for that expression.
  • When more than one expression is provided in the DISTINCT clause, the query will retrieve unique combinations for the expressions listed.
  • When the DISTINCT ON keywords are specified, the query will return the unique values for distinct_expressions and return other fields for the selected records based on the ORDER BY clause (limit of 1).
  • In PostgreSQL, the DISTINCT clause doesn't ignore NULL values. So when using the DISTINCT clause in your SQL statement, your result set will include NULL as a distinct value.

Example - With Single Expression

Let's look at the simplest DISTINCT clause example in PostgreSQL. We can use the DISTINCT clause to return a single field that removes the duplicates from the result set.

For example:

SELECT DISTINCT last_name
FROM contacts
ORDER BY last_name;

This PostgreSQL DISTINCT example would return all unique last_name values from the contacts table.

Example - With Multiple Expressions

Let's look at how you might use the PostgreSQL DISTINCT clause to remove duplicates from more than one field in your SELECT statement.

For example:

SELECT DISTINCT last_name, city, state
FROM contacts
ORDER BY last_name, city, state;

This PostgreSQL DISTINCT clause example would return each unique last_name, city, and state combination from the contacts table. In this case, the DISTINCT applies to each field listed after the DISTINCT keyword, and therefore returns distinct combinations.

Example - DISTINCT ON

One thing that is unique in PostgreSQL, compared to other databases, is that you have one more option when using the DISTINCT clause - which is called DISTINCT ON.

The DISTINCT ON clause will return only the first row for the DISTINCT ON (distinct_expressions), based on the ORDER BY clause provided in the query. Any other fields listed in the SELECT statement will be returned for that first row. It is kind of like doing a LIMIT of 1 for each DISTINCT ON (distinct_expressions) combination.

Let's explore further how to use DISTINCT ON in the DISTINCT clause, and what it returns.

So we could modify the example above:

SELECT DISTINCT ON (last_name) last_name, city, state
FROM contacts
ORDER BY last_name, city, state;

This DISTINCT clause example, which uses the DISTINCT ON keywords, would return all unique last_name values. But in this case, for each unique last_name value, it would return only the first unique last_name record it encounters based on the ORDER BY clause along with the city and state values from that record.

It is not returning the unique last_name, city, and state combinations. It is, in effect, doing a LIMIT of 1 for each DISTINCT ON (last_name) and returning the corresponding city and state values after it chooses the records to return.