In this PostgreSQL post explains how to use the PostgreSQL DISTINCT clause with syntax and examples.
The PostgreSQL DISTINCT clause is used to remove duplicates from the result set. The DISTINCT clause can only be used with SELECT statements.
The syntax for the DISTINCT clause in PostgreSQL is:
The expressions to use to remove the duplicates.
The columns or calculations that you wish to retrieve.
The tables that you wish to retrieve records from. There must be at least one table listed in the FROM clause.
Optional. The conditions that must be met for the records to be selected.
Let's look at the simplest DISTINCT clause example in PostgreSQL. We can use the DISTINCT clause to return a single field that removes the duplicates from the result set.
For example:
This PostgreSQL DISTINCT example would return all unique last_name values from the contacts table.
Let's look at how you might use the PostgreSQL DISTINCT clause to remove duplicates from more than one field in your SELECT statement.
For example:
This PostgreSQL DISTINCT clause example would return each unique last_name, city, and state combination from the contacts table. In this case, the DISTINCT applies to each field listed after the DISTINCT keyword, and therefore returns distinct combinations.
One thing that is unique in PostgreSQL, compared to other databases, is that you have one more option when using the DISTINCT clause - which is called DISTINCT ON.
The DISTINCT ON clause will return only the first row for the DISTINCT ON (distinct_expressions), based on the ORDER BY clause provided in the query. Any other fields listed in the SELECT statement will be returned for that first row. It is kind of like doing a LIMIT of 1 for each DISTINCT ON (distinct_expressions) combination.
Let's explore further how to use DISTINCT ON in the DISTINCT clause, and what it returns.
So we could modify the example above:
This DISTINCT clause example, which uses the DISTINCT ON keywords, would return all unique last_name values. But in this case, for each unique last_name value, it would return only the first unique last_name record it encounters based on the ORDER BY clause along with the city and state values from that record.
It is not returning the unique last_name, city, and state combinations. It is, in effect, doing a LIMIT of 1 for each DISTINCT ON (last_name) and returning the corresponding city and state values after it chooses the records to return.