GROUP BY and HAVING: The Difference Explained

Every confusion about GROUP BY and HAVING boils down to one thing: they run at different points in the query. Get the execution order right once and nothing about these clauses is tricky again.

Execution order (simplified)

FROM and JOIN: assemble the raw rows.
WHERE: filter rows before grouping.
GROUP BY: collapse rows into one per group.
Aggregate functions (COUNT, SUM, ...): computed on each group.
HAVING: filter groups based on aggregate results.
SELECT projection: choose columns to return.
ORDER BY, LIMIT.

The cardinal rule

Every column in the SELECT list must be either in the GROUP BY or inside an aggregate. Otherwise the database cannot know which row to pick.

-- Broken: sales is not grouped and is not aggregated
SELECT region, sales
FROM orders
GROUP BY region;

-- Fix 1: group by both
SELECT region, sales
FROM orders
GROUP BY region, sales;

-- Fix 2 (usually what you wanted): aggregate sales
SELECT region, SUM(sales) AS total_sales
FROM orders
GROUP BY region;

WHERE vs HAVING

Use WHERE to drop individual rows before grouping; use HAVING to drop whole groups based on their aggregate.

-- Regions whose 2026 sales exceed $1M
SELECT region, SUM(amount) AS total
FROM orders
WHERE order_year = 2026          -- per-row filter
GROUP BY region
HAVING SUM(amount) > 1000000     -- per-group filter
ORDER BY total DESC;

Filters that do not depend on aggregates belong in WHERE because WHERE runs first and shrinks the dataset the engine has to group. Putting them in HAVING works but is slower.

The PostgreSQL tutorial compresses the whole topic into one sentence worth memorizing: “WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.”

GROUP BY vs DISTINCT

SELECT DISTINCT col and SELECT col FROM t GROUP BY col return the same rows. Use DISTINCT when you only need uniqueness, and GROUP BY when you also want to aggregate. And when you need an aggregate on every row without collapsing the groups, that is the job of window functions.

COUNT quirks

COUNT(*): every row, including those with NULLs.
COUNT(col): rows where col is not NULL.
COUNT(DISTINCT col): unique non-NULL values.

Grouping by expressions

You can group by anything that is not an aggregate. A common example is bucketing by month:

SELECT
  DATE_TRUNC('month', created_at) AS month,
  COUNT(*) AS signups
FROM users
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month;

DATE_TRUNCis PostgreSQL's spelling, and PostgreSQL is a reasonable default dialect to practice in: it was the most-used database in the 2025 Stack Overflow Developer Survey, at 55.6% of respondents. Other engines name the bucketing function differently, but the GROUP BY mechanics are identical everywhere.

Some databases let you write GROUP BY 1 to group by the first column in the projection, which is handy in interviews but fragile in production code.

Common mistakes

Putting an aggregate in WHERE. Errors out. Aggregates only exist after grouping; use HAVING.
Grouping by a derived column alias. Standard SQL requires you to repeat the expression; PostgreSQL and MySQL allow the alias but others do not.
Forgetting that ORDER BY happens after the projection. Referring to an aggregate by its alias in ORDER BY is fine in most databases.

Practice this live

The Exchange chapter introduces COUNT, SUM, and GROUP BY. The Director chapter pushes further into HAVING and complex multi-join aggregations.

GROUP BY and HAVING, Clearly