Here’s a common coding scenario for SQL Server developers:

“I want to see the oldest amount due for each account, along with the account number and due date, ordered by account number.”

Since the release of SQL Server 2005, the simplest way to do this has been to use a window function like ROW_NUMBER.  In many cases, everything you need can be done in a single SELECT statement with your window function. The trouble comes when you want to incorporate that function in some other way. For instance, using it a WHERE clause.

Let’s use this example from AdventureWorks2012:

/* This gives us all orders for each account as determined by DueDate
, and the result set is ordered by AccountNumber */

SELECT ROW_NUMBER() OVER (PARTITION BY AccountNumber ORDER BY DueDate) AS rn
, AccountNumber
, DueDate
, TotalDue
FROM Sales.SalesOrderHeader AS soh
ORDER BY AccountNumber

This works perfectly and gives us every row in the table. Now, let’s modify it to meet the scenario we outlined earlier, and only show the oldest order for each account.

We could try this…

/* This gives us the oldest order for each account as determined by DueDate */
SELECT ROW_NUMBER() OVER (PARTITION BY AccountNumber ORDER BY DueDate) AS rn
, AccountNumber
, DueDate
, TotalDue
FROM Sales.SalesOrderHeader AS soh
WHERE rn = 1
ORDER BY AccountNumber

…to which, SQL Server gives you a Dikembe Mutumbo finger-wag:

Msg 207, Level 16, State 1, Line 7
Invalid column name 'rn'.

“Invalid column name”?

Maybe we need to not alias it. Let’s try again.

SELECT ROW_NUMBER() OVER (PARTITION BY AccountNumber ORDER BY DueDate) AS rn
, AccountNumber
, DueDate
, TotalDue
FROM Sales.SalesOrderHeader AS soh
WHERE ROW_NUMBER() OVER (PARTITION BY AccountNumber ORDER BY DueDate) = 1
ORDER BY AccountNumber

Okay, that’s much bett–

Msg 4108, Level 15, State 1, Line 7
Windowed functions can only appear in the SELECT or ORDER BY clauses.

Why can’t we put a window function in a where clause?

Because the WHERE clause already happened.

Logical Query Processing

SQL Server doesn’t process parts of a query in the same order they’re written. Rather than start with SELECT the way we read and write it, here’s the order SQL Server progresses through:

  1. FROM
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. SELECT
  6. ORDER BY
  7. TOP

The first four steps are all about getting the source data and reducing the result set down. Steps 5 & 6 determine which columns are presented and in which order. Step 7 (TOP) is only applied at the end because you can’t say which rows are in the top n rows until the set has been sorted. (You can read Itzik Ben-Gan’s explanation of this process in way more detail here.)

Since the WHERE clause happens before the SELECT, it’s too late in the process to add the window function to the WHERE clause. It’d have to loop back around a second time to re-evaluate WHERE.

There’s No Magic Hack

Unfortunately, the logical query processing model is a fundamental way of doing business for SQL Server, so we can’t just cheat around it. Instead, we have to reference our window function through a subquery or CTE (common table expression). In other words, we have to code in a way that makes SQL Server evaluate a WHERE clause after the SELECT containing the window function:

/* This gives us the oldest order for each account as determined by DueDate */
SELECT rn, AccountNumber, DueDate, TotalDue
FROM 
(
	SELECT ROW_NUMBER() OVER (PARTITION BY AccountNumber ORDER BY DueDate) AS rn
	, AccountNumber
	, DueDate
	, TotalDue
	FROM Sales.SalesOrderHeader AS soh
) as t
WHERE t.rn = 1
ORDER BY AccountNumber

By putting the WHERE clause in the outer query and the window function in the subquery (also called an inner query), we get the window function to come before the WHERE.

It’s not a shortcut, but it’s the shortest path to get what we want.

On a related note, window functions can’t be used in an UPDATE statement, either.

Trivia Isn’t Always Trivial

More internal concepts like logical query processing won’t often play into your day-to-day work. However, having at least an awareness of deeper workings like these can not only save you time when troubleshooting T-SQL, but also understand the limitations of the code base you have to work with.