When complex SQL queries are generated for you, there is no need to expose SQL’s complexity when you handle everything visually. The unnecessary complexity of the SQL OVER ROWS clause is hidden by a simple visual selection of a frame on screen. You can much better understand how ROWS works after playing for just a few minutes with our visual SQL query builder for window functions.

Why is the OVER ROWS Clause so Complex?

When the 2012 T-SQL features were first announced back in 2011, I must confess that framing intimidated me a bit. I didn’t really like the syntax and didn’t quite understand what was going on.

Expert T-SQL Window Functions in SQL Server – by Kathi Kellenberger, 2015

I witnessed firsthand, in one of my past jobs, how great windows functions are and how they can solve problems very hard to fix any another way. When our database architect at that time suggested we could use just plain SQL queries to transform huge server-side batches of data that would normally require complex cursor-based operations, I didn’t believe it was actually possible. But I ended up by successfully implementing the algorithms he suggested.

I still suspect today most people find the window functions most complex than they are. One reason is there have been some confusing, complex and uninspired syntax choices there. I always wondered why the inline ORDER BY clause was replicated as it was, but we needed a new PARTITION BY clause for what obviously is an inline GROUP BY clause.

The nightmare comes however from the ROWS/RANGE clause, that sometimes can be used after an OVER ORDER BY to further restrict the inline frame. Leaving aside the additional confusing alternative RANGE keyword and implementation, it took me a while to figure out UNBOUNDED PRECEDING is just the FIRST frame row, and UNBOUNDED FOLLOWING is the LAST frame row.

Usability of this clause is simply bad. Look at how much space they needed at Oracle to explain what should have been the definition of a simple FROM-TO interval:

Oracle Rows

What OVER ROWS Clause Actually Does

ROWS filters the partition frame for each current row, by specifying its top and bottom limits, that’s all. It’s an additional filter you may add after an OVER ORDER BY inline specification. And it can be used with an aggregate function.

The so-called offset-based or navigational analytic functions LEAD/LAG, FIRST/LAST_VALUE and NTH_VALUE, will also internally use their own frame. LAG and FIRST_VALUE look backwards, at a relative offset or just the first frame row. While LEAD and LAST_VALUE look forward, as a relative offset or just the last frame row.

First Last Value All

When you have no ROWS clause at all, the frame is still filtered by default between the first and current row in partition. You do not select by default everything within the partition, from first to last row, as you may wrongly assume.

When you pass just one single value, with no BETWEEN, the end value of your frame is considered the CURRENT ROW by default. You don’t have just one single row value selected, as you may wrongly assume.

Improved OVER ROWS Visual Clause in Data Xtractor

When we started providing Data Xtractor support for the ROWS clause, we fell in the same trap of trying to map the so many variations to different functions. We ended up with no less than eight OverRowsFrom and OverRowsTo functions, which was huge.

  1. We figured out at the end we actually need one single function for all, with two FromTo mandatory parameters for the beginning and starting of the frame:
  2. From-To parameters have to be mandatory, to avoid guessing the default values and to always give a precise indication of what the frame limits are.
  3. From-To parameters should use shorter and more intuitive values and keywords. As it’s hard to guess what UNBOUNDED would stand for, we rather used FIRST and LAST for the relative maximum and minimum edges of a set.
  4. CURRENT would obviously be for the current row, and a strict positive or negative offset would stand for the relative numbers of rows after or before the current row. A zero offset is also valid in SQL and transparently translates into the CURRENT row, but we also eliminated on purpose this additional unnecessary redundant complication.
  5. Both parameters can be the same, except you can never have FIRST as second parameter and LAST as first parameter.
  6. There are always two parameters, never just one. This is to avoid guessing if that’s about selecting just one single value, or a range with ending edge you must guess again what its value could be.

We also figured out for this specific function that the generic Edit Function Arguments popup wouldn’t work. The function is too specific and visually sliding the two edges of the frame on screen would be easier and provide a much better user experience.

  1. By sliding the edges of a visual frame, you can never reverse by mistake the start end end limits.
  2. The visual frame selection does not allow you to select the first row or last row alone.
  3. The drop-down controls become enabled only when the top negative or bottom positive offset values are required. This is only when one of the frame margins edges a current +/- offset entry.
  4. Each frame change triggers the update of both the SQL ROWS clause and our OverRowsBetween function as popup title.

Visual OVER ROWS Edit Popup in Data Xtractor

OverRowsBetween(from FIRST, to CURRENT) is the default definition for our function. When you don’t have a ROWS clause, or when you skip the second parameter from its SQL definition, this is by default the selected frame. And this is what you get as first popup on any new frame selection:

SQL Over Rows Clause First to Current

OverRowsBetween(from offset 1, to offset 5) is an example on how to limit the frame to one specific set. This could be one single row, rows with current row in the middle, or rows on one side of the current row.  This works with strict positive or negative offset values, or on the current row.

You can never select just the first or the last row: we tested all supported relational databases and this is not allowed. Generated SQL ROWS clause also translates into a BETWEEN n AND m expression, never into a single value – which you may assume you can use, when it actually translates into a frame ending in CURRENT ROW instead.

SQL Over Rows Clause Between Following

OverRowsBetween(from offset 1000000, to LAST) is a last example on how you can pass whatever big positive numeric values you want, as they will always be kept valid. You can never switch invalid from and to values for the frame or allow for invalid combinations. And you’ll never have to remember what the actual SQL syntax is, as it is generated for you on screen after any possible change:

SQL Over Rows Clause From Following to Last

Visual SQL Queries Calling OverRowsBetween

Following visual SQL query emulates LAG and LEAD with the ROWS frame selection of a single row.

First column looks at the second entry two rows behind, for each current row. Our custom visual OverRowsBetween function takes the same -2 offset value for both from and to parameters. Thirst column looks at the second entry two rows ahead, for each current row. Our custom visual OverRowsBetween function takes the same 2 offset value for both from and to parameters.

Using Min or Max on the single value will simply return that value. Unlike Avg or Sum, Min and Max also work on text values.

Rows Clause Single Row

The next visual SQL query emulates FIRST_VALUE and LAST_VALUE with some specific ROWS frame selections.

First column selects the minimum value from the frame between the first and current row, with an intuitive OverRowsBetween(FIRST, CURRENT) call. This is the frame always selected by default, and the explicit call can be obviously omitted. For names sorted in ascending order, as they are here, this will always be the first Apples string value.

First column selects the maximum value from the frame between the current and last row, with an intuitive OverRowsBetween(CURRENT, LAST) call. For names sorted in ascending order, as they are here, this will always be the last Plums string value.

Rows First Last Value

Support and Compatibility Issues

The visual SQL OVER ROWS clause has been tested and implemented in Data Xtractor for the following relational databases, with built-in window functions support:

  • SQLite – which supports window functions since 2018, with v3.25.
  • Microsoft SQL Server – including Azure.
  • Oracle
  • IBM DB2
  • PostgreSQL –  Amazon Redshift, which is based on Postgres, should work as well (not tested yet).
  • MySQL – which supports window functions since v8.0.2. Amazon Aurora, which is MySQL-based, should work as well once they start using a more recent version.
  • MariaDB – which supports window functions since v10.2.
  • Firebird – it supports window functions now, but no ROWS clause yet.
  • SQL Anywhere – it may work as well, but not tested.

Categories: Query Builder

Cristian Scutaru

I designed and implemented the Data Xtractor suite, with Model Xtractor, Query Xtractor, and Visual Xtractor as separate modules. I am a software architect and developer with over 30 years professional experience. I’ve been working with relational databases for almost three decades and I was constantly unhappy with the relative limitation of those tools used to connect directly to a platform, and instantly extract and display data in flexible ways.