Window Function Framing: ROWS vs RANGE vs GROUPS
Visual description of the ROWS, RANGE and GROUPS window function clauses. Understand what frame is selected when used on a set with duplicate values. Get the difference between a current row and a current group. Think in terms of either individual rows, or groups with duplicates. Use offset rows, groups or values.
Table of Contents
ROWS/RANGE/GROUPS Clauses
Framing in window functions is the further restriction of a selected partition frame based on the current row. Standard SQL defines the ROWS, RANGE and GROUPS clauses for this purpose, which must follow one ORDER BY clause within an OVER construct. Here is a graphical description of this clause, fully implemented now in SQLite:
At the time of this writing, only last version (11) of PostgreSQL and SQLite implemented the GROUPS clause. The RANGE clause is only partially implemented in SQL Server and previous versions of PostgreSQL, in the sense no n PRECEDING/FOLLOWING is allowed in the RANGE clause. The ROWS clause is implemented by most databases with window function support, except Firebird.
Frame Definitions Query
Install Data Xtractor and connect to any SQLite database. This could be one of the Chinook or Employees samples automatically connected to on an empty project file. Generate all demo queries, then open and run the Frame Definitions generic demo query, found under the Queries > Demo Queries > Basic > Window Functions folder. Switch to design mode, to see the visual SQL builder with the results:
We’ll use this data in all other examples. To keep it simple, we selected 10 constant values (as quantity) in ascending order in each of the OVER ORDER BY clauses, and in the query itself. This means each OVER frame will render part of the set of values from the first query column, in the exact same order. Remark that real-life queries may sort the query and the OVER columns in different ways.
Current visual SQL query uses a different ROWS, RANGE or GROUPS framing clause in each of the last three columns. As GROUPS is presently supported only on PostgreSQL v11+ and SQLite v3.28+, this will work only on these database types. OverOrderBy translates directly into an OVER ORDER BY clause with one single field.
OverRows, OverRange and OverGroups are internal functions that translate directly into the ROWS, RANGE, and GROUPS clauses, each with an optional EXCLUDE (but we will not use EXCLUDE here). First two arguments are keywords or numeric offset values for the beginning and the end of the selected frame. In our query, all last three columns limit the frame from 3 PRECEDING to the CURRENT ROW. We’ll further explain what this is and how it works.
The results shows all quantity values sorted in ascending order. Rows with consecutive duplicate values are highlighted in red. As we’ll see, duplicates are the main difference in how ROWS, RANGE and GROUPS work. Set DISTINCT option for the query – to show only different values, with no duplicates – and you will see that ROWS, RANGE and GROUPS show always the exact same values, with no difference at all.
We highlighted in light blue the current row, for which we will further determine what’s selected in each case.
Current Row and Current Group
Main difference between ROWS and the other two clauses is ROWS operates on individual rows, while RANGE and GROUPS operates on groups. A peer group contains all consecutive rows with the exact same value (or combination of values, when sorted by multiple fields). Unlike the current row, the current group contains all ties or duplicates of the current row.
When you use a window function on the CURRENT ROW, remember this will relate to the individual physical row only for ROWS. For RANGE and GROUPS, when the current row has duplicates around, the function applies to all of them. Read also RANGE vs ROWS: When CURRENT ROW is not a Row.
When our physical current row is the 6th row in the result set, CURRENT ROW returns only the value 8 for ROWS. However, both RANGE and GROUPS will consider two rows, as 8 has also a previous duplicate.
Preceding with an Offset Value
Right-click on the cell icon of the OverRows function call. Click on the Edit Function Arguments menu command. The highly specialized popup helps you select any supported window frame you want:
Generated SQL for each frame is displayed in the bottom-right corner, with a textual description on what and why the frame is defined this way. The generated SQL for our full query – after Hiding SQL for Subqueries – is:
SELECT quantity,
SUM(quantity) OVER (ORDER BY quantity ROWS 3 PRECEDING) AS "ROWS",
SUM(quantity) OVER (ORDER BY quantity RANGE 3 PRECEDING) AS "RANGE",
SUM(quantity) OVER (ORDER BY quantity GROUPS 3 PRECEDING) AS "GROUPS"
FROM FruitsU
ORDER BY 1
The popup with plenty of possible configuration options is fully described here. What we are interested with is what are the rows considered for our current row selection (6th physical row in the result set) for a 3 PRECEDING frame.
ROWS goes back 3 individual rows from the current row, plus the current row itself (always selected). The SUM for this frame will be (1 + 5 + 8) from the offset + 8 from the current row = 22.
RANGE, unlike ROWS, uses an offset value (not count) that must be applied to the current row value, to find the group with that value looking back. You don’t count rows or groups looking back, instead you are looking for the first row with the value 8 (from current row) – 3 (from the PRECEDING offset) = 5. The SUM for this frame will be 5 (from the only previous group with this value) + (8 + 8) as sum of all values from the current group = 21. Beware database systems like Microsoft SQL Server and PostgreSQL prior v11 support RANGE, but with no n PRECEDING/FOLLOWING offsets.
GROUPS is somehow similar to ROWS, but it will look back for a number of peer groups, instead of individual rows. As 5, 1, and (0 + 0) are the three groups with 3 distinct values before the current (8 + 8) group, the SUM for this frame will also be 22, as for ROWS. But this is just because the first group here contains two 0 values. Beware only PostgreSQL v11+ and SQLite v3.28+ support today this GROUP clause.
General Preceding and Following
This picture shows all possible combinations for PRECEDING and FOLLOWING with an offset value.
While for ROWS it’s simply straightforward, using positive integer numbers as offset for individual rows, GROUPS uses similar integers for groups back and forth.
RANGE uses a value offset, not a count. 5 FOLLOWING will still return just the (8 + 8) current group, as for CURRENT ROW. That’s because there is no row forward with the 8 + 5 = 13 value. But 6 FOLLOWING will discover the group with 8 + 6 = 14 value and include it in the frame. The SUM on this frame will be (8 + 8) from the current group + (14 + 14 + 14) from the group within the offset range.
First Row and Last Row
Whenever you start your frame with the first row, or end it with the last row, this translates into an either UNBOUNDED PRECEDING (for the first row) or UNBOUNDED FOLLOWING (for the last row) in the SQL query. All ROWS/RANGE/GROUPS clauses behave the same way, and our popup will generate just a ROWS clause:
Beware the ROWS clause should NOT be used on sets with duplicate values, as described here. Returned results may be unpredictable, as demonstrated below.
The returned values in the ROWS column on one of my MySQL servers had a wrong sequence. GROUPS is not supported on MySQL, so only ROWS and RANGE appear in the previous query. However, the sequence returned in ROWS for the group with 14 values is wrong: it should clearly be (35, 44, 50), not (44, 50, 35).
What happened is the ORDER BY of the outer query and the ORDER BY in OVER may have returned rows with the same duplicate values in a different order!
You may try this on your own MySQL server (or any other database) and get a correct result. I tried this on a SQLFiddle online with the same MySQL version and it worked fine. But do not rely on this, it does not matter. All database vendors make no guarantee on which order an unsorted set or subset of rows is returned. For duplicate rows, with the same value, your specific small inner sequence is not sorted, it doesn’t matter if you sort the rest.