RANGE vs ROWS: When CURRENT ROW is not a Row
Why and when ROWS CURRENT ROW and RANGE CURRENT ROW return different results. How you can visually design this SQL query with window functions and prevent this confusion.
Table of Contents
ROWS vs RANGE with CURRENT ROW
Install Data Xtractor and connect to the SQLite Employees sample database, as described here. The emp table has employees with their SAL salaries. Create a new text query from scratch and paste the following SQL:
SELECT DISTINCT sal as salary,
SUM(sal) OVER (ORDER BY sal ROWS CURRENT ROW) AS ROWS,
SUM(sal) OVER (ORDER BY sal RANGE CURRENT ROW) AS RANGE
FROM emp
ORDER BY sal
The query shows distinct salary numbers in ascending order, in both the query and each of the OVER window function clauses. As long as both ROWS and RANGE clauses with just a CURRENT ROW return the …current row, we should expect each column to show the same salary value, as it follows:
Remove the DISTINCT keyword and run the query again:
The visible difference is RANGE values for duplicates actually return as CURRENT ROW all rows with the same value as the current row! And this is actually the current peer group. The SUM of all these ties is no longer the same as the current row.
ROWS and RANGE with Visual SQL Query
While this is clearly a point of possible confusion when using window functions with plain typed SQL, let’s have a look at how the visual SQL query builder from Data Xtractor helps you avoid this situation.
Create another totally similar query, but as a built or designed query. The final result will look like this:
- Expand the Tables folder, right-click on the emp table and select Add to New Built Query.
- Set No Pagination.
- Select the SAL column three time, and remove all other columns from the design grid.
- Hide the diagram.
- Sort the query by the first column.
- Change the headers of the three columns to salary, ROWS and RANGE.
- Add as first Formula to the second column the Aggregate Function Sum.
- Add a call to the Window Function OverOrderBy, with emp-SAL field as first arg.
- Add another call to the Window Function OverRows, and select the current row:
Add the exact same function calls to the third column, except select OverRange and the current group instead, at the last step:
Conclusion
- ROWS CURRENT ROW and RANGE CURRENT ROW will likely select different frames when working on sets with duplicate values. The fact that CURRENT ROW in RANGE actually returns the current group of rows with the same value as the current row is always properly documented, but still confusing and easy to forget when you type SQL.
- Our visual SQL queries determine the ROWS or RANGE clause keyword based on the current row or current group selection. This way, you don’t even need to know what clause is actually generated and you may focus on what exactly you want your frame to return.
- Remark that the GROUPS clause, when supported, returns just like RANGE the current peer group. While RANGE CURRENT ROW and GROUPS CURRENT ROW are equivalent, you can only enter the second format in a text query.
- Our highly specialized popup transparently generates the ROWS, RANGE, GROUPS and EXCLUDE standard SQL clauses. It shows an error if some of these features are not supported by your database. The popup focuses on the actual frame limits, not on the specific SQL syntax. However, the generated SQL and full textual description on each selection appear on screen after each interaction.