LAG Function: Explained and Emulated with ROWS/RANGE

Published by Cristian Scutaru on

LAG window function looks at some row behind. It can be emulated with the ROWS BETWEEN clause. We do it here and then compare it with the direct call. Finally, we show you how LAG works with a visual image of the partition used by both LEAD/LAG and the ROWS-based window functions.

Table of Contents

Visual SQL Query with the LAG Function

Download Data Xtractor for SQLite, and create the SQL query below with the visual SQL query builder. The query basically works on ANY relational database supporting window functions, as it is built with constants concatenated by UNION calls.

What it does is it selects all fruit names from a simple dynamic table built this way.

SQL LAG Function

  1. First column returns all fruit names sorted in ascending order.
  2. Second column uses our LAG function with an offset of 2, on a similar partition of fruit names.
  3. Third column emulates our LAG function with the MIN aggregate OVER the same partition, but limiting the window with the ROWS BETWEEN clause, to select one single entry two rows back.

Second and third column both return a value displayed two rows before by the main query. This is what LAG does: it looks back on the partition, starting from the current row, and it returns the row value at some specified offset.

Generated SQL Query with LAG Call

Switch to SQL mode for the generated SQL query. We simplified the UNION SELECT subquery, to keep it short and focused.

WITH fruitsu_ AS (
   SELECT 'Apples' AS "name"
   UNION SELECT 'Oranges'
   UNION SELECT 'Plums'
   ...
   UNION SELECT 'Clementines')

SELECT fruitsu_."name" AS "name",
   LAG(fruitsu_."name", 2) OVER (ORDER BY fruitsu_."name") AS "Lag",
   MIN(fruitsu_."name") OVER (ORDER BY fruitsu_."name"
      ROWS BETWEEN 2 PRECEDING AND 2 PRECEDING) AS "Min"
FROM fruitsu_
ORDER BY 1

 

The initial top subquery just returns a few rows with fruit names. All window function calls are in the second SELECT subquery.

Second SELECT line calls the LAG function as we described before. Remark the OVER clauses after both LAG and MIX, which define partitions similar to the main query, with sorted fruit names. In real-life scenarios, the partitions may be different, but here, for demo purposes, it’s just easier to use the same set.

How the LAG Function Actually Works

Both LEAD and LAG window functions look at some row forwards or backwards from the current row, in their partitions defined with OVER. In our example, the LAG function jumped two rows before. For Grapes in current row, it returned Cherries.

LAG Function Description

When emulated with ROWS BETWEEN n AND n, we also limited the partition to the one record n rows before. You can call a MIN or MAX aggregate function, or even a SUM or AVG on numeric fields, to return that single value.

For the first two rows, both LAG and MIN look two records before, but they find nothing and return NULL.

Support for LAG and Compatibility Issues

LAG called directly and emulated has been tested with Data Xtractor on several RDBMSs. It works well on the following relational database systems:

  • SQLite – which supports window functions since 2018, with v3.25.
  • Microsoft SQL Server – works well, also on Azure.
  • Oracle – it works fine.
  • IBM DB2 – works fine.
  • PostgreSQL – ok. Amazon Redshift, which is based on Postgres, should work as well (not tested yet).
  • MySQL – which supports window functions since v8.0.2. Amazon Aurora, which is MySQL-based, should work as well once they start using a more recent version.
  • MariaDB – which supports window functions since v10.2.
  • Firebird – LAG works fine, but it does not support the ROWS/RANGE clause yet.
  • SQL Anywhere – it may work as well, but not tested.

Categories: Query Builder

Cristian Scutaru

I designed and implemented the Data Xtractor suite, with Model Xtractor, Query Xtractor, and Visual Xtractor as separate modules. I am a software architect and developer with over 30 years professional experience. I’ve been working with relational databases for almost three decades and I was constantly unhappy with the relative limitation of those tools used to connect directly to a platform, and instantly extract and display data in flexible ways.