Shared Inline Aggregates with Window Functions for MySQL

Published by Cristian Scutaru on

Design SQL queries with window functions and reusable OVER clause declarations, regardless of whether or not your database supports the standard SQL WINDOW definition. Design visual SQL queries using MySQL window functions through Data Xtractor. Generate reusable WINDOW definitions, or simply isolate a reusable OVER clause in the visual designer.

Table of Contents

Aggregates per Year

Install Data Xtractor and the Chinook database for MySQL. Also generate the demo queries. All the SQL queries from this article will also appear as generated demo queries under the Queries – Demo Queries – Chinook Tutorial – Window Functions folder.

For all Invoice table entries, collect different typical aggregates – SUM, AVG, MIN, MAX – for each year.

InlineAggs - Year Aggregates

This is a typical GROUP BY query that we will use as reference, with generated SQL:

SELECT EXTRACT(YEAR FROM invoice_.`InvoiceDate`) AS `Year`,
   SUM(invoice_.`Total`) AS `Sum`,
   AVG(invoice_.`Total`) AS `Avg`,
   MIN(invoice_.`Total`) AS `Min`,
   MAX(invoice_.`Total`) AS `Max`
FROM `chinook`.`invoice` AS invoice_
GROUP BY 1
ORDER BY 1

Inline Aggregates with Window Functions

This new queries displays granular data from the Invoice table, in Data and Amount fields. There is one such different entry for each table row. However, window functions – defined with inline OVER clauses for each column – can also add aggregates per year to each row.

Last four query expressions show aggregate values similar to the one collected before, but inline for each row. For instance, total sales for year 2009 are $449.46. This value appears as Sum in the previous GROUP BY query, and also inline, for each 2009 entry, in the rows below:

InlineAggs - Window Functions

For inline aggregates, select one Window Aggregate function for the first Formula row, and follow with a continuous combination of Over…(…) Window Function calls. You can add other functions or formulas before or after, but all Over calls must stick together.

These Over function calls will translate into the typical parts of a SQL OVER clause: with PARTITION BY, ORDER BY and ROWS/RANGE lines.

An OVER clause can be also left empty, or can reference a separate window definition, with the WINDOW clause, when supported by your database.

Generated SQL for our previous query shows:

  1. The same SQL OVER clause generated for SUM, AVG, MIN and MAX aggregates.
  2. Each partition grouped by the current Year, extracted from the InvoiceDate field.
  3. Each partition sorted ascending by the same InvoiceDate field.
  4. For any row, a frame with all ROWS for that Year.
SELECT EXTRACT(YEAR FROM invoice_.`InvoiceDate`) AS `Year`,
   invoice_.`InvoiceDate` AS `Date`,
   invoice_.`Total` AS `Amount`,
   SUM(invoice_.`Total`) OVER (
      PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)
      ORDER BY invoice_.`InvoiceDate`
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Sum`,
   AVG(invoice_.`Total`) OVER (
      PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)
      ORDER BY invoice_.`InvoiceDate`
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Avg`,
   MIN(invoice_.`Total`) OVER (
      PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)
      ORDER BY invoice_.`InvoiceDate`
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Min`,
   MAX(invoice_.`Total`) OVER (
      PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)
      ORDER BY invoice_.`InvoiceDate`
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Max`
FROM `chinook`.`invoice` AS invoice_
ORDER BY 1, 2

Shared Window Definitions for Window Functions

Previous query was great – it already shown how easy you can design SQL queries with window functions in Data Xtractor – but it has a problem. Redundancy happens often in real life with these kind of inline aggregates: you may have to specify again and again the same OVER clause. This is time consuming and it also adds clutter and unnecessary complexity to the generated SQL.

SQL: 2003 standard described a WINDOW clause just a few vendors still implement today. MySQL added support for the WINDOW clause, but only recently. A WINDOW clause allows you to define an OVER clause once, and reference it several times in your query.

Our visual SQL query builder took a step further: you may now start a separate query expression (i.e. designer grid column) with a call to an Over window function, and that column will be hidden and used as a template for a window definition. The Over(…) call from other active columns can then reference that same query expression:

InlineAggs - Window Reuse

The WINDOW clause, when supported, appears before the ORDER BY clause of the main query. Here is the generated SQL for MySQL v8.0.2+, which added built-in support for this standard SQL construct:

SELECT EXTRACT(YEAR FROM invoice_.`InvoiceDate`) AS `Year`,
   invoice_.`InvoiceDate` AS `Date`,
   invoice_.`Total` AS `Amount`,
   SUM(invoice_.`Total`) OVER `w` AS `Sum`,
   AVG(invoice_.`Total`) OVER `w` AS `Avg`,
   MIN(invoice_.`Total`) OVER `w` AS `Min`,
   MAX(invoice_.`Total`) OVER `w` AS `Max`
FROM `chinook`.`invoice` AS invoice_
WINDOW `w` AS (
   PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)
   ORDER BY invoice_.`InvoiceDate`
   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY 1, 2

 

But what about the other relational databases, you may ask. Those with no native WINDOW clause support…

Well, we’ll still obviously have to use the previous SQL syntax. But the beauty is you may still use just one separate column in our visual SQL query builder and declare your OVER clause once. This may translate into an important productivity gain when you have to type such queries on a daily basis.

Nested Window Definitions for Window Functions

The same SQL: 2003 standard also specifies support for nested WINDOW definitions. You should be able to reference another window definition in your current window definition.

We also reflected this in our visual SQL query builder: any Over(…) call – from both an OVER and WINDOW clause – can reference other window definitions. The query below isolated the SUM inline aggregate, whose definition spreads over the last three builder columns. It uses first the w1 window, which is also defined based on w2.

Shared Window Definitions

New versions of MySQL, and PostgreSQL (since v8.4), may be the only well-known relational databases with built-in support for such nested WINDOW clauses. And PostgreSQL also required the dependent windows to be declared at the end:

SELECT EXTRACT(YEAR FROM invoice_.`InvoiceDate`) AS `Year`,
   invoice_.`InvoiceDate` AS `Date`,
   invoice_.`Total` AS `Amount`,
   SUM(invoice_.`Total`) OVER (`w1`
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Sum`
FROM `chinook`.`invoice` AS invoice_
WINDOW `w2` AS (
      PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)),
   `w1` AS (`w2`
      ORDER BY invoice_.`InvoiceDate`)
ORDER BY 1, 2

 

But what about databases such as SQLite and SQL Anywhere, with native WINDOW support, but not nested? Well, we’ll still generate the WINDOW clause, but with no nested references. w1, in this case, will no longer refer to w2, but will have all w2’s window definition repeated inside:


SELECT DATEPART(YEAR, invoice_."InvoiceDate") AS "Year",
   invoice_."InvoiceDate" AS "Date",
   invoice_."Total" AS "Amount",
   SUM(invoice_."Total") OVER ("w1"
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS "Sum"
FROM "DBA"."Invoice" AS invoice_
WINDOW "w2" AS (
      PARTITION BY DATEPART(YEAR, invoice_."InvoiceDate")),
   "w1" AS (
      PARTITION BY DATEPART(YEAR, invoice_."InvoiceDate")
      ORDER BY invoice_."InvoiceDate")
ORDER BY 1, 2

 

When there is no WINDOW built-in support, or when you check the Emulate WINDOW Clause option, you automatically get your typical SQL query with explicit OVER clauses. All related Over calls will be collected into one single OVER clause for your active column.


SELECT EXTRACT(YEAR FROM invoice_.`InvoiceDate`) AS `Year`,
   invoice_.`InvoiceDate` AS `Date`,
   invoice_.`Total` AS `Amount`,
   SUM(invoice_.`Total`) OVER (
      PARTITION BY EXTRACT(YEAR FROM invoice_.`InvoiceDate`)
      ORDER BY invoice_.`InvoiceDate`
      ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS `Sum`
FROM	`chinook`.`invoice` AS invoice_
ORDER BY	1, 2

 

And there is more. As you may define your calls in no particular order, and we take care to always call them the right way. We warn you of redundancies and eliminate them. We tell you what was wrong, but we try to still render a correct query. We make you more productive.

WINDOW Clause Support

To conclude, here are our Data Xtractor relational databases with window functions support.

  1. MySQL – supported since v8.0.2, including the nested WINDOW references.
  2. PostgreSQL – including nested WINDOW references, with dependents declared last.
  3. SQLite – added recently. WINDOW clause supported, but no nested references.
  4. SAP/Sybase SQL Anywhere – WINDOW clause supported, but no nested references.
  5. MariaDB – no WINDOW clause.
  6. Oracle – no WINDOW clause.
  7. Microsoft SQL Server – no WINDOW clause.
  8. Amazon Redshift – no WINDOW clause.
  9. Firebird – no WINDOW clause.
  10. IBM Db2 – no WINDOW clause.

Categories: Query Builder

Cristian Scutaru

I designed and implemented the Data Xtractor suite, with Model Xtractor, Query Xtractor, and Visual Xtractor as separate modules. I am a software architect and developer with over 30 years professional experience. I’ve been working with relational databases for almost three decades and I was constantly unhappy with the relative limitation of those tools used to connect directly to a platform, and instantly extract and display data in flexible ways.