Crosstab Queries

Table of Contents

Pivot Columns

On any group query with one simple aggregation (i.e. one single column using Count, Sum, Avg, Min or Max) and two or more Group By columns, you may switch the Group By on one of the last columns to Pivot. All unique values that you had for the pivot column when it was Group By will “rotate” into a series of columns with those headers. Your results grid becomes a cross-table (or a crosstab, or a pivot table), with row headers the remaining Group By column(s), column headers the distinct values of the pivot, and aggregate values for each intersect.

The following simple group query tells you how many employees you have in each job and department (4 SALESMAN in department 30, 2 CLERK in department 20 etc):

pivot-query-1

 

By simply “pivoting” the JOB column (by changing its aggregate from Group By to Pivot), you get a different 2D view:

pivot-query-2

 

pivot-query-3

You can still clearly see there are 4 SALESMAN in department 30, 2 CLERK in department 20 etc, but through a sparse bi-directional table.

Emulated Queries and Native Pivot Support

SQL Server and Oracle have native PIVOT syntax, and Microsoft Access has a specific TRANSFORM-PIVOT syntax. However, for platforms with no specific crossstab support, or when Always Emulate Pivot option (by default False) is on, we emulate crosstabs as well!

From the outside, through our generic visual interface, you’ll see no difference at all. If you are curious about the implementation details, inspect the platform-specific generated SQL queries.

With few exceptions, they can be also reused as subqueries.

Dynamic and Static Pivot Values

Pivot values are static when you provide a In List Values operation in the Where clause for the pivot column. When not, we issue a transparent query to the database to get these distinct values. That’s great for you, because you do not have to know in advance what distinct values exist for that field. We do this for you transparently, because all crosstab queries (emulated or not) require to hard-code them in the SQL query itself.

Both static and dynamic lists of values could be sorted by your typical Sort choice. Max Pivot Columns option (default 20) may limit the number of values we expand from the server. Remove NULL from Lists option (default True) will also automatically discard dynamic NULL values, that may not work well when used as columns in SQL Server and Oracle.

Let’s see a more complex pivot table or crosstab, where column headers are obtained through a lookup, then sorted and filtered through a static list of values:

pivot-query-4

 

pivot-query-5

Crosstabs with Drill-Down

Your pivot column can be a very complex field, enhanced through Formula transformations. Following query shows total salaries for employees hired in a same year, in each department, starting with the highest year value (dynamic year values!). We changed the data format as well, just to show it’s possible and it works. And yes, we generated the exact same query, with the exact look and feel, for all our supported platforms! There could be huge differences underneath on how this is actually implemented from platform to platform (and there are!), but you will see the same simple behavior when you move from SQL Server to Oracle, to Access, to MySQL or SQLite:

pivot-query-9

 

pivot-query-8

 

As for the regular grouping queries, the beauty is you can drill-down on each row and look at the aggregated data. In ACCOUNTING, there are two people hired in 1981 (CLARK and KING), with sum of salaries $2,450 + $5,000 = $7,450. We also have MILLER, hired in 1982, with $1,300 salary:

pivot-query-10

Sparse vs Compact Cross-Table

We want lists of employees in each job, where job names appear as column headers. First attempt may be with a pivot on JOB obviously, MAX of employee names and group by employee numbers. But this would lead to a sparse table:

pivot-query-12

 

pivot-query-14

 

An interesting solution is using a window function! RowNumber here generates a 1, 2, 3, 4 suite for each list of employee names under each job:

pivot-query-15

 

pivot-query-11

 

Conclusion

Generated queries for all these use cases are not trivial and may be time-consuming for sure, if written manually. There are also many platform-specific differences, even completely different implementations. However, with Query Xtractor, most of this work seems trivial.